What Can Distributed Systems Teach Us About Concurrent Coding?
Three important questions plagued the asker of this Stack Exchange question on designing concurrent systems in 2010:
- How do you figure out what can be made concurrent vs. what has to be sequential?
- How do you reproduce error conditions and view what is happening as the [concurrent] application executes?
- How do you visualize the interactions between the different concurrent parts of the application?
What is striking is that, seven years later and at least for systems that are truly distributed in the sense of multiple machines, these problems have been mostly, if not fully, solved. As I read his question, I pondered: can the relatively less byzantine failures of ordinary concurrent code (such as, say, web servers) be tackled in a similar way?
In this post, I would like to discuss these particular solutions and how they can be applied to writing resilient concurrent code.
Continue reading →