There's been a fair bit of buzz about the meaning of multiple core machines - especially given the fact that today's 2 and 4 core systems will become tomorrow's 16 and 32 (or more) core systems. However, I don't think that the answer lies in changing languages and compilers to parallelize applications - at least, not a general answer. That seems to be where Larry O'Brien was going in SD Times this week:
No mainstream programming language is automatically parallelizable. This is ironic, since object-oriented programming has its roots in simulation, where concurrency is a basic concern. However, since mainstream OO languages allow state to be shared between threads, they’re fundamentally crippled. When the basic rule for thread safety is “either write objects with no fields or write objects with no virtual method calls,” the paradigms are clashing.
Surprisingly, the mainstream language that seems to have the most far-reaching proposal for manycore programming is C/C++. Herb Sutter, who is an architect at Microsoft and chair of the ISO C++ committee, gave the first public airing of his Concur project at last September’s PDC. Along with emphasizing that Moore’s “free lunch is over,” Sutter proposes that existing approaches to concurrency such as OpenMP do not go far enough and that the abstractions of .NET (and Java, for that matter) are inadequate, focusing as they do on thread management, rather than the more general concept of delayed execution.
Developers have trouble writing multi-threaded applications now, especially when the threads are native. When you try to have multiple threads of execution access shared state, chaos tends to ensue. While the hardware will certainly get better, the "wet-ware" - i.e., our brains - won't.
Ironically, the answer to this problem came up a long while back, in the Unix world. Back in the day, Unix approached the idea of problem solving with lots of small applications that you wire together. Those ideas evolved into the modern architecture of things like Apache. Do a PS on a Linux box sometime - you'll see lots of Apache processes. That's because it's far easier to create a single threaded application, and run multiple copies of it than it is to figure out how to get shared state properly shared in a single executable space with multiple threads going at it.
The other nice thing: The multiple process approach works equally well if you scale via multiple systems rather than via multiple cores. Or if you use both approaches. It also works with existing development tools - it doesn't require custom compilers that will almost certainly be architecture specific.
Which is more expensive - the multi-core hardware, or the developer trying to work on it? Based on that answer, which one makes the most sense to optimize?