Re: John Ousterhout on What Limits Software Growth
Posted: Sep 22, 2006 9:46 AM
This is a very serious discussion and there are a lot of points raised in the short article above.
> <p> There's an insatiable demand for features—whether
> underlying characteristics, such as security or
> reliability, or actual feature sets or Web pages, or
> screens, or editing tools. There's just no limit to the
> set of things people would like to do in their software.
And that's the beauty of software: there is no limit to what one can do. Only the sky is the limit. Computers are able to create whole universes.
> <p>To me, one of the most interesting things about
> software is that you're limited not by what you want to
> do, but by what you can do. We can conceive of so many
> things we'd like to have in our products, [but] we either
> don't have the time or the resources, or we just can't
> manage what it would take to build those things. </p>
Or our tools are wrong and are more of an obstacle rather than help.
> <p>Unfortunately, everything in software leads to more
> complexity. There are various laws of physics people have
> discovered, and there are corresponding laws of software.
> The first law of software is that software systems tend
> towards increasing states of complexity. It's almost a
> perfect mirror of the First Law of Thermodynamics in
> physics... </p>
The reason there is complexity though is because we have not yet captured the essense of computing. Very few people actually have understood how computers should be like and what should they do.
I will develop my view at the end, so please stay with me.
> <p>I think extreme programming is a great idea. There are
> a lot of important things to be learned from that, and it
> will allows us in many cases to reduce complexity -
> temporarily. The techniques in extreme programming will
> allow us to do things that we've never done before, that
> we couldn't do in the past. And that makes even bigger,
> more complex projects feasible. What's immediately going
> to happen, as soon as people get the current stuff totally
> under control, easily manageable, [is that] their
> ambitions ... are going to go up dramatically. And they're
> going to build even bigger things. We will never go back
> to a simpler day, I'm afraid. We just find better ways to
> manage complex things.</p>
If we are talking about the extreme programming technique as is widely known, then I have serious doubts that it can offer anything new. There is a certain amount of planning required for every task, which extreme programming does not offer.
> <p>What would be next, after that? Personally, I think
> there is a bottleneck around the development of Web-based
> applications. Web applications have a very different
> development style than traditional software [does]... It
> has something to do with the fact that there are so many
> different technologies that have to be mixed together to
> do Web application development. You end up using Java,
> those pieces is pretty good by itself, but when you try to
> combine all those together, projects become very difficult
> to manage. I think there is opportunity for someone to
> come up with a paradigm, or a toolkit, to make it
> dramatically simpler to develop really powerful
> applications over the Web. Over the next five or ten
> years, something is going to happen there. I can't tell
> you what it is. </p>
> <p>What do you see as the most limiting bottleneck in your
> current project? </p>
The real problem with computers is that they don't manage information, they manage bytes. We humans are interested in information, and not in bytes. We have built all sorts of operating systems, but none of them manages information; they all manage bytes.
The concept of process/driver/filesystem/library is totally wrong...and that's were the problem is: we deal too much with the technical details of computers; 95% of our programs consist of code that deals with the technical details of our system instead of the actual computation needed.
There are many things that are wrong:
1) operating systems are completely unaware of the structure of information. Only applications know the format of data. If there is a need to manage the information produced by an application, then one has to modify the application. Applications need to open and close files, open and close databases, open and close printers, open and close graphics devices. Data and code are not appropriately separated.
2) programs are fixed black boxes of computations with no way to communicate with the outside world. The only official interface of an executable has been the stdin/stdout mechanism. Of course each operating system provides its own communication way, but there is no way to reuse a useful computation contained in a program...and updating an program is not as simple as installing a new function.
3) programs are built in a monolithic way: heaps of source code must be compiled before we even see a single dot on the screen illuminated by our program. Of course there are interactive environments, but this interactivity does not go through the whole system: at some point, the interpreted language will invoke system code, and nobody knows what happens in there.
4) our computers speak a thousand languages, all incompatible between themselves! We have tons of different shells all with their own language, tons of different interfaces (Corba, COM, RPC, SOAP, HTML, FTP, windows messages, the X-windows protocol, OpenGL, tons of different languages and protocols (and protocols are languages). Yet there is no single way for programs to co-operate, exchange data, discover capabilities, communicate.
5) our computers can't talk to each other; they are perfect strangers only capable of exchanging hand signals. The same thing that happens to programs happens to computers, only much worse: computers can't really communicate, they can only have limited communication through network filesystems.
6) the computing environment is not the programming environment. In order to enter programming mode, I have to go through a sequence of boring and unnecessary steps like: a) firing up the IDE, b) go through the API docs, c) write the code, d) check the code, e) compile the code, f) figure out what went wrong. For example, if I want to change the color of the active window frame, I can not simply enter 'ActiveFrameColor = Orange' or whatever and have the frame color of all windows to change, because I) there is no persistent system-wide 'ActiveFrameColor' variable and II) there is no global event model so as that windows can listen to changes to 'ActiveFrameColor'.
How should the computing environment be? well, the system should be responsible for:
1) global definition of datatypes, including arrays, lists, tree maps, hashmaps, sorted maps, indexed arrays...not only at semantic level but also a binary level.
2) automatic persistence. It's the year 2006, we have computers for over 50 years, and yet we still have to 'open file'. The system should be responsible for using the system's RAM as a cache for hard disk.
3) system-wide persistent map of functions. A program should be a function. When a function is replaced, all programs are automatically updated. The difference with the current system is that the code should be distributed as source code (or in an intermediate form), and each function should be a different 'file'. The O/S should be responsible for compiling and caching the code, as well as checking for security.
4) support for reactive programming. Any state change should be an event where listeners can be attached.Reactive programming makes GUI programming very easy.
5) support for versioning. New definitions of data types and functions should be automatically versioned.
Programming should be as simple as opening a new source code file and running it...the various functions and structure of the program should be automatically stored by the system. By incrementally building systems, development would be much better and faster.
Most of the applications and mechanisms invented so far are solutions to the same problem:
1) versioning systems are information management systems which deal with source code.
2) document management systems are information management systems which deal with documents.
3) filesystems are information management systems which manage bytes.
4) the Windows registry is an information management system which manages system settings.
5) the 'etc' directory in Unix is an information management system which manages system settings.
6) databases are information management systems where triggers play the role of the reactive system.
7) help files are information management systems.
8) hypertext is an information management system.
9) an O/S kernel maintains various pieces of information in its memory.
And there are many others...
So the bottleneck exists because the same problem is solved using many different solutions; if this bottleneck goes away, scalability will skyrocket and creativity will be unlocked...