jp's domain

coding like crazy

Rethinking the JVM

The problem with programming

What is the greatest problem in computer programming today? Is it the impending 'concurrency crisis'? Is it reducing bug counts? Is it shifting over to a programming language that has the full power of macros? Is it type inference in statically-typed languages? Is it dealing with licensing issues?

All of these are genuine issues, and they are things that need to be looked at and are, but I would contend that the major problem in programming today is the waste of millions of programmer hours. What do I mean by this? Well, let's say I'm a Common Lisp programmer, and I need a library which will read Excel files. Such a library might exist in Python or Java, but it doesn't in Common Lisp, and if I really need that functionality, my only real choice is to write that library myself. That's not the worst thing that can happen, but I don't know much about the Excel format on top of the fact that writing such a library is a job that could literally take hundreds of hours. Now remember, my original goal was perhaps to write a general-purpose file analyzer, and not to write an Excel-reading library. What a waste of time!

We see this similar problem throughout the programming world. Perl has its own libraries to read CSV files. As does Python and Ruby. In an ideal world, if it was possible to easily use Perl's CSV libraries from Python, would days of work have been spent working on Python's CSV library? Probably not. More time could have been spent fixing what existed.

It doesn't end there!

This isn't the only problem that this Babel of programming has caused. Ruby's interpreter is widely seen as very mediocre. Due to pretty major speed issues, as well as other concerns, a new Ruby interpreter is being written, YARV, which is slated for a Christmas release this year. Writing interpreters and virtual machines isn't easy. Python has its own, Perl has its own, PHP has its own, and they all have their own little boxes of libraries and communities where it is nearly impossible to use a library between one language and the next. Some languages sidestep the issue by compiling down to native code instead of running off a virtual machine, but the result is the same. Essentially if you want to write a library that is easily usable between many different languages, you need to write it in C. C isn't bad, but it also isn't 1975 anymore, either.

Back to those language implementations. Much uproar has been raised in the Python community this past week over the global interpreter lock on the Python virtual machine. Basically the GIL makes it difficult to write multi-threaded programs in Python, but it does make the VM much more efficient and it allows much easier writing C extensions to be considerably easier. The creator of Python, Guido's response was essentially, "I'd love to have this feature, but it is a hard job, will take a lot of work, and I simply don't have the time to do it myself". Guido also has a different idea on the most effective way to use multiple cores, but that is besides the point. The point is, adding this feature of Python is something that will again take hundreds of man-hours. Ruby, with a similar design, is looking at the same issue in the near-future. What a crappy way of doing things! As programmers, if we have too much duplication in our code, our first response is to rightly want to kick our own ass. The whole idea of programming is to reduce duplication as much as possible, which is why we have variables, and functions, and some languages have macros. "Don't repeat yourself" is a principle most of us work by. I'm not sure if I live by it...eating ice cream twice in the same day isn't so bad. Anyway...

A solution?

As you can see, this is a pretty big problem. Instead of improving on what already exists, we as programmers are spending exorbitant amounts of time, money, and effort duplicating libraries and features found in other programming languages. There's got to be a better way of doing this.

Parrot has long been slated as the solution. Essentially it is a language-neutral VM which should make it easy to run all of these languages on the same platform, with the ability to easily use libraries between them. So there is a light at the end of the tunnel! Well, that's what Parrot will be, eventually. Parrot has been a work in progress since 2001, and it isn't really clear whether it will be ready in the next six months or the next five years. In addition, its focus is dynamic languages, so it isn't clear how well static languages will fare on the VM. With Parrot, much like, Perl 6, there's an air of uncertainty, and even when it hits its first 'production' release, there is no telling how fast uptake will be or what issues will be faced. When mentioning Parrot, it is worth noting that both PyPy and Rubinius have some similar potential but slightly different goals along with that same dash of uncertainty.

But of course there couldn't be an existing software project without Microsoft having something similar. Microsoft's .NET project and its related DLR (Dynamic Language Runtime) also aim to make it easy for different languages to work together. Not only does C# run on the .NET runtime, but there is also the OCaml-inspired F#, IronPython, IronRuby, IronLisp, JavaScript, Boo, and several other languages on the platform. Perhaps most importantly, .NET works now, and it does have major backing. .NET isn't restricted to Windows, either. The Mono project is a Microsoft-supported port of .NET to Linux.

Add a comment

you're not logged in