There's been a lot of discussion in the Ruby community about the state of VMs. Everyone knows that the current Ruby interpreter has its flaws, so there's been a lot of discussion and development on 3 different projects, YARV, Rubinius, and JRuby.
YARV is the most vanilla of the projects, although the one that most people will end up using. It is the official VM and is C-based and you can pretty much expect out of it what you might expect out of the Python or Perl VMs. JRuby is interesting because it rectifies many of the warts of Ruby. First of all, you get Unicode support, it compiles the Ruby code down to Java bytecodes (therefore you get speed), you can use Java libraries from your Ruby code, you can drop down into Java when performance is really needed, and perhaps most importantly, it should be hit a 1.0 release very soon.
Rubinius is perhaps the most interesting and ambitious of the projects. It follows the Smalltalk-ish idea of writing its VM in itself. So basically you are talking about Ruby running on Ruby. Evan Phoenix, the head of the project, cites some advantages of this in his blog:
There are a lot of different ways to answer this question. So I’ll say this: rubinius aims to be simple at the core so it can be a lot of things to a lot of people. The longer way of saying that is that firstly, rubinius can be customized to make your life easier. While we know ruby is already highly customizable, rubinius aims to go even beyond that.
Secondly, rubinius will fit into places the current interpreter won’t. One good example is that rubinius is all driven by bytecode and will even have hooks to further help companies protect their ruby intellectual property. To companies, that means ruby is much more enticing to use. And the more companies that use ruby, the more people, and the community grows. But don’t misread that and think that I’ve only developed rubinius to make companies happy. That’s just an example of where the “simple core, simple architecture” paradigm allows the whole project to remain agile and customizable by everyone that wants to use it.
Similar ideas led to the creation of the PyPy project for Python. The thought was that by implementing Python in Python, you would get the benefit of a simple and flexible VM that could be easily modified by anyone because instead of being in C, it would be in Python. It was also hoped that through processes like JIT-compilation, PyPy could be as fast as or even faster than CPython. As their homepage still says: "Rumors have it that the secret goal is being faster-than-C which is nonsense, isn't it?".
Well, PyPy is getting nearer and nearer to its 1.0 release, so we can now take a scope of the project and see how well its goals have been accomplished so far. Recently there have been several links on programming.reddit.com with discussion on PyPy.
Some quotes worth highlighting:
So far the PyPy code is not even conceptually simpler than the CPython code, which is overall quite nice and structured, though written in C and cluttered with INCREF and DECREF ( the only annoyance I have it, btw ). We all hope this changes in the future of the PyPy project but I admit I'm somewhat disappointed. This project had more credit than any other newly started OSS project in the world. It was even funded by the EU. The result is a very large amount of somewhat confusing [1], slow Python code that gives hope.
And from another user:
The last time I took a serious look at it, about a year ago, the code base was already insanely complex. Certainly it was much more complex than CPython ever was. Of course, it does a lot more than CPython in a number of ways, but the argument about it being inherently easier to modify for its user community, because it's coded in Python rather than C--well, I don't think that holds any water. If PyPy was closer to being a re-implementation of CPython in Python that would be likely be the case; the reality is that CPython is built on technology that has been well understood for the last 30-40 years, whereas some parts of PyPy are more like rocket science in comparison.
Here are also some interesting benchmarks.
What does this all mean? Well, it shows that PyPy still isn't as fast as CPython, but it is inching closer and closer. Keep in mind, though, that optimization becomes more and more difficult over time. It also shows that the codebase to PyPy didn't really end up being a lot simpler as originally hoped. Now as one of the quotes points out, PyPy isn't just a straight port of CPython, and it does quite a bit, but it still likely won't be any easier for a Python hacker to get in there and modify the VM as it would be for someone to get in and mess with the C-based VM.
Why did I go off discussing PyPy when this post initially had to do with Ruby VMs? There are a few things you must realize when you look at PyPy.
1) They had time
2) They had backing
3) They had experience
The PyPy project started in 2003, and in December 2004 they got adopted as an EU project to receive over 2 million euros over the next 2 years. link They had 5 people working full time on the project during those two years, as well as others working part time, and had some travel expenses paid for. link In addition, the people working on the project really had experience. Armin Rigo had previously worked on the Pysco JIT implementation for Python.
This has to be one of the best-funded and supported open source projects in history. And yet despite this, after 2 years of funding and work, when they hit 1.0 they will not be having a platform that is faster nor simpler than CPython.
What does this mean for Rubinius? It would appear to mean that Rubinius has a huge uphill battle ahead of it, as it is being worked on by a much smaller team, with less experience in things such as JIT, with a small fraction of the funding, and in a language that is somewhat more dynamic than Python. I really do wish all the best to the Rubinius team, but it is an ambitious project and it could be a long time before we ever see something you would run production code on.
Note: Please feel free to correct me if I am wrong in any of my statements. I don't mean to take any credit away from anything that has been accomplished by either of these projects, I just felt like making some observations. I really would like to hear input from anyone involved with either Rubinius or PyPy. And I'm looking forward to the 1.0 release of PyPy later this month. :)
last updated 2 years ago
#
Sure I'll correct you. Isn't it why you have comments enabled ? :-p
I know nothing about PyPy but I dived a little into Rubinius. For myself, I just love it and I'm happy it exists. I don't think that you can argue on one project because of the "failure" of another or because of an ammount of funding. People at rubinius are really nice and have a good dynamic and I think that it is a good factor for success.
There is a lot of work to do, but even if the project is abandonned in the end, lots of people would have reviewed Ruby (this is what you do on a rewrite). The contributed unit tests are in the major part also good for the other projects and participants have lots of fun.
Your "griefs" are maybe justified, but just come and give some help, I think that you'll enjoy it :)
zimbatm 2 years ago #