The joy of using Python and the perils of programming Ruby
After installing Planet Venus on a Mac OS X machine following a brief tutorial, it occurred to me that in the past months I have been using a bunch of useful Python applications, such as ViewVC, Trac, Bazaar and Venus itself; also, I started to collect information about Django by reading tutorials and skimming through presentations. Since a consistent part of the infrastructure I’m using day in and day out is founded on Python, I thought I could revamp my Python skills, which surely rusted during the latest two years of sparse Ruby programming.
The occasion came with the release of Python 2.5, and a script to write for a friend that had to manipulate information in a bunch of MP3 files taking data from a CSV file. The msi wizard asked if I wanted to install Python for a single user or for every user on my Windows 2000 machine (good, since installers for other scripting languages still have to catch up on this), did not require a restart (good), but when I logged in with another user I was unable to simply run Python programs from the command line (bad, but at least not as bad as having to lurk on the web for an initialization script to have irb let you type square brackets and curly braces on european keyboard as it happened for a certain period with Ruby’s installer): either I got an error saying that some parts of the script application was not found, yadda yadda yadda, or python was not recognized as an executable command. So I had to manually adjust the PATH, then I spent something more than an hour to wipe the first version of the script. The day after, the little Rubyist in me got curious about how to perform the same task in Ruby, and the train of thoughts usually provoked by such a comparison started.
I discovered that my brain is currently wired on objects: I seem to instinctively search for classes representing a part of the application domain, then to inspect them for methods performing some key operations, instead of looking for functions that use raw data. In the Python csv library, I spent a considerable amount of time staring at DictRead trying to understand its role, before scrolling down the documentation and finding the reader function. Besides, I also discovered that I expect not to compose entities or functions at the user API level, asking things to work just with names instead, and to hide anything else under the hood. I fiddled with reader for quite some time passing the CSV file name and wondering why only 11 of the 7159 rows got recognized, before looking more carefully at the documentation and noticing that the function needed an iterable object: strings are iterable entities, but file objects are as well, and one of those was the right thing to pass. Unfortunately, the PyDoc documentation for the csv module didn’t provide a code example I could copy and paste and quickly modify to match my needs. (Ah, the familiar sound of postmodern programming and its patterns!) Interestingly, the Ruby documentation provides a simple example of its csv module usage, albeit being online and not included in the distribution; and that example matches with my expectation of passing just the filename to the function used to open a CSV file; then it follows the usual pattern of exploiting a block to perform operations on each row and to ensure that the file gets closed afterwards. You may indeed wonder why I shifted away from Ruby and performed the task in Python instead.
Apart from the sheer curiosity of learning again a tool I was de facto already extensively using, I was also moved by the perception that the Python libraries universe was more populated than the corresponding Ruby one, and that I could find scripts to do virtually everything. Later, I discovered how mythical was that assumption, since while browsing the web for a library to manipulate ID3 tags (a feature that eventually did not make it into my script) the best pure Python option I found was limited to version 1 tags, while Ruby ID3 libraries seemed to be much more comprehensive. It really depends on the application domain, since if you happen to work with XML or feeds, the best tools still belong to the Python side. So, it wasn’t an availability issue, rather than a quality issue instead. Take the two csv modules, for example. I tried to perform a simple loop parsing my file and printing the total rows count at the end. Measuring performance by means of the time command in the Cygwin bash shell, the Python version carried out its duty in slightly more than half a second, while the Ruby version lasted 2′23″. Of course I don’t want to draw any conclusion from this little experiment, but on my machine (a Pentium III 800MHz with 256MB of RAM) a huge performance penalty was paid either by the Ruby CSV library, or by the Ruby interpreter, or even by both. Noticing the huge difference, I couldn’t help asking myself how many Ruby applications were part of my day to day programming activity, while also being targeted to a broader public than just Ruby programmers, so as to make a fair comparison with the list of Python applications I listed at the beginning. The result was: zero; and I also noticed I don’t know of a popular Ruby application as vastly deployed as, for example, MoinMoin. If the Ruby environment is not mature enough to produce that kind of applications, and is not efficient enough to quickly carry out a task as simple as looping into a CSV file, it seems that the only appealing things are the broad number of programmer tools available (think RubyGems, Rake, Watir and others) and the supposedly more elegant and readable syntax.
However, you’d like to note how I didn’t mention syntax amongst the reasons for turning to Python. That’s because if you have s = 'HELLO WORLD' you can write ' '.join(map(capitalize, s.split())) in Python or s.split.collect { |w| w.capitalize }.join(' ') in Ruby, and the relative elegance of the two solutions is still, to a certain degree, a subjective matter, depending on personal experience and education. It’s just quite fun to spot oddities and contradictions here and there. On the one hand, in the Ruby camp, which makes readability one of its flagships: how writing s =~ /\/$/ to check that a string ends with a slash could be more readable than Python’s s.endswith('/') equivalent is really beyond me, and why a simple ends_with method hasn’t yet been included in the standard library, given that Ruby classes are a plethora of applications of the TIMTOWTDI principle, puzzles me as well. On the other hand, in how languages oriented to functions like Python grow: to check a directory for existence, rename and move a file, I had to import functions from three different modules. Syntax probably matters, but still partly being an affair of personal tastes, it hardly can represent the solely metric to judge and choose a scripting language.
What instead was important enough to make me willing to switch language, at least for a small trial, was the different attitude of the communities around Ruby and Python; or, to say it better, the different attitude of the communities, as perceived by the outside; that is, the attitude of the most prominent or visible members of those communities. There are people in the Ruby community that have made a base principle the idea of being disliked, the so called opinionated software, of which Rails is one of the most famous example; and this I can understand. But what bothers me is the fact that those people seem to actually enjoy being disliked and breaking their world into opposite extremes: they seem to triumphantly play drums and trumpets instead of considering it evil; necessary, perhaps, but evil nonetheless. Someone can have fun at that; I had, too. But after the first few moments I felt the underlying arrogance, and it wasn’t funny anymore: just a way to tell others to fuck off and require their excuses afterwards. Please compare those attitudes with the ones of Python community members that joined the discussions: willing to help, and speaking common sense. Besides, the Rails and Ruby breakthrough offered space for a new publishing business, an effort lead by another important member of the community. I did buy no less than four books: sometimes finding knowledge which didn’t have the high quality I expected, at least when confronted with similar documents facing the same topics but freely available on the web; sometimes realizing that they were the only source of easily available information for certain subjects. Of course I don’t mind people making money and creating innovation by collecting and organizing knowledge on open source projects, but I like to make comparisons, and trying to understand consequences. As such, I notice that Rails online documentation is not as up to date and comprehensive as it could be; and that while the Python official distribution comes with updated and reasoned documentation including a nice tutorial, Ruby ships with an outdated “pragmatic” book covering an old version of the language nobody uses anymore, and just a bunch of links. Quite frankly, I prefer to live in worlds founded on different principles.

Ouch. I’d expect these posts from me, not from you
Well, so you don’t even have to write them, because I’ll do it for you!
[...] Do you remember when, at the end of this post, I tried to make a comparison between visible attitudes of the Python and the Ruby community? And as a key example I quoted Dive Into Python, available free on the Internet but also published in paper form by Apress? [...]
The Django book « The Long Dark Tea-time of the Blog said this on November 6, 2006 at 11:15 am |
Though it has been almost a year and a half since this post, for the sake of future readers, I would recommend the mutagen id3 tag library for python nowhttp://www.sacredchao.net/quodlibet/wiki/Development/Mutagen
Thanks for the pointer, Gabriel!