The Problems of Perl: The Future of Bugzilla
Once upon a time, Bugzilla was an internal application at Netscape, written in TCL. When it was open-sourced in 1998, Terry (the original programmer), decided to re-write Bugzilla in Perl. My understanding is that he re-wrote it in Perl because a lot of system administrators know Perl, so that would make it easier to get contributors.
In 1998, there were few advanced, object-oriented web scripting languages. In fact, Perl was pretty much it. PHP was at version 3.0, python was at version 1.5, Java was just starting to become well-known, ruby was almost unheard of, and some people were still writing their CGI scripts in C or C++.
Perl has many great features, most of all the number of libraries available and the extreme flexibility of the language.
However, Perl would not be my first choice for writing or maintaining a large project, such as Bugzilla. The same flexibility that makes Perl so powerful makes it very difficult to enforce code quality standards or to implement modern object-oriented designs. Here are the problems:
- Reviewing Perl code takes much longer than reviewing other languages. Here's why:
- There are many ways to do the same thing in Perl. It's even the motto: "There's More Than One Way To Do It." However, this means that each reviewer must enforce very strict code guidlines on each coder, or must learn each coder's coding style. In the interest of consistency, we usually pick the former. This takes more reviewer time. It's also frustrating for contributors who are used to writing in a particular fashion.
- More ways to write the same thing means there are many more bad ways to write code in Perl than there are in other languages. In any language it's possible to do stupid things and not realize it. In Perl it's easy. Or even when the code does what you mean it to, just because it works doesn't mean it's good. My experience is that Perl encourages people to write code that "just works," but might not be architected appropriately. Once again, this is possible in any language, but I think Perl makes it easier than other languages to write bad code.
- It's very easy to make Perl code hard to read. It uses strange variables like $_. It relies heavily on complex regular expressions. I don't think anybody would argue that Perl encourages readability.
- Perl lacks many of the features that implement what computer scientists call "design by contract." That is, Perl doesn't enforce things. For example, Perl doesn't check the type of arguments to subroutines. You can't make subroutines private in a class. Programmers have to actually read the documentation to know that a function is really "private" or "protected." Perl doesn't have real assertions. (The "assert" command in C, Python, Java, or various other languages.)
Perl's lack of enforcement is a nice feature for the casual programmer, but for the design of large applications, you want the programming langugae itself to do as much error-checking for you as possible, so that you don't have to write the error-checking yourself.
- Perl lacks a real exception mechanism. We would have to write our own if we want one.
- Under mod_perl, because of the design of Perl, Apache processes grow HUGE in size. We kill them if they get up to 70MB, but even 40MB for a single Apache process is too big. The fact that Perl never releases memory back to the kernel is a problem.
- Without some experience, it can be difficult to read Perl's compiler error messages and actually then determine what's wrong.
Since 1998 there have been many advances in programming languages. PHP has decent object-oriented features, python has many libraries and excellent syntax, Java has matured a lot, and Ruby is coming up in the world quickly.
Nowadays, almost all of our competitors have one advantage: they are not written in perl. They can actually develop features more quickly than we can, not because of the number of contributors they have, but because the language they're using allows it. There are at least two bug-trackers that I can think of off the top of my head that didn't even exist in 1998 and were developed rapidly up to a point where they could compete with Bugzilla.
However, honestly, I like Bugzilla better than all of our competitors. We have almost 10 years of experience writing a bug tracker. We know what people need and want from bug-tracking software.
But still, any of you long-term contributors to Bugzilla who also have experience in other languages, ask yourself this question: "In all the time I've spent working on Bugzilla in Perl, how far could I have gotten on writing another bug tracker, from SCRATCH, in another language?" My personal estimate is that I could have entirely re-written Bugzilla in Python or Ruby in half the time I've been working on it in Perl. (That would be re-writing it in a year and a half, not an unreasonable amount of time for 80,000 lines of code or so.)
Nowadays, even the virtue of "lots of system administrators speak Perl" is fading. New admins are more likely to know Python than Perl. And in about two years from now, I'll bet people will be just as likely to know Ruby. Perl will continue to fade in popularity, I suspect. Already there's no doubt that far more people know PHP than know Perl.
So the popularity argument is dead.
One advantage that Perl has is CPAN. There are a lot of libraries available. But then again, that's also a problem that Perl has--people need to install SO MANY modules just to use Bugzilla. Witness all the protesting there is from our users whenever we add new required modules to Bugzilla, and the support questions we get about problems with CPAN.
And even that advantage is fading. There are a lot of python modules available now. Java has Jakarta and a lot of other modules. And Ruby has RubyGems, which are even easier to install than CPAN modules. PHP has PEAR, which is also very nice.
In 1998, Perl was the right choice for a language to re-write Bugzilla in. In 2007, though, having worked with Perl extensively for years on the Bugzilla project, I'd say the language itself is our greatest hinderance.
But what can we do about it? ohloh.net says that we have 43,762 lines of code in Bugzilla, and I think we might even have twice that many, if you count templates. Not to mention POD.
I think that the experience of Netscape and the Mozilla Project shows us that re-writing Bugzilla from scratch and totally ignoring our old code base would be a bad idea. If we stopped development for a year and a half, we'd be hopelessly behind and our users would start to abandon us in droves.
As far as I can see, if we want to move away from Perl and move to a language that will be better for us as time goes on, we have two choices:
- Figure out a way to re-write *parts* of Bugzilla in another language without affecting performance or greatly adding to the complexity of installation. We could then incrementally move to another language.
- Work on both projects at once--a small team working on the re-write in another language, and the same team we have now working on the current Perl version, up until version 3.4 or 3.6.
If #1 is possible, I think I'd obviously prefer that. However, if it would be extremely difficult or be somehow bad for the project overall, then we could do #2.
No matter which way we go, these are the steps we'd have to go through:
- Make a list of every single feature of Bugzilla that would have to be re-written in a new version.
- Using this list, decide on a language to use that would be the easiest and best for implementing all of those features. We could also decide if we want to use a web framework like Rails (Ruby) or Pylons (Python) to eliminate us having to write some code. After all, the less code we have to write, the better.
I've already been experimenting with various languages, and I've started a page that compares their advantages and disadvantages, from Bugzilla's perspective.
- Prototype some basic features of Bugzilla in that language, to see how easy it actually is.
- Prioritize the feature list of Bugzilla, to figure out what we have to re-implement in what order.
- Do some design of the system so that it makes sense and is coherent when it's done. We don't have to re-design Bugzilla at this point--we could get stuck in that forever. And we shouldn't design it by committee. One or two people should work on the design, and then present it to others for review.
However, no matter what the design is, it's important to maintain feature and API parity with the current Perl Bugzilla--otherwise it will be very hard to get users and extensions to upgrade.
- Start work, based on the design and the priority list.
Without taking some action, I'm not sure how many more years Bugzilla can stay alive as a product. Currently, our popularity is actually *increasing*, as far as I can see. So we shouldn't abandon what we're doing now. But I'm seeing more and more products come into the bug-tracking arena, and I'm not sure that we can stay competetive for more than a few more years if we are stuck with Perl.
PHP in the Wiki?
Re: PHP in the Wiki?
Btw, there is no real difference between Java and C#. They basically do the same thing and in the same way. And both are really good for web projects that need to burn money.
I prefer Ruby as a language over Python. But Ruby as a platform (in terms of libraries and performance) might not actually win. I've already started looking into the possibility of using Pylons and SQLAlchemy in Python.
-Max
(Anonymous)
On Python cons
Re: On Python cons
-Max
(Anonymous)
multiple languages; performance
I also don't think that bugzilla needs to be especially fast; certainly no faster than rails or pylons can be. If it were made to be more cache-friendly, it could use an explicit invalidation model (like mediawiki does, and like we're going to be doing for AMO in the near future) and performance would then be gated largely by rate-of-change rather than rate-of-use. Also, making it cluster-friendly would likely be relatively straightforward, which gives another powerful scaling tool.
- shaver
Re: multiple languages; performance
Bugzilla's concern is mostly page-load time, and also being fast enough to have an AJAX interface in the future.
Being more cache-friendly might help. But the rate-of-change is actually pretty high.
Bugzilla actually already is cluster-friendly, pretty much. :-) The features are just undocumented.
-Max
Perl vs perl-like?
Python, Ruby, etc have fairly similar characteristics. Their runtimes are similarly bad. PHP should not be even considered.
Java/C# will kill you with their VM footprint and not provide any real advantages. A somewhat better type system is nice, but giant runtime cancels that out pretty quickly.
I'd suggest checking out something a bit more out of the usual paradigm. Specifically, look at some functional languages. With ocaml, Erlang(or maybe scheme) you'll get awesome performance and concise code. Downside of most functional languages is that they usually don't have a good web framework. Erlang is the sole exception. It scales well to SMP or clusters and has excellent web serving support with yaws and is probably going to be the next popular language after Ruby. Give Erlang a spin, it may just be the best toy you've ever seen.
As for rewrites, I like option 1 of focusing on one part at a time.
Re: Perl vs perl-like?
I can take a look at Erlang, but the current popularity of languages is a big factor to me, when looking at how many contributors we can get, and I've only heard much about Erlang from computer scientists and people who really enjoy looking into new languages. I don't doubt it's awesome, but I don't think it will be popular enough, fast enough, for our needs.
-Max
Use PerlTidy and possibly recent developments like Perl::Critic to enforce consistency.
Messy Java looks clean. Messy Perl looks messy.
That and
@_
are pretty much the only two “strange” variables you need with any regularity, and$_
is easy to avoid in most cases.No. It just makes regular expressions easy to use and certainly makes them easier to read than in other languages. Have you seen what regexen look like in PHP or Java?
Neither does Ruby, Python, Javascript or PHP.
Sure can, although no one knows how.
5.10 will. Coming soon.
Anyone can make claims like that. “Real” in what sense?
That has no longer been the case in years.
I would say their advantage is they are not Bugzilla. Bugzilla competes with TWiki for the honour of worst big installable Perl app codebase. My suggestion for moving forward would be to refactor the architecture to something actually sane and to port it to a modern framework such as Catalyst. This is pretty much the same as you would do while porting to another language. Think of it as porting from 1995-style Perl to 2005-style Perl.
(Anonymous)
bad perl?
You apparently haven't actually looked at Bugzilla code in the last couple years. It's had an almost complete rewrite over the last 2 years, with exactly that goal.
(Anonymous)
Framework?
I don't know much about the features required by bugzilla, but from experience I've found features can be rolled out extremely quickly when using a good framework
Re: Framework?
-Max
if a bugzilla hacker has not thought 'I could rewrite this'...
(Unfortunately, unless you can get a real rails performance guru on board, I'm not sure rails is the right choice from a deployment perspective. I can say with great confidence that from a development perspective it is heavenly.)
If we stopped development for a year and a half, we'd be hopelessly behind and our users would start to abandon us in droves.
I'm not sure that is really the case- because virtually every bugzilla installation is heavily customized, people have very different expectations about the bugzilla release/upgrade cycle than they do for consumer-facing, uncustomized software. Additionally, lets be frank- bugzilla has not exactly been blazing away with new features (in part because of the perl problem). Yes, it would suck to stop having new features altogether, but it isn't like bugzilla is currently doing tons of new features and releasing quickly anyway. (Happy first birthday, 2.22. ;)
Make a list of every single feature of Bugzilla that would have to be re-written in a new version.
I'd suggest that a rewrite (like the mozilla->firefox transition) might be defined as much by what features are taken out as by what features are kept. Inevitably, this will piss off some users, but the opportunity to come out with a light, clean restart would do bugzilla a great service.
Re: if a bugzilla hacker has not thought 'I could rewrite this'...
And yeah, I've heard the same thing over and over about Ruby performance--it's really bad. Which is terrible, because it's a beautiful language.
Many Bugzilla installations are heavily customized, but I'd say that 50-60% of them aren't. (I work as a Bugzilla consultant and I've worked with a lot of installations at a lot of places.)
As far as new features, actually, development-wise, we are blazing. :-) We're just not too hot on the bug-fix and release cycles. 3.0 has been essentially done for a long time, and it's coming out soon, but it took a long time to get through those cycles. (Mostly from lack of available reviewers and not as much motivation to fix bugs as there is to code features.)
Still, I'd be extremely wary of stopping development for that long. It really would hurt us, I'm sure.
And yeah, we could consider removing features, but I don't think we would do that. Well, except maybe we'd leave out New Charts, or something like that--something that isn't finished or has impossibly complex but useless code.
-Max
(Anonymous)
path not taken
- shaver
Re: path not taken
-Max
Consider CakePHP
(Anonymous)
Design by Contract in Perl
http://en.wikipedia.org/wiki/Design_by_contract#Languages_with_third-party_support
Re: Design by Contract in Perl
1) It's another module to install, and that's more pain for users and also puts us at the mercy of a module developer for *all* our code.
2) They're probably source filters, which are terrible things and most experienced Perl developers say to stay away from them.
-Max
(Anonymous)
The advantage isn't that they are not written in Perl. The advantage is that they weren't written in 1998. Adding new features to an almost 10 year old code base is much harder than adding it to something that's just been written. Please don't contribute to the Perl-makes-bad-programmers garbage (and you admit Terry was targetting sysadmins not programmers anyways) that's wallowing around the internet just because some code written a decade ago is now showing it's age.
I'm not saying you should definitely pick Perl. I enjoy Perl (sometimes dispite it's syntax) and others don't. Pick what meets your requirements, but please don't insult a language for something a group of people did with it 10 years ago.
However, Bugzilla is very far from the code that it was 10 years ago. It's been heavily re-architected, in an effort led largely by me and greatly assisted by LpSolit. I know the difference between writing good Perl code for a large application, and writing good code for a large application in another scripting language.
The mindset of Perl developers is usually flexibility over simplicity. However, for me, in a situation where we have very limited resources as far as code review goes, and also no funding whatsoever, the strictness of the language itself, and the design of the language and its libraries, matters to me.
-Max
(Anonymous)
Groovy
I would stick with Perl though.
Re: Groovy
We're unlikely to go with Java, in any case, though.
-Max
-Max
A Bad Idea
chromatic points to some problems with your reasonoing here. Let me point to some more:
Regards, Shlomi Fish.
Re: A Bad Idea
I wasn't really planning on moving to Java. It's just an option.
I agree with you about the Mozilla point. In fact, I made it in the article above, if you noticed.
-Max