Log in

To The Side

My Latest Side-Project

So, I haven't said much about it to anybody, but for a few weeks I've been working on re-writing AWStats. It's a fairly-popular program that generates charts and graphs based on your web server's logs, so you can see how many hits, visits, etc. that you're getting. It makes lots of different charts and graphs.

However, the code is pretty messy, and it's not maintained very much by its current author. I emailed him, and he said that I should go ahead and make my changes, and he'll either incorporate my patches into AWStats or I can just call it something different and ship it myself.

I've been making a lot of progress. My first focus has been to take all of the HTML output and put it into templates using the Template Toolkit, which leads to a lot of code cleanup. That's what I'm still working on now, though I've made a lot of progress on it. It's probably also led to security improvements, since I've been correctly filtering things.

I've also started to make the code use CGI.pm instead of manually sending HTTP headers and manually parsing the URL query string.

So far (I think I've worked about six or seven days cumulatively on it) the changes I've made can be represented as a patch of about 4700 lines and 203KB. For anybody who doesn't know these things, that's a huge amount of development. And that's just how big it all is as one patch--if we counted each change I made separately, it'd probably get closer to 6000 lines.

I think this will be a pretty good thing to do, because the current status of all open-source web-log analyzers is:

Webalizer: Hasn't been updated in years.
Analog: Hasn't been updated in years.
AWStats: Maintained only barely, probably full of security holes, very messy code.

And that's it. So since I know Perl pretty well (and AWStats is in Perl), I figured it would be a good thing to re-write and get out there.

Tags: ,


Good project!

I used Analog a while back, like it quite a bit even tho the learning curve is steep.

Is there a way to download everything you've done, or is not publish-worthy yet?
Well, theoretically you can check it out of my bzr tree like:

bzr co bzr://bzr.everythingsolved.com/awstats/mkanat

I have no idea if it works on anybody's machine other than mine, yet, though. Right now as I refactor I'm aiming for exact feature parity with the original, so the user-visible side of things won't be any different. This is because the only way to test my refactoring is to make sure that my new code outputs the same thing the old code did.

Eventually, once the internal refactoring is done, I'll be making some improvements, starting with making it output real XHTML and giving it real CSS classes and styling it correctly.

Even when I'm done, I want it to be fully compatible with the old config file format and AWStats's plugin API, so that people can directly migrate to it without any trouble. I'll probably even leave the old CSS classes around, so that people's old stylesheets will keep working.

I may end up thanking you for this later. As I read this I happpen to have already had AWStats (and webalizer) open in another window as I take a little look at ye ol weblogs and tune my site and whatnot. I can definitely see what you mean about these things looking like they haven't been updated much in a long time even without seeing the sourcecode. I hope to see some really awesome things come out of this...like say, it working really well and my webhost upgrading their stats package ;-)
Yeah, and you should see the sourcecode. I mean, I admire the guy who wrote it for making a nice piece of software, but it's all basically one 10,000 line file--and a lot of those lines are longer than 100 characters.

But yeah, I hope to see good things come out of it too! :-) It'll be the only modern open-source web stats package, so I'm hoping to see some good adoption.

That would be totally awesome!

I still am holding hope on GOOG releasing Urchin for free, but .. anything has to be better than the current crop of freely available webstat packages.
Ha, yeah. :-)

I doubt Google will release Urchin for free. They already have the free hosted Web Analytics, and I think that's the whole deal. I didn't want to use that, though, because I wanted something that could actually parse my logs, not some JS that I'd have to put on every page.