10000 posts!

Today we received our 10000th post on our Pentaho Data Integration (Kettle) forum. It seems only a short while ago that user “rsheldon” posted the very first post on December 12th 2005.

I can still remember myself thinking: “I just open sourced years of work and you ask me an Oracle question on legal stuff? WTF??

Fortunately, other people followed and although at first there where only a few posts, these days the forum is the doorway to a vibrant community that makes me proud to be a part of. It seems these days we’re getting hundreds of posts each week and I’m extremely pleased that almost all questions get a reply.

Surely, the insane quick pace of Kettle development is to be blamed for the large number of questions, but I for one can only be glad for this. What once was my pet project is now a full community effort. This is what Ohloh has to say about Kettle:

Factoid: “Large, active development team”

Over the past twelve months, 11 developers contributed new code to Kettle.

This is a relatively large team, putting this project among the top 10% of all project teams on Ohloh.

For this measurement, Ohloh considered only recent changes to the code. Over the entire history of the project, 12 developers have contributed.

In fact more developers have contributed, but all changes before 4 months ago are not taken into account. That is because we recently migrated to a different source control server.

Until next time,


Kubuntu fun

I have been using (open) SuSE for my Linux “needs” ever since SuSE version 6.x.
For my new laptop (see previous posts on the topic) however I tought I would give Kubuntu a try. I’ve been reading a lot of positive things about it and the Ubuntu attitude towards Microsoft is a bit clearer than Novells.

I have to say that I’m absolutely impressed. The only change I change I made to the distribution was the installation of the NVidia closed source graphics drivers, but to be honest, the open source drivers worked just the same. So that’s pretty impressive to me.

Now the only thing left that bothered me was the limit to the amount of memory I can use. I packed this machine with 4GB of RAM. Vista can access it so why can’t I. Well apparently the Intel Core 2 Duo 7600 is x86-64 so I probably should have picked that DVD download from the Kubuntu site. I’m guessing that’s the mistake I made here. I won’t know until I find the time to re-install the machine. (not the next couple of weeks anyway) I’m going to try the “live DVD” one of these days to see if that gives any clues. I’ll keep you up to date in the comments.

As for the real reason why I like this laptop and Linux: the complete re-build of the Kettle project takes less than 10 seconds on the new box. This could take minutes on my old (2GHz Pentium-M) machine. That and “apt-get”. ‘nough said.

Until next time,


Improving performance

Now that the API conversion is done, we can continue to port steps. In the mean time, you guys can enjoy better performance on the steps that where already ported.

Here is an example: generating 10M rows with 2 integers in it:

For now, the GUI hasn’t changed at all, but you can start to help out with testing etc. This image can be used to get an initial view of the performance of the currently ported steps. In any case, the example above runs “somewhat” faster than the old version, try for yourself.

Until next time,


P.S. A new download for the step plugin source code is available.

Taking me back…

We’re smack in the middle of porting Kettle to a new API, version 3.0.  So this is what it comes down to again…

Yep, you got it right.  Refactoring is at times a very time-consuming thing to do.

It also is something that brings me back to the start of Kettle.  You see, I have been doing this sort of thing a few times before with the Kettle codebase.  I can for example remember the time when I changed the Java code from using if/then structures to using Exceptions to make error handling more transparent and to clean up code.  There is something about going through hundreds of classes, fixing all occurences of a certain bad pattern, for days on end … it’s an acquired taste I have to say.

Anyway, the 3.0 porting of Spoon is well under way and I hope you can see the first results after the weekend.

Until then,



Apple vs Dell

I just ordered a new laptop.  The old one, an Acer 8104 is still a nice computer but seriously lacks some RAM.  It has a single GB of RAM and that really is becoming more and more a problem.  I just arrived at a point where I could no longer cope with the constant opening and closing of applications, the jumping through hoops, the stalls when you run out of memory, etc.

So I have been window shopping for weeks now, and in the end there where 2 computers that had my interest…

The contenders

The first was an Apple Macbook Pro (CPU: 2.33Ghz Core 2 Duo, RAM: 3GB-667Mhz,  HD: 120Gb 5400 tr/m, Screen: 15.4” @1440×900)


Price for this beauty: 3557,99 EURO

A Dell precision M65 (CPU: 2.33Ghz Core 2 Duo, RAM: 4GB-667Mhz,  HD: 100Gb 7200 tr/m, Screen: 15.4” @1920×1200)

Dell M65

Dell charges 3596,12 EURO for this one and it’s the computer I picked.


So, all in all, the price is about the same.  Even though the MBPro looks like a very nice piece of machinery and a lot of my colleagues at Pentaho already have one, I still went with the Dell.  Why?  Well, the extra GB of RAM will come in handy of-course, but also the low-resolution screen was a big turn-off on the Mac.  How can you claim any laptop is “Pro” if it doesn’t have at least 1680×1050?  My old laptop has that resolution and I don’t want to go back.  Anyway, if you think about it, the Dell has about twice the amount of pixels on the screen compared to the Apple laptop.  That’s a lot of detail and it’s going to be fun to play around with that.   The lack of a camera in the M65 doesn’t bother me in the least.  It’s not going to be used as a home-computer anyway…


The price on the Dell box included full accidental damage insurance for 3 years.  If you’re like me and you actually move the laptop a lot, hauling it around the world on planes, trains and automobiles, to clients, meetings, conferences, presentations, cold and hot places, then that insurance is certainly something nice to have.  The AppleCare protection plan details that it doesn’t help you out in case of accidents, wrong use, wrong applications, wrong software, temperature too high, too low,… It scrary to read that stuff since there is no specification of what “wrong” or “inappropriate” means. (The Dutch details that apply in my case)  Apple also doesn’t fix your computer on-site.  That is leading to the same problem I had with Acer: once you have any kind of problem you have to send the laptop out for repair and you don’t see it back for a week.  At least Dell performs the maintenance on-site the next business day.  In fact, the support seems to be worldwide, so in theory I could get repairs done when I’m abroad.  I’m not counting on that though 🙂


Another thing the M65 has going for it, is that it’s going to be running Linux pretty much without a problem.  This will be a box where I will be mainly running Linux on with perhaps a VM on it with Windows XP.  I know that Dell is starting to offer Linux in the US, but I don’t want to wait until that program reaches Europe and until it is worth anything beyond marketing hype.

Before you ask, obviously I didn’t order the laptop with Vista 🙂

Until next time,