Book Review : Pentaho 3.2 Data Integration

Dear Kettle fans,

A few weeks ago, when I was stuck in the US after the MySQL User Conference, a new book was published by Packt Publishing.

That all by itself is something that is not too remarkable.  However, this time it’s a book about my brainchild Kettle. That makes this book very special to me. The full title is Pentaho 3.2 Data Integration : Beginner’s Guide (Amazon, Packt).  The title all by itself explains the purpose of this book: give the reader a quick-start when it comes to Pentaho Data Integration (Kettle).

The author María Carina Roldán (blogtwitter) is a seasoned BI consultant and a valued member of the Kettle community. Besides her frequent appearances on our forum, she is appreciated by many for the time she spent on the Kettle Tutorial.

I’m not going to go over the detailed table of content.  Since I wrote the foreword of the book, I’m sure you’ll agree I’m somewhat biased. However, in all objectivity, the book covers what it claims to cover: it does help the PDI/Kettle beginner tremendously.  It covers all you need to get started and then some: the installation of PDI, the typical “Hello World” setup of PDI, reading text files, calculating, scripting, databases, repositories, etc.  As the title indicates, this book covers the current 3.2 stable release of Kettle, not the upcoming 4.0 release. However, for as far as 99% of the topics covered are concerned, that shouldn’t make too much of a difference.

So obviously I can recommend this book very much. It’s a time-saver for those that are starting with PDI.  For those that have dabbled with Kettle before I must say that María packed the book with nice tips and tricks so I’m sure you’ll be able to learn a thing or two.

Until next time,


6 thoughts on “Book Review : Pentaho 3.2 Data Integration”

  1. Hi Matt,

    I have just finished reading this book and would like to give you a short, personal impression:
    I already had gathered some experience using PDI (doing some rather small database synchronizations
    along with pulling data from LDAP Servers)

    The first chapters didn’t contain many new information for me, nevertheless, they provided some more
    details which I didn’t come accross during “learning by using PDI”.
    I especially like the chapter about the “Modified JavaScript” Step, which is one of the most powerfull steps IMHO.

    The last few chapters about the datawarehouse stuff and the combination of the jobs/transformation
    (especially the sub-transformation stuff) were really interesting!

    All in all: Great work, I highly recommend this book to anyone who works with PDI (be it a new user or
    someone who already has gathered some experience).



  2. Hi matt,
    we are currently working on a project for a client using kettle 3.2 and now 4.0 I got the book for the new developers we will be onboarding. it does serve as a decent introduction to kettle , but we are now hankering for some more advanced books especially o do with plug in development and user define java classes. is there any news of that kind of book coming out any time soo ( are you writing one for example ? )
    We would appreciate any pointers to books on kettle as I have only found menion of 3 and one is not yet out ( hint hint !!)

  3. Hi Robin,

    Around the end of September Wiley will be releasing another book written by yours truly, Roland Bouman and Jos Van Dongen:

    It will cover the advanced topics you are missing right now like extending Kettle (writing plugins), integration of Kettle, User Defined Java Class, performance tuning, and so on.

    All the best,

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.