Rolling back transactions

Pentaho Data Integration (Kettle) never was a real transactional database engine, and never pretended to be that. It was designed to handle large data volumes and slam a commit in between every couple of thousand rows to prevent the databases from chocking on the logging problem.

However, more and more people are using Kettle transformations in a transactional way. They want to have the option to roll back any change that happened to a database during the execution of a transformation in case anything goes wrong.

Well, we have been working on that in the past, but never quite got it right… until today actually. As part of bug report 724 I lifted the decision to commit or roll back all databases to the transformation level.

Take for example a look at this transformation:

What happens is that the first 2 steps will always finish execution before a single row hits the Abort step. That means that all rows from the “CSV file input” step will be inserted into the database table before the transformation fails. Well, in the past, even if you enabled “Unique connections”, this would have resulted in those rows to remain in the table.

To test yourself, use revision 6587 in trunk to build yourself or download a nightly build tomorrow.

With a little luck (further tests and then more tests) we can back-port this fix to version 3.0.2 this week, ready for the 3.0.2GA release at the end of next week.

I’m hoping to extend this same principle to jobs as well in the (more distant) future.

Until next time,
Matt

6 thoughts on “Rolling back transactions”

  1. Hi,

    I just tried to rolllback a transaction in a job, but it seems the principle you described above is still not implemented for jobs, is it?
    I think this would be a nice feature, it is exactly what I need at the moment.

    Stefan

  2. O.K. As I understand from other sources:
    * The “Use Unique Connections” is called in 4.1 (or earlier) “Make the transformation database transactional”
    * If this option is set in the transformation, then all commit sizes are ignored

    Correct?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.