10 thoughts on “Dynamic de-normalization of attributes stored in key-value pair tables”

  1. Hello,

    Haven´t had the need for it yet, but my fingers are itching to experiment with it.

    Thanks for the update Matt !

  2. Hi Matt,

    Thks for this post… and the very usefull “Inject Step” in PDI 4.2 too !!

    In fact, many (old and bad…) RDBMS physical models for business applications store “free attributes” (customized by end-users).
    And always with fixed columns in many business tables (customers, products), like this :
    Attribute_01 | Attribute_02 | Attribute_03 | …. | Attribute_10 |

    Last week, I had this specific request from a client and dynamic transformation with metadata injection is a very nice solution. In my case, it’s the opposite of your example, with a “Normalize step”, but it should work too 😉

    All the best


  3. Excellent article Matt! Thanks for sharing! I am currently working on a project which has similar requirements. Taking it a step further to automatically generate/amend a metadata model and an OLAP schema would be definitely very interesting.

  4. hi Matt,
    It’s an great article that will help solve my current situation. I used the same idea to create a transformation
    that contain the following steps:
    1. a UDJC that continues receiving messages through socket.
    2. a Split Fields Step to parse each message into row

    — beginning of a sub-transformation in a Single Threader (batch time= 20000ms)
    3. Mapping input Specification
    4. Modified Java Script Value Step to print out debug messages
    5. sort the batch of rows
    6. groupby Step to compute the average.
    7. Mapping output Specification
    — end of sub-transformation

    8. output to text file

    I ran it and notice the followings:
    1. the single Threader only kick off once, but not every 20secs. (step 3 only get printed once)
    2. once the sub-transformation is kicked off, I don’t see a batch of rows, only the first row is being passed.

    Any idea? and how do I debug this?

  5. Hi Ken, try to simplify your transformation. Make sure that your UDJC step indeed continues to produce rows.
    If you want to post a sample, hop on the forum and create a discussion thread there.
    Good luck,


  6. hi Matt, I think I kind of figured out the issue, see if that makes sense to you. I noticed the following behavior:

    * The single threader is set to batch every 20 secs

    – events came in at 1, 2, 5, 10th secs
    – there was no more event before the 20 sec time-window expired.

    In this situation, i found out the time threader would still be waiting without producing any output.

    – now, lets say, at the 50th sec, another event arrived, then interestingly, the time threader finally split out the result and aggregated with the previous events ( at 1, 2, 5, 10, 50th secs)

    Thanks for your help.

  7. Hi Matt,
    I´ve got a situation a little diferrent, as we don’t have any person_attribute table, this is made dinamically as a file arrive to our application. When I’ve tried to do what you show hear, doesn’t work. I wonder if it just work if the data is stored in our table or can I do it if it came on a strem?
    Is there any modification I should do, or did I just do something wrong?

  8. Hi Matt

    Very interesting, i use successfully this solution.
    However, I have tryed to use dynamic.ktr in a sub-transformation, ie : a step “mapping input” before “attribute_description” and a step “mapping output” instead of dummy “output”

    I have a transformation with a mapping step which maps dynamic.ktr and a dummy step just after. When I run it, I have a “bad encoding” error

    Thanks for your help

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.