ETL vs Reporting : more chatting

More chatting with a self-acclaimed “Noob”. (not a real nickname)  The soup was provided by yours truly.

I’m posting this as it might be interesting for others as well. 


(16:56:19)  Noob:  Hey man
(16:56:26)  Noob:  i wanted to ask about DWH
(16:56:31)  soup-nazi:  k
(16:56:53)  Noob:  after alot of reading and such am i right in assuming that most BI solutions are static as such
(16:57:09)  Noob:  consdering we have to decide on DWH dimensions and fact tables before we start
(16:57:46)  soup-nazi:  No, you are not right, in fact…
(16:58:19)  soup-nazi:  You do need to decide up-front, but the fact that people change and requirements change is absolute
(16:59:11)  soup-nazi:  So what you need is a water-fall project model: Gather requirements – Design model – make technical design – implement ETL – make reports / testing – handover / docs
(16:59:31)  soup-nazi:  Each time you go through the cycle your model is updated and the ETL changes.
(16:59:50)  soup-nazi:  A star schema is designed to allow this to happen.  It copes very well with changes to the requirements.
(17:00:27)  soup-nazi:  Your DWH can take a punch when you design it correctly and use technical (or surrogate) keys everywhere.
(17:01:09)  soup-nazi:  k?
(17:01:34)  Noob:  Thats interesting. so would you consider a customer as an ongoing project and if the customer required changes such as new reports you would have to add to your original development
(17:02:07)  soup-nazi:  Yes, absolutely.
(17:02:51)  soup-nazi:  Here is the unconvenient truth: users change their mind.  And they should be able to change their minds.
(17:03:09)  soup-nazi:  Business requirements change, everything changes.
(17:03:26)  soup-nazi:  In some organisations things change faster than in others, but changes are happening everywhere.
(17:03:59)  Noob:  thats very true. have you ever come up against a customer that wanted more control to devlop thier own reports as things change?
(17:04:22)  Noob:  as learning the ins and outs of the Pentaho BI platform isn’t an easy job
(17:05:16)  soup-nazi:  Yes, that’s why we developed Pentaho metadata and the Web Based Ad-hoc Querying and Reporting (WAQR) solution
(17:05:39)  soup-nazi:  That allows plain simple users to build their own reports.
(17:05:47)  soup-nazi:  online
(17:05:49)  soup-nazi:  on the web
(17:05:54)  soup-nazi:  available now
(17:06:00)  soup-nazi:  in version 1.5M3 or later
(17:06:08)  Noob:  actually build thier own reports? not just edit ETL transformations?
(17:06:41)  soup-nazi:  yes, one has nothing to do with the other.
(17:06:48)  Noob:  you will have to forgive me if i ask stupid questions about the pentaho products. ive been thrown in the deep end here 🙂
(17:06:54)  soup-nazi:  ETL (Kettle) is not meant for reporting.
(17:07:46)  Noob:  Okay, well i have this little dedicated linux box sitting next to me and the plan is to use it to build a BI solution
(17:07:55)  Noob:  ive started with what i know best, Kettle
(17:08:14)  Noob:  and transformed my data into a DWH for some basic reporting
(17:08:39)  Noob:  from then on i would use the other Pentaho tools such as report designer and the .xaction editor to create the rest?
(17:09:00)  soup-nazi:  Yes.
(17:09:21)  soup-nazi:  Most of these tools start out with a SQL query that defines your data.
(17:09:45)  soup-nazi:  The SQL query is usually simple because it reads from a star schema DWH.
(17:10:08)  soup-nazi:  With Pentaho metadata you can even eleminate that SQL writing by designing your data model;
(17:10:21)  soup-nazi:  WAQR then writes the SQL for you based on a user-selection.
(17:10:35)  Noob:  ahhhhh, very nice =)
(17:10:54)  Noob:  WAQR you say. is there a release? maybe ive already seen it
(17:11:47)  soup-nazi:  Download Pentaho 1.5 milestone 3, it’s in there. (Create report from the first page after login)
(17:12:01)  soup-nazi:  It’s some AJAX like frontend
(17:12:12)  Noob:  AJAX, i like! =)
(17:12:25)  soup-nazi:  1.5M3 is available from the Pentaho frontpage at the moment
(17:14:11)  Noob:  Okay, im downloading it just now to have a look.
(17:14:32)  soup-nazi:  Pentaho metadata 1.5M3 is also available for download BTW.
(17:14:40)  Noob:  but in all, we start with Kettle and place all our OLTP data into a DWH
(17:14:59)  Noob:  from there we do everything else. reporting, dashboards, charts ect ect
(17:15:17)  soup-nazi:  yes.
(17:15:23)  Noob:  all the data for these reports is taken from the DWH
(17:15:28)  soup-nazi:  yes
(17:15:37)  Noob:  okay. starting to get a better understanding
(17:15:57)  soup-nazi:  In fact, once you put the lowest level of data, the transactions, in the DWH, you never go back to the source system.
(17:16:19)  soup-nazi:  You can build what we call aggregates or analytical fact tables that are derived from the lowest level fact tables.
(17:16:50)  soup-nazi:  For example, if you want to know how long ago it was that each individual customer bought something, you need to do lookups.
(17:17:05)  soup-nazi:  Those lookups are easier in ETL than they are in reporting tools. (virtually impossible).
(17:17:12)  soup-nazi:  So you build an aggregate, etc…
(17:17:38)  soup-nazi:  You refine the data, you lift it until at the very end you get a highly top-level fact table to drive your dash-boards.
(17:18:36)  Noob:  wow, its alot to take in. just aswel im gonna keep a copy of this conversation. Also i appreciate your helping this *Noob* =)
(17:18:55)  soup-nazi:  np


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.