Today (10th of November 2013) I have decided to start a series of book reviews and my first pick is the The Data Warehouse Toolkit written by Ralph Kimball and Margy Ross.
This is 3rd edition that was released several months ago and I was looking forward to this book for several reason:
One is that it is up to date and I like to be up to date
Second reason is that I have gathered new experience and found interesting non Kimball approaches to deal with Kimball limitations that I just couldn't accept.
and most important is that I want to know (and memorize) everything about Kimball Methodology :)
I've read previous edition (probably 2nd edition) but with my new perspective it will be like reading a new book on a well know subject for me.
Before I start let me tell you that I like Kimball methodology for many reasons but I believe that it tries to do to many things in one go therefore I prefer to mix methodologies.
This will be in-depth review and I split into different days and I hope this will also spark some discussions!
List of days:
The first chapter Data Warehousing, Business Intelligence and Dimensional Modeling Primer is well written and not too lengthy and I particularly like Goals of DW and BI. We also have Star Schema Versus OLAP Cubes which I really like and I'm glad it has been included as in previous editions I couldn't understand why OLAP cubes are not discussed.
The ETL part sounds like piece of cake but this actually where I look for alternative approaches as I believe following Kimball may cause serious issues.
The book has a very nice Restaurant Metaphor for the Kimball Architecture and it underlines that preparations is key which is where I believe many companies go wrong and end up with difficult to maintain solution however again I believe ETL can be significantly simplifier by using alternative approaches that I will discuss very soon.
One of the nicest additions to this edition is Alternative DW/BI Architectures and this sentence is very encouraging:
"We acknowledge that organizations have successfully constructed DW/BI systems based on the approaches advocated by others. We strongly believe that rather than encouraging more consternation over our philosophical differences, the industry would be far better off devoting energy to ensure that our DW/BI deliverable are broadly accepted by the business to make better, more informed decisions"
No more war! Peace! Yupi! but is it? One of the discussed approaches is Hybrid Hub and spoke however it is compared to 3NF ONLY and I say this is incomplete! My knowledge is fairly "fresh" so when someone tells me Hybrid Hub and Spoke I don't think about Inmon's 3NF I think about Data Vault. I agree with Kimball that 3NF is too complex however this is precisely the reason why I go with Data Vault because it considerably simplifies ETL process that is still too complex with Kimball Methodology.
I like the idea of this section however they missed Data Vault which overcomes all 3NF cons + has other benefits and I don't believe authors are not aware of Data Vault especially that it seems Inmon now says that it should be part of his DW 2.0 (I cannot find reliable source for this info, if you know of one please let me know).
The only reason why this was missed might be because it is a good solution so I think I might have to wait until 4th edition for find an answer to this question (but I will try to get an answer sooner).
Also important aspect that was not discussed yet is Master Data Management (for conformed dimensions) but I hope this will still be discussed in the book.
Day 3 coming soon!