Customer Dimension Vs Customer Fact Table
I’m re-reading the classic The Data Warehouse Toolkit by Ralph Kimball. It’s a great book for learning about data warehousing. I figured it was worth a refresh. In my current project, I’m working with some large data sets and I’ll be using data marts to make analyzing the data easier.
I’m at the section where he talks about CRM and adding a customer dimension. I’ve always wondered why he uses a customer dimension instead of an accumulating snapshot customer fact table.
There’s lots of interesting things about a customer that can be expressed as facts, like spending (total, yearly, quarterly, etc) and customer service requests. Adding a ‘has purchased’ dimension makes segmenting easy too. Data about the customer, like first purchase, last purchase and location, goes from being evil outriggers to full blown dimensions of their own.
Once you have a customer dimension, you can drill across data marts to get customer-centric slices of queries. Alternatively, the customer fact table can have a view to turn it into a dimension. If you’re data sets are really big, you can create mini dimensions from the customer fact table (which he recommends anyway).
I suppose hard drive and memory sizes were a lot smaller and CPU’s were a lot slower in 2002..

