The first effort for creating an automated build for the data warehouse has turned out to need some re-thinking.
The original vision was for dropping the datamart database and then recreating that database, playing in some create scripts for the tables and then deploying and running some SSIS packages to populate the data. This turns out to be un-realistic.
There are tables in the warehouse that cannot be deleted. Type two slowly changing dimensions (SCDs) and fact tables that keep history are two examples. These tables can’t be re-created because the historical data isn’t in the ODS anymore. This requires a change in thinking.
The source for a warehouse build is not just the create scripts and the ODS – this isn’t enough. We have to think of the current warehouse as the source for the revised version. This means modifying and updating the current datamart instead of completely rebuilding it during a build.
What does this mean for our Agile Warehouse – one that is continuously integrated?
We have to employ some database change management techniques so that we can build in one automated process.
The iteration lifecycle from a high level would include:
- Complete development on the dev database
- Restore the test environment to match the structure of the production environment
- Merge in changes from the dev environment to test environment
- Run automated tests on test environment
- If automated tests are successful and the current changes are to be moved to production, run the same sync and test process against the production environment and backup the test environment for the next iteration. If the automated tests are not successful, then more development will be done on dev and the process starts over
This creates the framework for the ongoing lifecycle of development. In my next post, I’ll talk a little about how you can build the technology infrastructure for making this environment a reality.