Building create scripts for user permissions

October 5, 2007

I needed to use T-SQL (as opposed to using the Sql Server object model and .NET) to write out CREATE scrips for logins to a database as well as the roles to which they were assigned.  I decided to use the Catalog Security Views to accomplish this task.

Here is a script that will create CREATE USER scripts for all users in the database:

select ‘CREATE USER [‘ + database_principal_name + ‘] FOR LOGIN [‘ + server_login_name + ‘] WITH DEFAULT_SCHEMA=[dbo]’ as sql
from (
select dp.name as database_principal_name, sl.name server_login_name
from sys.database_principals dp inner join sys.sql_logins sl on dp.sid = sl.sid where dp.type=‘S’and dp.name not in (‘dbo’, ‘guest’, ‘INFORMATION_SCHEMA’, ‘sys’)
) ss

Here is a script that will create statements to add users to roles:

select ‘EXEC sp_addrolemember N”’ + role_name + ”’, N”’ + login_name + ”” as sql from
(
select dp_login.name as login_name, dp_role.name as role_name from sys.database_role_members drm
inner join
(
select * from sys.database_principals dp where dp.type=‘S’ and dp.name not in (‘dbo’, ‘guest’, ‘INFORMATION_SCHEMA’, ‘sys’)) dp_login
on drm.member_principal_id = dp_login.principal_id
inner join
(select * from sys.database_principals dp where dp.type=‘R’ ) dp_role
on drm.role_principal_id = dp_role.principal_id ) ss

I haven’t yet tried to script out table level permissions but I’ll bet that a similar approach would work.  Also, I tried to make the statements as readable as possible, but they don’t look very good in the blog window.  I think if you’ll take them and paste them into SSMS then they will be easier to read.

In a subsequent post, I’ll talk about how you can use this kind of script to build an automated delivery system for datamarts.

Mark

Advertisements

Google Code Upgrade

December 20, 2006

Google Code project hosting has been upgraded!  I just noticed that there is a location for storing downloads and a new Wiki tab as well.  This was my only complaint about google code up to this point and it is now fixed.

I have placed the latest executable in the download section for each of the two projects that I’m hosting.

So now you don’t have to download the source (although you are welcome to) and have to comple it to use the application.

(This goes for both NDataUnit and BuildDB)

Mark


NDataUnit – Database Unit Testing Tool

October 22, 2006

Unit testing is something that can really help in the development cycle for a data warehouse.  Having a robust suite of tests can help us refactor and create new features with confidence that we aren’t breaking existing features.  Obviously at the outset we won’t have a large set of tests, but as time passes, we should continually be creating simple, small, and fast tests that as a whole test every facet of the warehouse we can think to test.

So with much fanfare (drumroll please) I’d like to announce another open source tool that I have started:

NDataUnit

NDataUnit is written in C# under the GNU 2.0 license and is intended for anyone who needs to perform tests on data and compare the returned results to expected results.  NDataUnit is a console app that can be called from NAnt or MSBuild and it takes a test definition file (XML) and some command line arguments (Server, Database and Output File) and runs the tests described (SQL Statements) and compares the results against those listed in the test definition file.  When the application is completed, if you specify, a log of the results of your tests are contained in a log file.  Also, NDataUnit will return as a DOS ErrorLevel how many errors there were during processing.  This means that if all the tests were successful, it will return 0 and NAnt will continue processing.  This allow for complete integration with NAnt and an automated build.

NDataUnit is also hosted on Google Code like BuildDB.

One way to make use of this tool and start unit testing your warehouse would be to take two subsets of your ODS data and load one of them into a structural copy of your ODS and the other you should run through the ETL process into the star schema.  Then script out the star schema or fact table data into insert statements that can be run during our automated build.

After this is complete, you should be able to build your datamart from scripts (now including test data).  At this point we have the datamart in a known state.  This means we should be able to write tests to ensure that it is in the known state.  For instance we could sum up the number of distict values in a foreign key column in the fact table.  We could also sum up the total value in a measure column.  Many tests could be written to ensure that it is in the known state we expect.

After we test the datamart to be in the known state, we would proceede on to our regular transactional ETL process.  We would run the ETL process against the subset of data we identified earlier that is now in a structural copy of the ODS.  Since we are once again using known data, we should be able to write tests to ensure that our ETL process is running correctly.

We can approach this in two ways:

  • Write tests that test the test-data ODS to the newly loaded datamart.
  • Write tests that look from known values in the datamart after ETL.

It would make sense to have a mixture of these two types of tests.  Each one can help find different kinds of problems.

I hope that this is only the beginning of NDataUnit.  I’m very open to having help with enhancing the application by either letting me know what it needs to do, or someone stepping up and getting their hands dirty in the code.

Let me know what you think.


BuildDB – Open Source Tool for the Agile SQL2005 Database

September 24, 2006

I’d like to announce the beginning of a new open source project to fill a hole in the set of tools needed for an Agile Database / Datamart built on SQL 2005.

The name of the project is BuildDB and it is hosted by Google Project Hosting under the GNU General Public License 2.0.

In consultation with a friend and colleague, Greg Graham, I wrote a tool that will build a SQL 2005 database from script files using a command line console app.  The app is written in Visual Studio 2005 on the .NET Framework 2.0.  The project is definitely not finished and I hope that maybe a person or two will step up and help me test it and maybe even do some coding.

It works like this:

BuildDB [TargetServerName] [TargetDBName] [ScriptsRoot]

Under the ScriptsRoot directory, the application looks for these directories: “Logins”, “Users and Roles”, “Tables”, “User Defined Functions”, “Views”, “Stored Procedures”, “Triggers” and “Static Data”.

The tool will go through these directories and create the database and then run the scripts in the “Static Data” directory allowing the opportunity of inserting data right from the get-go.

There are already some known issues with the application and also some enhancements that I’ll be working on in the near future.  Even with these issues and enhancements outstanding the tool is pretty useful but maybe incapable of building a super complicated database.  If someone tries the tool and finds any other issues that I don’t have identified in the Issues List, please add it or get in touch with me via this blog.


Developer Edition to the Rescue

September 18, 2006

In an earlier post I had mentioned the problem that SSIS packages couldn’t be executed in a way that was “in-process” – in other words, my build would have to create a SQL job that executed a package that I had deployed to the SSIS server.  I would use sp_start_job to start the job and the problem is that it just starts the job and reports whether the job was started successfully.  The build moves on to the next step.  This would make it hard to load shared (conformed) dimensions first and then load the fact table second.

This doesn’t turn out to be a problem because we can use the Developer Edition of SQL Server and install SSIS on our local machines and use dtexec on the command line.

I don’t know why I didn’t think of this earlier, but it completely solves the problem.

We didn’t want to install SQL on our local machines for data security reasons and this doesn’t cause a problem.

Problem solved!


The Continuously Integrated Datamart

September 9, 2006

The first effort for creating an automated build for the data warehouse has turned out to need some re-thinking. 

The original vision was for dropping the datamart database and then recreating that database, playing in some create scripts for the tables and then deploying and running some SSIS packages to populate the data.  This turns out to be un-realistic.

There are tables in the warehouse that cannot be deleted.  Type two slowly changing dimensions (SCDs) and fact tables that keep history are two examples.  These tables can’t be re-created because the historical data isn’t in the ODS anymore.  This requires a change in thinking.

The source for a warehouse build is not just the create scripts and the ODS – this isn’t enough.  We have to think of the current warehouse as the source for the revised version.  This means modifying and updating the current datamart instead of completely rebuilding it during a build.

What does this mean for our Agile Warehouse – one that is continuously integrated?

We have to employ some database change management techniques so that we can build in one automated process.

The iteration lifecycle from a high level would include:

  1. Complete development on the dev database
  2. Restore the test environment to match the structure of the production environment
  3. Merge in changes from the dev environment to test environment
  4. Run automated tests on test environment
  5. If automated tests are successful and the current changes are to be moved to production, run the same sync and test process against the production environment and backup the test environment for the next iteration.  If the automated tests are not successful, then more development will be done on dev and the process starts over

This creates the framework for the ongoing lifecycle of development.  In my next post, I’ll talk a little about how you can build the technology infrastructure for making this environment a reality.