Well – I’ve been so busy lately that I haven’t had much time to blog, but I wanted to make an announcement here that I’m taking a job at Microsoft. I’ll be starting in a couple weeks as a TS for the Data Platform out of the North Central District. I’m really excited and hopefully I’ll be doing a little more blogging here as a result.
Some pretty cool stuff…
Did you know that SQL Server Management Studio makes a nice little regular expression (regex) editor and tester? Go to find and replace (Ctrl-H) and down at the bottom of the dialog box is a check box names “Use”. Check that box and choose Regular Expression. Now to the right of the box where you type the text, there is a glyph that will fly out a regular expression character helper. You can type in text in Management Studio and test a regular expression against it.
You can also leverage this regular expression functionality to do a multi-line find and replace. You can substitute in the old C# (C++) syntax \n for a new line. I needed to replace a block of text with a multi-line block of text and this worked beautifully!
Don’t forget you can use Management Studio to search in files too. A nice GREP tool too! Hit the glyph just to the right of Quick Replace or Quick Find at the top and you’ll get the option to search in files.
Now this is cool!
I’ve been needing a new demo machine for myself and finally bought a Dell Studio 16 - i7 processor (4 cores) and 8 gigs of ram.
I finally gave up and installed VMWare and everything works great. I’m a total Microsoft kind of person. I love not having to install anything I don’t have to, but I had to.
Hyper-V just isn’t ready to handle these new processors and new video cards. VMWare is.
And after all that, VMWare just works. It comes with all the virtual networks set up. They just work. There is a setting to optimize disk for performance by not waiting for disk write. This is perfect for demo machines.
Once Microsoft remedies the Hyper-V situation, I’ll gladly switch back.
Maybe I’m the only one, but I put one of my new Hyper-V imgages into a difficult state the other day by choosing the wrong network type. What I mean by this is that when Server 2008 came up (for the first time) and saw it was on the network (it was bound to the virtual NIC attached to the physical NIC on my laptop) it asked me if I was on a public network, home network or work network. I wasn’t thinking and said it was a public network. That makes it harder to access your VM as a UNC. I don’t know all the differences to these different settings, but I do know that if you choose public, that it isn’t discoverable from the netbios name.
Make sure you use either Home or Work network.
Ever wondered if it mattered what order your data was in inside an ETL load?
I hadn’t really considered it until this morning. I have already considered and used the technique of dropping indexes during ETL and then creating them afterwords. But this was new.
I had a load that took 50 seconds. It had a clustered index as well as a unique index that was used to enforce a primary key. The ETL load came in in no order.
The first thing I tried was to order the ETL in ascending order – which is the same as the order of the clustered index – to see if it mattered. I suspected it would and it did. the time went from 50 s. to 33 s. This was good - nearly half the time just by ordering the data. I wondered if ordering the data backwards from the clustered index would help or hurt, and to my surprise, it helped and lowered the time further to 28 s.
Now I wondered how much lower I could go by dropping the indexes before ETL and the rebuilding them afterwords.
I re-ran the non-ordered ETL using the drop and rebuild method and got 28 s. So if all I had done was just drop and rebuild indexes, I would have gone from 50 to 28 s. Not bad. But when I ordered the load descending it dropped even further to 14 s. The last test of course was the order it back to ascending and see which ordering worked best for the drop and rebuild method. It turned out that ordering ascending worked better and dropped the time down to 13 s.
This was a surprise to me. With the indexes built, the opposite order to the clustered index was better. When you drop and reload, the same order was better. I think the moral of the story isn’t specifically whether ascending or descending is better – I suppose it could be different in other cases. Mostly just remember that the order can really matter and when you are trying to squeeze every ounce of performance out of ETL, check out the order of your ETL.
Anyone on the SQL CAT team want to comment on why backwards works better when indexes are left intact?
I was trying to install SharePoint 2010 (final version) today on Windows Server 2008 (no luck with R2) and I had to try to intsall prerequisites twice before everything would install. If you have trouble, keep trying. :)