Sunday, 10 October 2010

Reminiscences from SQLBits 7

SQLBits is over for another 6 months. as ever it was a real pleasure to spend a day catching up with old friends, making new friends and hearing some very knowledgeable people speaking about SQL Server. Aside from taking some time to say a huge thankyou to the organising committee, the many people who help both at the event and doing the physical prep for it, and the fantastic speakers, all of whom give up their time for free. I wanted to share with you some of the things I took away from the conference.

I was not able to get to the Friday sessions, so the following are all from the Saturday agenda, also not having a time machine I was only able to attend 20% of the sessions on offer, pleases add your own memories from both days to the comments.

Thanks...

I started off the day with Thomas Kejser from the SQL CAT team, who was discussing designing and tuning high speed data loading.

Thomas currently holds the world record for the fastest ETL load so it is fair to say he knows what he is talking about. The biggest takeaway for me from this session was his discussions on using bulk insert to keep loggong to a minimum and how he designs his SSIS packages to ensure maximum parallelism.

When doing a bulk insert into a heap using the TABLOCK hint the insert will be minimally logged and will create a 'BULK UPDATE' lock on the target table. The BU lock will allow multiple threads to bulk update the table but will lock out any other processes that attempt to use the table for non-bulk update actions. What is new in 2008 is that the same behaviour can be produced when inserting into a table with a clustered index. This is achieved by setting the TF-610 flag to on as discussed in Sunil Agarwal's blog.

In order to leverage the parallel bulk insert capabilities Thomas constructs a prioritised queue of the jobs that need to be done and then uses multiple copies of a generic package that takes the next item from the queue, flags it as in progress, runs it and finally marks it as complete. this sequence is looped until there are no more jobs in the queue.

By running the load this way Thomas is able to keep development time for the packages to a minimum, and ensure that his server is as busy as it can be right up to the job has finished. The CAT team have a number of video presentations around SSIS performance which include details on producing this design pattern.

Following this I went to hear Ashwani Row talk about SSAS Best Practices.

The big takeaway for me from this session was seeing how normalising the relational datamart tables to a snowflake schema and providing a view over the top of this to present SSAS with a simple star schema improved processing performance without the overheads and complications that would normally be associated with a snowflake in the Cube.

A very handy tip that I'm sure I will use again and again.

My next session was on parallel tuning with Ramesh Meyyappan. I'm not a hardcore performance junkie, but I love listening to people that are. It allows me to get snippets and tips without having to do the hard graft of finding them for myself - and Ramesh is probably my favorite oracle on the subject.

The sample scripts for this presentation are available from SQL Workshops.

Some of the snippets that I came away with are that CTEs will not run parallel, though temporary tables will. Also when querying a table with a computed column the query will not run in parallel if there is a computed column using a scalar function.

It was also interesting to note that if the estimated number of rows is low then the query will not be considered for parallel processing. The estimated rows can bees viewed using the estimated query plan, and rearranging the query to present the high row part of a query first can change the estimated row count. While this sounds almost like the old school coders' myth that the table order in a join affected performance. Ramesh demonstrated this happening, and if you are still in any doubt I recommend that you either download his samples and try for yourself and / or find an opportunity to hear Ramesh speak.

For the lunchtime session I went to hear some myths about SQL Server hosted by Quest Software. No big takeaways from this (unless you count the T-shirt) but a very fun session.


Well, I have a couple of Ideras
  During the lunch break, the guys on the Idera stand 'enhanced' my Viking hat and where kind enough to tweet me the picture.


















Following lunch, I went to hear Jennifer Stirrup in a session Alice's Adventures in Reporting Services.

While Jen did nothing to dispel our image as geeks with her Dr Who enemy example data her session was an excellent piece in trying to ween people off of producing poor performing visualisations - a cause dear to my heart! 

she discussed...
preattentive attributes

   - Those 'things' that allow us to make snap observations such as differences in colour, relative size (bigger = closer) etc. and how these things can be utilised to highlight important perspectives on the information, or when misunderstood, to obscure or distort it.

Visual Integration
   -  This is the phase where the viewer makes the connections in the information that is presented, for example working out which graphical elements relate to which elements in the legend.

followed by the Cognitive Integration phase
   -  This is the bit where the viewer reads the presented information, understanding how the various elements interrelate and interpret the 'message' in your report.

Jen highlighted that people are inherently poor at judging 2D area and angles, also how chart-junk (including 3D effects obscure the information. She drew attention to the bullet graphs and sparklines now available in SSRS 2008 R2 and gave us a short demo on using them.

My Final session of the day was End to End Master Data Management... with Jeremy Kashel

This was a worthwhile introduction to MDS providing an insight into how the web front end is used to manage the process and how business rules are set up and applied to ensure that the consistency essential to the concept of Master Data are handled.

Jeremy also talked about how SSIS is leveraged to extract source data into the MDM repository and also to take clean and consolidated data back to the source systems.

Then it was on the the post event party where I was robbed of any prize for the 5 time running, but had a great time regardless.

Thanks again to all the hardworking guys who give up there time to make this the best SQL Server event in the UK time and time again. Please add your own memories from the event in the comments and remember to fill in your feedback forms!

See you all again in 6 months