Upcoming AI talks and workshops in Denmark

I will be traveling to Denmark to give a few talks and workshops on the topics of AI, machine learning, and deep learning.

  • 29-30 May – Intelligent Cloud Conference in Copenhagen, where I will give two talks: Introduction to deep learning and Microsoft Cognitive Framework and AI in the cloud: Machine learning, cognitive services, and bots.
  • 5 October – Emerging AI developer in Lyngby, Denmark.

Hope to see you there!

Upcoming AI talks and workshops in Denmark

Integration Services (SSIS) internals and performance tuning

Back in 2014 I created a 214 page deck for a pre-con at SQLSaturday Johannesburg and Cape Town. Since then I’ve presented it at a good number of large and small conferences.

As I don’t really work much with SSIS anymore, I thought I’d publish the entire deck here and hopeful someone would find value in it.

The material is quite text-heavy, so it should be readable, and it is compiled from a number of sources (see the last four slides). Also, please note that it doesn’t cover SSIS 2016 and some of the buffer optimizations that came in that version.

Integration Services (SSIS) internals and performance tuning

SQL Server Performance Troubleshooting Free Scripts and Tools List

Back in the days, I used to collect a lot of different scripts, tools, and other goodies for troubleshooting SQL Server performance issues. These days, however, I tend to use what is publicly and freely available (as well as some internal stuff), and keep a list of those in my head.

I’ve meant to write that list down for a while, and today Chrissy asked:

So here it is…

Disclaimer: While I do work as a Premier Field Engineer for Microsoft, this is my list – this is not an official list from my employer. Others likely have a list that differs from mine.

Free scripts and tools from Microsoft
These are scripts and tools provided for free by Microsoft that I use. Some of them come “as-is” without any official support.

  • BPCheck – This is by far my favorite script and one I use all the time. I like to joke that there is one good thing and one bad thing about this script. The good thing is that it gives me a lot of information. The bad thing is that it gives me a lot of information. It can be quite overwhelming the first time you run it, but once you have used it a few times, it is really useful.
  • Pssdiag/Sqldiag Manager – This tool creates a customized pssdiag utility that uses sqldiag to collect diagnostics data, such as profiler trace, extended events, perfmon and DMV information. I use this all the time for data collection used to troubleshoot performance problems. There is also version for Linux available (see this blog post, for how to use the Linux version).
  • SQL Nexus – Loads and analyzes performance data collected by Pssdiag. It loads the data into a database and does some reporting on top of that. Very useful.
  • PAL – The Performance Analysis of Logs (PAL) tool reads a perfmon counter log and analyzes it using different threadholds. It generates an HTML based report (or XML, if you prefer), which shows which counters to look out for. It has templates for SQL Server workloads, but work with a lot of other workloads too.
  • RML – The Replay Markup Language (RML) utilities can be used to analyze profiler traces or replay the trace file against another instance of SQL Server. For how to use it, read this blog post.
  • WinDbg – Windows Debugger. Useful for debugging a memory dump.
  • Tigertoolbox – This is the repository for the SQL Server Tiger Team and contains a lot of good stuff. Maybe you need some SQL Performance Baseline reports, fix your VLFs, or view the waits and latches. It is also the home of BPCheck – go check it out!
  • diskspd – Storage load generator / performance test tool. It replaces SQLIO. I use this if I want to benchmark the IO subsystem of a new machine / VM. There are a couple of good blog posts about how to use diskspd here and here.
  • SQL Server Management Studio – SSMS is now a standalone application (it used to ship with the SQL Server installation), and got quite a few improvements in the newer versions.
  • sysinternals – Tools for advanced troubleshooting on Windows, which I use from time to time.
  • SQL Server Data Tools – Not really a tool that I use for performance troubleshooting, but it is in my toolbox for when I have to look at an SSIS package.
  • Visual Studio Code – My go-to editor for… pretty much everything.

Free scripts, tools, and resources from the community
I also use a lot of stuff that is made available by the awesome SQL Server community.

  • SQL Server Diagnostic Information Queries – Before I discovered Glenn Berry‘s DMV queries, I used to have my own script collection. Not anymore. Glenn updates his queries every month. While I personally prefer to use BPCheck to get the information from the DMVs, I find that Glenn’s scripts might be a bit better structured and easier to use.
  • sp_whoisactive – Think sp_who2… just much much better. This script by Adam Machanic shows which statements are being executed right now, and can include the estimated execution plan, acquired locks, tempdb and resource allocation, and  more.
  • sp_helpindex rewrite – sp_helpindex that ships with SQL Server does not show newer functionality (such as included columns and filters), so Kimberly Tripp rewrote it to include this information. Also, be sure to read her blog post on indexing strategies.
  • Ola Hallengren’s SQL Server Maintenance Solution – Ola’s scripts are great and really easy use, and make life easier for a lot of DBAs around the world. The index and statistics maintenance part of the script is an easy way to reduce index fragmentation and make sure your statistics are up to date.
  • Plan Explorer – While SSMS can show the execution plan just fine, Plan Explorer makes it just a little easier to read.
  • Statistics Parser – Parses the output of SET STATISTICS IO and SET STATISTICS TIME and makes it a lot easier to read and analyze.
  • Poor SQL – Ever got a SQL query that is just really poor formatted? Or even just one line? This site will help you format it to something readable.
  • dbatools – I don’t use these PowerShell modules for performance troubleshooting, but they are useful for checking best practices and other DBA tasks.
  • SQL Server Version List – I use this up-to-date list of SQL Server versions to check if any service packs or commutative updates are missing.
  • SQL Server Wait Types Library and Latches Classes Library – While I do remember the most common wait types and latches classes that I see all the time, I sometime run into one that I haven’t seen before or just can’t remember what is. Paul Randal‘s libraries are really useful resources to look up those.

That’s it! If you have a free tool or script that you use (or wrote yourself), feel free to leave a comment – would love to check it out!

SQL Server Performance Troubleshooting Free Scripts and Tools List

Data movement when partitioning existing table in SQL Server

The other day, Jørgen Guldmann (@jorgenguldmann) asked if data would move if we would partition an existing table. The table was quite large, billions of rows, so moving data could take quite a bit of time. He wanted all the existing table in one partition, and then new data loaded in the future to be placed in new partitions. I have seen this with a number of Data Warehouses, where the partitioning strategy for the fact tables was designed and applied after the database had been put into production and been running for a while. So, I thought to test it out to make sure I would provide a good answer.

In one of my previous blog posts, I wrote about when a partition split move data. So the question is, would we see the same behavior where we see LOB_INSERT_ROWS and LOB_DELETE_ROWS entries in the transaction log?

tl;dr: We don’t see data movement with LOB_INSERT_ROWS and LOB_DELETE_ROWS entries in the transaction log. However, the creation of the clustered index copy the data pages from the old structure (heap or existing clustered index) to the new structure – even if it is on the same filegroup. So yes, data will move and it can have a significant performance impact.

In my test, I have a small-ish heap called Facts.Sales, with 1,000,000 rows, in a database called PartitonTest. I have then created a right range partition function SalesPF on the TransactionDate column and a partition scheme SalesPS, which are using the same filegroup as the heap. The partitioning function would have the boundary value right of the existing values in TransactionDate, making sure that all existing data will end up in the first partition.

To partition the existing table, we need to create a clustered index on the heap. If the table was a clustered index, we would need to do the same thing using WITH (DROP_EXISTING = ON).

But before we create the clustered index, and partition our table, let’s take a look at the pages the data is located on in our heap by using DBCC IND:

DBCC IND ('PartitionTest', 'Facts.Sales', 1);

This shows that data is placed on the data pages between page 8 and page 3131.

Now, let’s create the clustered index on the partition scheme SalesPS:

ON Facts.Sales(SalesID, TransactionDate)
ON SalesPS(TransactionDate);

After this, DBCC IND shows us that the data has been moved to the data pages between page 24,600 and page 27,599.

So, data has been moved.

If we take a look at the Books Online article for index disk space, it states that:

Whenever an index is created, rebuilt, or dropped, disk space for both the old (source) and new (target) structures is required in their appropriate files and filegroups. The old structure is not deallocated until the index creation transaction commits. Additional temporary disk space for sorting operations may also be needed.

This aligns with what we saw in the above test.

If we take a look at the transaction log using fn_dblog for the entries related the transaction for the clustered index creation, we only get 3,123 entries. These entries are related to metadata operations and extend allocation. This is a very small footprint on the transaction log, compared to the when data is moved due to a partition split. I reran the test where data is placed in several partitions, and the result was the same. That being said, large-scale index operations can generate large data loads that can cause the transaction log to fill quickly.

In conclusion, when designing Data Warehouses and large fact tables, it is a good idea to design the partition strategy up front rather than apply it after the database is already in production and the fact table contains a large amount of data. If applied afterwards, partitioning an already existing table will cause data movement when creating the clustered index and have a significant performance impact.

Data movement when partitioning existing table in SQL Server

First time speaker at PASS Summit: How I prepared

PASS Summit 2014 was the first time I was selected to speak at a truly major tech conference. While I have spoken at over 30 conferences over the past two years, some of them with several hundreds of participants, this one is different; PASS Summit had 5,900 registrations. My session had 233 attendees, and the room felt pretty packed with a few people standing in the back and sitting on the floor.

PASS Summit 2014 talk

The talk selected was called Integration Services (SSIS) for the DBA, which was a talk I had already delivered at SQLBits XII earlier the year. While my speaker evaluation from SQLBits was very good, I was not quite happy with the talk I already had. Therefore, I decided to restructure it.

And it paid off. Big time.

My speaker evaluation score was well above average, and one of the best I have ever got. Some of the comments in the evaluation were:

AWSOME presentation

The session was very well balanced. Great amount of slides and demos.

Fantastic, well prepared session, with great demos incl. Game of thrones demo!!

Explanation of version differences, in particular, was enlightening. Thanks.

Some very useful management techniques learned

David was extremely engaging and made the topic fun for everyone in the session.

Structuring the presentation

Structure is everything. When I wrote my dissertation at the University of Brandford, my supervisor introduced me to the following structure:

  1. What?
  2. How?
  3. So what?

Another structure is the three-act structure, which is mostly seen in writing, screenwriting, and storytelling.

I used the structure I had learned at university, and ended up with this:

  1. What is SQL Server Integration Services (SSIS)?
  2. How do I deploy, configure, and execute SSIS packages?
  3. How do I monitor and troubleshoot SSIS packages?

While the last part is not a “so what?” question, I thought that it would still fit into that category. At least close enough.

I also planned a five-ish minutes introduction, where I introduced myself, the topic, who the talk was for (I wanted to make sure that people knew it was a level 200 DBA focused talk, and not a deep dive into the topic from a developer point of view), and the three things above that they would learn.

Before I started working on the slides or demos, I wrote down a detailed plan for everything in Evernote. I find that I am much faster in creating slides and demos, when I know exactly what it is I need to do. Also, writing these things down in plain text, makes it easier for me to think about the content and structure, than if I was doing it in PowerPoint.


I am not a big fan of slides with a lot of text. I know some people can pull it off, like Kimberly Tripp (twitter | blog) and Paul Randal (twitter | blog), but most people can’t; myself included. I do it for my pre-con material, because it is a whole day training, and it is nice for the audience to be able to go back and go through the notes. However, not for short talks.

Instead, I prefer a big font with just a few words. I am a big fan of Zack Holman’s slide decks, and really would like to be better in creating beautiful slide decks.

The slides would follow the structure that I had written down in Evernote.

I used PowerPoint, although it is not my favorite slide deck tool. I have done things on my Mac in Keynote and Deckset as well, which is a lot easier. However, I work in a Windows world where PowerPoint is the de-facto standard, so I have come to learn to live with it.

When using PowerPoint, I do the following:

  • Disable the presenter view. When using this the laptop goes from mirroring the screen to extending it. When I go out of the presentation to do a demo, the laptop goes back to mirroring the screen. The problem with this is that the projector will need to readjust. Best case, this gives a couple of seconds of black screen and a bit of a flickering screen for another second or two. Worst case, it messes up the resolution. Don’t use it; use speaker notes on a separate device (iPad, Surface, piece of paper) instead.
  • Disable the Slide Show Popup Toolbar. I know others who like to draw on their slides, and probably should keep this. I don’t, and don’t want it to show up if I accidently hit my mouse.
  • Disable the mouse cursor during the presentation. I don’t need it, and it is an annoyance if I accidently hits my mouse or trackpad.


My demos took place in SQL Server Management Studio and SQL Server Data Tools. I tried to put in a bit of geekiness and humour, rather than just using the standard demo databases. For example, I got hold of a Game of Thrones data set that I used, and had Dr. Who quotes in my data flow. It spiced things up a bit, and worked really well.

The demos should be complex enough to prove your point, but simple enough for the audience to follow. I prefer to make simpler demos, rather than big complex demos. Less things can go wrong.

In my first demo, I started designing a package from scratch. While Buck Woody (blog) says “never type in demos”, I like how others do type in demos; it brings a bit more life to it. So, I drag-and-dropped my first demo, but only halfway. Then I brought in a finished package, so the audience did not have to see me drag-and-drop for 15 minutes. It worked very well, and much better than if I had only used the finished package to demonstrate how SSIS works.

The rest of my demos were pre-scripted in T-SQL.

Other things I did was:

  • Have a reset strategy. Although it was not one-click, I could reset my demo in about 30 seconds.
  • I had configured SQL Server Management Studio to use a larger font size, displaying the results in a separate tab, include the query in the result set, and added line numbers. I typically go with size 16, but the resolution was higher than I had been told, and it was hard to read from the back of the room, so I bumped it up to 18 when I did my tech check. I have also enable lined numbers, and show query result in a new tab.
  • I used to use ZoomIt, but it had some things that bothered me. It did not feel smooth when being used. Instead I switched to using Windows 8.1’s build-in magnifier, which can do multi-level zoom while still demoing, and seems more smooth than Zoomit. However, it does some funny things when being used inside VMware Workstation, as the mouse cursor sometimes shows up double (one inside the VM, and one at the host system) when it hovers over the VMware title bar.
  • I considered using bigger fonts in Visual Studio as well, but settled for using the magnifier. I think that worked better.

Be Utterly Prepared (No excuses)

As Scott Hanselmann (blog) | twitter) writes in a blog post about being prepared for absolute chaos at technical presentations: “Be Utterly Prepared (No excuses)“.

I try to be prepared for every talk to do, but this time I took it to another level. I have probably put in at least 80 hours into this one hour and 15 minutes talk; not including the time spent on the first version for SQLBits.

And again, it paid off. Everything ran very smooth. Here are some of the comments in the speaker evaluation:


Presenter was very knowledgeable about the subject. He kept the smooth pace throughout the session and kept it interesting.

The session was awesome. Very interactive. He kept audience awake and engaged in the presentation. He was relaxed but at the same time very focused. He presented enormous amount of information but in a very easy to follow and understand way.

The audience were kept well engaged- excellent use of humour and great slides.

Excellent! You really made a connection with your DBA audience!

Great personality. He enjoys his work & it shows in his lectures.

He had a really god connection with the crowd

Speaker had great energy and was very passionate and knowledgeable about the material. Would have bought whatever had he been a sales guy. Great job.

Very well done presentation; good match of material to time allotted. Pace was right on. Nice job!

Awesome sample data. We love the disarming, personal touch.

The speaker was very knowledgeable on the content and did an excellent job repeating questions and answering them.

David was extremely engaging and made the topic fun for everyone in the session. Loved it!

Here are some of the things I did to be utterly prepared:

  • I wrote down detailed speaker notes in Evernote. Not every sentence I wanted to say, but keywords and sentences in a bullet point format. This way I knew what to say when. And then I had the speaker notes on my iPad on stage. Nobody seemed to notice it, as I simply glanced on it from time to time, and it helped me not forget anything. It was the first time I did this, but definitely not the last time. It really helped me to remember some of the details.
  • Backup speaker notes. Yes, even with my speaker notes on my iPad, I had a few details I kept forgetting when doing my trial talks. I wrote them down on a few pieces of paper, in large writing, and placed them on the table on stage.
  • I practiced. And then I practiced some more. Until I knew my talk inside out. I always do a few practice runs before a talk, but this time I did it significant more times than usual. And it paid off. I am not very good at winging it, mostly because I believe that structure is so fundamental to a good talk; and to be honest, I would feel a bit like a speaker 47 if I did.
  • My demos worked. But oh my, did I have a lot of demo fails during my trial runs. Minor stuff mostly, but still something that annoyed me. Trying them out over and over and over again helped me getting rid of those glitches, and made everything run very smooth.
  • I ran my demos inside VMware Workstation. One of the benefits of using a VM is that it is easier to move the demos to a backup computer. However, I really want to fix the issues mentioned above I had with the magnifier (ZoomIt was even worse).
  • I brought not one, but two backup laptops. Both with identical setup (slides and VM for demos) as the primary laptop. And am I happy I did! Because both of them failed. Luckily my primary laptop worked just fine.
  • I am getting really good at time management. I made sure that I had a few extra demos, if I was having too much time, but also made sure that they were not demos that were essential to the talk. I ended up using one extra demo, and cutting three others. I used to have my iPad standing showing a watch, but one of the best thing I’ve bought was a Logitech Professional Presenter R800, which has a build in LCD display that shows the remaining time.
  • I decided that I didn’t want to end my talk with taking questions. This was a tip I learned from Scott Hanselman, as it is better to end the talk on a high. Instead I took questions throughout the session. I was so happy for this decision; because what happened for most other presenters I saw taking questions in the end was that the audience started walking out during the questions. Not a few of them, but the majority. Instead my talk ended with a link to the slides and demos, a “thank you so much for attending and have a great conference”, and an applaud from the audience.

On the day

One of the most important things I do when I arrive at a conference is to do a tech check. While PASS Summit did have a projector in the speaker ready room, I went during the lunch break to the actual room I had to present in, and did a tech check. There were trouble with the projector, as the resolution was not the one we were told (which unfortunately showed in a number of other talks, where the screen resolution was screwed, and some of the screen was cut off). I got a technician to take a look at it, and got it fixed within 5 minutes or so. Don’t wait doing a tech check until right before the talk. Never. Things go wrong all the time (I have learned this the hard way).

I have now a pretty big bag with different cables, converters, USB sticks, batteries, extra mouse and extra presenter. I didn’t need any of it, but in the past I have used all of it at one time or another.

While doing my tech check, I went several times to the back of the room, and made sure everything was working the way it was supposed to.

  • Does the resolution look correct? Is the image not stretched? I had trouble here, and needed to get a technician. The problem was that my laptop couldn’t figure out the correct resolution automatically.
  • Is something cut off on the projector? This happens more often than not that the bottom or one of the sides of the screen is cut off, and also happened here. After changing the resolution and pressing the auto adjust on the projector, this was fixed.
  • Was the font on the slides big enough? It was.
  • Did the colours look good? Are they readable? Are the photos and other graphical elements sharp and easy to see? Check, check, check.
  • Was the font in the demos big enough? It wasn’t, so I bumped it up.
  • Did the magnifier work as it was supposed to? Yap.
  • Was the Logitech presenter and mouse working and fully charged? Yes.
  • Was the laptop fully charged and plugged in (and actually charging). Always check if it is actually charging – I have tried to run out of battery mid-talk.

During the talk

I always try to start right on time (I am Danish that way), and end on time. Sometimes I small talk with the audience before the talk, sometimes I don’t. If I do, I am usually being a bit silly or telling a funny story. This time I didn’t. Next year, I will, as I think it works pretty well.

I followed the prepared script pretty close, but still allowed time for questions throughout the talk and going a bit off-script here and there. Not much, but a couple of the jokes were not scripted (some were though).

Most importantly, I had a lot of fun. The first couple of minutes are always the hardest. Will they like it? Is everything working fine, or will the laptop blow up (or just run out of battery) on stage? But after that, I am having a blast. One of the best comments I had was that I really take my personality with me onto the stage. I try to do that, as I want to show people that I am having fun doing this.

To sum up, I simply try to not waste everybody’s time for an hour and 15 minutes by being prepared.

Further reading:

A few blog posts and online courses that I recommend anyone serious about public speaking at tech conferences. They helped me a lot.

First time speaker at PASS Summit: How I prepared

Upgrading to SQL Server 2014 – Campus Days talk (slides in Danish)

Below you will find slides, demos, and references for my Campus Days talk on “Upgradring to SQL Server 2014”. The slides are in Danish, as it was a Danish conference.



Demos can be found at github.

Further reading / references

Upgrading to SQL Server 2014 – Campus Days talk (slides in Danish)

Integration Services (SSIS) for the DBA

Below you will find slides, demos, and references for my PASS Summit 2014 talk on “Integration Services (SSIS) for the DBA”.



Demos can be found at github.

Further reading / references

Integration Services (SSIS) for the DBA