You can find some great and overwhelming load of not so good information about optimizing tempdb performance all over the Internet. This post has one purpose â€” to warn you of the consequences that you will face blindly following one of those â€śbest practicesâ€ť. Namely, if somebody tells you without even checking what your storage is and how tempdb is being used, that you should have as many tempdb files as processors, watch out. This recommendations comes from ancient, SQL Server 2000 times and since then everything has changed:
1. Modern servers have a lot more CPUs then those that were used more then 10 years ago.
2. SQL Server 2005 introduced, along many others, an important optimization â€” it may cache temporary objects (temporary tables and table variables) across invocations of a routine (like a stored procedure or a trigger). Because an IAM page and one date page of those temporary object will be kept in cache, they will not need to be allocated and deallocated every time you execute this routine.
This is why the recent recommendation changed to this:
1. If you have less than 8 cores, the number of tempdb files should be the same as the number of cores (cores, not logical processors).
2. Otherwise, the number of files should be 8.
3. If you are experiencing allocation page contention, add more files in 4-file chunks and monitor for this type of contention. Also consider enabling trace flag 1118 before a large number of file is added (enabling this flag means that SQL Server will be allocating full extents to each tempdb object).
You can also find a slightly different recommendation, that the number of tempdb files should be equal to Â˝ or ÂĽ of the number of cores.
Letâ€™s see how this recommendation works for directly attached storage (DAS) when all tempdb files are on the same physical disk.
Fact 1. tempdb is a shared resource, which means that it can be monopolized by just one offending session. As a result the whole SQL Server instance may be brought to its knees.
Fact 2. tempdb performance is strictly correlated to the performance of I/O subsystem that it uses.
Fact 3. There can be only one filegroup for tempdb. Hence, SQL Server always uses the proportional fill algorithm plus the round robin one for this database. As a result, all you can do to optimize tempdb I/O performance is to create files of equal sizes and the same amount of data in each of them.
Fact 4. Even very efficient algorithms (like the round robin) have some overhead. Therefore, distributing data across many files is a little bit slower than writing the same amount of data to only one file.
Fact 5. Writing or reading from a single file will never be slower than writing or reading simultaneously to/from many files laid on the same DAS magnetic disk (either SATA or SCSI/SAS). The problem is that the opposite is often true â€” if your disk cannot sequentially handle many simultaneous I/O request (DAS drivers usually canâ€™t), its performance will drop down by the factor of 10.
Seeing is believing
The test is fairly simple â€” I am going to execute simultaneously the same query and measure the total execution time. This query is written in such a way that every time it executes, it spills sort to tempdb. As I am using 4 core, 8 logical CPU Intel i7, the tempdb will be split into maximum 4 files. This test will be repeated four times:
1. For a single tempdb file (the data and the log file are on the same SATA 10.000 RPM drive).
2. For a tempdb divided into 4 equal size files (all data files and the log file are on the same SATA 10.000 RPM drive).
3. For a single tempdb file (the data and the log file are on the same SSD drive).
4. For a tempdb divided into 4 equal size files (all data files and the log file are on the same SSD drive).
Before each test, SQL Server is restarted â€” this is the only way to enforce tempdb configuration changes, and ensure that tempdb files are equally filled with data.
The first configuration looks like this:
The query Iâ€™m going to use comes from Ramesh Meyyappan â€” if you are not familiar with it, go to http://www.sqlworkshops.com and see his webcasts for additional explanations:
The execution time was 25 seconds.
Now let me change tempdb configuration and restart SQL Server:
This time the same queries need 30 seconds to execute.
Now let me drop additional tempdb files and move both primary and log file to the SSD drive:
After the restart, the query was executed one more time. This time it completed in 6 seconds.
Finally, I have split tempdb into 4 files again, but this time placed all of them on a SSD drive. The execution time stays basically the same, at 6 seconds.
The following chart summarizes the results:
Lesson learned â€” to really benefit from best practices, you should not pick just one of them. For example, if you decided to ignore the one about placing tempdb on a dedicated, fast disk with many spindles, following the other about splitting tempdb into multiple files may actually hurt you.
In the next part of this series we will see why splitting tempdb into multiple files is considered to be a best practice in the first place.
See you then