Max degree of parallelism and maintenance tasks

In this posts I would like to show you a huge drawback of another commonly seen “best practice” — the one about setting ‘max degree of parallelism’ server option to 1.

What does this recommendation come from?

As our machines grow bigger and bigger many database administrators and programmers find out that databases which used to perform well, tend to slow down to the unacceptable level. What makes things even worse, their performance becomes unpredictable — the same query can sometimes finish in seconds, but the other times in minutes.

OLTP databases, the ones in which a lot of small queries are executed by many simultaneous sessions, seem to be particularly vulnerable to those problems. Because they often manifest themselves as CXPACKET wait, the knee-jerk response too often is to disable parallelism at the SQL Server level by setting ‘max degree of parallelism’ to 1. And surprisingly, this seems to work — queries response time is lower and constant again.

The problem is that SQL Server by itself scales really well, and CXPACKET wait means nothing more than parallel operations taking place. So, instead of doing further analysis, i.e. examining query plans and correlating CXPACKET waits with PAGEIOLATCH_SH ones (common pattern that implies large scans and possible lack of indices), we just shut down parallelism.

The hidden cost

Why am I questioning this “silver-bullet” solution? After all, simple and efficient solutions are quite rare and should be appreciated, shouldn’t they? Well, I’m doing this for three reasons:

1. Because it hides problems with database and/or client programs design — it is 21 century which means it is high time to accept and use the multiprocessor machines.

2. Because it hides the lack of database management, especially outdated or missing statistics.

3. Finally, because it can really slow down database maintenance tasks. And this is the reason I am going to show you right now.

Suppose, that one of your tables looks like this one — it is a table with more than 6 million rows that takes 1.3 GB. This table has a clustered, and 5 narrow non-clustered indices (the total space used by those additional indices is about 1.1 GB):

CREATE DATABASE [Test]
 ON  PRIMARY 
( NAME = N''Test'', FILENAME = N''C:Program FilesMicrosoft SQL ServerMSSQL11.MSSQLSERVERMSSQLDATATest.mdf'' , SIZE = 512MB , FILEGROWTH = 50MB )
 LOG ON 
( NAME = N''Test_log'', FILENAME = N''C:Program FilesMicrosoft SQL ServerMSSQL11.MSSQLSERVERMSSQLDATATest_log.ldf'' , SIZE = 150MB , FILEGROWTH = 25MB);
GO

ALTER DATABASE [Test] SET RECOVERY SIMPLE;
GO

USE Test
GO

CREATE TABLE [dbo].[Test](
    [ProductKey] [int] NOT NULL,
    [OrderDateKey] [int] NOT NULL,
    [DueDateKey] [int] NOT NULL,
    [ShipDateKey] [int] NOT NULL,
    [ResellerKey] [int] NOT NULL,
    [EmployeeKey] [int] NOT NULL,
    [PromotionKey] [int] NOT NULL,
    [CurrencyKey] [int] NOT NULL,
    [SalesTerritoryKey] [int] NOT NULL,
    [SalesOrderNumber] [nvarchar](20) NOT NULL,
    [SalesOrderLineNumber] [tinyint] NOT NULL,
    [RevisionNumber] [tinyint] NULL,
    [OrderQuantity] [smallint] NULL,
    [UnitPrice] [money] NULL,
    [ExtendedAmount] [money] NULL,
    [UnitPriceDiscountPct] [float] NULL,
    [DiscountAmount] [float] NULL,
    [ProductStandardCost] [money] NULL,
    [TotalProductCost] [money] NULL,
    [SalesAmount] [money] NULL,
    [TaxAmt] [money] NULL,
    [Freight] [money] NULL,
    [CarrierTrackingNumber] [nvarchar](25) NULL,
    [CustomerPONumber] [nvarchar](25) NULL,
    [OrderDate] [datetime] NULL,
    [DueDate] [datetime] NULL,
    [ShipDate] [datetime] NULL);
GO

INSERT INTO [dbo].[Test]
SELECT S1.*
FROM [AdventureWorksDW2012].[dbo].[FactResellerSales] S1;
GO 100

USE [Test];
GO

CREATE CLUSTERED INDEX IdxCl
ON [dbo].[Test] ([SalesOrderNumber]);

CREATE INDEX IdxNcProductKey
ON [dbo].[Test] ([ProductKey]);

CREATE INDEX IdxNcEmployeeKey
ON [dbo].[Test] ([EmployeeKey]);

CREATE INDEX IdxNcDates
ON [dbo].[Test] ([OrderDateKey], [DueDateKey]);

CREATE INDEX IdxNcPriceQuantity
ON [dbo].[Test] ([UnitPrice],[OrderQuantity]);

CREATE INDEX IdxNcCosts
ON [dbo].[Test] ([TotalProductCost],[ProductStandardCost]);

EXEC sp_helpindex ''Test'';
GO

SELECT idx.name,idx.index_id,page_count*8/1024 AS SizeMB,index_depth
FROM sys.indexes AS idx
JOIN sys.dm_db_index_physical_stats(db_id(),object_id(''Test''),NULL,NULL,NULL) as ph
    ON idx.index_id=ph.index_id
WHERE idx.object_id =object_id(''Test'');
GO
------
index_name        index_description                    index_keys
IdxCl             clustered located on PRIMARY         SalesOrderNumber
IdxNcCosts        nonclustered located on PRIMARY      TotalProductCost, ProductStandardCost
IdxNcDates        nonclustered located on PRIMARY      OrderDateKey, DueDateKey
IdxNcEmployeeKey  nonclustered located on PRIMARY      EmployeeKey
IdxNcPriceQuantity nonclustered located on PRIMARY     UnitPrice, OrderQuantity
IdxNcProductKey    nonclustered located on PRIMARY     ProductKey

----
name                index_id    SizeMB    index_depth
IdxCl                1            1339    4
IdxNcProductKey      2            199     3
IdxNcEmployeeKey     3            199     3
IdxNcDates           4            223     3 
IdxNcPriceQuantity   5            235     3
IdxNcCosts           6            270     4

 

To give you a whole picture, the database is running in the simple recovery model (so, index rebuild will be minimally logged), the machine has 16 GB of RAM, one 4-core processor with hyper threading enabled and the database files are on a disk that can handle about 50 000 IOPS.

What I am going to test is the index rebuild with the default ’max degree of parallelism’ settings (0) and with the most restrictive one (1). We will start with the default configuration:

SELECT value_in_use
FROM sys.configurations
WHERE name = ''max degree of parallelism'';
GO
-----
value_in_use
0

 

All the indices for this table were recreated in 13 seconds (the statement was executed twice, and the result of the first execution was discarded — I don’t want to measure the disk read throughput).

ALTER INDEX ALL 
ON [dbo].[Test]
REBUILD;

 

Now it is time to set ’max degree of parallelism’ to 1 and repeat the same statements:

EXEC sp_configure ''show advanced options'',1;
RECONFIGURE;
EXEC sp_configure ''max degree of parallelism'',1;
RECONFIGURE;
GO

ALTER INDEX ALL 
ON [dbo].[Test]
REBUILD;

 

By disabling parallelism at the server level I managed to increase index rebuild time to 35 seconds (yes, it was almost three times worse). Let me use the power of visualization to convince you that disabling parallelism may hurt you really bad:

image

Don’t forget to reset server options when you are done:

EXEC sp_configure ''max degree of parallelism'',0;
EXEC sp_configure ''show advanced options'',0;
RECONFIGURE;

 

The better solutions

OK, suppose that you are experiencing high CXPACKET wait, and you are not so sure any more about disabling parallelism. What can you do? There are some solutions:

1. Consider disabling hyper threading, especially for OLTP databases. As a side effect you will increase the cache available for each processor (the amount of cache stays the same, but the number of logical processors will be lower).

2. Make sure that appropriate indices exist and statistics are up-to-date.

3. Consider setting ‘cost threshold for parallelism’ to the higher value. The default value (5) has not changed for more than 15 years and now many queries that should not be executed in parallel have higher estimated cost.

4. Consider setting MAXDOP for the problematic queries.

5. If you are using Enterprise Edition, consider using Resource Governor to limit parallelism for user queries.

Share this article ...

Google Plus
Ihren LinkedIn Kontakten zeigen



This entry was posted in Administration and tagged , , , by Marcin Szeliga. Bookmark the permalink.
Marcin Szeliga

About Marcin Szeliga

Since 2006 invariably awarded Microsoft Most Valuable Professional title in the SQL category. A consultant, lecturer, authorized Microsoft trainer with 15 years’ experience, and a database systems architect. He prepared Microsoft partners for the upgrade to SQL Server 2008 and 2012 versions within the Train to Trainers program. A speaker at numerous conferences, including Microsoft Technology Summit, SQL Saturday, SQL Day, Microsoft Security Summit, Heroes Happen {Here}, as well as at user groups meetings. The author of many books and articles devoted to SQL Server.

Comments are closed.