Wednesday, June 23, 2021

Update Stats Sample Rate does not work

Create/Update statistics allows users to speicfy the sample rate. However, the sample rate may not work as you expected in some scenarios.



1.Tables that have not greater than 1024 pages

I'm going to use the table Production.Product  in AdventureWorks2019 to demonstrate

use AdventureWorks2019


create statistics IProductID on Production.Product(ProductID) with sample 20 percent


dbcc show_statistics('Production.Product','IProductID')


In this script, Sample rate is set to 20%.  However, the 'DBCC how_statistics' shows that the 'Rows' equals to 'Rows Sampled', which means it's 100% sampled.





Because for table with less than 1024 pages in the clustered index(if the table is heap, we count the indexid 0) of table,  SQL Server ignores the sample specified and  always  use 100% sampled.

In this case, the Prodcution.Product only has 15 pages, hence it's always 100% sampled.



Please note, sample 0 is an exception. If you specify 0, SQL Server does not create histogram.



2.Tables that have more than 1024 pages

SQL Server guarantee that at least 1024 pages will be sampled. If the sample rate specified is less than 1024, SQL Server will replace it with 1024 pages. If it’s greater than 1024 pages,  SQL Server will use following formula as sample rate:1024/Total Pages.



3.What if sample rate is not specified?

If the pages is greater than 1024, SQL Server picks up from smaller one from following two

  1. TotalPages
  2. 15*power(Rows,0.55)/TotalRows*TotalPages+1024)


For large table, the rows in the 1024 pages can be ignored.

A table has 1,000,000, then 15*power(1000000.0,0.55) =29929 rows will be sampled, almost 29929/1000000=2.9%

A table has 10,000,000, then 15*power(10000000.0,0.55) =106192 rows will be sampled, almost 106192/1000000=1.06%


Posted at