Different approaches of counting number of rows in a table

Today, I received an email from the developer asking if there is a better way instead of the COUNT (*) Transact-SQL statement, to count the number of records in a table. My reply to his question is yes, there are several methods to get this information from SQL Server. However, none of these approaches are perfect, and has its own disadvantages. In this blog post, I will show different methods of count number of rows in a table.

Before discussing the different approaches, I must emphasize that the COUNT (*) statement gives you the true count of the total number of rows in a table. The COUNT (*) statement performs the full table scan on heap table and cluster index scan on tables with clustered index, to get the exact count of the records in a table. Because of this, it can get slower as the table gets bigger, as effectively it is counting each row separately, which includes the rows that contains null values. For more information about COUNT (*), see MSDN resource here.

 Let’s take a look at different approaches of counting number of records in a table:

Approach 1: Counting rows using sys.partitions catalog view

As we know that we can use sys.partitions catalog view to check the structure of the table. This catalog returns one row for each partition of all tables and most types of indexes (except Full-Text, Spatial, and XML are not included in this view) in the database. We can join the sys.partitions catalog view with sys.tables catalog view to quickly get row count for all tables in the database. Here is the sample sys.partitions query, which I use to get the row count for all tables in the database:

SET NOCOUNT ON;
SET STATISTICS IO ON;

-- Ensure a USE [databasename] statement has been executed first.
SELECT SCHEMA_NAME(t.[schema_id]) AS [table_schema]
      ,OBJECT_NAME(p.[object_id]) AS [table_name]
      ,SUM(p.[rows]) AS [row_count]
FROM [sys].[partitions] p
INNER JOIN [sys].[tables] t ON p.[object_id] = t.[object_id]
WHERE p.[index_id] < 2
GROUP BY p.[object_id]
	,t.[schema_id]
ORDER BY 1, 2 ASC
OPTION (RECOMPILE);

Here is output when I run it against AdventureWork2012 system database:

01

Although it is one of the fastest ways to count the number of rows in a table, however, according to sys.partitions documentation, the count is not always accurate. I personally could not find anything on the MSDN that tells me the cases where the count will not be accurate. The only situation in which I found count is not accurate is when I run this query while DML operations are in progress against the table for which I am counting.

Approach 2: Counting table rows using sys.dm_db_partition_stats dynamic management view

Like sys.partitions, we can use sys.dm_db_partition_stats dynamic management view to count the number of rows in a table. This dynamic management view contains row-count information for every partition and displays the information about the space used to store and manage different data allocation unit types.  According to MSDN, the row_count column of sys.dm_db_partition_stats dynamic management view is approximate value, and Microsoft never reveals cases where counts are not accurate. However, similar to sys.partitions catalog view, the only situation in which I found count is not accurate is when I use it while DML operations are in progress against the table for which I am counting. Here is my query, which combines sys.dm_db_partition_stats dynamic management view and sys.tables catalog view, to get the row count for all tables in the database:

SET NOCOUNT ON;
SET STATISTICS IO ON;

-- Ensure a USE [databasename] statement has been executed first.
SELECT SCHEMA_NAME(t.[schema_id]) AS [table_schema]
      ,t.[name] AS [table_name]
      ,SUM(ps.[row_count]) AS [row_count]
FROM [sys].[tables] t
INNER JOIN [sys].[dm_db_partition_stats] ps
     ON ps.[object_id] = t.[object_id]
WHERE [index_id] < 2
GROUP BY t.[name]
	,t.[schema_id]
ORDER BY 1, 2 ASC
OPTION (RECOMPILE);

Here is output when I run it against AdventureWork2012 system database:

02

When I compared STATISTICS IO output of both queries, I noticed that the sys.dm_db_partition_stats dynamic management view query version is slightly faster than sys.partitions catalog view query version. Examine the STATISTICS IO output of both queries that shows that SQL Server is doing a less work to retrieve the row count information using sys.dm_db_partition_stats dynamic management view query version.

STATISTICS IO output for sys.partitions version of query:

Table 'syssingleobjrefs'. Scan count 75, logical reads 151, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysidxstats'. Scan count 77, logical reads 175, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysschobjs'. Scan count 0, logical reads 768, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysrowsets'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

STATISTICS IO output for sys.dm_db_partition_stats version of query:

Table 'syssingleobjrefs'. Scan count 75, logical reads 151, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysidxstats'. Scan count 76, logical reads 164, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysschobjs'. Scan count 0, logical reads 384, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Approach 3: Counting table rows using sp_spaceused system stored procedure

We can use the sp_spaceused system-stored procedure to return space usage information about a database or a table within a database. The syntax for this procedure is:

sp_spaceused [@objectname=['objectname']]
[,[@updateusage=['TRUE|FASLE']]]

You use the objectname parameter to enter the name of a table, indexed view, or Service Broker queue to retrieve space usage information specific to that object. If you set the updateusage to TRUE, DBCC UPDATEUSAGE updates usage information. The default value for this parameter is FALSE. You can run the command without any parameters to retrieve information about the current database:

EXEC [sp_spaceused]

This returns the following information:

  • Current database name
  • Current database size
  • Unallocated space
  • Reserved space
  • Space used by data
  • Space used by indexes
  • Unused space

To view information for a table in the current database, you must include the table name. For example, to look at information for the Person.Address table in the AdventureWorks2012 database, you would run:

EXEC [sp_spaceused] 'Person.Address'

This would return:

  • Table name
  • Number of rows in the table
  • Reserved space
  • Space used by data
  • Space used by indexes
  • Unused space

Here is the sample output in SSMS:

03

It is not necessary to include the parameter names (@parametername =) when running this command.

Finally, here is my script that uses sp_spaceused system-stored procedure to return the table usage information for all database tables, which includes count of rows in the table:

SET NOCOUNT ON;

-- Ensure a USE [databasename] statement has been executed first.
DECLARE @Database		            [nvarchar] (256)
       ,@TSQLCommand01              [nvarchar] (MAX)

SET @Database = DB_NAME()

IF OBJECT_ID(N'TempDb.dbo.#Table_Size_Info') IS NOT NULL
	DROP TABLE #Table_Size_Info

CREATE TABLE #Table_Size_Info (
	 [ID] [int] IDENTITY(1, 1) PRIMARY KEY
	,[ObjectName] [sysname]
	,[NumRows] [bigint]
	,[Reserved] [varchar](30)
	,[Data] [varchar](30)
	,[IndexSize] [varchar](30)
	,[Unused] [varchar](30)
	,[ObjectType] [char](1)
	)

SET @TSQLCommand01 = N''
SET @TSQLCommand01 = N'USE' + SPACE(1) + QUOTENAME(@Database) + N';' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'DECLARE @SQLStatementID02 [smallint] ,' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'@CurrentObjectSchema         [sysname] ,' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'@CurrentObjectName           [sysname] ,' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'@CurrentObjectFullName [sysname] ,' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'@CurrentObjectType           [char](1)' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'DECLARE @AllObjects TABLE ( [ID] [int] IDENTITY(1, 1) PRIMARY KEY , [ObjectSchema] [sysname] , [ObjectName] [sysname] , [ObjectType] [char](1) , [Completed] [bit] );' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'INSERT INTO @AllObjects ([ObjectSchema], [ObjectName], [ObjectType], [Completed])' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'SELECT  [TABLE_SCHEMA] , [TABLE_NAME] , N''T'' , 0' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'FROM    [INFORMATION_SCHEMA].[TABLES]' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'WHERE   [TABLE_TYPE] = N''BASE TABLE''' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'AND CHARINDEX(N'''''''' , [TABLE_NAME]) = 0' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'ORDER BY [TABLE_SCHEMA], [TABLE_NAME]' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'SELECT @SQLStatementID02 = MIN([ID]) FROM @AllObjects WHERE [Completed] = 0' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'WHILE @SQLStatementID02 IS NOT NULL' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'BEGIN' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'SELECT @CurrentObjectSchema = [ObjectSchema] , @CurrentObjectName = [ObjectName] , @CurrentObjectType = [ObjectType]' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'FROM @AllObjects WHERE [ID] = @SQLStatementID02' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'SET @CurrentObjectFullName = QUOTENAME(@CurrentObjectSchema) + ''.'' + QUOTENAME(@CurrentObjectName)' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'INSERT INTO #Table_Size_Info ([ObjectName] , [NumRows] , [Reserved] , [Data] , [IndexSize] , [Unused] )' + CHAR(13) + N'EXEC [sp_spaceused] @CurrentObjectFullName' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'UPDATE #Table_Size_Info SET [ObjectName] = @CurrentObjectFullName , [ObjectType] = @CurrentObjectType WHERE [ID] = SCOPE_IDENTITY();' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'UPDATE @AllObjects' + CHAR(13) + N'SET [Completed] = 1' + CHAR(13) + N'WHERE [ID] = @SQLStatementID02' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'SELECT @SQLStatementID02 = MIN([ID]) FROM @AllObjects WHERE [Completed] = 0' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'END' + CHAR(13)

EXEC [sp_executesql] @TSQLCommand01

SELECT *
FROM #Table_Size_Info
GO

Here is the output when I executed against AdventureWorks2012 database:

04

I have also compared STATISTICS IO output of this query with above two queries and noticed that it is much slower than other two queries that I shared in this blog post, but still much faster than equivalent SELECT COUNT(*) version.

I hope you liked this post :-) . Please feel free to use comment section, If you have any question regarding this post. You can also drop me question via twitter (@BasitAali).

7 thoughts on “Different approaches of counting number of rows in a table

  1. These methods are all very fast, but also all flawed. All of them take their information from system objects that are used to track at a physical level, not at a logical level. In other words, they don’t care at all about transactions and consistency.

    Try opening a second window in SSMS. In that window, type
    BEGIN TRAN;
    DELETE FROM HumanResources.EmployeePayHistory;
    go
    ROLLBACK TRAN;
    go

    Select and execute the first part (up to the first go), but do not execute the rollback yet.
    Now switch to another window in SSMS and try counting rows using any of the methods outlined in this post. All of them will report zero rows. Now switch to the first window and execute the ROLLBACK. Logically, the delete has never been executed and the table has never been empty.

    The same can happen with a data modification that rolls back immediately due to a constraint violation. The time slot of the incorrect results is much smaller of course. But to simulate high concurrency, you can have one window with a loop that continuously tries to delete from a table that is referenced by a foregin key constraint, then have another window with a loop that continuously checks the row count in both tables and outputs a message to the console when it sees an impossible number (non-zereo rows in the child table, but an empty parent table). Leave it running for a few minutes, then check the results.

    These methods are all fine if you need a good estimate of the number of rows in the table, e.g. for sizing purposes. But if you need the exact number as input to a business decision making process, SELECT COUNT(*) is the only way to go. (And remember to NOT use the NOLOCK hint or the READ UNCOMMITTED transaction isolation level; they suffer from the exact same problems)

    Like

  2. I have used almost all of the above techniques to count the rows in a table but unfortunately they don’t always give you accurate result. It relies on the statistics of the table. So if the statistics of the table is not accurate then it will give you false result. So I don’t trust these techniques.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s