An Overview on SQL Server Checkpoints

checkpoint

Commercial database systems like SQL server have many recovery mechanisms to restore data hardware or software failure. Checkpoints are part of such recovery mechanism.SQL Server Database Engine is programmed to perform changes or modifications to database pages in buffer cache (memory) first and after some processing the modification are written on the disk files. This “processing” involves Checkpoint which writes all the dirty pages existing on Buffer Cache to Physical disk. It also enters log records from buffer log to physical file. Checkpoint in SQL Server was introduced to reduce the time required for recovery during an unexpected shutdown or system failure. Database Engine issues a checkpoint for each database on regular intervals.

Various factors and conditions (depending upon the Recovery Interval settings done) commit a checkpoint to issue. It can be an execution of ALTER DATABASE command, manual execution of CHECKPOINT, Server clean shutdown, or even in case SQL database is in SIMPLE mode and its log is 70% full.

Talking about Checkpoints, a regular definition states that once a transaction is made checkpoint writes about it. But this is not true, checkpoint writes about all the pages which have been changed (marked dirty) since the last checkpoint. It doesn’t depend on transaction, whether it is committed or not. tempdb is an exception where data pages are not written to disk as a part of checkpoint. Below section will elaborate in detail, what exactly happens when checkpoint is triggered;

Working Operation of CHECKPOINT

Changes done in memory of respective databases are checked as per last checkpoint and then all the dirty pages of databases are written to the disk. It is independent of the state of the transaction which has made the changes. Then, all the log records along with the most recent log record describing changes made in the database are written to disk first before page is written to disk assuring the recovery can be done through this write-ahead logging.

Sequential entries are made to the log from all the transactions. And it is not possible to write selective records to disk. Thus, when a user writes a dirty page to disk even if having only single log record creating problem in it, all the log records prior to this log record will be written on the page.Later, generation of log records with information about checkpoints takes place.The LSN (Log Sequence Number) of checkpoint is recorded in boot page of that database in the dbi_checkptLSN field along with other critical information. If SIMPLE Recovery mode is assigned, the VLFs in log are checked if it is possible to mark them inactive. These tasks are independent of what type of CHECKPOINT has been made (Manual or Automatic).

Categories of SQL Server Checkpoints

Automatic

Automatic Checkpoint is the most common one, and it issues automatically in the background as per the settings done in Recovery Interval server configuration option. This Recovery Interval parameter is defined at server level. By default, this parameter value is 0(zero) in which target recovery interval is 1 minute. Automatic checkpoints are throttled on the basis of number of outstanding writes and on the fact whether Database Engine senses any rise in write latency above 20 milliseconds.

The following is the query to define the [recovery interval]:

USE [master];
GO

EXEC [sp_configure] '[recovery interval]', 'seconds'
GO;

Indirect

Indirect Checkpoints were added in SQL Server 2012 and this also runs in the background but the difference is it runs on the basis of user-specified target time for recovery for respective databases. If user has used ALTER DATABASE to set TARGET_RECOVERY_TIME as >0, it will be used overriding the Recovery Interval specified at server level completely, avoiding Automatic Checkpoint for that Database. It has been observed that Indirect checkpoint are faster and provide more predictable recovery time as compared to automatic checkpoints.

Recovery time required for database recovery is reduced when indirect checkpoints are preferred. This is done by factoring in cost of random I/O during a REDO operation. These checkpoints also reduceassociated I/O thwarting by continuously writing dirty pages to disk.

Here is the syntax of the query to produce indirect checkpoint:

USE [master];
GO

ALTER DATABASE … SET TARGET_RECOVERY_TIME = target_recovery_time { SECONDS | MINUTES}
GO;

Manual

As the name defines, this command runs like any other T-SQL statement and once issued it will run to its completion. It must be noted that Manual Checkpoint will run for the current database only. Checkpoint_Duration can also be defined in seconds at database level, which defines the time to complete checkpoint and is optional.

Here is the syntax to issue manual checkpoint:

USE [master];
GO

CHECKPOINT [ checkpoint_duration ]
GO;

Internal

User cannot control these Internal Checkpoints. This is issued by various server operations like; backup & database-snapshot creation ensuring that the images taken synchronize with the state of log. Following events will follow up with generation of internal checkpoints;

When databases are added or removed with ALTER DATABASE command. It is triggered when backup of a database is taken or database snapshot is created explicitly or internally. A clean Shutdown (Shutdown with NOWAIT) will also trigger internal checkpoint. Changes in Recovery Model from Full\Bulk-logged to Simple will also initiate internal checkpoint.

How Recovery-Interval Stimulates Recovery Performance?

Normally, default values are enough to provide optimum recovery performance.But changing the recovery interval can be practiced to improve performance. You can opt to change the interval time in some conditions like; if the recovery process takes longer than 1 minute when long-running transactions are not rolled back. Frequent checkpoints are ruining performance, as under high frequency of running checkpoints performance of SQL Server can drop due to heavy I/O activity.

In case recovery interval setting has to be changed, it is recommended to increase it slowly with small increments. You can examine the relative recovery performance with gradual increase in recovery interval. This will help you get effective results.

Conclusion:

Checkpoints are useful repository of information and serves best for recovery of SQL server databases. This article shows Checkpoint’s importance, relevance and categories of checkpoints. Practicing a better checkpoint method will not only enhance the server performance but it also renders a better recovery plan. Recovery Interval can be chosen as per I/O activities and significance of the databases; it must be up-to-date and at the same time should not hamper the performance of server. One thing must be considered that the checkpoint log records are never overwritten by successive checkpoints. It will be only overwritten when log wraps and VLFs (Virtual Log Files) are re-used.

About the author:

This is the guest post by Andrew Jackson, a US based SQL Server DBA, who is currently working for SysTools Group, a company that provides its services across a diverse range that includes data recovery, digital forensics, and cloud backup. 

He can be contacted through his blog or via LinkedIn.