Friday, August 20, 2021

How-to determine the timestamp of the last SQL MI failover

In the previous blog posts we have shared the scripts on how to determine your SQL Managed Instance Service Tier virtual hardware properties, and starting with this one we shall be focusing on the High Availability aspect of the Azure SQL MI.


If you are interested in other posts on how-to discover different aspects of SQL MI - please visit the http://aka.ms/sqlmi-howto, which serves as a place holder for the series.

To start the focus on the high availability, let us review the currently available architectural models and the respective service tiers.

Currently, there are two high availability architectural models:

  • Standard availability model that is based on a separation of compute and storage. It relies on high availability and reliability of the remote storage tier. This architecture targets budget-oriented business applications that can tolerate some performance degradation during maintenance activities.
  • Premium availability model that is based on a cluster of database engine processes. It relies on the fact that there is always a quorum of available database engine nodes. This architecture targets mission critical applications with high IO performance, high transaction rate and guarantees minimal performance impact to your workload during maintenance activities.

 

For the currently offered service tiers of SQL Managed Instance we are mapping Standard availability model to General Purpose service tier and the Premium availability model to the Business Critical service tier with a total of 4 replicas within Availability Group – 1 primary and 3 secondary replicas.

For More information regarding the high availability architectural models can be found in our official documentation - High availability - Azure SQL Database and SQL Managed Instance | Microsoft Docs

 

Failover Types

SQL MI Failover takes place under different conditions and can be either forced or unforced.


Unforced Failovers can happen during a maintenance process, such as a host or a database upgrade (and those timings can be controlled by the customer with the Maintenance Window feature), or in case of a power, hardware, or software problem.

Regardless of a reason for an unforced failover, we guarantee high availability of 99.99% through the SLA.

 

The second type of the failovers are the forced ones or also known as user-initiated ones – these are the ones that have been provoked by the user or user-application.  

This scenario of user-initiated failovers applies tor the cases when a SQL MI service tier has been changed (upgraded or downgraded to a different service tier) or any potential SKU upgrade or downgrade (such as changing the number of CPU vCores) – all those operations will provoke a user-initiated failover.

Another possibility for the user-initiated failovers are the Manual Failovers and for more information about this feature please consult the following documentation - Manually initiate a failover on SQL Managed Instance - Azure SQL Managed Instance | Microsoft Docs.

 

The timestamp of the last SQL MI failover

The right way to consult the timestamp of the last  SQL Managed Instance failover is through viewing the timestamp of the current instance start. This information can be found in the sys.dm_os_sys_info DMV under the column [sqlserver_start_time] containing the local system date and time when the respective replica was started.

 

The following query below provides the information on the last failover timestamp as well as on the number of hours that has passed since:

 

select sqlserver_start_time as LastInstanceStart, DATEDIFF(HOUR, sqlserver_start_time, GETDATE()) as HoursSinceFailover
       from sys.dm_os_sys_info;

 

NikoNeugebauer_0-1629470665421.png

On my test Business Critical instance, I have recently tested user-initiated failover and this reflects with a very recent SQL MI start timestamp, and just a little bit under 2 days – 47 hours before the query was issued, but your own SQL Managed Instance will have quite different values, depending on the above mentioned factors, such as the last patching time of the Managed Instance.

 

This little query can be very useful when you are troubleshooting your SQL MI – determining if a failover took place recently can help explaining some of the temporary performance effects, such as an empty Buffer Pool or an empty [tempdb] database.

Posted at https://sl.advdat.com/3D6tF2c