Tuesday, August 31, 2021

GMSA is used for SQL service and it fails to start

   

GMSA is used for SQL service and its failing to start with error “the request failed, or the service did not respond in a timely fashion. Consult the event log or other applicable error logs for details”. 

 

 

In this post we would like to explain one of the interesting issues that we encountered where SQL Server uses a gMSA as its service account.

 

 

Troubleshooting:

 

When we tried to start SQL server using GMSA account, we found the SQL Server could not start due to timeout.  One reason could be that the service account is not properly set or could not be authenticated with domain controllers.

 

SaniyaSamreen_0-1630348139688.png

 

 

When we checked Windows Services applet (Services.msc) we found that it was in “Starting” state.

SaniyaSamreen_1-1630348139706.png

 

 

SQL Server Services running under the context of a gMSA service account, gMSA service relies on KDS service, however the "Microsoft Key Distribution Service" service is not started on the domain controller. Here is an actual explanation about how GMSA account needs Microsoft Key Distribution Service is .

 

  1. During startup, Windows enumerates all automatic services and tries to start them.
  2. When Windows tries to start a service that is configured to use a group Managed Service Account (gMSA), the Service Control Manager (SCM) tries to log on by using the account information for the service.
  3. The logon request is sent to the Local Security Authority process (lsass.exe, LSASS) that is running on the computer.
  4. LSASS receives the request. While handling the request, LSASS tries to do a Lightweight Directory Access Protocol (LDAP) search for the msDS-ManagedPassword attribute.
  5. When the LDAP request is performed on a domain controller, the LDAP query can be sent back to the local server, where it is handled by a different thread in LSASS, which is the same process that issued the query.
  6. The LDAP server thread calls in to the Microsoft Key Distribution Service Provider (kdscli.dll), where it tries to find server components: Microsoft Key Distribution Service (KdsSvc), RPC endpoint from the RPC endpoint mapper (EPM).
  7. Because the KdsSvc service is set to be triggered as soon as one of these RPC queries occurs, the service should start (in theory). However, because the SCM is currently blocked from starting a service and it can only start one service at a time, KdsSvc never gets started, and SCM hangs.

 

From domain controller side, we observed the issue of starting the KDS service. When manually starting KDS service, we can see following error:

 

C:\>net start kdssvc

The Microsoft Key Distribution Service service is starting.

The Microsoft Key Distribution Service service could not be started.

 

 

SaniyaSamreen_2-1630348139712.png

 

A system error has occurred.

 

System error 1064 has occurred.

 

An exception occurred in the service when handling the control request

 

Also, we observed that Domain Controller is in Computer Container instead of Domain Controllers OU.

 

 

SaniyaSamreen_3-1630348139721.png

 

 

 

 

Cause:

 

This issue occurs because KDS assumes that the Domain Controllers are in the Domain Controllers OU instead of other OUs or Computer Container.  We moved the Domain Controller (DC) back to Domain Controllers OU, then started the KDS service.

 

SaniyaSamreen_4-1630348139729.png

 

 

 

C:\>net start kdssvc

The Microsoft Key Distribution Service service is starting.

The Microsoft Key Distribution Service service was started successfully.

 

SQL service is now able to start with both service accounts. Issue has been resolved.  

 

SaniyaSamreen_5-1630348139742.png

 

 

 

Please refer:

https://support.microsoft.com/en-za/help/4294429/service-using-gmsa-account-doesn-t-start-on-windows-server-2012-r2-dc

 

Resolution:

 

Move Domain Controller (DC) back to a Domain Controller OU and start KDS service.

 

 

 

 

Recommendation:

 

From AD perspective, we always recommend not to move DCs out of domain controllers OU, because default Domain Controller has many different user rights assigned. If you move it to other OUs or Container this may cause unexpected errors. When the service is up and running, we can move the DC out to different OUs, and this won’t cause the issue. However, later when machine is rebooted, this service will not start correctly due to the described behavior.

 

Please refer:

https://support.microsoft.com/en-us/help/3094486/kds-doesn-t-start-or-kds-root-key-isn-t-created-in-windows-server-2012

 

Failback Option:

 

The failback option for the SQL Server service is to use the NT Service\MSSQLSERVER account.

 

 

 

 

Author:  Saniya Samreen – ARR Support Engineer, SQL Server on Azure VM Microsoft

Reviewer: Joseph Pilov – Escalation Engineer, SQL Server, Microsoft

 

 

Posted at https://sl.advdat.com/2WEpmLe