Monday, March 7, 2022

How to connect ADF to SQL Server on Azure VM using Private Endpoint across different VNETs

Introduction:

In today Blog article, we will go through technical steps to connect Azure Data factory to SQL Server on Azure virtual machine across different subscription/VNets.

 

Use Case:

Customer wants to connect Azure Data Factory on one subscription to an Azure SQL Server on Virtual Machine (SQL VM) on another subscription. check out the architecture diagram below for more clarification.

Ahmed_S_Mahmoud_0-1646662095708.png

Solution:

 

Here, I am listing the main steps of this set up:

-- VNET Peering

-- Data factory Private EP

-- Private zone DNS resolution

-- Self-Hosted Integration Runtime

 

  • VNet Peering
Ahmed_S_Mahmoud_0-1646665318638.png

 

Set up the VNet peering between the two VNets using the steps in article:

Connect virtual networks with VNet peering - tutorial - Azure portal | Microsoft Docs

 

More information can be found at:

Azure Virtual Network peering | Microsoft Docs

 

  • Data factory Private Endpoint
Ahmed_S_Mahmoud_2-1646662643317.png

 

After change the network access to private endpoint, Set up a private endpoint link for Azure Data Factory using the steps: 

Azure Private Link for Azure Data Factory - Azure Data Factory | Microsoft Docs

 

  • DNS resolution
Ahmed_S_Mahmoud_3-1646662744703.png

 

To connect privately with your private endpoint, you need a DNS record. We recommend that you integrate your private endpoint with a private DNS zone. You can also utilize your own DNS servers or create DNS records using the host files on your virtual machines.

 

You can link a private DNS zone to one or more virtual networks by creating virtual network links. You can also enable the autoregistration feature to automatically manage the life cycle of the DNS records for the virtual machines that get deployed in a virtual network.

Ahmed_S_Mahmoud_1-1646666216558.png

 


 

More information can be found at:

What is an Azure DNS private zone | Microsoft Docs

 

Note:- If you are using a custom DNS server on your network, clients must be able to resolve the FQDN for the Data Factory endpoint to the private endpoint IP address. You should configure your DNS server to delegate your private link subdomain to the private DNS zone for the VNet, or configure the A records for ' DataFactoryA.{region}.datafactory.azure.net' with the private endpoint IP address. For more information on configuring your own DNS server to support private endpoints, refer to the following articles:

Name resolution for resources in Azure virtual networks | Microsoft Docs

What is a private endpoint? | Microsoft Docs

 

  • Self-Hosted Integration Runtime
Ahmed_S_Mahmoud_2-1646667347311.png

Download and install Self-hosted Integration runtime on Azure VM. 

Download Microsoft Integration Runtime from Official Microsoft Download Center

 

Configure and register the SHIR with Azure data factory.

Create a self-hosted integration runtime - Azure Data Factory & Azure Synapse | Microsoft Docs

Ahmed_S_Mahmoud_1-1646667230381.png Make sure the Self-hosted node is connected to the cloud service.

 

Troubleshooting:

Issue #1:

Self-hosted failed with error like below: The access to data factory from public network is blocked 

 

Ahmed_S_Mahmoud_5-1646662786864.png

Possible Solution:

Add the private endpoint to host file or make sure the DNS resolution. 

When you set up your ADF to connect to the VNet using a Private Endpoint there are DNS changes that occur. If you are using a custom DNS server on your network, clients must be able to resolve the FQDN for the Data Factory endpoint to the private endpoint IP address. You should configure your DNS server to delegate your private link subdomain to the private DNS zone for the VNet, or configure the A records for ' DataFactoryA.{region}.datafactory.azure.net' with the private endpoint IP address. More information at:

https://docs.microsoft.com/en-us/azure/data-factory/data-factory-private-link#dns-changes-for-private-endpoints

 

Tip:

you might need to perform flush dns cache after updating DNS or the Host file under system32/drivers/etc/host using command:
ipconfig /flushdns

 

Additional References:

Azure Private Link for Azure Data Factory - Azure Data Factory | Microsoft Docs

Access on-premises SQL Server from Data Factory Managed VNet using Private Endpoint - Azure Data Factory | Microsoft Docs

 

Thanks to @abhishekshaha  and @Marcio Lipi for helping to create this content.

 

I hope you find this article helpful. If you have any feedback, please do not hesitate to provide it in the comment section below.

 

Ahmed S. Mazrouh

Posted at https://sl.advdat.com/3IT03rDhttps://sl.advdat.com/3IT03rD