Thursday, April 7, 2022

Guidance for copy blobs between storage accounts with network restriction


Many users need to copy blob files between 2 storage accounts. The reasons are varied, for example data backup, storage account migration, business requirements, etc. Azure storage supports to copy blob from 1 storage account to another directly, instead of downloading to local disk or buffer and uploading again. In addition, many users must only allow selected network access to destination and source storage accounts to meet security or compliance requirements. In this blog, we will introduce how to achieve this with AZCOPY and how to implement this with network restrictions in place for both source and destination storage accounts.


Copy blobs between storage account directly It does not rely on network bandwidth of your local computer and could leverage the performance of storage accounts and Azure backbone network to achieve better throughput comparing to download and upload again. If source and destination storage accounts are in the same region, the bandwidth cost is free of charge.   


To copy blobs between storage accounts directly, we can use AZCOPY following the below syntax.





azcopy copy 'https://<source-storage-account-name><container-name>/<blob-path><SAS-token>' 'https://<destination-storage-account-name><container-name>/<blob-path>' 





Please note If you provide authorization credentials by using Azure Active Directory (Azure AD), you can omit the SAS token only from the destination URL. However, the source storage account must have a SAS-token appended.


Reference: Copy blobs between Azure storage accounts with AzCopy v10 | Microsoft Docs 


However, it is not supported if you need to restrict source and destination storage account network access via firewall allowlist, because the request sent from source storage backend with private IP address and the IP address are dynamic.


There are two supported scenarios:


Scenario 1: The client uses public endpoint accessing storage accounts. In this scenario, the client’s public IP or VNET must be added to allowlist in both source and destination Storage.



Scenario 2: The client’s VNET has private links configured and uses private endpoint accessing storage accounts. In this scenario, the firewall allowlist is not needed.



Here is the full process of this mechanism:

  1. The client sent a PutBlockfromURL request to the destination storage.
  2. After getting the requests, the destination storage tries to get blocks from the given URL which is the source storage. However, since the destination storage has not been allowed by the source firewall, it will get a 403 Forbidden error.
  3. After getting 403, the destination storage sent another GetBlob request on behalf of client. If the client has access to the source storage, the destination will be able to get the blocks from the source with response code 206 and return a success to the client.
  4. The client sent PutBlockList to destination storage to commit the blocks and finish the process after receiving success from request.
Posted at