Thursday, July 1, 2021

SharePoint Online Multiple Files (Folder) Copy with Http Connector

ADF does not directly support copying a folder/multiple files from SharePoint Online, but there are workarounds to achieve this. Two additional steps needed here as compared to single file copy are:

  1. Get the list of files:
    • User can maintain the file names in a text file manually, OR
    • Use Web Activity to call SharePoint Rest API to get the list of files.
  2. ForEach Activity to loop the list of relative file names and pass the file name to Copy Activity (Base URL changes a bit as compared to single file copy)

 

Below is how the pipeline flow would look like:

 

Web1 – Get the access token from SPO

Web2 – Get the list of files from SPO folder

ForEach1 – Loop the list of file names

Copy1 – Copy data with HTTP connector as source

 

RoshnaNazir_0-1624533576973.png

 

RoshnaNazir_1-1624533576976.png

 

Step1:

Grab Access token from SPO

Copy file from SharePoint Online leverages AAD/service principal authentication and SharePoint API to retrieve files.

 

  1. Register SharePoint Application and Grant permission - https://docs.microsoft.com/en-us/azure/storage/common/storage-auth-aad-app?tabs=dotnet#register-your-application-with-an-azure-ad-tenant

         a) Register AAD Application

      1. On Azure Portal, go to AAD app registration page: https://portal.azure.com/#blade/Microsoft_AAD_IAM/ActiveDirectoryMenuBlade/RegisteredApps 
      2. New Registration à Enter your App name
      3. Go to "Certificates & secrets", create new client secret, you can set the expire to 1Y/2Y/Never

RoshnaNazir_2-1624533576980.png

       b) Grant SharePoint site permission to your registered App (need site owner permission on SharePoint)

 

RoshnaNazir_3-1624533576986.png

 

Full details on how to register app and also granting permissions is mentioned in prerequisites here - https://docs.microsoft.com/en-us/azure/data-factory/connector-sharepoint-online-list#prerequisites

 

     c) Create an ADF Pipeline. Start with creating a Web Activity to get the access token

Headers:

  • Content-Type: application/x-www-form-urlencoded
  • Body: grant_type=client_credentials&client_id=[Client-ID]@[Tenant-ID]&client_secret=[Client-Secret]&resource=00000003-0000-0ff1-ce00-000000000000/[Tenant-Name].sharepoint.com@[Tenant-ID

 

Debug run to check if the activity succeeds and also check the activity output to see if it returns the access token in the payload. You can also verify the same using Postman client to check if the token is valid.

 

RoshnaNazir_4-1624533576989.png

 

 

Step 2:

Get the list of Files

 

  1. Create another Web Activity to get the list of files

Headers:

  • Authorization: @{concat('Bearer ', activity('WebActivity1Name').output.access_token)}
  • Accept: application/json

          

 Debug run to see if the activity succeeds, and check it shows the list of files under the folder in the output.

 

RoshnaNazir_5-1624533576991.png

 

 

 

Step 3:

Loop the list of relative file names

 

  1. Create a ForEach Activity with inner Copy activity
  • Items: @activity('WebActivity2Name').output.value

RoshnaNazir_6-1624533576992.png

 

RoshnaNazir_7-1624533576993.png

 

 

RoshnaNazir_8-1624533576994.png

 

Step 4:

Create Copy activity

  1. New dataset -> HTTP -> Binary type:

    a) HTTP linked service

 

    b) Configure copy activity HTTP source

         Dataset properties:

  • Name: RelativeURL (Any name)
  • Value: @{item().ServerRelativeUrl}
  • Request method: GET
  • Additional header: “Authorization: Bearer <accessToken>” (accessToken is generated in Step1)

 

Tip: You can test with a static access token gotten from the previous Web activity output first. You can also use expression (add dynamic content): @{concat('Authorization: Bearer ',activity('WebActivityName').output.access_token)}

 

RoshnaNazir_9-1624533576996.png

 

     c) Configure Linked Service properties

  • Name: FileName (Any Name)
  • Value: @dataset().RelativeURL

RoshnaNazir_10-1624533576998.png

 

RoshnaNazir_11-1624533577002.png

 

2. Create Copy sink as below

 

RoshnaNazir_12-1624533577003.png

 

 

 Successful pipeline run as follows:              

 

RoshnaNazir_13-1624533577007.png

 

Thanks to @Jijo Puthooran for helping me in authoring this blog.

Posted at https://sl.advdat.com/3h5eLjU