Wednesday, April 20, 2022

Getting started with your first Microsoft Azure Synapse Analytics project

Azure Synapse Analytics is a limitless analytics service that brings big data analytics and an enterprise data warehouse into a unified platform. The Azure Synapse platform provides a unified experience to help you discover and explore your data quickly, so that you can find meaningful insights at scale.

 

Let’s create our first Azure Synapse workspace to dive deeper into this enterprise-grade platform.

 

Creating an Azure Synapse workspace

Follow these steps to create your first Azure Synapse workspace:

  1. Go to the Azure portal, provide your credentials, and log in.
  2. Click on the Create a resource link on the home page.
  3. Select Azure Synapse Analytics from the list of all available resources and click on Create.

creyers_0-1649882319840.png

 

Figure 1: Azure Synapse Analytics page in Azure Marketplace

  1. You need to provide some basic details to create your workspace:
    1. Subscription: You need to select your subscription. If you have many subscriptions in your Azure account, you need to select the specific one that you are going to use to create the Azure Synapse workspace.
    2. Resource group: Select a resource group for your Azure Synapse workspace. If you do not already have a resource group created, click on Click new right below the text field for the resource group.
    3. Workspace name: Provide an appropriate name for the workspace that you are going to create.

creyers_1-1649882319845.png

 

Figure 2: Providing various details in the Create Synapse workspace window

  1. Region: You can see many options in the dropdown. Select the most appropriate region for your Azure Synapse workspace.
  2. Select Data Lake Storage Gen2: This will be the primary storage account for the workspace, holding catalog data and metadata associated with the workspace.
  3. Account name: You can select from the dropdown, or you can create a new one. Only Data Lake Gen2 accounts with a hierarchical namespace enabled will appear in the dropdown. However, if you click on Create new, it will create a Data Lake Gen2 account with a hierarchical namespace enabled.
  4. File system name: Again, you can select from the dropdown, or you can create a new one. To create a new file system name, click on Create new and provide an appropriate name for it.
  1. Click on Next: Security > to configure security options and networking settings for your workspace. Provide SQL administrator credentials that can be used for administrator access to the workspace's SQL pools.
  2. Click on Tags to provide a name-value pair to this resource.
  3. Go to the next page to review the summary and click on Create after verifying all the details on the summary page.
  4. In your Azure Synapse workspace in the Azure portal, click on Launch Synapse Studio.

 

Exploring Synapse Studio

Synapse Studio is an integrated tool to access your Azure Synapse workspace. You can access all components using Synapse Studio. As you can see in the following image, you are able to ingest, explore, analyze, and visualize data in Azure Synapse. Synapse Studio gives you the flexibility to connect to your Power BI workspace directly without navigating to the Power BI portal:

creyers_2-1649882319862.png

 

Figure 3: Synapse Studio Home page

You can connect to your workspace using Synapse Studio. Synapse Studio is a free web tool provided by Azure Synapse for all data engineers, data scientists, and report developers. Synapse Studio also enables you to manage and monitor all your resources created under your Synapse workspace.

 

If you are new to Azure Synapse, it is highly recommended that you explore all the resources available under Knowledge center in Synapse Studio. Go to Browse gallery in Knowledge center and go through the available templates, datasets, notebooks, SQL scripts, and pipelines to get yourself acquainted with Azure Synapse. To learn more about Knowledge center, you can go through the Explore the Synapse Knowledge center documentation.

 

Creating SQL pools, Apache Spark pools, and Data Explorer pools

You can create three different types of pools within your Synapse workspace: SQL pools, Apache Spark pools, and Data Explorer pools:

  • The serverless SQL pool, built-in, is immediately available for your workspace. Dedicated SQL pools can be configured to adapt to team or organizational requirements and constraints.
  • Apache Spark pools can be tuned to run different kinds of Apache Spark workloads using specific configuration libraries, permissions, and so on.
  • Data Explorer pools can be used to run near real-time analytics on large volumes of logs and time series data streaming from applications, websites, IoT devices, and more.

Under the Manage hub of Synapse Studio, choose which compute engine you want to use that best fits your business scenario:

creyers_3-1649882319865.png

 

Figure 4: SQL pools under the Manage hub of Synapse Studio

 

Endpoints for connecting to different pools in Azure Synapse

Different endpoints can be used to connect to various resources within Azure Synapse.

These are generic URLs where you just need to replace the * character with your Azure Synapse workspace name:

  • https://web.azuresynapse.net: This will redirect you to your Azure Synapse workspace and you need to fill in Azure Active Directory, Subscription, and Workspace name to access your workspace:

creyers_4-1649882319870.png

 

Figure 5: Providing the details to access your Azure Synapse workspace

  • https://*.dev.azuresynapse.net: This is a development endpoint, and you can ascertain the URL for this endpoint in the Overview section of your Azure Synapse workspace within the Azure portal:

creyers_5-1649882319885.png

 

Figure 6: Azure Synapse workspace in the Azure portal highlighting different endpoints

  • https://*.database.windows.net: This endpoint can be used to access the provisioned SQL pool from any application; you just need to provide your Azure Synapse workspace name in place of *.
  • https://*-ondemand.database.windows.net: This endpoint is similar to the preceding endpoint. However, this endpoint will connect to a serverless SQL pool in Azure Synapse. Make sure you replace * with your Azure Synapse workspace name.
  • https://*.sql.azuresynapse.net: This endpoint can also be used to access the provisioned SQL pool from any application. This is also known as a dedicated SQL endpoint. You can find the endpoint available in your Synapse workspace within the Azure portal.
  • https://*-ondemand.sql.azuresynapse.net: This endpoint can also be used to access a serverless SQL pool in Azure Synapse. This is also known as a serverless SQL endpoint, and you can find the endpoint available in your Synapse workspace within the Azure portal.

Now that you have created your Synapse workspace and know how to navigate to Synapse Studio, you are all set to build your enterprise-grade analytics solution. Go through the different hubs of Synapse Studio, Data, Develop, Integrate, Monitor, and Manage, to get yourself acquainted with all the supported capabilities.

 

Conclusion

Unlike an on-premises data warehouse, Azure Synapse provides you with a cloud-based data warehouse that is integrated with big data analytics under one unified platform. And it takes just a few clicks to set up your environment. You can also generate meaningful insights using Power BI within Synapse Studio itself. The Integrate hub provides you with code-free data integration capabilities within Synapse Studio.

To learn more:

Posted at https://sl.advdat.com/3K0gBxChttps://sl.advdat.com/3K0gBxC