Azure Data Factory and Azure Synapse Analytics pipelines include the powerful code-free data transformation feature called Mapping Data Flows. In this post, I'll show you how you can use data flows to visually transform hierarchies into relational data for downstream consumption by destinations that require data to be shaped like a table.
The data I'm going to work with is a JSON source and I want to grab the coordinates and type properties from the geometry field. Note that the coordinates property inside the geometry structure is an array.
The way to form this into a relational structure is by building a mapping data flow that first reshapes that data into your desired column format and then takes the array and unrolls that into a series of denormalized rows.
The FlattenStructure transformation in the diagram above is a Derived Column. To turn your structure into a relational table, just pick the name of the struct in the Derived Column and use a column pattern to define which hierarchy you wish to flatten.
What this will produce is an updated projection in your data flow with the 2 properties inside of the geometry structure now as separate columns.
That leaves the coordinates array which requires the Flatten transformation. Arrays are trickier when moving from hierarchical to relational because each value in that array will need to be added to existing rows outside of that array, resulting in a denormalized schema.
In the Flatten transformation, you will pick the coordinates array for unroll and then define the target projection.
That's it! You now have a fully flattened set of data from your JSON structures that can be consumed by relational targets.
Click on these documentation links to learn more about Mapping Data Flows in ADF and Synapse: Flatten transformation, column patterns in Derived Column.
Posted at https://sl.advdat.com/3DQxuZv