Data engineering is one of the most involved, dull, and repetitive tasks in the data industry today. Bringing up the subject is a sure way to instantly clear the room at a party or make your friends groan in despair. The work includes a lot of complicated tools, many, many lines of code, and a not-insignificant amount of gnashing of teeth. But what if I told you there is a way to quickly, easily, and cheaply do both data engineering and data exploration on most data without even having to import it into a database? What if I told you that Azure Synapse has a just about magical tool that can instantly connect to flat files and make them available as nice, well-behaved tables? I'm talking about Azure Synapse serverless pools, a great tool for both initial exploration and surprisingly complex data engineering. I'll walk you through what it is, how to set it up, and give you a few examples of basic data exploration, as well as a tad more complex data engineering. Oh, and did I mention you can actually store the end result in several different formats as well...?
Resources:
Andy Cutler’s blog on serverless pools: https://www.datahai.co.uk/tag/sql-serverless/
Andy Mallon’s blog on bucketing with T-SQL: https://am2.co/2019/10/how-to-create-date-buckets-in-t-sql/
Posted at https://sl.advdat.com/2ZtxPT6