Advanced Data Solutions : Azure API Management networking explained

Hi,

Microsoft recently announced that Azure Private Link was in preview for Azure API Management. I thought this was a good opportunity to recap and compare the different options from a network perspective. I will also explain how VIP (Virtual IP) and DIP (Dynamic IP) work because the official doc is not 100% accurate on the topic.

Introducing the different pricing tiers

As you might know, APIM comes with Developer, Consumption, Basic, Standard, Premium tiers. Microsoft also refers to an Isolated (in preview) tier on its pricing page but I do not know what's behind, so let's stick to the other tiers with the exception of the Consumption tier, which I will park since there is nothing to consider network-wise, other than the fact that it has no instance-specific IPs that you can control.

From a network perspective, only the DEV and PREMIUM tier integrate with a VNET. The DEV tier allows customers to experiment with VNET integration but is not covered by any SLA, while the PREMIUM tier is covered by an SLA but comes with high costs (about 2500€/month for the first gateway unit, a few hundred euros for the next units). So, often, customers end up using Premium for the sole VNET integration purpose, not for the extra features it offers (such as global distribution).

A while ago, Microsoft also came with the self-hosted gateway, which, as the name indicates, allows you to host a gateway wherever you want (on-premises, in another Cloud, etc.), but this also requires the premium tier. Moreover, the self-hosted gateway targets use cases where you want to avoid roundtrips between non-Azure data centers and Azure with the exception of the controlplane, which is used to download the gateway config (policies, etc.). The self-hosted gateway can still work "offline" should the controlplane be unavailable. I will not cover the self-hosted gateway in this blog post, but at least, that was a glimpse of it.

While the BASIC and STANDARD tiers come with their own SLAs and might be sufficient for some customers, they cannot integrate with a VNET. The recent announcement from Microsoft, stating that Azure Private Link had become available for all the pricing tiers (except Consumption), could be seen as a game changer, but it is really? That's what we'll see in the next sections but let me first introduce the DIP & VIP concepts.

Introducing DIP & VIP

The DIP (Dynamic IP) is a type of IP that is allocated by APIM itself to its underlying machines, to communicate with backend systems. A DIP is subject to change because it is not static, and is Microsoft's internal kitchen. Azure will take any available IP from the subnet in which APIM lives when it needs to allocate a new DIP. If you have several gateway units, you're likely to have several DIPs. If you share a single subnet with multiple APIM instances, they will all allocate their DIPs as they see fit.

On the other hands, a VIP (virtual IP), is not subject to change unless major events such as the accidental deletion of the IP or of the instance, or other major updates such as the integration with another VNET occurs, or switching from Internal to External integration. Depending on the network scenario, APIM may have one public VIP and one private VIP. In a nutshell, VIPs are more reliable than DIPs because they are expected not to change. Public VIPs can be used, only under certain circumstances, in firewall rules while private VIPs and DIPs should never be used in firewall rules. Instead, you should always whitelist the subnet in which APIM lives. We will see the working more in depth in the next section but rule of thumb number 1 is: always use a dedicated subnet for a single APIM instance if you want to be able to make a clear segregation between environments (DEV, TEST, ACC, PRD if you stick to DTAP). Let's see the different network topologies.

Introducing network topologies

In one of my recent articles, 10 shades of public API hosting in Azure, I already introduced a few concepts, but I was targeting a broader topic: web landing zones. Here I will focus on the API gateway only. Let me start with the first and easiest topology.

Using non-VNET APIM

As stated earlier, one may perfectly use Basic, Standard & even Consumption tier if we do not need to integrate with a VNET. Figure 1 shows how simple this could be:

Figure 1: APIM - No VNET

In this topology, the API gateway is exposed to internet. Anyone having the public VIP and/or the public DNS can start talking to it. All backends see the public VIP as the caller IP. You can typically use that public VIP in Azure Web App's access restriction rules to make sure backend services are called by that specific gateway. Note that, as stated before, the Consumption tier does not have such a VIP. In this topology, there is no DIP (or at least not that we can see). Let us know see the exact opposite topology.

Integrating APIM with a VNET in internal mode

In this topology, only available in DEV & PREMIUM tiers, the API gateway is not exposed to internet. The only way to expose the gateway is to use a reverse-proxy with or without a WAF module (preferably with a WAF). It has a system public VIP that is used for outbound traffic only. The gateway is part of a private network perimeter and can be reached by all the systems that have access to its private inbound VIP.. Figure 2 illustrates this topology:

Figure 2 - APIM in VNET - Internal mode

For sake of simplicity, I didn't go through a full Hub & Spoke setup with an NVA and custom route tables. I only use Azure's system routes but the public/private VIP and DIP behavior is the same with the whole shebang.

Figure 2 covers three scenarios:

The gateway calls an internet facing backend: the backend sees the public VIP
the gateway calls an internal backend within the same VNET as the APIM instance: the backend sees a DIP
the gateway calls an internal backend in another VNET (peered) than APIM's VNET: the backend sees a DIP

The latter is where the Microsoft doc is inaccurate IMHO as they state that DIP is only used within the same VNET. So, with such a topology, remember rule of thumb number 1.

Integrating APIM with a VNET in external mode

This topology is very similar to the previous one except that there is no more private VIP. Figure 3 illustrates it.

Figure 3 - APIM in VNET - Internal mode

The same scenarios as with the previous topology are covered and the same behavior is observed. So, here again, rule of thumb number 1 also applies. Let us know see the Private-Link-enabled APIM.

Understanding APIM with private link

I will not explain how the whole private link story works because it is already broadly covered by the official documentation. However, when I saw this announcement, I was really puzzled about its added value. To make sure there was no magic trick (specific to APIM), I quickly created a lab to test the thing out. After my tests (03/2022), I'm still puzzled.

In a nutshell, private link works like this:

You enable it for a public PaaS service. By enabling private link, you must choose the subnet in which the private endpoint (private IP) must be created. Note that you can have multiple private IPs for a single PaaS instance. There is a whole setup to do with DNS to make this work fine, which I'll skip as it is not relevant here.
After having performed the previous step, you end up with a private inbound IP. With some services (such as Azure SQL, Azure Storage, etc.) you can keep using both the public endpoint and your private endpoint(s). With some others such as Azure App Service, enabling private link automatically denies all public traffic. For APIM, the public VIP is still accessible by default but can be disabled through an API call. I'm not sure whether this is already feasible to disable it in one go using ARM or Terraform.
For whatever service (thus including APIM), private link only impacts the inbound traffic, not the outbound one.

What the latter means is that you can access to a PaaS service privately but it does not mean that this service, can, in turn, access other private backends. A good example of this is with Azure Web Apps. if you enable private link for a given web app, you will be able to access it privately but you also need to leverage the VNET integration feature for the outbound traffic, so that the web app can in turn, talk to private backends such as for example an Azure SQL.

Figure 4 illustrates this topology:

Figure 4 - Private-Link-enabled APIM

So, if you patch your instance, you can isolate it from internet but you can only talk to public facing backends. Given the fact that the primary role of an API gateway is to proxy traffic between API consumers and actual backends, I would have expected Microsoft to give us something to manage outbound traffic. It turns out that with a Private-Link-enabled APIM (in 03/2022), we end up with a private inbound IP for our gateway but we can only talk to public backends because there is not (yet?) a way to manage the outbound traffic...So, this is a niche scenario, where you only want to allow trusted parties (first or third) to consume your gateway, at an affordable cost. In conclusion, this is in no way comparable with a true VNET integration. Note that I numbered the rule of thumb number 1 because here is another rule thumb (number 2): the Cloud is a moving target so if you read this blog post a few months after its publication, review carefully that what I stated here is still valid.

Posted at https://sl.advdat.com/3uOdhQBhttps://sl.advdat.com/3uOdhQB

Sunday, April 3, 2022

Azure API Management networking explained