Azure Policy can give us the ability to audit settings inside a virtual machine using Guest Configuration. However, at this time we can’t remediate those machines because the feature is not yet available. This means that although we can see that a virtual machine is non-compliant there is little you can do about fixing it from the policy blade itself.
One of the built-in Guest Configuration policies can audit whether specific software is installed in a Windows machine, this could be a full software program or a specific agent. But then how do install the software based on the non-compliant policy result?
Thanks to Azure Policy state change events we can now detect when a resource changes it’s compliance settings and we can subscribe to these events using an Event Grid Subscription. I’ve used Event Grid in a previous post, but this time I’m going to use an Azure Automation runbook and some PowerShell to install the missing software package (PowerShell 7).
Pre-Requisite Deployment
I’m going to need several different resources to make this all work – so I’ve scripted everything up as Bicep templates and PowerShell scripts to run the deployment. All the files are in the GitHub repository, you can download them, and the only modification will be the names of the resources in deploy.ps1.
After updating those fields, you can run the script, it will complete the following steps.
- Deploy a new storage account.
- Deploy a container into the storage account called software where the MSI file is placed.
- Deploy a new automation account which is assigned a managed identity. This feature is currently in preview and simplifies the previous approach to giving permissions to an automation service principal.
- Deploy a couple of variables into the automation account which are used by the runbook.
- Assign Contributor permission to the automation account managed identity.
- Install all the Az modules required by the runbook (this does take a bit of time to complete).
- Assign the policy below to the resource group. This will install the Guest Configuration agent which is a pre-requisite for the software installation policy.
- Create a system topic to listen to the policy state changes.
It takes a while to deploy the initial template – but be patient. There is some output logging so you can see what the rest of the script does.
- Download the PowerShell 7 MSI and upload it to the storage account.
- Publish the runbook to the automation account
- Create a webhook for the runbook
- Deploy and Event Grid Subscription and the software installation policy.
The policies will be deployed to the resource group...
The software installation file is ready in the storage account...
And the Event Grid subscription is listening for policy events...
I’ve adjusted the filters for the events which I’m interested in – it should only fire the webhook when the software installation policy returns a non-compliant result.
Testing the Process
And now for some testing. I’ll create a standard Windows Server in the resource group by just going through the wizard – when complete my machine will not have PowerShell 7 installed (simply because it isn't there out of the box).
Checking the Apps and Features on the server...
Things are going to start moving in this virtual machine, but at some point, that software installation policy is going to return a non-compliant result. This can happen either before or after the Guest Configuration agents are installed, now it doesn’t really matter. The Guest Configuration extension will eventually install and check for the installed application. It generally takes around 30 minutes for policy evaluation to complete – you can trigger an evaluation using PowerShell at any time by running.
Start-AzPolicyComplianceScan -ResourceGroupName SoftwareInstallation
It is called out in the documentation that state change events are only fired after the evaluation is complete. From my testing this took around 10 or so minutes so you have to patient.
While you wait, I’ll explain the runbook that is going to be run. The steps involved in this one are…
- Strip down the subject from the Event Grid event – the schema can be found here.
- Create a script object using a here-string and write that out to a script file in the runbook worker.
- Call the Invoke-AzVMRunCommand cmdlet on the virtual machine and run the script that is now in the runbook worker.
When it is eventually called – the extension runs the script which downloads and installs PowerShell from the storage account.
Back to the process and my software installation policy has returned a non-compliant result for my virtual machine. As I said before you need to wait until the evaluation cycle is complete before an event will be fired.
I’ve captured the policy event that was generated by the Azure platform and it is below – note how the fields correspond to our filters and the subject contains the affected resource id.
Now when I check the automation account I can see the job has been run and there are no errors in the runbook output which is a good sign...
Finally, when I log on to the server, I can see the application has installed…
The Guest Configuration service runs on its own timer, in turn it sends reports back to a guest assignment object. Azure Policy then performs its evaluation based on these objects so there is an inherent delay in a resource becoming compliant. However now that the extensions and software is installed eventually this resource will report back as compliant to the guest configuration object and finally the policy.
Well, there it is, a way to use Azure Policy and state change events to trigger automation and remediate guest configuration policies. You could use this to install multiple agents on your virtual machines without affecting existing DSC configurations or custom script extensions. As always some caveats with the testing: -
- My testing cases are small and in no way should reflect your own testing.
- This is hosted on GitHub – if there are issues or you make changes, please submit a PR for review.
Posted at https://sl.advdat.com/35xURHD