Simple Azure VM Start/Stop Chaining using only Tags, Event Grid and Azure Functions

When you are migrating VMs from on-premise to Azure, you always have to evaluate the needed availability of several VMs. Your decisions in terms of VM size, storage tiers, and pricing options do rely on this evaluation. In my current migration of on-prem Remote Desktop Services to Azure Virtual Desktop, we have a Remote App that is used quite irregularly. Sometimes once per week, sometimes one to two days, and sometimes not a single time in a week. So we will go with Pay as you Go for these needed VM’s. We can deal with this behavior easily in Azure Virtual Desktop (planned shutdown and start on connect), but that’s only the frontend VM. In my scenario, I have some additional backend VMs which hold some services needed for the running application (licenseservice and some webservices for the DMS integration). We don’t need to run the backend VMs if nobody uses the frontend application, so I want to link the running state of these VMs with each other. 

The Frontend VM will be triggered by AVD’s “Starts on Connect” feature, and the needed backend server will be automatically started and deallocated depending on the Frontend VM.

I know there are solutions using EventGrid + Logic App + Azure Automation. But as you may already know, serverless Azure Functions are simply more efficient in terms of scaling and pricing.

In my setup how-to, I decided to simplify the setup with two single VMs. It shouldn’t be hard for someone to adjust, because in the end, you only have to tag the dependent VMs with the same value.
So let’s start …

Create some VMs for testing

We created two resource groups for testing. In my example, I created one named “Lab_init” and one “Lab_triggered” ? This way, we can define which VM can trigger the start process by putting them into this resource group.

Now we create 2 VMS, one in our “Lab_init” resource group and one in our “Lab_triggered” resource group.

I’m going with Ubuntu this time, but it doesn’t really matter. We only want to start and stop, so go with whatever you prefer.

Next, we need to tag our VM’s. The value can be whatever we want, but it has to match on all VM’s that we want to trigger. The code of our function (we’ll get to this later) loop through all subscriptions and search for VM’s with the same value in the bootbinding tag.

Setup Azure Function App

Now we get to the funny part.

We will create a new Azure Function App. (Serverless tier is good enough for our needs 🙂 )
Because our Function App needs to start/stop our Azure VMs across multiple subscriptions, we need a Managed Identity.
Add the Virtual Machine Contributor role for every subscription where you place VMs which needs to be triggered.
Our Azure Function App needs some modules to do its job. We have to add these to the requirements.psd1 file.
Note: You shouldn’t add the full Az module, as it’s quite large. Only add the submodules you really need.
Now we create our function and select “Azure Event Grid trigger”!
We enter the following code for our function:
param($eventGridEvent, $TriggerMetadata)

# Make sure to pass hashtables to Out-String so they're logged correctly
# $eventGridEvent | Out-String | Write-Host

$tAction = ($ -split "/")[-2]
$tVmName = ($ -split "/")[-1]
$tSubscriptionId = $

# preflight check
Write-Host "Check trigger action"
if (($tAction -ne "start") -and ($tAction -ne "deallocate")) {
    Write-Warning "Unsupported action: [$tAction], we stop here"
Write-Host "##################### Triggerinformation #####################"
Write-Host "Vm: $tVmName"
Write-Host "Action: $tAction"
Write-Host "Subscription: $tSubscriptionId"

Write-Host "Get information about trigger vm"
$context = Set-AzContext -SubscriptionId $tSubscriptionId

if ($context.Subscription.Id -ne $tSubscriptionId) {
    # break if no access
    throw "Azure Function have no access to subscription with id [$tSubscriptionId], check permissions of managed identity"

$tVm = Get-AzVM -Name $tVmName
$bindingGroup = $tVm.Tags.bootbinding

if (!$bindingGroup) {
    Write-Warning "No tag with bootbinding found for [$tVmName], check your tagging"

# main
Write-Host "Query all subscriptions"
$subscriptions = Get-AzSubscription

foreach ($sub in $subscriptions) {

    Write-Host "Set context to subscription [$($sub.Name)] with id [$($]"
    $context = Set-AzContext -SubscriptionId $

    if (!$context) {

        # break if no access
        Write-Warning "Azure Function have no access to subscription with id [$tSubscriptionId], check permissions of managed identity"

    # get vms with bootbinding tag
    $azVMs = Get-AzVM -Status -ErrorAction SilentlyContinue |  Where-Object { ($_.Tags.bootbinding -eq $bindingGroup) -and ($_.Name -ne $tVmName) }
    if ($azVMs) {
        $azVMs | ForEach-Object {
            Write-Host "VM [$($_.Name)] is in same binding-group, perform needed action "
            $vmSplatt = @{
                Name              = $_.Name
                ResourceGroupName = $_.ResourceGroupName
                NoWait            = $true
            switch ($tAction) {
                start {
                    Write-Host "Start VM"
                    $_.PowerState -ne 'VM running' ? (Start-AzVM @vmSplatt | Out-Null) : (Write-Warning "$($_.Name) is already running")
                deallocate {
                    Write-Host "Stop VM"
                    $_.PowerState -ne 'VM deallocated' ? (Stop-AzVM @vmSplatt -Force | Out-Null) : (Write-Warning "$($_.Name) is already running")
                Default {}

Setup event grid

Thankfully, we can use an “Event Grid System Topic” for our solution, so we don’t have to code anything here. You can think of a Topic as the source, where we want to react to events that occur.
Because we want to react to events in our “Lab_init” resource group, we select Resource Groups as Types and select “Lab_init” as the resource group.
If we want to trigger something, we have to create an “Event Subscription”
First, we give our Event Subscription a name and an endpoint. The endpoint defines what we want to trigger.
We dont want to call our function on every event in the dependent resource group, so we make some adjustments to filter for specific events. Otherwise, we have unnecessary function calls and have to filter the event in your function code, which is not good practice if we really don’t need to, because there is no other solution. In the Basic section, we reduce invocations to only successfully completed events.
In the Filter section of our Event Subscription we should also add some string filtering for the subject. This helps us only trigger our function if the event is triggered by the Microsoft.Compute provider on a virtual machine.

Validate Setup

Now let’s test our configuration

We start our “initVM”
In our Topic view, we see that some events are received by our Topic and also that some events are matched by our advanced filter.
Same informations four our “Event Subscription”
And we can also check our function output.

Log into our VMs

Check initVM
Check triggeredVM

As you can see, there is most likely a time difference of 3 minutes between the boottimes, so keep that in mind. In my AVD scenario, it doesn’t really matter, because we have some buffer until the user logs in and starts the application. We never had problems with that.

Hope it can be usefull for somebody, feel free to a adjust

Leave a Reply