Availability
Lab Objective
In this hands-on lab, you will learn how to:
- Deploy VMs to availability zones across three separate datacenters for maximum resilience
- Create availability sets with fault and update domains for traditional high availability
- Configure load balancers to distribute traffic across multiple VM instances
- Test failover scenarios by stopping VMs and observing traffic redirection
- Set up basic monitoring to track VM availability and performance
Scenario: Deploy a web application across multiple availability zones with automatic failover to ensure high availability.
Please sign in to launch lab.
Pre-Provisioned Environment
| Resource Type | Resource Name | Purpose |
|---|---|---|
| Resource Group | Availability-Lab-RG | Contains all lab resources |
| Virtual Network | availability-vnet | Network for VMs (10.0.0.0/16) |
| Load Balancer | avail-lb[unique] | Distributes traffic across VMs |
| Public IP | avail-pip[unique] | External access to load balancer |
Lab Exercises
Part 1: Deploy VMs to Availability Zones
Step 1: Create First VM in Zone 1
- Navigate to Virtual machines → Create
- Select
Availability-Lab-RGresource group - Set VM name:
WebVM-Zone1 - Choose Windows Server 2019 Datacenter
- Select size: Standard_B2s
- Set administrator username:
azureuser - Set administrator password:
Password123!
Step 2: Configure Availability Zone
- Click “Availability options”
- Select “Availability zone”
- Choose “Zone 1”
- Click “Next: Disks”
Step 3: Configure Networking
- Select virtual network:
availability-vnet - Create new subnet:
zone1-subnet(10.0.1.0/24) - Select “Create new” public IP
- Click “Review + create” → Create
Step 4: Create VMs in Other Zones
- Repeat the process to create:
WebVM-Zone2in Availability Zone 2 (subnet: 10.0.2.0/24)WebVM-Zone3in Availability Zone 3 (subnet: 10.0.3.0/24)
Verify: Check that all three VMs are created in different availability zones.
Part 2: Configure Load Balancer
Step 1: Add VMs to Backend Pool
- Navigate to Load balancers →
avail-lb[unique] - Go to Settings → Backend pools
- Click the existing backend pool
- Add all three VMs (WebVM-Zone1, WebVM-Zone2, WebVM-Zone3)
- Click “Save”
Step 2: Configure Health Probe
- Go to Settings → Health probes
- Click “Add”
- Set name:
web-health-probe - Set protocol: HTTP
- Set port: 80
- Set path:
/ - Click “OK”
Step 3: Create Load Balancing Rule
- Go to Settings → Load balancing rules
- Click “Add”
- Set name:
web-lb-rule - Set frontend port: 80
- Set backend port: 80
- Select health probe:
web-health-probe - Click “OK”
Verify: Load balancer is configured with all VMs in backend pool.
Part 3: Install Web Server on VMs
Step 1: Connect to First VM
- Go to Virtual machines →
WebVM-Zone1 - Click “Connect” → “RDP”
- Download RDP file and connect using
azureuser/Password123!
Step 2: Install IIS
- Open Server Manager
- Click “Add roles and features”
- Select “Web Server (IIS)” role
- Complete installation
- Open browser and verify IIS default page loads
Step 3: Customize Web Page
- Navigate to
C:\inetpub\wwwroot - Edit
iisstart.htm - Add text: “This is WebVM-Zone1”
- Save the file
Step 4: Repeat for Other VMs
- Connect to
WebVM-Zone2andWebVM-Zone3 - Install IIS on both VMs
- Customize pages to show “WebVM-Zone2” and “WebVM-Zone3”
Verify: Each VM shows its unique identifier when accessed directly.
Part 4: Test Load Balancer
Step 1: Access Load Balancer IP
- Go to Load balancers →
avail-lb[unique] - Copy the public IP address
- Open browser and navigate to the IP address
- Refresh page multiple times
Expected: You should see different VM pages as load balancer distributes traffic.
Step 2: Test Failover
- Stop one VM (WebVM-Zone1)
- Refresh browser page multiple times
- Observe that traffic only goes to remaining two VMs
Step 3: Restart VM and Verify
- Start WebVM-Zone1 again
- Wait 2-3 minutes for health probe to detect
- Refresh browser to see all three VMs responding again
Verify: Load balancer automatically removes failed VMs and adds them back when healthy.
Part 5: Create Availability Set
Step 1: Create Availability Set
- Navigate to Availability sets → Create
- Set name:
web-availability-set - Select resource group:
Availability-Lab-RG - Set fault domains: 2
- Set update domains: 5
- Click “Create”
Step 2: Create VM in Availability Set
- Create new VM:
AvailSetVM1 - Choose availability options: “Availability set”
- Select
web-availability-set - Use same credentials and settings as before
- Create the VM
Step 3: Add Second VM to Set
- Create another VM:
AvailSetVM2 - Select same availability set:
web-availability-set - Complete creation
Verify: Both VMs are in the availability set with different fault domains.
Part 6: Monitor Availability
Step 1: Enable VM Insights
- Go to Virtual machines →
WebVM-Zone1 - Click “Insights” under Monitoring
- Click “Enable”
- Select Log Analytics workspace or create new
- Wait for deployment to complete
Step 2: View Availability Metrics
- Go to Monitor → Metrics
- Select resource:
WebVM-Zone1 - Add metric: “VM Availability”
- Set time range: Last 4 hours
Step 3: Create Simple Alert
- Go to Monitor → Alerts
- Click “Create alert rule”
- Select resource:
WebVM-Zone1 - Add condition: VM Availability less than 1
- Create action group with email notification
- Name the alert: “VM Down Alert”
- Create the alert
Verify: Alert is created and will notify when VM becomes unavailable.
Troubleshooting Guide
Common Availability Issues
| Issue | Symptoms | Possible Cause | Solution |
|---|---|---|---|
| VMs not distributing across zones | All instances in single zone despite configuration | Capacity constraints or SKU limitations | Verify VM size availability in all zones and consider different SKU |
| Load balancer not detecting failures | Traffic continues to unhealthy instances | Health probe misconfiguration or endpoint issues | Check health probe settings and ensure application health endpoint responds |
| Availability set deployment failures | Cannot add VM to existing availability set | Mixing managed and unmanaged disks | Ensure consistent disk types (managed disks recommended) |
| Zone-redundant services unavailable | Options grayed out during resource creation | Region or service limitations | Verify regional support for zone-redundant services and select supported region |
| High latency between zones | Poor application performance across zones | Cross-zone network latency or configuration | Optimize application architecture for zone-aware design and minimize cross-zone traffic |
Availability Configuration Checklist
| Component | Requirement | Status | Notes |
|---|---|---|---|
| VM placement | Distributed across zones or fault domains | ✅ | Ensures protection against infrastructure failures |
| Load balancer configuration | Health probes and backend pools configured | ✅ | Enables automatic traffic distribution and failover |
| Storage resilience | Zone-redundant or geo-redundant storage | ✅ | Protects against data loss during infrastructure failures |
| Network redundancy | Multiple paths and zone-redundant components | ✅ | Eliminates network single points of failure |
| Monitoring and alerting | Availability metrics and alerts configured | ✅ | Provides visibility and proactive issue detection |
Best Practices
| Scenario | Recommendation | Benefit |
|---|---|---|
| Mission-critical applications | Deploy across availability zones with zone-redundant services | Provides highest level of availability (99.99% SLA) |
| Cost-sensitive workloads | Use availability sets for basic high availability | Balances cost and availability for standard applications |
| Stateful applications | Implement data replication across zones or regions | Ensures data consistency and availability during failures |
| Global applications | Deploy across multiple regions with traffic manager | Provides geographic redundancy and disaster recovery |
| Maintenance planning | Align update domains with maintenance windows | Minimizes business impact during planned maintenance |
Optional Advanced Exercises
For users wanting more technical depth, try these exercises:
Advanced Exercise 1: Automated Availability Deployment
# Deploy high availability infrastructure using PowerShell automationConnect-AzAccount
# Set up variables for multi-zone deployment$resourceGroup = "Availability-Lab-RG"$location = "East US"$vmSize = "Standard_B2s"$availabilityZones = @("1", "2", "3")
# Create availability zones VMs with automationforeach ($zone in $availabilityZones) { $vmName = "AutoVM-Zone$zone" $subnetName = "Zone${zone}Subnet"
# Create VM configuration $vmConfig = New-AzVMConfig -VMName $vmName -VMSize $vmSize -Zone $zone $vmConfig = Set-AzVMOperatingSystem -VM $vmConfig -Windows -ComputerName $vmName -Credential (Get-Credential) $vmConfig = Set-AzVMSourceImage -VM $vmConfig -PublisherName "MicrosoftWindowsServer" -Offer "WindowsServer" -Skus "2019-Datacenter" -Version "latest"
# Create VM with zone placement New-AzVM -ResourceGroupName $resourceGroup -Location $location -VM $vmConfig -Zone $zone -Verbose
Write-Host "Deployed VM $vmName to Availability Zone $zone"}
# Configure load balancer backend pools$loadBalancer = Get-AzLoadBalancer -ResourceGroupName $resourceGroup -Name "avail-lb*"foreach ($zone in $availabilityZones) { $backendPoolName = "Zone${zone}Pool" $vmName = "AutoVM-Zone$zone" $vm = Get-AzVM -ResourceGroupName $resourceGroup -Name $vmName $nic = Get-AzNetworkInterface -ResourceId $vm.NetworkProfile.NetworkInterfaces[0].Id
# Add VM to appropriate backend pool $nic.IpConfigurations[0].LoadBalancerBackendAddressPools = $loadBalancer.BackendAddressPools | Where-Object {$_.Name -eq $backendPoolName} Set-AzNetworkInterface -NetworkInterface $nic}
Write-Host "High availability deployment completed successfully"Advanced Exercise 2: Availability Set Optimization
# Create optimized availability set with proximity placement groups$resourceGroup = "Availability-Lab-RG"$location = "East US"
# Create proximity placement group for low latency$ppg = New-AzProximityPlacementGroup -ResourceGroupName $resourceGroup -Name "web-ppg" -Location $location -ProximityPlacementGroupType "Standard"
# Create optimized availability set$availSet = New-AzAvailabilitySet -ResourceGroupName $resourceGroup -Name "optimized-availability-set" -Location $location -PlatformFaultDomainCount 3 -PlatformUpdateDomainCount 5 -ProximityPlacementGroupId $ppg.Id -Sku "Aligned"
# Deploy VMs with optimized placement$vmConfigs = @()for ($i = 1; $i -le 3; $i++) { $vmName = "OptimizedVM$i" $vmConfig = New-AzVMConfig -VMName $vmName -VMSize "Standard_D2s_v3" -AvailabilitySetId $availSet.Id -ProximityPlacementGroupId $ppg.Id $vmConfig = Set-AzVMOperatingSystem -VM $vmConfig -Windows -ComputerName $vmName -Credential (Get-Credential) $vmConfig = Set-AzVMSourceImage -VM $vmConfig -PublisherName "MicrosoftWindowsServer" -Offer "WindowsServer" -Skus "2019-Datacenter" -Version "latest"
$vmConfigs += $vmConfig}
# Deploy all VMs concurrently$jobs = @()foreach ($config in $vmConfigs) { $job = Start-Job -ScriptBlock { param($rg, $loc, $vmConfig) New-AzVM -ResourceGroupName $rg -Location $loc -VM $vmConfig } -ArgumentList $resourceGroup, $location, $config $jobs += $job}
# Wait for all deployments to complete$jobs | Wait-Job | Receive-Job
Write-Host "Optimized availability set deployment completed"Advanced Exercise 3: Availability Monitoring and Alerting
# Set up comprehensive availability monitoring$resourceGroup = "Availability-Lab-RG"$workspaceName = "availability-workspace"
# Create Log Analytics workspace for centralized monitoring$workspace = New-AzOperationalInsightsWorkspace -ResourceGroupName $resourceGroup -Name $workspaceName -Location "East US" -Sku "Standard"
# Enable VM insights for all VMs$vms = Get-AzVM -ResourceGroupName $resourceGroupforeach ($vm in $vms) { # Install monitoring agent Set-AzVMExtension -ResourceGroupName $resourceGroup -VMName $vm.Name -Name "MicrosoftMonitoringAgent" -Publisher "Microsoft.EnterpriseCloud.Monitoring" -ExtensionType "MicrosoftMonitoringAgent" -TypeHandlerVersion "1.0" -Settings @{"workspaceId" = $workspace.CustomerId} -ProtectedSettings @{"workspaceKey" = (Get-AzOperationalInsightsWorkspaceSharedKeys -ResourceGroupName $resourceGroup -Name $workspaceName).PrimarySharedKey}}
# Create availability alert rules$actionGroup = New-AzActionGroup -ResourceGroupName $resourceGroup -Name "AvailabilityAlerts" -ShortName "AvailAlert" -EmailReceiver @{Name="Operations"; EmailAddress="ops@company.com"}
# VM availability alert$vmAvailabilityAlert = New-AzMetricAlertRuleV2 -ResourceGroupName $resourceGroup -Name "VM-Availability-Alert" -TargetResourceId "/subscriptions/$($(Get-AzContext).Subscription.Id)/resourceGroups/$resourceGroup" -WindowSize "00:05:00" -Frequency "00:01:00" -MetricName "VmAvailabilityMetric" -Operator LessThan -Threshold 0.99 -Severity 2 -ActionGroup $actionGroup
# Load balancer health alert$lbHealthAlert = New-AzMetricAlertRuleV2 -ResourceGroupName $resourceGroup -Name "LB-Health-Alert" -TargetResourceId (Get-AzLoadBalancer -ResourceGroupName $resourceGroup).Id -WindowSize "00:05:00" -Frequency "00:01:00" -MetricName "VipAvailability" -Operator LessThan -Threshold 0.99 -Severity 1 -ActionGroup $actionGroup
Write-Host "Comprehensive availability monitoring configured successfully"Key Takeaways
After completing this lab, you should understand:
- Availability zones provide the highest level of resiliency by distributing resources across physically separate datacenters within a region, achieving 99.99% SLA
- Availability sets protect against local failures within a single datacenter using fault and update domains, providing 99.95% SLA at lower cost
- Load balancer configuration is critical for automatic failover and traffic distribution across available instances during failures
- Health probes and monitoring enable proactive detection of issues and automatic traffic redirection to healthy instances
- Proper capacity planning ensures adequate resources remain available during zone or domain failures to maintain performance
- Cost optimization requires balancing availability requirements with infrastructure expenses based on business criticality and SLA targets
Availability Decision Matrix
Availability Strategy Selection
| Criteria | Single VM | Availability Set | Availability Zones | Multi-Region | Recommendation |
|---|---|---|---|---|---|
| SLA requirement | 99.9% | 99.95% | 99.99% | 99.995%+ | Choose based on business SLA requirements |
| Cost impact | Lowest | Medium | Higher | Highest | Availability sets for cost-sensitive workloads |
| Complexity | Simple | Moderate | Moderate | High | Start simple, evolve based on requirements |
| Regional resilience | None | None | High | Maximum | Zones for regional disasters, multi-region for global |
Failure Protection Coverage
| Failure Type | Single VM | Availability Set | Availability Zone | Multi-Region |
|---|---|---|---|---|
| Hardware failure | ❌ No protection | ✅ Protected | ✅ Protected | ✅ Protected |
| Rack failure | ❌ No protection | ✅ Protected | ✅ Protected | ✅ Protected |
| Datacenter failure | ❌ No protection | ❌ No protection | ✅ Protected | ✅ Protected |
| Regional disaster | ❌ No protection | ❌ No protection | ❌ No protection | ✅ Protected |
| Planned maintenance | ❌ Downtime required | ✅ Zero downtime | ✅ Zero downtime | ✅ Zero downtime |
Cost vs Availability Trade-offs
| Scenario | Optimization Strategy | Expected Outcome |
|---|---|---|
| Development/Testing | Single VMs with automated backups | 70-80% cost savings with acceptable availability for non-production |
| Standard Production | Availability sets with load balancing | Balanced cost and availability with 99.95% SLA |
| Mission-Critical | Availability zones with zone-redundant services | Premium availability (99.99% SLA) with 20-30% cost increase |
| Global Enterprise | Multi-region deployment with traffic management | Maximum availability (99.995%+) with 2-3x cost increase |
Additional Resources
- Azure Virtual Machine Availability Documentation - Comprehensive guide to VM availability options and best practices
- Availability Zones Documentation - Complete reference for zone-based deployments and supported services
- Load Balancer High Availability Guide - Configuration patterns for load balancer availability and failover
- SLA Reference Guide - Official SLA commitments and availability calculations for Azure services
- High Availability Architecture Patterns - Design patterns and best practices for resilient applications