Chapter Navigation:
- 📚 Course Home: AZD For Beginners
- 📖 Current Chapter: Chapter 7 - Troubleshooting & Debugging
- ⬅️ Previous: Debugging Guide
- ➡️ Next Chapter: Chapter 8: Production & Enterprise Patterns
- 🤖 Related: Chapter 2: AI-First Development
This comprehensive troubleshooting guide addresses common issues when deploying AI solutions with AZD, providing solutions and debugging techniques specific to Azure AI services.
- Microsoft Foundry Models Service Issues
- Azure AI Search Problems
- Container Apps Deployment Issues
- Authentication and Permission Errors
- Model Deployment Failures
- Performance and Scaling Issues
- Cost and Quota Management
- Debugging Tools and Techniques
Symptoms:
Error: The requested resource type is not available in the location 'westus'
Causes:
- Microsoft Foundry Models not available in selected region
- Quota exhausted in preferred regions
- Regional capacity constraints
Solutions:
- Check Region Availability:
# List available regions for OpenAI
az cognitiveservices account list-skus \
--kind OpenAI \
--query "[].locations[]" \
--output table- Update AZD Configuration:
# azure.yaml - Force specific region
infra:
provider: bicep
path: infra
module: main
parameters:
location: "eastus2" # Known working region- Use Alternative Regions:
// infra/main.bicep - Multi-region fallback
@allowed([
'eastus2'
'francecentral'
'canadaeast'
'swedencentral'
])
param openAiLocation string = 'eastus2'Symptoms:
Error: Deployment failed due to insufficient quota
Solutions:
- Check Current Quota:
# Check quota usage
az cognitiveservices usage list \
--name YOUR_OPENAI_RESOURCE \
--resource-group YOUR_RG- Request Quota Increase:
# Submit quota increase request
az support tickets create \
--ticket-name "OpenAI Quota Increase" \
--description "Need increased quota for production deployment" \
--severity "minimal" \
--problem-classification "/providers/Microsoft.Support/services/quota_service_guid/problemClassifications/quota_service_problemClassification_guid"- Optimize Model Capacity:
// Reduce initial capacity
resource deployment 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01' = {
properties: {
model: {
format: 'OpenAI'
name: 'gpt-4.1-mini'
version: '2024-07-18'
}
}
sku: {
name: 'Standard'
capacity: 1 // Start with minimal capacity
}
}Symptoms:
Error: The API version '2023-05-15' is not available for OpenAI
Solutions:
- Use Supported API Version:
# Use latest supported version
AZURE_OPENAI_API_VERSION = "2024-02-15-preview"- Check API Version Compatibility:
# List supported API versions
az rest --method get \
--url "https://management.azure.com/providers/Microsoft.CognitiveServices/operations?api-version=2023-05-01" \
--query "value[?name.value=='Microsoft.CognitiveServices/accounts/read'].properties.serviceSpecification.metricSpecifications[].supportedApiVersions[]"Symptoms:
Error: Semantic search requires Basic tier or higher
Solutions:
- Upgrade Pricing Tier:
// infra/main.bicep - Use Basic tier
resource searchService 'Microsoft.Search/searchServices@2023-11-01' = {
name: searchServiceName
location: location
sku: {
name: 'basic' // Minimum for semantic search
}
properties: {
replicaCount: 1
partitionCount: 1
hostingMode: 'default'
semanticSearch: 'standard'
}
}- Disable Semantic Search (Development):
// For development environments
resource searchService 'Microsoft.Search/searchServices@2023-11-01' = {
name: searchServiceName
sku: {
name: 'free'
}
properties: {
semanticSearch: 'disabled'
}
}Symptoms:
Error: Cannot create index, insufficient permissions
Solutions:
- Verify Search Service Keys:
# Get search service admin key
az search admin-key show \
--service-name YOUR_SEARCH_SERVICE \
--resource-group YOUR_RG- Check Index Schema:
# Validate index schema
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import SearchIndex
def validate_index_schema(index_definition):
"""Validate index schema before creation."""
required_fields = ['id', 'content']
field_names = [field.name for field in index_definition.fields]
for required in required_fields:
if required not in field_names:
raise ValueError(f"Missing required field: {required}")- Use Managed Identity:
// Grant search permissions to managed identity
resource searchContributor 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
scope: searchService
name: guid(searchService.id, containerApp.id, searchIndexDataContributorRole)
properties: {
principalId: containerApp.identity.principalId
roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '8ebe5a00-799e-43f5-93ac-243d3dce84a7')
principalType: 'ServicePrincipal'
}
}Symptoms:
Error: Failed to build container image
Solutions:
- Check Dockerfile Syntax:
# Dockerfile - Python AI app example
FROM python:3.11-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
gcc \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements first for better caching
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]- Validate Dependencies:
# requirements.txt - Pin versions for stability
fastapi==0.104.1
uvicorn==0.24.0
openai==1.3.7
azure-identity==1.14.1
azure-keyvault-secrets==4.7.0
azure-search-documents==11.4.0
azure-cosmos==4.5.1- Add Health Check:
# main.py - Add health check endpoint
from fastapi import FastAPI
app = FastAPI()
@app.get("/health")
async def health_check():
return {"status": "healthy"}Symptoms:
Error: Container failed to start within timeout period
Solutions:
- Increase Startup Timeout:
resource containerApp 'Microsoft.App/containerApps@2024-03-01' = {
properties: {
template: {
containers: [
{
name: 'main'
image: containerImage
resources: {
cpu: json('0.5')
memory: '1Gi'
}
probes: [
{
type: 'startup'
httpGet: {
path: '/health'
port: 8000
}
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 10 // Allow more time for AI models to load
}
]
}
]
}
}
}- Optimize Model Loading:
# Lazy load models to reduce startup time
import asyncio
from contextlib import asynccontextmanager
class ModelManager:
def __init__(self):
self._client = None
async def get_client(self):
if self._client is None:
self._client = await self._initialize_client()
return self._client
async def _initialize_client(self):
# Initialize AI client here
pass
@asynccontextmanager
async def lifespan(app: FastAPI):
# Startup
app.state.model_manager = ModelManager()
yield
# Shutdown
pass
app = FastAPI(lifespan=lifespan)Symptoms:
Error: Authentication failed for Microsoft Foundry Models Service
Solutions:
- Verify Role Assignments:
# Check current role assignments
az role assignment list \
--assignee YOUR_MANAGED_IDENTITY_ID \
--scope /subscriptions/YOUR_SUBSCRIPTION/resourceGroups/YOUR_RG- Assign Required Roles:
// Required role assignments for AI services
var cognitiveServicesOpenAIUserRole = subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '5e0bd9bd-7b93-4f28-af87-19fc36ad61bd')
var searchIndexDataContributorRole = subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '8ebe5a00-799e-43f5-93ac-243d3dce84a7')
resource openAiRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
scope: openAi
name: guid(openAi.id, containerApp.id, cognitiveServicesOpenAIUserRole)
properties: {
principalId: containerApp.identity.principalId
roleDefinitionId: cognitiveServicesOpenAIUserRole
principalType: 'ServicePrincipal'
}
}- Test Authentication:
# Test managed identity authentication
from azure.identity import DefaultAzureCredential
from azure.core.exceptions import ClientAuthenticationError
async def test_authentication():
try:
credential = DefaultAzureCredential()
token = await credential.get_token("https://cognitiveservices.azure.com/.default")
print(f"Authentication successful: {token.token[:10]}...")
except ClientAuthenticationError as e:
print(f"Authentication failed: {e}")Symptoms:
Error: The user, group or application does not have secrets get permission
Solutions:
- Grant Key Vault Permissions:
resource keyVaultAccessPolicy 'Microsoft.KeyVault/vaults/accessPolicies@2023-07-01' = {
parent: keyVault
name: 'add'
properties: {
accessPolicies: [
{
tenantId: subscription().tenantId
objectId: containerApp.identity.principalId
permissions: {
secrets: ['get', 'list']
}
}
]
}
}- Use RBAC Instead of Access Policies:
resource keyVaultSecretsUserRole 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
scope: keyVault
name: guid(keyVault.id, containerApp.id, 'Key Vault Secrets User')
properties: {
principalId: containerApp.identity.principalId
roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '4633458b-17de-408a-b874-0445c86b69e6')
principalType: 'ServicePrincipal'
}
}Symptoms:
Error: Model version 'gpt-4-32k' is not available
Solutions:
- Check Available Models:
# List available models
az cognitiveservices account list-models \
--name YOUR_OPENAI_RESOURCE \
--resource-group YOUR_RG \
--query "[].{name:model.name, version:model.version}" \
--output table- Use Model Fallbacks:
// Model deployment with fallback
@description('Primary model configuration')
param primaryModel object = {
name: 'gpt-4.1-mini'
version: '2024-07-18'
}
@description('Fallback model configuration')
param fallbackModel object = {
name: 'gpt-4.1'
version: '2024-08-06'
}
// Try primary model first, fallback if unavailable
resource primaryDeployment 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01' = {
parent: openAi
name: 'chat-model'
properties: {
model: primaryModel
}
sku: {
name: 'Standard'
capacity: 10
}
}- Validate Model Before Deployment:
# Pre-deployment model validation
import httpx
async def validate_model_availability(model_name: str, version: str) -> bool:
"""Check if model is available before deployment."""
try:
async with httpx.AsyncClient() as client:
response = await client.get(
f"{AZURE_OPENAI_ENDPOINT}/openai/models",
headers={"api-key": AZURE_OPENAI_API_KEY}
)
models = response.json()
return any(
model["id"] == f"{model_name}-{version}"
for model in models.get("data", [])
)
except Exception:
return FalseSymptoms:
- Response times > 30 seconds
- Timeout errors
- Poor user experience
Solutions:
- Implement Request Timeouts:
# Configure proper timeouts
import httpx
client = httpx.AsyncClient(
timeout=httpx.Timeout(
connect=5.0,
read=30.0,
write=10.0,
pool=10.0
)
)- Add Response Caching:
# Redis cache for responses
import redis.asyncio as redis
import json
class ResponseCache:
def __init__(self, redis_url: str):
self.redis = redis.from_url(redis_url)
async def get_cached_response(self, query_hash: str) -> str | None:
"""Get cached response if available."""
cached = await self.redis.get(f"ai_response:{query_hash}")
return cached.decode() if cached else None
async def cache_response(self, query_hash: str, response: str, ttl: int = 3600):
"""Cache AI response with TTL."""
await self.redis.setex(f"ai_response:{query_hash}", ttl, response)- Configure Auto-scaling:
resource containerApp 'Microsoft.App/containerApps@2024-03-01' = {
properties: {
template: {
scale: {
minReplicas: 2
maxReplicas: 20
rules: [
{
name: 'http-requests'
http: {
metadata: {
concurrentRequests: '5' // Scale aggressively for AI workloads
}
}
}
{
name: 'cpu-utilization'
custom: {
type: 'cpu'
metadata: {
type: 'Utilization'
value: '60' // Lower threshold for AI apps
}
}
}
]
}
}
}
}Symptoms:
Error: Container killed due to memory limit exceeded
Solutions:
- Increase Memory Allocation:
resource containerApp 'Microsoft.App/containerApps@2024-03-01' = {
properties: {
template: {
containers: [
{
name: 'main'
resources: {
cpu: json('1.0')
memory: '2Gi' // Increase for AI workloads
}
}
]
}
}
}- Optimize Memory Usage:
# Memory-efficient model handling
import gc
import psutil
class MemoryOptimizedAI:
def __init__(self):
self.max_memory_percent = 80
async def process_request(self, request):
"""Process request with memory monitoring."""
# Check memory usage before processing
memory_percent = psutil.virtual_memory().percent
if memory_percent > self.max_memory_percent:
gc.collect() # Force garbage collection
result = await self._process_ai_request(request)
# Clean up after processing
gc.collect()
return resultSymptoms:
- Azure bill higher than expected
- Token usage exceeding estimates
- Budget alerts triggered
Solutions:
- Implement Cost Controls:
# Token usage tracking
class TokenTracker:
def __init__(self, monthly_limit: int = 100000):
self.monthly_limit = monthly_limit
self.current_usage = 0
async def track_usage(self, prompt_tokens: int, completion_tokens: int):
"""Track token usage with limits."""
total_tokens = prompt_tokens + completion_tokens
self.current_usage += total_tokens
if self.current_usage > self.monthly_limit:
raise Exception("Monthly token limit exceeded")
return total_tokens- Set up Cost Alerts:
resource budgetAlert 'Microsoft.Consumption/budgets@2023-05-01' = {
name: 'ai-workload-budget'
properties: {
timePeriod: {
startDate: '2024-01-01'
endDate: '2024-12-31'
}
timeGrain: 'Monthly'
amount: 500 // $500 monthly limit
category: 'Cost'
notifications: {
Actual_GreaterThan_80_Percent: {
enabled: true
operator: 'GreaterThan'
threshold: 80
contactEmails: ['admin@company.com']
contactRoles: ['Owner']
}
}
}
}- Optimize Model Selection:
# Cost-aware model selection
MODEL_COST_TIERS = {
'gpt-4.1-mini': 'low',
'gpt-4.1': 'high'
}
def select_model_by_cost(complexity: str, budget_remaining: float) -> str:
"""Select model based on complexity and budget."""
if complexity == 'simple' or budget_remaining < 10:
return 'gpt-4.1-mini'
else:
return 'gpt-4.1'# Enable verbose logging
azd up --debug
# Check deployment status
azd show
# View application logs (opens monitoring dashboard)
azd monitor --logs
# View live metrics
azd monitor --live
# Check environment variables
azd env get-valuesIf you deployed agents using azd ai agent init, these additional tools are available:
# Ensure the agents extension is installed
azd extension install azure.ai.agents
# Re-initialize or update an agent from a manifest
azd ai agent init -m agent-manifest.yaml --project-id <foundry-project-id>
# Use the MCP server to let AI tools query project state
azd mcp start
# Generate infrastructure files for review and audit
azd infra generateTip: Use
azd infra generateto write IaC to disk so you can inspect exactly what resources were provisioned. This is invaluable when debugging resource configuration issues. See the AZD AI CLI reference for full details.
- Structured Logging:
import logging
import json
# Configure structured logging for AI applications
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
def log_ai_request(model: str, tokens: int, latency: float, success: bool):
"""Log AI request details."""
logger.info(json.dumps({
'event': 'ai_request',
'model': model,
'tokens': tokens,
'latency_ms': latency,
'success': success
}))- Health Check Endpoints:
@app.get("/debug/health")
async def detailed_health_check():
"""Comprehensive health check for debugging."""
checks = {}
# Check OpenAI connectivity
try:
client = AsyncOpenAI(azure_endpoint=AZURE_OPENAI_ENDPOINT)
await client.models.list()
checks['openai'] = {'status': 'healthy'}
except Exception as e:
checks['openai'] = {'status': 'unhealthy', 'error': str(e)}
# Check Search service
try:
search_client = SearchIndexClient(
endpoint=AZURE_SEARCH_ENDPOINT,
credential=DefaultAzureCredential()
)
indexes = await search_client.list_index_names()
checks['search'] = {'status': 'healthy', 'indexes': list(indexes)}
except Exception as e:
checks['search'] = {'status': 'unhealthy', 'error': str(e)}
return checks- Performance Monitoring:
import time
from functools import wraps
def monitor_performance(func):
"""Decorator to monitor function performance."""
@wraps(func)
async def wrapper(*args, **kwargs):
start_time = time.time()
try:
result = await func(*args, **kwargs)
success = True
except Exception as e:
result = None
success = False
raise
finally:
end_time = time.time()
latency = (end_time - start_time) * 1000
logger.info(json.dumps({
'function': func.__name__,
'latency_ms': latency,
'success': success
}))
return result
return wrapper| Error Code | Description | Solution |
|---|---|---|
| 401 | Unauthorized | Check API keys and managed identity configuration |
| 403 | Forbidden | Verify RBAC role assignments |
| 429 | Rate Limited | Implement retry logic with exponential backoff |
| 500 | Internal Server Error | Check model deployment status and logs |
| 503 | Service Unavailable | Verify service health and regional availability |
- Review AI Model Deployment Guide for deployment best practices
- Complete Production AI Practices for enterprise-ready solutions
- Join the Microsoft Foundry Discord for community support
- Submit issues to the AZD GitHub repository for AZD-specific problems
- Microsoft Foundry Models Service Troubleshooting
- Container Apps Troubleshooting
- Azure AI Search Troubleshooting
- Azure Diagnostics Agent Skill - Install Azure troubleshooting skills in your editor:
npx skills add microsoft/github-copilot-for-azure
Chapter Navigation:
- 📚 Course Home: AZD For Beginners
- 📖 Current Chapter: Chapter 7 - Troubleshooting & Debugging
- ⬅️ Previous: Debugging Guide
- ➡️ Next Chapter: Chapter 8: Production & Enterprise Patterns
- 🤖 Related: Chapter 2: AI-First Development
- 📖 Reference: Azure Developer CLI Troubleshooting