Monday Cloud Tip: Azure Monitor Log Analytics KQL Queries That Actually Help

Your weekly dose of actionable cloud wisdom to start the week right

The Problem

It’s 2 PM, your application is misbehaving, users are complaining, and you’re staring at Azure Monitor with millions of log entries. You know the answer is in there somewhere, but writing KQL queries feels like learning a foreign language whilst your service burns down around you.

The Solution

Master these essential KQL (Kusto Query Language) queries that solve real troubleshooting scenarios. Instead of drowning in logs, you’ll pinpoint issues in minutes and look like a hero whilst doing it.

Essential Troubleshooting Queries:

1. Find Application Errors in the Last Hour

// Quick error hunt - your go-to starting query
AppServiceHTTPLogs
| where TimeGenerated > ago(1h)
| where ScStatus >= 400
| summarize ErrorCount = count() by ScStatus, CsUriStem
| order by ErrorCount desc
| take 20

2. Slow API Endpoints Analysis

// Find your performance bottlenecks
AppServiceHTTPLogs
| where TimeGenerated > ago(4h)
| where TimeTaken > 5000  // Requests taking over 5 seconds
| summarize 
    AvgTime = avg(TimeTaken),
    MaxTime = max(TimeTaken),
    Count = count()
    by CsUriStem
| order by AvgTime desc

3. Failed Login Attempts Detection

// Security monitoring - spot brute force attempts
SigninLogs
| where TimeGenerated > ago(24h)
| where ResultType != "0"  // Non-successful logins
| summarize 
    FailedAttempts = count(),
    UniqueIPs = dcount(IPAddress)
    by UserPrincipalName
| where FailedAttempts > 10
| order by FailedAttempts desc

4. Resource Health Check

// Quick service health overview
AzureActivity
| where TimeGenerated > ago(2h)
| where ActivityStatus == "Failed"
| summarize Count = count() by ResourceGroup, OperationName
| order by Count desc

5. Database Connection Issues

// Spot database connectivity problems
AzureDiagnostics
| where TimeGenerated > ago(1h)
| where Category == "SQLSecurityAuditEvents"
| where statement_s contains "login failed"
| summarize FailedConnections = count() by client_ip_s, server_principal_name_s
| order by FailedConnections desc

6. Memory and CPU Correlation

// Find resource correlation issues
Perf
| where TimeGenerated > ago(2h)
| where ObjectName == "Processor" and CounterName == "% Processor Time"
    or ObjectName == "Memory" and CounterName == "Available MBytes"
| summarize avg(CounterValue) by Computer, CounterName, bin(TimeGenerated, 5m)
| render timechart

Why It Matters

  • Mean Time to Resolution: Cut debugging time from hours to minutes
  • Proactive Monitoring: Spot issues before they become outages
  • Data-Driven Decisions: Use actual data, not gut feelings
  • Team Collaboration: Share queries that work across your team

Try This Week

  1. Bookmark useful queries – Save these in your Azure Monitor query library
  2. Create custom alerts – Turn successful queries into proactive monitoring
  3. Build a troubleshooting runbook – Document your team’s go-to queries
  4. Practice with historical data – Run queries against last week’s logs to learn patterns

Advanced: Create Reusable Functions

// Save this as a function called "QuickErrorSummary"
let QuickErrorSummary = (timeRange: timespan) {
    AppServiceHTTPLogs
    | where TimeGenerated > ago(timeRange)
    | where ScStatus >= 400
    | summarize Count = count() by ScStatus, bin(TimeGenerated, 5m)
    | render timechart
};
// Use it like: QuickErrorSummary(2h)

KQL Tips for Faster Debugging

  • Start narrow: Begin with specific time ranges and expand if needed
  • Use take liberally: Add | take 100 to limit results whilst exploring
  • Summarise early: Use summarize to spot patterns instead of scrolling through rows
  • Visualise when possible: render timechart or render columnchart reveals trends
  • Save successful queries: Build your personal troubleshooting toolkit

Common Troubleshooting Scenarios

  • Performance degradation: Check response times and resource usage correlation
  • Authentication issues: Review sign-in logs and failed connection attempts
  • Deployment problems: Monitor activity logs during release windows
  • Intermittent errors: Look for patterns in error frequency and timing
  • Resource exhaustion: Correlate performance counters with application metrics

Pro Tips

  • Use workbooks: Combine multiple queries into dashboards for ongoing monitoring
  • Set up alerts: Convert your best troubleshooting queries into proactive alerts
  • Share with the team: Export successful queries and build a team knowledge base
  • Learn incrementally: Master one query type per week rather than trying everything at once

Hidden Gem: The search operator can search across ALL tables when you’re not sure where to look: search "error message" will find that needle in the haystack.


Got a KQL query that saved your bacon during an incident? Share it with me – real-world troubleshooting stories make the best Monday tips!