Your weekly dose of actionable cloud wisdom to start the week right
The Problem
It’s 2 PM, your application is misbehaving, users are complaining, and you’re staring at Azure Monitor with millions of log entries. You know the answer is in there somewhere, but writing KQL queries feels like learning a foreign language whilst your service burns down around you.
The Solution
Master these essential KQL (Kusto Query Language) queries that solve real troubleshooting scenarios. Instead of drowning in logs, you’ll pinpoint issues in minutes and look like a hero whilst doing it.
Essential Troubleshooting Queries:
1. Find Application Errors in the Last Hour
// Quick error hunt - your go-to starting query
AppServiceHTTPLogs
| where TimeGenerated > ago(1h)
| where ScStatus >= 400
| summarize ErrorCount = count() by ScStatus, CsUriStem
| order by ErrorCount desc
| take 20
2. Slow API Endpoints Analysis
// Find your performance bottlenecks
AppServiceHTTPLogs
| where TimeGenerated > ago(4h)
| where TimeTaken > 5000 // Requests taking over 5 seconds
| summarize
AvgTime = avg(TimeTaken),
MaxTime = max(TimeTaken),
Count = count()
by CsUriStem
| order by AvgTime desc
3. Failed Login Attempts Detection
// Security monitoring - spot brute force attempts
SigninLogs
| where TimeGenerated > ago(24h)
| where ResultType != "0" // Non-successful logins
| summarize
FailedAttempts = count(),
UniqueIPs = dcount(IPAddress)
by UserPrincipalName
| where FailedAttempts > 10
| order by FailedAttempts desc
4. Resource Health Check
// Quick service health overview
AzureActivity
| where TimeGenerated > ago(2h)
| where ActivityStatus == "Failed"
| summarize Count = count() by ResourceGroup, OperationName
| order by Count desc
5. Database Connection Issues
// Spot database connectivity problems
AzureDiagnostics
| where TimeGenerated > ago(1h)
| where Category == "SQLSecurityAuditEvents"
| where statement_s contains "login failed"
| summarize FailedConnections = count() by client_ip_s, server_principal_name_s
| order by FailedConnections desc
6. Memory and CPU Correlation
// Find resource correlation issues
Perf
| where TimeGenerated > ago(2h)
| where ObjectName == "Processor" and CounterName == "% Processor Time"
or ObjectName == "Memory" and CounterName == "Available MBytes"
| summarize avg(CounterValue) by Computer, CounterName, bin(TimeGenerated, 5m)
| render timechart
Why It Matters
- Mean Time to Resolution: Cut debugging time from hours to minutes
- Proactive Monitoring: Spot issues before they become outages
- Data-Driven Decisions: Use actual data, not gut feelings
- Team Collaboration: Share queries that work across your team
Try This Week
- Bookmark useful queries – Save these in your Azure Monitor query library
- Create custom alerts – Turn successful queries into proactive monitoring
- Build a troubleshooting runbook – Document your team’s go-to queries
- Practice with historical data – Run queries against last week’s logs to learn patterns
Advanced: Create Reusable Functions
// Save this as a function called "QuickErrorSummary"
let QuickErrorSummary = (timeRange: timespan) {
AppServiceHTTPLogs
| where TimeGenerated > ago(timeRange)
| where ScStatus >= 400
| summarize Count = count() by ScStatus, bin(TimeGenerated, 5m)
| render timechart
};
// Use it like: QuickErrorSummary(2h)
KQL Tips for Faster Debugging
- Start narrow: Begin with specific time ranges and expand if needed
- Use
takeliberally: Add| take 100to limit results whilst exploring - Summarise early: Use
summarizeto spot patterns instead of scrolling through rows - Visualise when possible:
render timechartorrender columnchartreveals trends - Save successful queries: Build your personal troubleshooting toolkit
Common Troubleshooting Scenarios
- Performance degradation: Check response times and resource usage correlation
- Authentication issues: Review sign-in logs and failed connection attempts
- Deployment problems: Monitor activity logs during release windows
- Intermittent errors: Look for patterns in error frequency and timing
- Resource exhaustion: Correlate performance counters with application metrics
Pro Tips
- Use workbooks: Combine multiple queries into dashboards for ongoing monitoring
- Set up alerts: Convert your best troubleshooting queries into proactive alerts
- Share with the team: Export successful queries and build a team knowledge base
- Learn incrementally: Master one query type per week rather than trying everything at once
Hidden Gem: The search operator can search across ALL tables when you’re not sure where to look: search "error message" will find that needle in the haystack.
Got a KQL query that saved your bacon during an incident? Share it with me – real-world troubleshooting stories make the best Monday tips!








