OpsPilot - AI Skill Microsoft Hackathon project

OpsPilot is an autonomous, multi-agent SRE command center built on Azure AI Foundry for the Microsoft Agents League Hackathon (Reasoning Agents track). It compresses production incident triage from hours of manual, high-pressure work into an evidence-backed, sub-5-minute investigation — and escalates to a frontier reasoning model only when it needs to. When an incident fires, a Commander agent dispatches specialized AI agents — Metrics, Logs, and Deployment — that investigate in parallel. A Time Machine agent builds a causal timeline, a Root Cause agent produces a confidence-scored hypothesis, and a Recommendation agent proposes ranked, executable fixes — all streamed live to an Azure-Portal-style command center. ⭐ The differentiator — confidence-gated reasoning escalation: OpsPilot computes a combined confidence score and only routes to the o4-mini reasoning model when confidence is low. Frontier reasoning stays off the hot path unless it's actually needed — controlling cost and latency by design. 🔧 Built with: • Azure AI Foundry / Azure OpenAI — GPT-4o, GPT-4o-mini, o4-mini • LangGraph multi-agent orchestration (StateGraph) • FastAPI + Python 3.12 backend • React + TypeScript + Fluent UI v9 frontend • Azure Container Apps · Docker · GitHub Actions (GHCR) ⏱ Chapters (adjust timestamps to match your video): 0:00 Intro 0:30 The problem with incident response today 1:15 OpsPilot dashboard tour 2:30 Triggering an autonomous investigation 3:45 Specialist agents working in parallel 5:00 Confidence-scored root cause + reasoning escalation 6:15 Ranked remediations & one-click rollback 7:15 Analytics & wrap-up 🔗 Links: • GitHub (source + run instructions): https://github.com/rewyekha/OpsPilot • Portfolio: https://reyaskhan.me Built by Reyas Khan. #MicrosoftAgentsLeague #AzureAIFoundry #AIAgents #MultiAgent #SRE #DevOps #AIOps #LangGraph #Azure #IncidentResponse #ReasoningAgents