Warpspeed 2025 to Riquell: The Future of On-Call Without Burnout

The Beginning: A Shot in the Dark

When we first heard about Warpspeed 2025, an agentic AI hackathon organized by Devfolio and Lightspeed India in Bangalore, we knew we had to be there. The numbers were intimidating - over 2000+ registrations for a hackathon where only around 65 teams would make it to the final round. But sometimes, the best adventures begin with the longest odds.

Our team came together almost serendipitously: Akash Singh, Harsh Kumar Gupta, Himanshu, and myself decided to take on this challenge together. Shubhang Sinha joined us later as our AI engineer, bringing additional expertise to strengthen our technical foundation. What we didn't know then was that we were about to turn a shared frustration into a grand prize-winning solution.

The Spark of an Idea: DreamOps

The idea for DreamOps didn't come from a brainstorming session or market research - it came from pain. Real, 3 AM, production-is-down, your-phone-is-buzzing pain.

Akash Singh, being a DevOps engineer himself, had lived through countless nights of being jolted awake by PagerDuty alerts. Picture this: It's 3 AM, your production database is down, users are angry, and you're stumbling in the dark trying to diagnose what went wrong while half-asleep. This was Akash's reality, and the reality of thousands of on-call engineers worldwide.

The problems were clear:

Constant sleep interruptions and alert fatigue
Manual log analysis across multiple systems under pressure
30-60 minutes of stressful debugging for common issues
Inconsistent remediation quality when exhausted
Burnout from repetitive tasks that could be automated

So Akash proposed a solution: DreamOps - an AI-powered on-call partner that would handle routine incidents automatically, letting engineers actually sleep through the night.

Special Thanks to Point Blank Club

Before diving into the technical journey, I want to express our heartfelt gratitude to all the seniors at Point Blank Club who took the time to validate our idea in the early stages. Your insights, feedback, and encouragement gave us the confidence to push forward with DreamOps. The validation from experienced developers and mentors was invaluable in shaping our approach and believing in the potential impact of our solution.

The Technical Challenge: Building the Impossible in 24 Hours

What we were attempting was ambitious - building an AI agent that could automatically triage and resolve infrastructure issues using Claude AI and advanced integrations. For many of us, including myself, this was uncharted territory.

I'll be honest - I didn't have enough knowledge about MCPs (Model Context Protocol) and AI agents when we started. But that's the beauty of hackathons - they push you beyond your comfort zone and force rapid learning under pressure.

The Architecture We Built

DreamOps became an intelligent incident response platform with these core components:

AI-First Architecture: Claude AI integration for advanced reasoning and root cause analysis
Model Context Protocol (MCP): Seamless integration with 10+ tools
Confidence Scoring: Only auto-executes actions with ≥80% confidence
Risk Assessment: Categorizes commands as low/medium/high risk
Production-Ready Stack: Python FastAPI backend, Next.js frontend
Deep Integrations: Kubernetes, PagerDuty, Grafana, GitHub, Slack, Notion

How It Actually Works

When PagerDuty sends an alert, our AI agent:

Instantly analyzes the incident with full Kubernetes context
Diagnoses root cause using logs, metrics, and documentation
Executes remediation commands automatically (with safety checks)
Only escalates truly complex issues that need human intervention

We even implemented what we playfully called "YOLO Mode" - when enabled, DreamOps autonomously executes remediation commands for common issues like pod crashes, memory issues, and deployment failures. Don't worry though, every action is risk-assessed and confidence-scored!

The Team Behind the Magic

Let me properly introduce the incredible team that made this possible:

Akash Singh - The visionary and lead developer who conceived DreamOps from his real-world DevOps pain points. His deep understanding of infrastructure challenges was the foundation of our solution.
Harsh Kumar Gupta - Our full-stack developer who worked across both frontend and backend systems to create a cohesive user experience.
Himanshu - Our backend developer who focused on the core server infrastructure and data processing pipelines.
Myself - As the AI engineer, I managed the backend systems, alert processing, and all the complex integrations despite initially being unfamiliar with MCPs and AI agents. The learning curve was steep but rewarding.
Shubhang Sinha - Our additional AI engineer who joined us later, bringing specialized knowledge in machine learning and AI systems that helped refine our agent's capabilities.

Each team member brought unique strengths, but more importantly, we shared the same vision of making on-call duty humane again.

The Results That Blew Everyone Away

The numbers spoke for themselves:

80% faster incident resolution (2-5 minutes vs 30-60 minutes)
2-4 hours saved per on-call shift
Zero 3 AM wake-up calls for routine issues
Consistent remediation quality regardless of time of day
90% reduction in middle-of-night escalations

Victory at Warpspeed 2025

After 24 hours of intense building, debugging, and refining, we presented DreamOps to the judges. The moment they announced us as the Grand Prize winners was surreal.

The official announcement read: "Grand Prize goes to DreamOps by Akash Singh, Inchara J, Himanshu Singh, and Harsh Kumar Gupta. DreamOps is an AI agent that tackles late-night debugging. It automatically triages and resolves common programming issues, cutting debugging time from 30-60 minutes to just 2-5 minutes. Engineers can now rest easy while AI handles routine problems, escalating only complex ones."

We won $3,000 USD, but more importantly, we had validation that we'd solved a problem that resonated with every engineer in the room.

Beyond the Hackathon: Evolution to Riquell

Winning Warpspeed 2025 was just the beginning. What started as DreamOps has evolved into Riquell - a more sophisticated AI copilot that helps DevOps and SRE teams find and fix production issues faster, without needing to write complex scripts or dig through dozens of dashboards.

The hackathon judges were blown away by our approach to solving a problem that every engineer in the room had experienced. While other teams built incremental improvements, we reimagined incident response from the ground up with AI at the core.

Technical Evolution: From Prototype to Production

Since the hackathon, we've rebuilt our architecture with significant improvements:

How Riquell Works:
Right now, when a pager alert fires, engineers have to jump between logs, metrics, and tracing tools to figure out what went wrong. It's a stressful, manual process that can take hours and often hits at the worst possible time.

Riquell connects directly to systems like PagerDuty and starts triaging incidents the moment an alert comes in. It pulls real-time telemetry, routes signals to specialized AI agents for logs, metrics, and traces, and uses retrieval-augmented generation along with a system knowledge graph to understand the full context of the issue.

Three-Tiered Resolution System:
Once the issue is analyzed, Riquell offers three resolution modes depending on confidence level and risk:

YOLO Mode: For low-risk, high-confidence issues like pod crashes or restarts, Riquell acts on its own. There's a built-in rollback mechanism to undo changes if the fix doesn't stabilize things.
Approval Mode: Riquell prepares the complete fix and shows it to the engineer first. Once approved, it executes the steps automatically.
Human-in-the-loop Mode: For more complex cases, Riquell guides the engineer step-by-step, offering context-rich suggestions and reasoning.

All of this happens inside the tools teams already use, with overlays added to existing dashboards to simplify investigation and resolution.

Advanced Tech Stack:

Frontend: Next.js for the SaaS interface and real-time dashboards
Backend: Go as the primary language for backend systems, with Python FastAPI for the incident response workflow that receives PagerDuty webhooks
AI & LLM Stack:
- Claude AI as the primary high-capability reasoning engine for sophisticated root cause analysis and remediation planning
- Agno framework used to build the AI agent
- Vector-based search implemented to enable semantic search of on-call notes and incident data
- Knowledge Graph and RAG to turn the codebase into a knowledge graph, making it easier for the agent to make edits and suggest relevant knowledge base articles
- Reinforcement Learning with RAG integrated to lessen reliance on Vector DB during production to reduce cost and complexity

Deep Integration Ecosystem:
Our MCP server integrations include:

AWS ECS/EKS: Deployment infrastructure
PagerDuty: Source of incident alerts that trigger workflows via webhook
Grafana: Gathering context and validating alerts with quantitative data
Kubernetes: Investigating live status of services, describing pods, and pulling logs
GitHub: Correlating production issues with recent code modifications
Notion: Knowledge base for runbooks and architectural diagrams
Datadog: Performance monitoring and tracing
Atlassian: Issue tracking and team collaboration
Slack: Team communication and notifications

Continuous Learning:
Riquell also learns continuously. It observes how incidents are handled, gathers feedback from engineers, and uses reinforcement learning to improve its accuracy and decision-making over time.

Future Technical Innovations

The roadmap ahead includes some exciting technical challenges we're exploring:

Predictive Incident Prevention:

Machine learning models that analyze historical patterns and system metrics to predict issues before they occur
Anomaly detection algorithms that can identify subtle drift in system behavior
Proactive scaling and resource optimization based on predicted load patterns

Advanced Observability Integration:

Deep integration with planned APM tools like New Relic, Pyroscope, and OpenTelemetry
Custom instrumentation that provides richer context to our AI agents
Real-time correlation between business metrics and infrastructure health

Multi-Cloud Intelligence:

Cross-cloud incident correlation and resolution across AWS, GCP, and Azure
Cloud-agnostic infrastructure abstractions for universal deployment
Cost-impact analysis that factors incident resolution into infrastructure spending

What We Learned

This experience taught us several valuable lessons:

Real problems make the best products - Our solution resonated because it addressed genuine pain points that every engineer in the room had experienced
Technical challenges are surmountable - Even without deep expertise in certain areas, determination and rapid learning can bridge gaps
Team diversity is strength - Each member's unique background contributed to our comprehensive solution
Validation matters - Getting feedback from experienced developers (shoutout to Point Blank Club seniors!) helped refine our approach

Looking Forward

Riquell isn't stopping at the hackathon victory. We continue pushing the boundaries of what's possible when AI meets DevOps, exploring new frontiers in intelligent infrastructure management.

We started this as a hackathon project called DreamOps, won the Warpspeed hackathon, and have been building ever since. Now, Riquell is becoming a full product aimed at making incident response faster, safer, and a lot less stressful.

The Real Victory

Yes, we won $3,000 and the Grand Prize at Lightspeed Warpspeed 2025. But honestly? The real win is what we built and the problem we're solving.

We've created something that lets engineers sleep through the night instead of being woken up by routine production issues. We've built a platform that transforms the most stressful part of being a developer into something manageable and automated. From DreamOps to Riquell, we're continuously evolving our approach to make incident response not just faster, but fundamentally more intelligent.

Conclusion

From 2000+ registrations to 65 finalist teams to Grand Prize winners - the journey of Warpspeed 2025 taught us that with the right idea, the right team, and enough determination, you can build something that truly matters.

To the Point Blank Club seniors who believed in our idea from the early stages, to the judges who recognized the potential of DreamOps, and to every engineer who has ever been woken up by a 3 AM alert - this one's for you.

Riquell is real. It's happening. And it's just the beginning of making on-call duty humane again.

Because 3 AM debugging sessions should be a thing of the past. ✨

Connect with the team:

Akash Singh - Lead Developer & Visionary
Harsh Kumar Gupta - Full-Stack Developer
Himanshu - Backend Developer
Inchara J - AI Engineer (Backend & Integrations)
Shubhang Sinha - AI Engineer

Check out our journey:

The future of incident response is here. Ready to dream easy while AI takes care of your on-call duty?