Overview
OpsBot is an operational AI assistant for maintenance, safety, compliance, and day-to-day operating procedures in Google Chat. The product brings governed knowledge directly into the chat interface employees already use, with source-backed answers and safe fallback behavior when evidence is weak.
My role: led product and technical delivery, from scope and source governance to RAG architecture, Google Chat integration, evaluation traces, runtime observability, and deployment.
Business Impact
| Metric | Value |
|---|---|
| Target population | 3,972+ collaborators across store and support roles |
| Knowledge demand | ~12,000 monthly views on the operating knowledge base |
| Instant resolution target | 85% of first-line questions answered immediately |
| Time reallocated | 14,500+ hours/year across stores and regional support |
| ROI hypothesis | ~€540k in productivity reallocation |
Product Design
What the assistant answers
- Maintenance procedures and troubleshooting steps
- Safety processes, checklists, and crisis guidance
- Compliance questions and operating standards
- FAQ-style operational questions from the approved corpus
What makes it usable in the field
- Google Chat-first UX instead of yet another separate tool
- Grounded retrieval so answers come from approved sources instead of freeform guessing
- Safe fallback behavior when evidence is too weak
- Guardrails for banned or out-of-scope topics
- Incremental source sync so the corpus can stay fresh as documents evolve
Technical Architecture
The runtime keeps the DAISI-style operational discipline, but simplifies the product into a focused single-assistant RAG flow rather than a broader multi-agent setup.
Core runtime
- FastAPI for the webhook runtime
- Cloud Run for the serving layer
- Google Chat as the user channel
- Gemini for answer generation
- Vertex AI RAG Engine for grounded retrieval
- Cloud SQL PostgreSQL for conversation memory and checkpoint state
Source layer
- Google Docs
- Google Sheets
- Google Drive exports
- PDFs
- Approved web pages
Platform operations
- Terraform for infrastructure delivery
- Cloud Run Jobs + Scheduler for background workloads
- MLflow for traces, evaluations, and feedback analysis
- Structured source registry to define what enters the corpus and how it is synced
Architecture diagrams
The page splits the architecture into three readable views: the system map, the request flow, and the source sync loop.
System architecture
This view keeps the core responsibilities separate: Google Chat for the user channel, Cloud Run and FastAPI for the webhook runtime, Gemini and Vertex AI RAG Engine for grounded answering, governed documents for the corpus, and Cloud SQL / MLflow / scheduled jobs for operations.Request flow
A user asks in Google Chat. The webhook normalizes the event, applies policy checks, retrieves evidence from the approved corpus, generates a grounded answer, persists useful state, logs the trace, and replies in the same chat thread. If evidence is weak, the assistant falls back instead of inventing a procedure.Source sync loop
The source registry controls what can enter the corpus. Scheduled jobs export documents, parse and normalize content, enrich chunks with metadata, refresh the Vertex AI RAG corpus, and feed quality issues back to the source owners.Stack
| Category | Technologies |
|---|---|
| LLM / RAG | Gemini, Vertex AI RAG Engine |
| Backend | Python 3.11, FastAPI, Pydantic |
| Memory | Cloud SQL PostgreSQL |
| Cloud | GCP, Cloud Run, Cloud Run Jobs, Cloud Scheduler, GCS |
| Knowledge sources | Google Docs, Google Sheets, Google Drive, PDFs, approved web pages |
| Observability | MLflow, structured tracing, feedback evaluation |
| Infrastructure | Terraform, Docker |
| Quality | Pytest, Ruff, Mypy, CI automation |
Delivery Scope
Lot 1 focus
- Maintenance workflows
- Safety procedures
- Compliance and regulatory operational content
Hard constraints that shaped the MVP
- Keep the assistant simple and governed in V1
- Prefer direct ingestion from Google Workspace sources over unnecessary live integrations
- Exclude sensitive documents from the corpus instead of overbuilding permission logic too early
- Treat the document pipeline as a first-class problem: parsing quality, OCR, metadata, chunking, and sync matter as much as the prompt layer
Why this project matters
OpsBot earns trust by being useful, grounded, and operationally maintainable. The interesting part is not only the model choice — it is the combination of governed content, safe answer behavior, runtime observability, and delivery discipline that makes the assistant viable beyond a prototype.