Overview
AI Product Photo Detector is a complete MLOps system -- not just a model, but a full production pipeline for binary classification of real vs AI-generated product images. Built around EfficientNet-B0 with Grad-CAM explainability, trained on the CIFAKE dataset (CIFAR-10 real images vs Stable Diffusion AI-generated counterparts). The project covers every stage of the ML lifecycle: data versioning, multi-environment training, experiment tracking, model registry, containerized serving, infrastructure-as-code, CI/CD automation, and production monitoring.
Results
| Metric | Value |
|---|---|
| Accuracy | 92.8% |
| F1-Score | 93.1% |
| Precision | 92.5% |
| Recall | 93.7% |
| Inference Latency | 131ms |
Key Features
| Feature | Description |
|---|---|
| ML Model | EfficientNet-B0 with Grad-CAM explainability for interpretable predictions |
| API | FastAPI with JWT authentication, rate limiting, batch and explain endpoints |
| Training | 3 modes: Local/Docker, Google Colab, and Vertex AI |
| Monitoring | Prometheus + Grafana with 20+ metrics, 13 alert rules, and drift detection |
| Infrastructure | Terraform (5 modules, 2 environments) + Docker (5 services) + Cloud Run |
| CI/CD | GitHub Actions with 5 workflows and automated quality gates |
| Testing | 316 tests, 70%+ coverage across 3 levels (unit, integration, E2E) |
System Architecture
The pipeline follows a linear flow from data to production monitoring:
Data (DVC + GCS) → Training (PyTorch) → Registry (MLflow) → Package (Docker) → Infra (Terraform) → CI/CD (GitHub Actions) → Deploy (Cloud Run) → Serve (FastAPI) → Monitor (Prometheus + Grafana)
Each stage is automated and version-controlled. DVC tracks dataset versions in Google Cloud Storage, MLflow records every training run, Docker packages the model and API, Terraform provisions cloud infrastructure, GitHub Actions orchestrates testing and deployment, and Prometheus collects production metrics.
Model Architecture
The classification pipeline is built on EfficientNet-B0 with transfer learning:
- Base Model -- EfficientNet-B0 pretrained on ImageNet (5.3M parameters)
- Custom Head -- Dropout (0.3) + Linear layer for binary classification
- Explainability -- Grad-CAM generates visual heatmaps highlighting the regions driving each prediction
Training Configuration
| Parameter | Value |
|---|---|
| Dataset | CIFAKE (CIFAR-10 + Stable Diffusion) |
| Architecture | EfficientNet-B0 |
| Optimizer | AdamW (weight decay) |
| Scheduler | CosineAnnealingLR |
| Best Learning Rate | 3e-4 (Run 3) |
| Parameters | 5.3M (transfer learning) |
MLOps Pipeline
Data Versioning
DVC (Data Version Control) manages dataset versions with remote storage on Google Cloud Storage. Dataset changes are tracked alongside code in Git, ensuring full reproducibility.Experiment Tracking
MLflow manages the complete experiment lifecycle:- Metrics Logging -- Loss, accuracy, F1-score, precision, recall per epoch
- Model Registry -- Version control with staging/production promotion
- Artifact Storage -- Model weights, confusion matrices, training curves, Grad-CAM samples
- Hyperparameter Tracking -- Learning rate, batch size, augmentation config, optimizer settings
Training Modes
| Mode | Environment | Use Case |
|---|---|---|
| Local / Docker | Local machine or Docker container | Development and quick iterations |
| Google Colab | Cloud notebook with free GPU | Prototyping with GPU acceleration |
| Vertex AI | Google Cloud managed training | Production training at scale |
Grad-CAM Explainability
Every prediction can include a Grad-CAM heatmap overlay via the /predict/explain endpoint, showing which image regions contributed most to the classification decision.API and Serving
FastAPI serves the model with a production-grade endpoint architecture:
The API includes a JWT authentication pipeline and configurable rate limiting to prevent abuse in production.
| Endpoint | Method | Description |
|---|---|---|
| /predict | POST | Single image classification with confidence score |
| /predict/batch | POST | Batch prediction for multiple images |
| /predict/explain | POST | Prediction with Grad-CAM heatmap overlay |
| /health | GET | Health check and model status |
| /metrics | GET | Prometheus-compatible metrics |
Infrastructure
Terraform
Infrastructure is defined as code across 5 modules managing 2 environments (dev and prod):| Module | Purpose |
|---|---|
| Cloud Run | Serverless container deployment |
| IAM | Service accounts and permissions |
| Monitoring | Alert policies and notification channels |
| Networking | VPC and firewall rules |
| Storage | GCS buckets for data and artifacts |
Docker Compose
The local development stack runs 5 services:| Service | Port | Purpose |
|---|---|---|
| FastAPI | 8000 | Inference API |
| MLflow | 5000 | Experiment tracking UI |
| Streamlit | 8501 | Web prediction interface |
| Prometheus | 9090 | Metrics collection |
| Grafana | 3000 | Monitoring dashboards |
Cloud Run Deployment
Production deployment targets Google Cloud Run with auto-scaling, HTTPS, and custom domain configuration. Total infrastructure cost is kept under $0.50/month through Cloud Run's scale-to-zero capability and efficient resource allocation.CI/CD
GitHub Actions automates the full build-test-deploy cycle with 5 workflows:
CI Pipeline
- Ruff -- Python linting and formatting checks
- mypy -- Static type checking
- pytest -- 316 tests across unit, integration, and E2E levels
- CodeQL -- Security vulnerability scanning
- Docker -- Container build and validation
CD Pipeline
- Auto-deploy -- Triggered on merge to main after CI passes
- Smoke Test -- Post-deployment health and prediction validation
- Rollback -- Automatic rollback on failed smoke tests
- Quality Gate -- Model accuracy ≥ 0.85 and F1 ≥ 0.80 enforced before deployment
Monitoring
Prometheus Metrics
20+ custom metrics tracked in production:- Request metrics -- Latency histograms, throughput counters, error rates
- Model metrics -- Prediction confidence distribution, class balance
- System metrics -- Memory usage, active requests, queue depth
- Drift metrics -- Feature distribution shifts detected via statistical tests
Alert Rules
13 alert rules configured across categories:- Availability -- Service down, health check failures
- Performance -- Latency spikes, error rate thresholds
- Model -- Prediction drift, confidence anomalies
- Infrastructure -- Resource saturation, container restarts
Grafana Dashboards
Pre-configured dashboards for API performance, model behavior, and infrastructure health with automatic alerting via notification channels.Cost Management
The entire production deployment runs at under $0.50/month:
| Resource | Cost |
|---|---|
| Cloud Run | ~$0.00 (scale-to-zero, free tier) |
| GCS Storage | ~$0.01 (small dataset and artifacts) |
| Container Registry | ~$0.10 (image storage) |
| Monitoring | ~$0.00 (free tier) |
| Total | < $0.50/month |
Tech Stack
| Category | Technologies |
|---|---|
| ML / DL | PyTorch, EfficientNet-B0, torchvision, Grad-CAM |
| API | FastAPI, Uvicorn, Pydantic, JWT auth |
| MLOps | MLflow, DVC, Model Registry, Vertex AI |
| Monitoring | Prometheus, Grafana, structlog, drift detection |
| Frontend | Streamlit |
| Infrastructure | Terraform, Docker, Docker Compose, Google Cloud Run |
| CI/CD | GitHub Actions (5 workflows), CodeQL, quality gates |
| Testing | pytest (316 tests), Ruff, mypy, 70%+ coverage |
| Cloud | Google Cloud Platform (GCS, Cloud Run, Vertex AI, IAM) |
| Dataset | CIFAKE (CIFAR-10 + Stable Diffusion) |
