Context
Selected among 7,000+ global applicants for the Mistral AI Worldwide Hackathon in London (Feb 28 - Mar 1, 2026). Organized by Mistral AI and Iterate, sponsored by Weights & Biases, NVIDIA, Amazon Web Services (AWS), ElevenLabs and Hugging Face.
Fine-Tuning Track (sponsored by W&B) — 48 hours to fine-tune open-source Mistral models and build a working application.
Selectionne parmi 7 000+ candidatures mondiales pour le Mistral AI Worldwide Hackathon a Londres (28 fev - 1er mars 2026). Organise par Mistral AI et Iterate, sponsorise par Weights & Biases, NVIDIA, Amazon Web Services (AWS), ElevenLabs et Hugging Face.
Fine-Tuning Track (sponsorise par W&B) — 48 heures pour fine-tuner des modeles Mistral open-source et construire une application fonctionnelle.
Project
Ecotopia is an interactive political simulation where the player is mayor of a city facing ecological collapse. Free-text speeches are analyzed by specialized fine-tuned models:
- Structured information extraction — Political promise NER, type categorization, contradiction detection
- Conditional text generation — Contextualized citizen reactions based on game state, citizen profiles, and trust history
Ecotopia est une simulation politique interactive ou le joueur incarne un maire face a l'effondrement ecologique. Les discours en texte libre sont analyses par des modeles specialises fine-tunes :
- Extraction d'information supervisee — NER de promesses politiques, categorisation, detection de contradictions
- Generation conditionnelle de texte — Reactions citoyennes contextualisees selon l'etat du jeu, les profils et l'historique de confiance
Fine-Tuning
4 Mistral models fine-tuned via QLoRA (NF4 4-bit, LoRA r=16, alpha=32) on 690 synthetic examples generated via Amazon Bedrock, in under 10 minutes per model:
| Task | Models | Training Examples |
|---|---|---|
| Promise Extraction | Ministral 8B, Nemo 12B | 300 (3 difficulty tiers) |
| Citizen Reactions | Ministral 8B, Small 24B | 390 |
4 modeles Mistral fine-tunes via QLoRA (NF4 4-bit, LoRA r=16, alpha=32) sur 690 exemples synthetiques generes via Amazon Bedrock, en moins de 10 minutes par modele :
| Tache | Modeles | Exemples |
|---|---|---|
| Extraction de promesses | Ministral 8B, Nemo 12B | 300 (3 niveaux de difficulte) |
| Reactions citoyennes | Ministral 8B, Small 24B | 390 |
Results
Our 8B fine-tuned SLMs outperform Mistral Large (base) across the entire structured output pipeline at 10x lower latency. Mistral Large scores 0% valid JSON on citizen reactions without fine-tuning.
Specializing small models on precise tasks enables real-time applications where latency and output format reliability are hard constraints.
Nos SLMs 8B fine-tunes surpassent Mistral Large (base) sur toute la pipeline structuree avec une latence 10x inferieure. Mistral Large score 0% en JSON valide sur les reactions citoyennes sans fine-tuning.
La specialisation de petits modeles sur des taches precises permet des applications temps reel ou la latence et la fiabilite du format de sortie sont des contraintes dures.
Architecture
- Inference: HuggingFace Endpoints with custom handler (4-bit BitsAndBytes)
- Backend: Spring Boot 3.5 + Spring AI
- Frontend: Phaser 3 (TypeScript, pixel art)
- Tracking: Weights & Biases (experiment tracking, evaluation, automated report)
- Data: PostgreSQL + synthetic training data via Amazon Bedrock
- Inference : HuggingFace Endpoints avec handler custom (4-bit BitsAndBytes)
- Backend : Spring Boot 3.5 + Spring AI
- Frontend : Phaser 3 (TypeScript, pixel art)
- Tracking : Weights & Biases (suivi d'experiences, evaluation, rapport automatise)
- Donnees : PostgreSQL + donnees synthetiques via Amazon Bedrock
