Hakim Vision
A computer-vision research project that generates synthetic Baloot card scenes, trains YOLO/RT-DETR detectors, and exports quantized models (ONNX, CoreML, TFLite) for the Hakim AR mobile companion. Part of an open-source, Arabic-first Baloot AI platform.
Hakim Vision is the computer-vision pillar of Hakim, an open Baloot AI platform. It generates synthetic scenes of the 32-card Baloot deck on textured backgrounds, trains modern object detectors (YOLO11, RT-DETRv2) on them, and exports quantized models (ONNX, CoreML, TFLite) to detect cards in real time on iPhone — driving AR, play-by-play strategy hints.
What it solves
- Card-game AI for Baloot lacked public research infrastructure and open models.
- No modern detector pipeline for the 32-card Baloot deck; existing tools assumed standard 52-card decks.
- Mobile AR for physical Baloot tables requires < 30ms inference; cloud APIs are too slow.
Impact
32-card Baloot deck scenes
YOLO11 + RT-DETRv2
ONNX, CoreML, TFLite (< 30ms)

Architecture
Data flow
- Card asset library (32-card Baloot deck)
- Synthetic scene generation (OpenCV + textures)
- Augmentation pipeline (scale, rotate, light)
- Labeled scenes → dataset (tar shards)
- Train YOLO11 / RT-DETRv2
Validation on held-out synthetic set
- Export to ONNX / CoreML / TFLite
- Mobile app (Hakim AR companion)
- Real-time card detection on iPhone < 30ms
Engineering decisions
Synthetic data generation over manual labeling
Generate unlimited labeled scenes of the 32-card Baloot deck on textured backgrounds via OpenCV. Eliminates the annotation bottleneck and enables fast iteration.
Modern typed Python package (not a notebook)
Replaced the original 2018 notebook with a production-grade package: mypy --strict, pytest suite, uv lockfile, CI/CD. Reproducible, testable, and maintainable.
On-device export for < 30ms mobile inference
Train once, export to ONNX/CoreML/TFLite. Models run directly on iPhone for real-time card detection without round-trips to the cloud.
Part of Hakim — a four-pillar open Baloot platform
Hakim Vision feeds card detections to hakim-engine (rules) and hakim-agent (strategy). The AR companion displays hints from hakim-coach (LLM commentary) overlaid on the physical table.
Gallery

