- Role
- Concept, Engineering, Automation
- Timeline
- 2025
- Project type
- AI agent / Experiment
- Status
- Active
Context
Numerai is a data tournament where participants submit machine-learning models on anonymised financial data. Continuously evaluating and improving those models is tedious, repetitive work, exactly the kind of task worth automating.
Approach
A Python agent pulls the current model scores, summarises them, and hands them to Claude for analysis. Claude reads the trends, proposes adjustments, and explains its reasoning.
Instead of starting from scratch each time, the agent writes findings into a persistent JSON knowledge store, so every run builds on the last. GitHub Actions triggers the loop on a schedule, with no servers to run.
Outcome
A self-running experiment that shows how an LLM can sit as a reasoning layer on top of a data pipeline. It runs entirely on free tiers (€0 operating cost) and documents its own evolution over time.
Stack
- Python
- Claude API
- GitHub Actions
- Numerai API
- JSON Knowledge Store
