A little AI that learns, live, how to hunt โ while the flies learn how to escape.
โถ watch it liveAn open field with two kinds of moving dots:
Nothing here is hand-scripted. Both sides start out clueless and figure everything out by trial and error.
This is reinforcement learning. Each side has its own small neural network (an "actor-critic"). Every moment, the network looks at the situation โ where's the nearest fly, how fast is it moving, where are the walls โ and picks a direction to move. Then it gets a reward:
Over millions of tiny attempts, each network nudges itself toward choices that paid off. That's learning โ no rules written by a human, just rewards.
Because it's an arms race. The instant the hunter gets good, the flies are also getting better at dodging โ so the target keeps moving. Every gain has to be earned against a smarter opponent. That's why the two lines on the timeline seesaw instead of shooting straight to 100%.
The clock on screen counts simulated time, not real time. One training step equals a couple of simulated seconds, so after a lot of steps the display reads days or weeks of "fly time" โ even though far less real time has passed. The estimated time shows roughly how long until the hunter reaches 95% accuracy at the current pace.
Yes โ it trains 24/7 on a little machine at home, and saves its brain and full history to a file. So you can check in any day and watch the curve keep climbing from where it left off.
No โ this is a view-only window. Tweaking the world (speed, number of flies, resetting the brains) is locked behind a private admin login. You get to watch the science happen. ๐
Built with Python ยท PyTorch ยท FastAPI. Predator & prey are independent A2C agents in a continuous 2-D world with physical obstacles.
โถ watch it live