🪰 Fly Hunt

A little AI that learns, live, how to hunt — while the flies learn how to escape.

What am I looking at?

An open field with two kinds of moving dots:

Lil Nesbes (green) — the hunters. They try to catch flies.
Flies (white) — the prey. They try not to get caught.
Obstacles (grey) — walls both sides must move around.

Nothing here is hand-scripted. Both sides start out clueless and figure everything out by trial and error.

How does it learn?

This is reinforcement learning. Each side has its own small neural network (an "actor-critic"). Every moment, the network looks at the situation — where's the nearest fly, how fast is it moving, where are the walls — and picks a direction to move. Then it gets a reward:

The hunter earns reward for getting closer and a big bonus for a catch.
The fly earns reward for staying alive and opening distance, and a penalty when caught.

Over millions of tiny attempts, each network nudges itself toward choices that paid off. That's learning — no rules written by a human, just rewards.

Why does it take so long?

Because it's an arms race. The instant the hunter gets good, the flies are also getting better at dodging — so the target keeps moving. Every gain has to be earned against a smarter opponent. That's why the two lines on the timeline seesaw instead of shooting straight to 100%.

Hunt accuracy — how often a chase ends in a catch. The hunter's skill.

Prey evasion — how long flies survive on average. The flies' skill.

What's "sim-time"?

The clock on screen counts simulated time, not real time. One training step equals a couple of simulated seconds, so after a lot of steps the display reads days or weeks of "fly time" — even though far less real time has passed. The estimated time shows roughly how long until the hunter reaches 95% accuracy at the current pace.

Is it always running?

Yes — it trains 24/7 on a little machine at home, and saves its brain and full history to a file. So you can check in any day and watch the curve keep climbing from where it left off.

Can I change anything?

No — this is a view-only window. Tweaking the world (speed, number of flies, resetting the brains) is locked behind a private admin login. You get to watch the science happen. 🙂

Built with Python · PyTorch · FastAPI. Predator & prey are independent A2C agents in a continuous 2-D world with physical obstacles.

▶ watch it live