blog record — Jan 2026

Multi-Bot Traffic Sim

Three TurtleBots, one mini-city, zero deadlocks. Intent broadcast + behaviour trees.

stack

ROS2TurtleBotBehaviour TreesHybrid A*DWA

01motivation

The polite-halt failure

The interesting failure in multi-agent nav isn't crashing, it's waiting. Two bots hit a four-way stop, both detect each other, both back off, both re-plan, both approach again, both back off. Nothing crashes. Nothing moves. The throughput metric quietly bleeds to zero.

This is the symmetric standoff, the dual of the obvious failure where neither yields and they collide. Any purely local yielding policy sits on a knife edge between the two. The MAPF literature has been chewing on this for twenty years; Stern et al.'s 2019 survey^[1] catalogs the complete planners (CBS, ICTS, M*) that resolve conflicts by jointly searching the configuration space.

Those planners are expensive, brittle to noisy perception, and assume a central coordinator with a global map. I had three small diff-drive robots, intermittent comms, and a perception stack that periodically forgot stop signs existed. I needed something cheaper.

The fix was small: don't broadcast position, broadcast intent. If every bot publishes "I will be at intersection X in 2.3 seconds with priority 1," the symmetry breaks before the encounter, and yielding becomes a one-line rule instead of two control loops fighting each other.

fig 01 — chartwith vs without eta_intersection

fig 01Symmetric standoff without ETA, clean pass with it.

02architecture

One topic between bots

Each bot runs an identical stack. The only thing crossing the wire is one 50 Hz ROS 2 topic, /fleet/intent.

fig 02 — chart3 bots · 50 hz topic

fig 02Per-bot stack plus the cross-bot /fleet/intent topic. Best-effort DDS, 50 Hz.

The BT eats both local perception and the fleet intent stream. Everything downstream is a normal Nav2-style planner-controller. The planners don't know other robots exist; that knowledge lives entirely in the BT.

03protocol

Intent broadcast, v3

Three iterations before I landed on something small enough to be cheap and rich enough to be useful.

v1 was just (robot_id, pose). Every bot had to run an internal model of every other bot. Wasteful and inconsistent.

v2 added next_waypoint and velocity. Better, but with a junction every ~3 m, neighbours still had to guess which intersection mattered.

v3 pushes the inference into the publisher, where it's cheap (the bot already knows its plan):

msg// excerpt

std_msgs/Header header

uint8   robot_id              # 0..N-1
geometry_msgs/Pose2D pose     # current pose in map frame
geometry_msgs/Pose2D next_waypoint

string  intersection_id       # "" if not approaching one
float32 eta_intersection      # seconds; -1 if N/A
uint8   priority              # 0=lowest, 255=highest; static per robot for now

uint8 STATE_CRUISING    = 0
uint8 STATE_APPROACHING = 1
uint8 STATE_YIELDING    = 2
uint8 STATE_CROSSING    = 3
uint8 state

fleet_msgs/msg/Intent.msg

ETAs are arc length on the global plan over a smoothed velocity estimate. Noisy — ±0.4 s at 3 m out — but the BT only needs the ordering, not the absolute value.

04decisions

Behavior tree instead of FSM

Started with a finite state machine. Worked for two bots. With three I had a state per pairwise interaction and the transition table became something I no longer trusted. Rewrote it as a behavior tree on BehaviorTree.CPP, mostly because Colledanchise and Ögren^[2] argue BTs compose better past a handful of states: subtrees reuse, fallback semantics are explicit, you can hot-swap a branch without rederiving the transition graph.

                       [Root Sequence]
                              │
                ┌─────────────┴─────────────┐
                │                           │
       [DetectIntersection?]         [DecideRightOfWay]
                                            │
                                       (Fallback ?)
                                            │
              ┌─────────────────┬───────────┴───────────┐
              │                 │                       │
     [NoConflict → Go]   [HigherPriority → Go]   [EarlierETA → Go]
                                                        │
                                                  (else → Yield)


Fallback (?):  succeed on first child that succeeds
Sequence (→): succeed only if all children succeed

fig 03Right-of-way subtree. Fallback (?) tries children left-to-right; sequence (→) runs all in order.

The decision in one line: go if no conflict, or if I outrank my neighbour, or if I get there first; otherwise yield. The third clause did most of the work (see Results).

The payoff: when I added the traffic-light rule three weeks in, it was a new subtree under the root, not a rewrite of every transition. The yield/go subtree didn't even know traffic lights existed.

05motion

Hybrid A* plus DWA

TurtleBots are diff-drive but the simulated bodies have non-trivial turning radii at my run velocities, so I treat them as non-holonomic. Plain grid A* produces in-place rotations the controller can't track when threading a curb cutout.

Global: Hybrid A*. State is (x, y, θ); expansions are short kinematically-feasible arcs at a fixed steering set. Heuristic is the max of Reeds-Shepp distance (ignoring obstacles) and 2D Dijkstra (ignoring kinematics). Plans take 40–120 ms on my laptop for a 20 m route.

Local: DWA. Hybrid A* gives me a path; DWA picks the next (v, ω) from the dynamic window of reachable velocities. Late-detected jaywalkers get dodged here — the global plan never sees them.

The DWA objective, classic Fox et al.^[3]:

G (v, ω) = σ \ i g (α \cdot e x t h e a d in g (v, ω) + \ e t a \cdot e x t c l e a r an ce (v, ω) + γ \cdot e x t v e l oc i t y (v, ω) \ i g)

(DWA)

heading rewards alignment with the next pose, clearance rewards distance to the nearest costmap obstacle, velocity rewards forward speed (so the bot doesn't stop just to maximize clearance). I run α=0.6, β=0.3, γ=0.1 after a small grid search. Jaywalker-heavy scenarios wanted higher β but then the bots drove like learner drivers in open intersections.

BT and DWA interaction matters: when the BT decides to yield, it doesn't stop the controller. It clamps the dynamic window's upper bound to a crawl (0.05 m/s) and biases clearance up. The bot keeps creeping, which makes resuming smooth and — crucially — keeps publishing a non-stale ETA. A stopped robot with eta = ∞ confuses every neighbour's BT.

06perception

YOLO for signs and jaywalkers

YOLOv8n^[4] fine-tuned on ~2,400 hand-labelled Gazebo frames. Classes: stop_sign, yield_sign, traffic_light_(red/yellow/green), pedestrian. ~40 minutes on a 4070, mAP@0.5 around 0.88, pedestrians noticeably worse (0.79) than signs (0.93+).

Signs are easy. They don't move, they're geometrically distinct, you can run a high confidence threshold (0.7) because a missed frame is recovered on the next. Pedestrians are the actual problem. A jaywalker can appear from behind a parked car and be in the bot's path in under 400 ms. Drop the threshold and you get phantom pedestrians spawning from textured walls and the bot panic-stops mid-intersection. Raise it and you miss the real person.

What worked: a per-track temporal vote. I run a cheap IOU tracker and only commit to a pedestrian-in-path once I've seen the track in 3 of the last 5 frames at confidence ≥ 0.45. That's ~150 ms of latency, which the DWA's ~1.2 s horizon absorbs comfortably. Below threshold the BT treats the detection as advisory and only nudges clearance weights up.

fig 04 — chart3 of last 5 @ conf ≥ 0.45

fig 04Per-track vote commits the real jaywalker, rejects the phantom.

07results

200 runs in a mini-city

Setup: a 30 m × 30 m Gazebo mini-city with 6 intersections (4 stop-controlled, 2 signalized), 3 TurtleBot3 Burgers, 8 scripted pedestrians crossing on a Poisson process tuned to a slow college campus. Each run is up to 5 minutes with a randomized 4-waypoint tour per bot. 200 seeds.

Baseline: same stack minus /fleet/intent and the right-of-way subtree. Each bot uses only its LIDAR costmap and "slow if obstacle in conflict zone."

Metric	Baseline	Ours
Safe arrivals (no collision, no timeout)	71%	94%
Mean time-to-goal (s)	78.1	61.4
Intersection deadlocks per run	1.46	0.18 (~8× reduction)
Pedestrian near-misses*	0.31	0.22
Hard stops (>1.5 m/s²)	4.8	3.1

*near-miss = pedestrian within 0.4 m of bot footprint.

The remaining 6% of failures cluster around dense pedestrian crossings, not vehicle-vehicle conflicts. In 11 of 12 failures the bot saw the pedestrian, yielded correctly, and then a second pedestrian entered before the first cleared and the bot timed out waiting. That's a policy hole — the BT has no notion of "creep in once the leading pedestrian passes."

08ablation

The surprising field

I expected priority to do most of the work. It's the most explicit conflict-resolution signal in the message. It didn't.

Field-by-field ablation:

Field removed	Safe-arrival drop	Deadlocks/run
`priority`	94% → 91%	0.18 → 0.27
`next_waypoint`	94% → 88%	0.18 → 0.41
`eta_intersection`	94% → 76%	0.18 → 1.12

fig 05 — chart200 runs · deadlocks per 5-min run

fig 05Removing eta_intersection collapses ~80% of the deadlock-reduction gain.

eta_intersection does roughly 80% of the deadlock reduction on its own. Without an ETA, the BT has to fall back on priority or pose-distance, and both are too coarse. Two bots approaching at different speeds but similar distances will both think the other is ahead. Both yield. Standoff.

With an ETA the asymmetry is explicit and the "yield only if your neighbour gets there first" rule fires cleanly. Priority becomes a tiebreaker. Obvious in retrospect — temporal ordering is what makes traffic legible to humans too — but I had to watch a lot of bots gently bow to each other for ten seconds at a time before I believed it.

09next

What I'd push on next

Heterogeneous fleet. All three bots are identical. Mix in a slow load-carrier and a fast scout and the ETA ordering breaks; static priority probably needs to become negotiated or learned.

Comms dropout. Best-effort DDS on one Wi-Fi network is generous. 30–50% packet loss and 200 ms spikes would stress it. My guess: the BT needs a stale-intent timeout and a conservative fallback, which is a clean subtree addition.

Real hardware. Gazebo cameras are too clean and the LIDAR too honest. Real TurtleBots^[5] with an OAK-D will surface a different failure distribution and the temporal-vote window will need re-tuning.

Longer term I want to know whether the intent topic should be replaced by a learned message — let a small policy decide what to broadcast. That's a much bigger project and probably needs more than one undergrad's worth of evenings.

/ footnotes

[1]Stern et al., Multi-Agent Pathfinding: Definitions, Variants, and Benchmarks, SoCS 2019 — arxiv.org/abs/1906.08291. ↩
[2]Colledanchise & Ögren, Behavior Trees in Robotics and AI: An Introduction, CRC Press, 2018. Library and reference implementation at behaviortree.dev. ↩
[3]Fox, Burgard, Thrun, The Dynamic Window Approach to Collision Avoidance, IEEE R&A Magazine, 1997 — ri.cmu.edu/fox_1997. ↩
[4]Ultralytics YOLOv8 docs — docs.ultralytics.com. ↩
[5]TurtleBot3 platform and ROS 2 stack — turtlebot.com and ros.org. ↩