A DimOS × Hogwarts Mischief

The Wrath of Filch

Send the hound to hide the phones — before the caretaker takes them forever.

Team
Jamjam
Harold  ·  香菇鸡丁  ·  Titian  ·  字泉
Kai  ·  Gale  ·  Pollux
DimOS Unitree Go2 YOLO-E WebRTC A* Planner
github.com/jamjamDimos/dimos/tree/jamjam_ui Mischief managed.
Once upon a curfew

It's the witching hour at Hogwarts.

Across the dorms, students still scroll their phones beneath the covers. Argus Filch prowls the corridors — and any phone he finds is gone forever.

Tonight, a Unitree Go2 is your familiar.
Save the night.

Team Jamjam02 · The Mischief
The Quest

Win Conditions

To Beat Filch by Dawn
  • Spot every student who appears on the parchment.
  • Click their face — send the hound to hide their phone.
  • Reach them all before the bell tolls dawn.

One robot. Seven souls. A finite number of seconds.

Team Jamjam03 · Win Conditions
Under the parchment

How it works

camera
/color_image
YOLO-E
open-vocab + BoT-SORT
BBoxSelection
sole writer
TargetLock
debounce + reassoc
Map click
/free space
A* Planner
nav_cmd_vel
MovementManager
tele ▷ nav · L2+B above all
Go2 WebRTC
real / MuJoCo

One source of truth. Rerun click and web click both feed the same BBoxSelectionModule.
Exclusivity. Picking a face cancels nav; clicking a point cancels follow — one gesture, one outcome.

Team Jamjam04 · Architecture
Robustness, not just a demo

Two state machines worth showing

TargetLock FSM

unselected
▸ bbox set
locked
▸ id miss
searching
▸ re-found
locked
searching
▸ timeout
lost

searching rescues brief occlusions — the lock returns the moment the person reappears.

MovementManager Priority

1. L2 + B hardware damp — above the stack
2. tele_cmd_vel wins on any keypress
3. nav_cmd_vel resumes after 1 s silence
4. Follow task ⇒ idle on stop_movement

The human is always able to interrupt the autonomous loop — by design, by layer.

Team Jamjam05 · State Machines
On top of jamjam/jamjam

What jamjam adds

01 · APP
The Marauder's Map
A new dimos/apps/ namespace. Parchment web UI on :7782 — click a face, click a point, drive with WASD. ~1.7k lines of HTML+JS, fully offline.
02 · PERCEPTION
World-frame distance estimation
The naïve LiDAR-into-bbox approach broke on real data (sparse cloud, missing edge depth). We pivoted to TF reprojection + Detection3DPC.from_2d over the voxel map — robot↔target distance now lives in world frame, and the follower's gone 10× more robust.
03 · TOOLING
Auto-logged runner
scripts/run-blueprint.sh · SIM=mujoco / REPLAY=1 / ROBOT_IP=… · auto-rotated logs · mjpython on macOS for MuJoCo's GL context.
50files added 1commit, additive 0jamjam files modified
Team Jamjam06 · The Build
Designed to feel like magic

The Parchment

A scroll unfurls. Characters ignite in candle-gold, then settle into ink. One quill cursor. One soundtrack.

Cinematic intro
Scroll unfurls in 1.6 s. 4 story beats reveal letter-by-letter with a gold-to-ink glow. Skippable in one key.
Click anything, anywhere
Face on the map → track. Free space → navigate. Gallery face → swap target. Same gesture, same outcome.
Looped soundtrack
Potion Latch on autoplay-with-mute → unmuted on the first interaction. The only autoplay shape every browser actually allows.
Team Jamjam07 · The Feel
Where jamjam earned its scars

The bugs worth telling

① Distance estimation broke on real data

The obvious approach — shoot LiDAR rays into the bbox — looked solid until sparse point clouds left holes at the person's edges. The follower stuttered, sometimes lunged.

We pivoted: take the bbox center, push it back through the TF tree into world frame, then ask the voxel map (Detection3DPC.from_2d) where the target actually sits in 3D.

Robot ↔ target distance in world coordinates. Edge-case point clouds no longer derail the follow. An order of magnitude more robust.

② Three drivers, one wheel

tele_cmd_vel, the bbox follower, and the planner all want to write cmd_vel. Letting any two of them fire simultaneously is how robots run into things.

Our rule lives in two places, by design:

✦ Server-side select publishes stop_movement=True before adopting the new face.
✦ Server-side navigate clears the bbox selection before the planner takes the wheel.
MovementManager suppresses nav_cmd_vel for 1 s after any teleop keypress.

One gesture, one outcome. The autonomous loop never fights the human.

Team Jamjam08 · The Scars
The hound awaits

Try it · in 30 seconds

git clone -b jamjam_ui \
  https://github.com/jamjamDimos/dimos.git
cd dimos

# MuJoCo simulation — no hardware needed
SIM=mujoco scripts/run-blueprint.sh
open http://localhost:7782/
Filch as an NPC
A red wandering sprite. Whoever reaches the phone first wins.
Multi-dog Houses
One Go2 per House. Race or cooperate.
Offline LLM voices
qwen3:4b already pre-downloaded — per-character lines.

Mischief managed.

Team Jamjam — Harold · 香菇鸡丁 · Titian · 字泉 · Kai · Gale · Pollux 09 · Fin.