Play Rock Paper Scissors Lizard Spock against Sheldon Cooper

Forum|Forum|4 months ago
March 17, 2026
7 replies
142 views

dilip.m
Ensign

Hi everyone,

After working on a serious project for few weeks, i wanted to build something fun with Metis and Voyager.

A tiny tribute to The Big Bang Theory and Sheldon Cooper fans-

Built a real-time Rock Paper Scissors Lizard Spock game that runs entirely on-device using the Axelera Metis AIPU (214 TOPS).

How it works:

YOLOv10n trained on the HaGRID dataset (1M+ hand gesture images, 34 gesture classes) detects and classifies hand gestures directly from a live camera feed
Model is quantized and compiled to run on the Metis AIPU — no GPU, no cloud, just the edge accelerator
The game maps detected gestures to moves: fist=Rock, open palm=Paper, peace sign=Scissors, grip=Lizard, vulcan/4 fingers=Spock
You play against Sheldon Cooper (complete with Bazinga quotes and reaction images)

The flow:

Show a gesture to the camera and hold it steady (0.5s)
3-2-1 countdown
Your gesture is locked, Sheldon picks randomly
Result screen with win logic, flavor text, and Sheldon's reaction

Tech stack:

Axelera Metis M2 AIPU on RK3588 board
Voyager SDK (GStreamer pipeline + AIPU inference)
YOLOv10n (HaGRID gestures) — exported to ONNX, quantized with per_tensor_min_max, compiled for Metis
OpenCV for display overlay
POV / RTSP camera input
Python, fully local, zero latency to cloud

What I learned:

YOLOv10 has attention ops (MatMul/Softmax) that need per_tensor_min_max quantization to compile on Metis
Tried hand keypoint models first (YOLO11n-pose-hands, 21 keypoints) — keypoints worked but gesture classification from POV camera angle was unreliable. Geometry-based pose detection fails when perspective flattens finger distances
Switching to a gesture classification model (HaGRID) that directly outputs gesture labels was the right call — much more robust than trying to infer gestures from keypoint geometry

Spanner
Axelera Team
Forum|Forum|3 months ago
March 18, 2026

Dude, that’s mega-awesome! Anything with pop culture references gets me excited, and this doubles up on that in the best possible way! Fantastic work!

I’m curious about the gestures. I’d guess HaGRID detects more than the five gestures used here? Are you using a high threshold on the gestures you want, as a way to filter out the gestures you don’t? I notice that it didn’t detect any others in the video, so accuracy is great! Just wondered if non-game gestures were an issue at all?

dilip.m
Author
Ensign
Forum|Forum|3 months ago
March 18, 2026

there's a lot the model picks up that the game doesn't use.

For the game I map only specific classes to moves:

Rock: fist
Paper: palm, stop, stop_inverted
Scissors: peace, peace_inverted
Lizard: grip, grabbing
Spock: two_up, two_up_inverted, four

Any other detected gesture (like point, ok, call, etc.) is just ignored — the game only acts when it sees one of the mapped gestures held steady for ~0.5 seconds. There's also a no_gesture class (class 33) which the model outputs when it doesn't see a clear gesture, so it handles the "hand in frame but not doing anything specific" case natively.

The confidence threshold is set at 0.55, which is fairly standard for YOLO — not particularly aggressive. Honestly the HaGRID dataset is just solid. It has over 1 million images across those 34 classes with a wide variety of backgrounds, skin tones, and lighting conditions, so the model generalises well out of the box. I didn't need to do much filtering at all — non-game gestures just get classified correctly as something other than the five game moves and get ignored.

i am thinking of adding small voice clips too

Spanner
Axelera Team
Forum|Forum|3 months ago
March 18, 2026

Noice! And how is Sheldon’s respond handled? Just a randomised selection of one of the five gestures?

dilip.m
Author
Ensign
Forum|Forum|3 months ago
March 18, 2026

yes, random generator. could be extended to 1v1 game with friends(fellow Axeleraties?)

Spanner
Axelera Team
Forum|Forum|3 months ago
March 18, 2026

I was just going to ask that - whether you could make it into a two player game 😄 Or potentially more! Just like Sheldon expanded on the traditional rock, paper, scissors, maybe there’s a way to make this a three (or more) player game? 😄

Steven Hunsche
Axelera Team
Forum|Forum|3 months ago
March 18, 2026

This is so much fun! Thank you for posting this!

I’m just dropping this comment to keep informed on any (if any) updates here to hopefully try it out myself in the future!

dilip.m
Author
Ensign
Forum|Forum|3 months ago
March 19, 2026

https://github.com/dannytata8041-dotcom/play_vs_sheldon/tree/main
Feel free to play around, tweak. Might need a bit of adaptive learning with gestures. for lizard, i use claw

Sign up

Log in, or create an Axelera AI account

Login to the community

Log in, or create an Axelera AI account