Hi Community!
I’ve made significant strides since my last update, though the journey involved some unexpected pivots.
Initially, I focused on a hand keypoint-based pipeline, but the real-world performance fell short of my expectations. Even with custom cropping, the model struggled to accurately identify keypoints from a 3-meter distance. Additionally, my gesture recognition algorithm introduced further latencies, and the heavy CPU overhead for the dynamic cropping made the system feel sluggish and unresponsive.
Realizing I needed a more robust approach, I took a few steps back to the design phase. I decided to transition from keypoint detection to a dedicated gesture recognition model. I integrated a high-quality model from GitHub, trained on a massive dataset far beyond my own processing capacity, and successfully deployed it using the Voyager SDK.
The results are like a dream. To further optimize performance, I rebuilt the entire pipeline using the high-level API, which inherently resolved my previous cropping issues. As a result, the system is now faster, more reliable, and significantly more efficient.
The most exciting part? I’ve finally linked the system with Home Assistant! I’ve developed the integration logic to translate gestures into real-world actions. Currently, I can seamlessly control my lamps (adjusting brightness, toggling power, and cycling through colors) and manage my LG Smart TV (power, volume, and mute functions).
What’s next:
-
Expanding the library of supported features and commands.
-
Refactoring the codebase and finalizing documentation.
The full project reveal is just around the corner, scheduled for the end of this week. Stay tuned!

