YOLO Developers: Meet Your New Edge for Real-Time Multi-Stream AI

What if you could run 24 simultaneous YOLO streams on a single M.2 card, 6x more than what most hardware can handle? Welcome to the performance revolution that's redefining what's possible at the edge.

The computer vision world runs on Ultralytics YOLO models. From retail loss prevention to smart city surveillance, these models power the visual intelligence behind modern applications. But every YOLO developer knows the challenge: scaling real-time inference across multiple video streams while maintaining performance and staying within power budgets.

The Axelera® AI Metis® platform addresses this multi-stream challenge head-on, delivering purpose-built performance for YOLO workloads at the edge. For developers who attended YOLO Vision London, you were able to see a preview of the upcoming YOLO26 release running on a new Metis form factor coming soon. Since both Ultralytics and Axelera AI are focused on ease of use, the new model was compiled and running within the same day. We're excited to share more in the coming months!

Multi-Stream YOLO Performance That Actually Works

Independent testing by HotTech Vision and Analysis reveals what matters most to YOLO developers: consistent multi-stream performance across model variants. Their comprehensive evaluation tested YOLOv5s, YOLOv5m, YOLOv7, YOLOv8s, and YOLOv8l across multiple platforms using 14 simultaneous 30fps video streams.

Key findings for YOLO developers:

Consistent Performance: Metis maintained leading performance across all YOLO variants tested, with the PCI Express card delivering 313 FPS on YOLOv5s and the M.2 module achieving 178 FPS - outperforming competition by substantial margins. (Our own testing showed results up to 875 FPS on the PCIe and 827 FPS on the M.2.)

Energy Efficiency: Testing showed sub-1 Joule per frame performance on most YOLO models, with YOLOv5s achieving 0.743 J/frame on the PCIe card and 1.035 J/frame on the M.2 module.

Model Flexibility: From lightweight YOLOv8s to complex YOLOv8l, Metis handled model scaling better than alternatives, maintaining efficiency even as computational demands increased.

Real-World Multi-Stream Applications

The performance translates directly to practical deployments across industries:

Application/Use Case	Streams Supported	Model Types Tested	Efficiency Cited
Retail Checkout/Loss Prevention	1–24	Multiple YOLO	Lowest <1 J/frame vs. peers
Security/Surveillance	1–24	YOLOv5m, YOLOv8l	Maintains 2–3x throughput of rivals

Retail Operations: Multi-camera checkout monitoring and loss prevention systems benefit from processing up to 24 streams simultaneously while maintaining real-time performance.

Security & Surveillance: Campus and urban deployments can process up to 24 streams for real-time threat detection and monitoring.

Developer Experience Built for YOLO

Axelera AI’s Voyager® SDK provides native YOLO support that respects developer time.

The platform includes:

Direct Model Support: Ultralytics YOLOv5, YOLOv8, and YOLO11 variants integrate without conversion pipelines or compatibility workarounds.

Multi-Stream Ready: Sample applications demonstrate multi-stream inference out of the box, not as an afterthought requiring custom development.

Performance Optimization: Built-in quantization and pipeline optimization ensure spec-sheet performance that translates into real deployments.

"Unlike the competing platforms, we did not have to do any additional development work to handle multiple simultaneous inputs, output postprocessing, or data saving." – HotTech evaluation

Looking Ahead: Expanding Deployment Options

Innovation continues across both hardware and software fronts to meet evolving deployment needs. The recently announced Axelera® AI Metis® M.2 Max delivers double the memory capacity and bandwidth of the original M.2 module, representing the latest advancement in the hardware lineup.

New form factors beyond M.2 and PCIe cards provide additional integration flexibility for diverse edge scenarios, while upcoming SDK releases continue optimizing Ultralytics YOLO performance and expanding model support for the anticipated YOLO26 launch. Both companies are continuously improving their products to deliver better performance for developers.

Solving High-Resolution Challenges with Tiled Inference

One challenge YOLO developers encounter is from high-resolution cameras with a wide coverage area at distance, such as an 8k security camera looking at a large crowd of people. Since Ultralytics YOLO models are typically trained on 640x640 inputs and expect that shape as input, working with 8K or other high-resolution sources can present detection accuracy challenges when images are downscaled to fit this size.

The Voyager SDK is capable of tiled inference. This approach automatically breaks high-resolution images into smaller tiles, runs inference on each tile at full resolution, then reassembles the results. Developers get the detection accuracy they expect even on high-resolution video streams without sacrificing the performance benefits of running inference at the model's trained resolution.

Your Next YOLO Deployment

Multi-stream computer vision no longer requires choosing between performance and efficiency. Independent testing validates that purpose-built AI acceleration can deliver both simultaneously.

For YOLO developers planning multi-stream deployments, the question isn't whether you can achieve real-time performance across multiple feeds - it's how quickly you can get there.

Ready to test multi-stream YOLO performance in your environment? The Voyager SDK provides everything needed to evaluate real-world performance with your models and video streams.

Be the first to reply

Sign up

Log in, or create an Axelera AI account

Login to the community

Log in, or create an Axelera AI account

Scanning file for viruses.

This file cannot be downloaded