Skip to main content
Question

Cascaded Pipeline

  • January 11, 2026
  • 2 replies
  • 29 views

Hi all!

I’m currently trying to setup a simple pipeline with yolo pose + tracker.
The attached YAML works fine, but how to link the tracker results back to the tracked object?
It seems the trackers meta data does not contain any reference to the tracked object from stage 1.

I feel like I’ missing something obvious here..

axelera-model-format: 1.0.0

name: yolo11lpose-coco-tracker

description: yolo11l pose estimation with ByteTrack tracking

pipeline:
- keypoint_detections:
model_name: yolo11lpose-coco-onnx
input:
type: image
preprocess:
- letterbox:
height: ${{input_height}}
scaleup: true
width: ${{input_width}}
- torch-totensor:
inference:
handle_all: false
postprocess:
- decodeyolopose:
box_format: xywh
conf_threshold: 0.65
max_nms_boxes: 3000
nms_iou_threshold: 0.45
nms_top_k: 300
normalized_coord: false
- tracking:
model_name: tracker
input:
source: full
color_format: RGB
cv_process:
- tracker:
algorithm: bytetrack
bbox_task_name: keypoint_detections
min_width: 20
min_height: 20
history_length: 30
algo_params:
frame_rate: 20
track_buffer: 30

models:
yolo11lpose-coco-onnx:
class: AxONNXModel
class_path: $AXELERA_FRAMEWORK/ax_models/base_onnx.py
weight_path: weights/yolo11l-pose.onnx
weight_url: https://media.axelera.ai/artifacts/model_cards/weights/yolo/keypoint_detection/yolo11l-pose.onnx
weight_md5: a0c2124d8dfec01a427cc8b11bae0255
task_category: KeypointDetection
input_tensor_layout: NCHW
input_tensor_shape: [1, 3, 640, 640]
input_color_format: RGB
num_classes: 1
dataset: CocoDataset-keypoint-COCO2017
extra_kwargs:
compilation_config:
quantization_scheme: per_tensor_min_max
ignore_weight_buffers: false
tracker:
model_type: CLASSICAL_CV
task_category: ObjectTracking

datasets:
CocoDataset-keypoint-COCO2017:
class: KptDataAdapter
class_path: $AXELERA_FRAMEWORK/ax_datasets/objdataadapter.py
data_dir_name: coco
label_type: COCO2017

operators:
decodeyolopose:
class: DecodeYoloPose
class_path: $AXELERA_FRAMEWORK/ax_models/decoders/yolopose.py

Any hints?

For now I use an extra IoU step to match the bbox’es from tracker and keypoint detection, but that’s just a workaround really..

 

2 replies

  • Axelera Team
  • January 13, 2026

Hi FreezerJohn,

how do you access the results now? Can you show me parts of your Python code so that I understand better what your requirements are?


  • Author
  • Ensign
  • January 15, 2026

Hi Sascha,

sure! So from my understanding in the pipeline above the tracker tracks the results from the first stage, as configured with bbox_task_name: keypoint_detections

In code, I access the frame_result.meta which contains entries of type CocoBodyKeypointsMeta and tracking_meta as expected.

My problem is: as soon as there is more than one person in a frame, I do not know which tracker result belongs to which keypoint detection. There seems to be no link in the meta data and indices are random.
Both bounding boxes are quite similar of course, so with a IoU calculation I can link them - but that feels like an unnecessary extra step on my side.

 

#!/usr/bin/env python3
"""
Minimal script to demonstrate the tracker / keypoint correlation issue.
Uses public image: https://www.ultralytics.com/images/bus.jpg
"""

import os, sys, urllib.request
from pathlib import Path

if not os.environ.get('AXELERA_FRAMEWORK'):
sys.exit("Please activate voyager-sdk venv")

import numpy as np
from axelera.app.stream import create_inference_stream

IMAGE_URL = "https://www.ultralytics.com/images/bus.jpg"
IMAGE_PATH = "/tmp/bus.jpg"

if not Path(IMAGE_PATH).exists():
urllib.request.urlretrieve(IMAGE_URL, IMAGE_PATH)

stream = create_inference_stream(network="yolo11lpose-coco-tracker", sources=[IMAGE_PATH])

for frame_result in stream:
meta = frame_result.meta
tracker = meta['tracking']
kp_det = meta['keypoint_detections']

print("keypoint_detections.boxes (indexed 0,1,2...):")
for i, box in enumerate(kp_det.boxes):
print(f" Detection[{i}]: {box}")

print("\ntracker.boxes (keyed by track_id):")
for track_id, box in tracker.boxes.items():
print(f" Track[{track_id}]: {box}")

print("\nPROBLEM: Indices don't match, must use IoU to correlate:")
for det_idx in range(len(kp_det.boxes)):
for track_id, track_box in tracker.boxes.items():
if np.allclose(track_box, kp_det.boxes[det_idx], atol=1):
print(f" Detection[{det_idx}] -> Track[{track_id}]")
break
break

stream.stop()

Output:

Init ByteTrack!
keypoint_detections.boxes (indexed 0,1,2...):
Detection[0]: [221 407 345 860]
Detection[1]: [668 394 809 882]
Detection[2]: [ 49 398 247 904]
Detection[3]: [ 0 549 78 873]

tracker.boxes (keyed by track_id):
Track[4]: [ 0 549 78 873]
Track[3]: [ 49 398 247 904]
Track[2]: [668 394 809 882]
Track[1]: [221 407 345 860]

PROBLEM: Indices don't match, must use IoU to correlate:
Detection[0] -> Track[1]
Detection[1] -> Track[2]
Detection[2] -> Track[3]
Detection[3] -> Track[4]