I’ve used the yolov5s.pt file provided from the official yolov5 repository and converted it to onnx with this command:
python3 export.py --weights yolov5s.pt --imgsz 640 --batch-size 1 --include onnx --opset 17 .
I’ve then tried to compile this with this command:
compile -i /home/axelera/Vision.AxeleraTesting/data/yolov5s.onnx -o /home/axelera/Vision.AxeleraTesting/data/compile --overwrite
I got this output:
09:49:45 >INFO] Dump used CLI arguments to: /home/vintecc/axelera/Vision.AxeleraTesting/data/compile/cli_args.json
09:49:45 >INFO] Dump used compiler configuration to: /home/vintecc/axelera/Vision.AxeleraTesting/data/compile/conf.json
09:49:45 >INFO] Input model has static input shape(s): ((1, 3, 640, 640),). Use it for quantization.
09:49:45 >INFO] Data layout of the input model: NCHW
09:49:45 >INFO] Using dataset of size 100 for calibration.
09:49:45 >INFO] In case of compilation failures, turn on 'save_error_artifact' and share the archive with Axelera AI.
09:49:45 >INFO] Quantizing '' using QToolsV2.
09:49:45 >INFO] ONNX model validation can be turned off by setting 'validate_operators' to 'False'.
09:49:49 >INFO] Checking ONNX model compatibility with the constraints of opset 17.
Calibrating... ✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨ | 100% | 8.22it/s | 100it |
09:50:04 INFO] Exporting '' using GraphExporterV2.
/home/vintecc/.cache/axelera/venvs/3252ae77/lib/python3.10/site-packages/torch/nn/functional.py:3734: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
/home/vintecc/.cache/axelera/venvs/3252ae77/lib/python3.10/site-packages/torch/jit/_trace.py:976: TracerWarning: Encountering a list at the output of the tracer might cause the trace to be incorrect, this is only valid if the container structure does not change based on the module's inputs. Consider using a constant container instead (e.g. for `list`, use a `tuple` instead. for `dict`, use a `NamedTuple` instead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior.
module._c._create_method_from_trace(
09:50:12 oINFO] Quantization finished.
09:50:12 5INFO] Quantization took: 27.38 seconds.
09:50:12 ]INFO] Export quantized model manifest to JSON file: /home/vintecc/axelera/Vision.AxeleraTesting/data/compile/quantized_model_manifest.json
09:50:12 mINFO] Lower input model to target device...
09:50:12 wINFO] In case of compilation failures, turn on 'save_error_artifact' and share the archive with Axelera AI.
09:50:12 INFO] Lowering '' to target 'device' in 'multiprocess' mode for 1 AIPU core(s) using 100.0% of available AIPU resources.
09:50:12 1INFO] Running LowerFrontend...
09:50:31 :INFO] Running FrontendToMidend...
09:50:33 INFO] Running LowerMidend...
09:50:37 5INFO] Running MidendToTIR...
09:50:58 5INFO] Running LowerTIR...
09:51:54 0INFO] LowerTIR succeeded to fit buffers into memory after iteration 0/4.
Pool usage: {L1: alloc:3,978,752B avail:4,194,304B over:0B util:94.86%, L2: alloc:23,074,304B avail:32,309,248B over:0B util:71.42%, DDR: alloc:3,840,256B avail:1,040,187,392B over:0B util:0.37%}
Overflowing buffer IDs: set()
09:51:55 rINFO] Running TirToAtex...
09:51:57 9INFO] Running LowerATEX...
09:52:01 9INFO] Running AtexToArtifact...
09:52:02 0INFO] Lowering finished!
09:52:02 >INFO] Compilation took: 109.53 seconds.
09:52:02 ]INFO] Passes report was generated and saved to: /home/vintecc/axelera/Vision.AxeleraTesting/data/compile/compiled_model/pass_benchmark_report.json
09:52:02 eINFO] Lowering finished. Export model manifest to JSON file: /home/vintecc/axelera/Vision.AxeleraTesting/data/compile/compiled_model_manifest.json
09:52:02 oINFO] Total time: 137.26 seconds.
09:52:02 INFO] Done.
Looking at the output it finds the correct input shape (1, 3, 640, 640). However when opening the model with the runtime API I get different input/ouput shapes. This piece of code:
input_infos, output_infos = model.inputs(), model.outputs()
Gives me this:
Input: uTensorInfo(shape=(1, 644, 656, 4), dtype=<class 'numpy.int8'>, name='var_input_ifd0', padding=s(0, 0), (0, 0), (0, 0), (0, 0)], scale=1.0, zero_point=0)]
Output:
)TensorInfo(shape=(1, 644, 656, 4), dtype=<class 'numpy.int8'>, name='var_input_ifd0', padding=l(0, 0), (0, 0), (0, 0), (0, 0)], scale=1.0, zero_point=0)]
TensorInfo(shape=(1, 40, 40, 256), dtype=<class 'numpy.int8'>, name='output_mmio_var', padding=a(0, 0), (0, 0), (0, 0), (0, 0)], scale=1.0, zero_point=0), TensorInfo(shape=(1, 20, 20, 256), dtype=<class 'numpy.int8'>, name='output_mmio_var_1', padding=s(0, 0), (0, 0), (0, 0), (0, 0)], scale=1.0, zero_point=0), TensorInfo(shape=(1, 80, 80, 256), dtype=<class 'numpy.int8'>, name='output_mmio_var_2', padding=s(0, 0), (0, 0), (0, 0), (0, 0)], scale=1.0, zero_point=0)]
The input should be 1, 3, 640, 640
And the output should be 1, 25200, 85
Could you assist me in how I should use the compiled model?