NPU Development Guide
Quectel Pi H1 product SoC is equipped with Qualcomm ® Hexagon ™ Processor (NPU) is a hardware accelerator specifically designed for AI inference. To use NPU for model inference, QAIRT (Qualcomm® AI Runtime SDK) is required performs model porting on pre trained models. Qualcomm ® Provide a series of SDK for NPU developers to facilitate users in NPU porting their AI models.
- Model quantification library: AIMET
- Model transplantation SDK: QAIRT
- Model Application Library: QAI-APP-BUILDER
- Online Model Conversion Library: QAI-HUB
Preparation work
Create Python Execution Environment
sudo apt install python3-numpy
python3.10 -m venv venv-quecpi-alpha-ai
. venv-quecpi-alpha-ai/bin/activate
Download the testing program
Download ai-test.zip, extract and switch directories
unzip ai-test.zip
cd ai-test
Execute AI reasoning
Execute program to load model and test dataset
./qnn-net-run --backend ./libQnnHtp.so \
--retrieve_context resnet50_aimet_quantized_6490.bin \
--input_list test_list.txt --output_dir output_bin
View Results
Execute the script to view the results
python3 show_resnet50_classifications.py \
--input_list test_list.txt -o output_bin/ \
--labels_file imagenet_classes.txt
Script output result:
Classification results
./images/ILSVRC2012_val_00003441.raw [acoustic guitar]
./images/ILSVRC2012_val_00008465.raw [trifle]
./images/ILSVRC2012_val_00010218.raw [tabby]
./images/ILSVRC2012_val_00044076.raw [proboscis monkey]
Test image collection
NPU software stack
QAIRT
QAIRT (Qualcomm ® AI Runtime ) SDK is an integrated system that incorporates Qualcomm ® AI Software packages, including Qualcomm ® AI Engine Direct、Qualcomm ® Neural Processing SDK and Qualcomm ® Genie。 QAIRT provides developers with access to Qualcomm ® All tools required for porting and deploying AI models on hardware accelerators, as well as the runtime for running the models on CPU, GPU, and NPU.
Support reasoning backend
- CPU
- GPU
- NPU

SoC Architecture Comparison Table
| SoC | dsp_arch | soc_id |
|---|---|---|
| QCS6490 | v68 | 35 |
| QCS9075 | v73 | 77 |
AIMET
AIMET(AI Model Efficiency Toolkit)is a quantization tool for deep learning models such as PyTorch and ONNX. AIMET improves the performance of deep learning models by reducing their computational load and memory usage. With AIMET, developers can quickly iterate and find the optimal quantization configuration to achieve the best balance between accuracy and latency. Developers can compile and deploy the quantitative model exported by AIMET using QAIRT on Qualcomm NPU, or run it directly using ONNX Runtime.

QAI-APPBUILDER
Quick AI Application Builder (QAI AppBuilder) can help developers easily use Qualcomm based software® AI Runtime SDK is equipped with Qualcomm® Hexagon™ Processor (NPU) from Deploy AI models and design AI applications on the Qualcomm® SoC platform. It encapsulates the model deployment API into a simplified set of interfaces for loading the model into NPU and performing inference. QAI AppBuilder greatly reduces the complexity of deploying models for developers and provides multiple demos for developers to reference in designing their own AI applications.

QAI-Hub
Qualcomm® AI Hub (QAI-Hub)is a one-stop model conversion cloud platform that provides online model compilation, model quantification, model performance analysis, model inference, and model download services. Qualcomm® AI Hub automatically processes the model transformation from pre trained models to device runtime, and the system automatically configures devices in the cloud for performance analysis and inference on the devices. Among them,Qualcomm® AI Hub Models(QAI-Hub-Models) Based on the cloud services provided by QAI-Hub, it supports online quantification, compilation, inference, analysis, and download of models in the model list on cloud devices in a command-line manner.
