Need help with onnx-tensorrt?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

1.4K Stars 350 Forks MIT License 240 Commits 88 Opened issues


ONNX-TensorRT: TensorRT backend for ONNX

Services available


Need anything else?

Contributors list

TensorRT Backend For ONNX

Parses ONNX models for execution with TensorRT.

See also the TensorRT documentation.

For the list of recent changes, see the changelog.

For a list of commonly seen issues and questions, see the FAQ.

For business inquiries, please contact [email protected]

For press and other inquiries, please contact Hector Marinez at [email protected]

Supported TensorRT Versions

Development on the Master branch is for the latest version of TensorRT 7.2.2 with full-dimensions and dynamic shape support.

For previous versions of TensorRT, refer to their respective branches.

Full Dimensions + Dynamic Shapes

Building INetwork objects in full dimensions mode with dynamic shape support requires calling the following API:


const auto explicitBatch = 1U << static_cast(nvinfer1::NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);


import tensorrt
explicit_batch = 1 << (int)(tensorrt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)

For examples of usage of these APIs see: * sampleONNXMNIST * sampleDynamicReshape

Supported Operators

Current supported ONNX operators are found in the operator support matrix.




For building within docker, we recommend using and setting up the docker containers as instructed in the main (TensorRT repository)[] to build the onnx-tensorrt library.

Once you have cloned the repository, you can build the parser libraries and executables by running:

cd onnx-tensorrt
mkdir build && cd build
cmake .. -DTENSORRT_ROOT= && make -j
// Ensure that you update your LD_LIBRARY_PATH to pick up the location of the newly built library:

For building only the libraries, append

to the CMake build command.

Executable Usage

ONNX models can be converted to serialized TensorRT engines using the

onnx2trt my_model.onnx -o my_engine.trt

ONNX models can also be converted to human-readable text:

onnx2trt my_model.onnx -t my_model.onnx.txt

ONNX models can also be optimized by ONNX's optimization libraries (added by dsandler). To optimize an ONNX model and output a new one use

to specify the output model name and
to specify a semicolon-separated list of optimization passes to apply:
onnx2trt my_model.onnx -O "pass_1;pass_2;pass_3" -m my_model_optimized.onnx

See more all available optimization passes by running:

onnx2trt -p

See more usage information by running:

onnx2trt -h

Python Modules

Python bindings for the ONNX-TensorRT parser are packaged in the shipped

files. Install them with
python3 -m pip install /python/tensorrt-7.x.x.x-cp-none-linux_x86_64.whl

TensorRT 7.2.2 supports ONNX release 1.6.0. Install it with:

python3 -m pip install onnx==1.6.0

The ONNX-TensorRT backend can be installed by running:

python3 install

ONNX-TensorRT Python Backend Usage

The TensorRT backend for ONNX can be used in Python as follows:

import onnx
import onnx_tensorrt.backend as backend
import numpy as np

model = onnx.load("/path/to/model.onnx") engine = backend.prepare(model, device='CUDA:1') input_data = np.random.random(size=(32, 3, 224, 224)).astype(np.float32) output_data =[0] print(output_data) print(output_data.shape)

C++ Library Usage

The model parser library,, has its C++ API declared in this header:



After installation (or inside the Docker container), ONNX backend tests can be run as follows:

Real model tests only:

python OnnxBackendRealModelTest

All tests:


You can use

flag to make output more verbose.

Pre-trained Models

Pre-trained models in ONNX format can be found at the ONNX Model Zoo

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.