TensorRT 8 Is Out. Here is What you need to know.


Whether you are building a machine learning model for research or for a business function, the whole of point of creating a model is to perform inference. Currently, TensorRT provides the most performant way to achieve just that. And TensorRT 8 takes it to the next level. In this article you will discover the latest capabilities of TensorRT8.

Image by Engin Akyurt from Pixabay

When you create and train a machine learning model, your model can technically perform inference. But chances are it is far from being optimized for inference. Depending on the framework you use sometimes this difference can be hidden from you for simplification, or it can be more explicit.

What’s New In TensorRT 8?

TensorRT 8 comes with pretty significant advancements over the existing TensorRT 7. Let’s unfold 3 major advancements and what they mean for you.

Quantization Aware Training

Image by PIRO4D from Pixabay

Quantization in machine learning is not a new concept. It is overall a good practice and in many cases it is necessary to certain extend to achieve significant speed gains as well as achieving much lower memory footprint.

