You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TensorRT OSS release corresponding to TensorRT 8.2.1.8 GA release.
6
+
- Updates since [TensorRT 8.2.0 EA release](https://github.com/NVIDIA/TensorRT/releases/tag/8.2.0-EA).
7
+
- Please refer to the [TensorRT 8.2.1 GA release notes](https://docs.nvidia.com/deeplearning/tensorrt/release-notes/tensorrt-8.html#rel-8-2-1) for more information.
- Removed duplicate constant layer checks that caused some performance regressions
11
+
- Fixed expand dynamic shape calculations
12
+
- Added parser-side checks for `Scatter` layer support
13
+
14
+
- Sample updates
15
+
- Added [Tensorflow Object Detection API converter samples](samples/python/tensorflow_object_detection_api), including Single Shot Detector, Faster R-CNN and Mask R-CNN models
16
+
- Multiple enhancements in HuggingFace transformer demos
17
+
- Added multi-batch support
18
+
- Fixed resultant performance regression in batchsize=1
19
+
- Fixed T5 large/T5-3B accuracy issues
20
+
- Added [notebooks](demo/HuggingFace/notebooks) for T5 and GPT-2
21
+
- Added CPU benchmarking option
22
+
- Deprecated `kSTRICT_TYPES` (strict type constraints). Equivalent behaviour now achieved by setting `PREFER_PRECISION_CONSTRAINTS`, `DIRECT_IO`, and `REJECT_EMPTY_ALGORITHMS`
23
+
- Removed `sampleMovieLens`
24
+
- Renamed sampleReformatFreeIO to sampleIOFormats
25
+
- Add `idleTime` option for samples to control qps
26
+
- Specify default value for `precisionConstraints`
27
+
- Fixed reporting of TensorRT build version in trtexec
28
+
- Fixed `combineDescriptions` typo in trtexec/tracer.py
29
+
- Fixed usages of of `kDIRECT_IO`
30
+
31
+
- Plugin updates
32
+
-`EfficientNMS` plugin support extended to TF-TRT, and for clang builds.
33
+
- Sanitize header definitions for BERT fused MHA plugin
34
+
- Separate C++ and cu files in `splitPlugin` to avoid PTX generation (required for CUDA enhanced compatibility support)
35
+
- Enable C++14 build for plugins
36
+
37
+
- ONNX tooling updates
38
+
-[onnx-graphsurgeon](tools/onnx-graphsurgeon/CHANGELOG.md) upgraded to v0.3.14
39
+
-[Polygraphy](tools/Polygraphy/CHANGELOG.md) upgraded to v0.33.2
40
+
-[pytorch-quantization](tools/pytorch-quantization) toolkit upgraded to v2.1.2
41
+
42
+
- Build and container fixes
43
+
- Add `SM86` target to default `GPU_ARCHS` for platforms with cuda-11.1+
44
+
- Remove deprecated `SM_35` and add `SM_60` to default `GPU_ARCHS`
45
+
- Skip CUB builds for cuda 11.0+ [#1455](https://github.com/NVIDIA/TensorRT/pull/1455)
46
+
- Fixed cuda-10.2 container build failures in Ubuntu 20.04
47
+
- Add native ARM server build container
48
+
- Install devtoolset-8 for updated g++ version in CentOS7
49
+
- Added a note on supporting c++14 builds for CentOS7
50
+
- Fixed docker build for large UIDs [#1373](https://github.com/NVIDIA/TensorRT/issues/1373)
51
+
- Updated README instructions for Jetpack builds
52
+
53
+
- demo enhancements
54
+
- Updated Tacotron2 instructions and add CPU benchmarking
55
+
- Fixed issues in demoBERT python notebook
56
+
57
+
- Documentation updates
58
+
- Updated Python documentation for `add_reduce`, `add_top_k`, and `ISoftMaxLayer`
59
+
- Renamed default GitHub branch to `main` and updated hyperlinks
> NOTE: On CentOS7, the default g++ version does not support C++14. For native builds (not using the CentOS7 build container), first install devtoolset-8 to obtain the updated g++ toolchain as follows:
0 commit comments