2021-11-06 v0.7.0.0
Since a lot of updates happened since last release version is updated straight to v0.7.0.0
Comparing to previous release (v0.6.2.0) this release brings improved performance for SCRFD based detectors.
Here is performance comparison on GPU Nvidia RTX 2080 Super for scrfd_10g_gnkps detector paired with
glintr100 recognition model (all tests are using src/api_trt/test_images/Stallone.jpg, 1 face per image):
| Num workers | Client threads | FPS v0.6.2.0 | FPS v0.7.0.0 | Speed-up |
|---|---|---|---|---|
| 1 | 1 | 56 | 103 | 83.9% |
| 1 | 30 | 72 | 128 | 77.7% |
| 6 | 30 | 145 | 179 | 23.4% |
Additions:
- Added experimental support for msgpack serializer: helps reduce network traffic for embeddings for ~2x.
- Output names no longer required for detection models when building TRT engine - correct output order is now extracted
from onnx models. - Detection models now can be exported to TRT engine with batch size > 1 - inference code doesn't support it yet, though
now they could be used in Triton Inference Server without issues.
Model Zoo:
- Added support for WebFace600k based recognition models from InsightFace repo:
w600k_r50andw600k_mbf - Added md5 check for models to allow automatic re-download if models have changed.
- All
scrfdbased models now supports batch dimension.
Improvements:
- 1.5x-2x faster SCRFD re-implementation with Numba: 4.5 ms. vs 10 ms. for
lumia.jpgexample with
scrfd_10g_gnkpsand threshold = 0.3 (432 faces detected)). - Move image normalization step to GPU with help of CuPy (4x lower data transfer from CPU to GPU, about 6%
inference speedup, and some computations offloaded from CPU). - 4.5x Faster
face_align.norm_cropimplementation with help of Numba and removal of unused computations.
(Cropping 432 faces fromlumia.jpgexample tooks 45 ms. vs 205 ms.). - Face crops are now extracted only when needed - when face data or embeddings are requested, improving
detection only performance. - Added Numba njit cache to reduce subsequent starts time.
- Logging timings rounded to ms for better readability.
- Minor refactoring
Fixes:
- Since gender/age estimation model is currently not supported exclude it from models preparing step.