INT8 vs FP32 Comparison on Select Networks and Platforms

The table below illustrates the speed-up factor for the performance gain by switching from an FP32 representation of an OpenVINO™ supported model to its INT8 representation.

Intel® Core™
i7-8700T
Intel® Xeon®
Gold
5218T
Intel® Xeon®
Platinum
8270
Intel® Core™
i7-1065G7
Intel® Core™
i5-1145G7E
OpenVINO
benchmark
model name
Dataset Throughput speed-up FP16-INT8 vs FP32
bert-large-
uncased-whole-word-
masking-squad-0001
SQuAD 1.6 2.5 2.0 N/A 2.8
brain-tumor-
segmentation-
0001-MXNET
BraTS 1.5 1.7 1.6 1.9 1.8
deeplabv3-TF VOC 2012
Segmentation
1.4 2.4 2.6 2.8 2.9
densenet-121-TF ImageNet 1.6 3.2 3.2 3.0 3.2
facenet-
20180408-
102900-TF
LFW 2.0 3.6 3.5 3.2 3.5
faster_rcnn_
resnet50_coco-TF
MS COCO 1.7 3.5 3.4 3.6 3.6
googlenet-v1-TF ImageNet 1.8 3.6 3.7 3.5 3.6
inception-v3-TF ImageNet 1.8 3.8 4.0 3.7 3.7
mobilenet-
ssd-CF
VOC2012 1.5 3.0 3.3 3.1 3.3
mobilenet-v1-1.0-
224-TF
ImageNet 1.5 3.2 3.9 2.9 3.2
mobilenet-v2-1.0-
224-TF
ImageNet 1.3 2.7 3.8 2.2 2.5
mobilenet-v2-
pytorch
ImageNet 1.4 2.6 3.6 2.3 2.4
resnet-18-
pytorch
ImageNet 1.9 3.7 3.8 3.6 3.6
resnet-50-
pytorch
ImageNet 1.8 3.6 3.8 3.5 3.6
resnet-50-
TF
ImageNet 1.8 3.5 3.8 3.4 4.0
squeezenet1.1-
CF
ImageNet 1.6 2.9 3.2 3.0 3.2
ssd_mobilenet_
v1_coco-tf
VOC2012 1.6 3.0 3.4 3.1 3.3
ssd300-CF MS COCO 1.8 3.7 3.6 3.8 4.0
ssdlite_
mobilenet_
v2-TF
MS COCO 1.4 2.3 3.1 2.4 2.6
yolo_v3-TF MS COCO 1.8 3.8 3.9 3.7 3.8

The following table shows the absolute accuracy drop that is calculated as the difference in accuracy between the FP32 representation of a model and its INT8 representation.

Intel® Core™
i9-10920X CPU
@ 3.50GHZ (VNNI)
Intel® Core™
i9-9820X CPU
@ 3.30GHz (AVX512)
Intel® Core™
i7-8700 CPU
@ 3.20GHz (AVX2)
OpenVINO Benchmark
Model Name
Dataset Metric Name Absolute Accuracy Drop, %
bert-large-
uncased-whole-word-
masking-squad-0001
SQuAD F1 0.65 0.57 0.83
brain-tumor-
segmentation-
0001-MXNET
BraTS Dice-index@
Mean@
Overall Tumor
0.08 0.08 0.9
deeplabv3-TF VOC 2012
Segmentation
mean_iou 0.73 0.73 1.11
densenet-121-TF ImageNet acc@top-1 0.74 0.74 0.76
facenet-
20180408-
102900-TF
LFW pairwise_
accuracy
_subsets
0.02 0.02 0.02
faster_rcnn_
resnet50_coco-TF
MS COCO coco_
precision
0.21 0.21 0.20
googlenet-v1-TF ImageNet acc@top-1 0.03 0.03 0.01
inception-v3-TF ImageNet acc@top-1 0.03 0.01 0.01
mobilenet-
ssd-CF
VOC2012 mAP 0.35 0.34 0.34
mobilenet-v1-1.0-
224-TF
ImageNet acc@top-1 0.27 0.20 0.20
mobilenet-v2-1.0-
224-TF
ImageNet acc@top-1 0.45 0.94 0.94
mobilenet-v2-
PYTORCH
ImageNet acc@top-1 0.35 0.63 0.63
resnet-18-
pytorch
ImageNet acc@top-1 0.26 0.25 0.25
resnet-50-
PYTORCH
ImageNet acc@top-1 0.18 0.19 0.19
resnet-50-
TF
ImageNet acc@top-1 0.15 0.15 0.10
squeezenet1.1-
CF
ImageNet acc@top-1 0.66 0.66 0.64
ssd_mobilenet_
v1_coco-tf
VOC2012 COCO mAp 0.24 0.24 3.07
ssd300-CF MS COCO COCO mAp 0.06 0.06 0.05
ssdlite_
mobilenet_
v2-TF
MS COCO COCO mAp 0.14 0.14 0.47
yolo_v3-TF MS COCO COCO mAp 0.20 0.20 0.36

For more complete information about performance and benchmark results, visit: www.intel.com/benchmarks and Optimization Notice. Legal Information.