This tutorial describes how to run an example of post-training quantization for MobileNet v2 model from PyTorch framework. It covers all the aspects from model preparation and validation of full precision model to quantization and benchmarking the performance boost after optimization. All the steps below are based on the tools and samples of configuration files distributed with the Intel® Distribution of OpenVINO™ toolkit.
In the instructions below,
<INSTALL_DIR> is the directory where Intel® Distribution of OpenVINO™ toolkit is installed and
<POT_DIR> is the Post-Training Optimization Tool directory, which is
Sample configuration files are located in the
- Override the path to the model, dataset and annotations inside the configuration file.
Evaluate the accuracy of full-precision model:
The actual result should be 71.82% of the accuracy top-1 metric.
- Override the path to the model and AccuracyChecker YAML configuration file.
Run POT tool to get quantized model. The resulted model will be placed in the subfolder under the
The actual result should be 71.42% of accuracy top-1 metric on VNNI based CPU. Note: the results can be different on the CPUs with the different instruction sets.
In order to observe the performance speedup after the quantization, run
benchmark_app for the original and quantized models: