This tutorial describes an example of running post-training quantization for MobileNet v2 model from PyTorch framework, particularly by the DefaultQuantization algorithm. The example covers the following steps:
All the steps are based on the tools and samples of configuration files distributed with the Intel® Distribution of OpenVINO™ toolkit.
The example has been verified in Ubuntu 18.04 Operating System with Python 3.6 installed.
In case of issues while running the example, refer to POT Frequently Asked Questions for help.
In the instructions below, <INSTALL_DIR>
is the directory where OpenVINO™ toolkit is installed and <POT_DIR>
is the Post-training Optimization Toolkit directory, which is <INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit
. <EXAMPLE_DIR>
is the working directory where the example is executed.
<EXAMPLE_DIR>
.Download the MobileNet v2 PyTorch model using the Model Downloader tool from the Open Model Zoo repository:
After that the original full-precision model is located in <EXAMPLE_DIR>/public/mobilenet-v2-pytorch/
.
Convert the model to the OpenVINO™ Intermediate Representation (IR) format using the Model Converter tool:
After that the full-precision model in the IR format is located in <EXAMPLE_DIR>/public/mobilenet-v2-pytorch/FP32/
.
For more information about the Model Optimizer, refer to its documentation.
Check the performance of the original model using the Deep Learning Benchmark tool:
Note that the results might be different dependently on characteristics of your machine. On a machine with Intel® Core™ i9-10920X CPU @ 3.50GHz it is like:
Check the performance of the full-precision model in the IR format using the Deep Learning Benchmark tool:
Note that the results might be different dependently on characteristics of your machine. On a machine with Intel® Core™ i9-10920X CPU @ 3.50GHz it is like:
For more information about the Benchmark Tool refer to its documentation.
To perform the accuracy validation as well as quantization of a model, the dataset should be prepared. This example uses a real dataset called ImageNet.
To download images:
Signup
button in the right upper corner, provide your data, and wait for a confirmation email.Download
tab.Download Original Images
.Terms of Access
page. If you agree to the Terms, continue by clicking Agree and Sign
.Download as one tar file
section.<EXAMPLE_DIR>/ImageNet/
.Note that the registration process might be quite long.
Note that the ImageNet size is 50 000 images and takes around 6.5 GB of the disk space.
To download the annotation file:
val.txt
from the archive into <EXAMPLE_DIR>/ImageNet/
.After that the <EXAMPLE_DIR>/ImageNet/
dataset folder should have a lot of image files like ILSVRC2012_val_00000001.JPEG
and the val.txt
annotation file.
<EXAMPLE_DIR>
and name it mobilenet_v2_pytorch.yaml
. This is the Accuracy Checker configuration file.mobilenet_v2_pytorch.yaml
: where data_source: ./ImageNet
is the dataset and annotation_file: ./ImageNet/val.txt
is the annotation file prepared on the previous step. For more information about the Accuracy Checker configuration file refer to Accuracy Checker Tool documentation.
Evaluate the accuracy of the full-precision model in the IR format by executing the following command in <EXAMPLE_DIR>
:
The actual result should be like 71.81% of the accuracy top-1 metric on VNNI based CPU.
Note that the results might be different on CPUs with different instruction sets.
<EXAMPLE_DIR>
and name it mobilenet_v2_pytorch_int8.json
. This is the POT configuration file.mobilenet_v2_pytorch_int8.json
: where "model": "./public/mobilenet-v2-pytorch/FP32/mobilenet-v2-pytorch.xml"
and "weights": "./public/mobilenet-v2-pytorch/FP32/mobilenet-v2-pytorch.bin"
specify the full-precision model in the IR format, "config": "./mobilenet_v2_pytorch.yaml"
is the Accuracy Checker configuration file, and "name": "DefaultQuantization"
is the algorithm name.
Perform model quantization by executing the following command in <EXAMPLE_DIR>
:
The quantized model is placed into the subfolder with your current date and time in the name under the /results/mobilenetv2_DefaultQuantization/
directory. The accuracy validation of the quantized model is performed right after the quantization. The actual result should be like 71.556% of the accuracy top-1 metric on VNNI based CPU.
Note that the results might be different on CPUs with different instruction sets.
Check the performance of the quantized model using the Deep Learning Benchmark tool:
where <INT8_MODEL>
is the path to the quantized model.
Note that the results might be different dependently on characteristics of your machine. On a machine with Intel® Core™ i9-10920X CPU @ 3.50GHz it is like: