Post-Training Optimization Toolkit

## Introduction

Post-training Optimization Toolkit (POT) is designed to accelerate the inference of deep learning models by applying special methods without model retraining or fine-tuning, like post-training quantization. Therefore, the tool does not require a training dataset or a pipeline. To apply post-training algorithms from the POT, you need:

• A full precision model, FP32 or FP16, converted into the OpenVINO™ Intermediate Representation (IR) format
• A representative calibration dataset of data samples representing a use case scenario, for example, 300 images

The tool is aimed to fully automate the model transformation process without changing the model structure. The POT is available only in the Intel® distribution of OpenVINO™ toolkit and is not opensourced. For details about the low-precision flow in OpenVINO™, see the Low Precision OptimizationGuide.

Post-training Optimization Toolkit includes a standalone command-line tool and a Python* API that provide the following key features:

• Two post-training 8-bit quantization algorithms: fast DefaultQuantization and precise AccuracyAwareQuantization.
• Global optimization of post-training quantization parameters using the Tree-Structured Parzen Estimator.
• Symmetric and asymmetric quantization schemes. For details, see the Quantization section.
• Compression for different hardware targets such as CPU and GPU.
• Per-channel quantization for Convolutional and Fully-Connected layers.
• Multiple domains: Computer Vision, Recommendation Systems.
• Ability to implement a custom optimization pipeline via the supported API.

For benchmarking results collected for the models optimized with POT tool, see INT8 vs FP32 Comparison on Select Networks and Platforms.

Further documentation presumes that you are familiar with the basic deep learning concepts, such as model inference, dataset preparation, model optimization, as well as with the OpenVINO™ toolkit and its components such as Model Optimizer and AccuracyChecker.

## Install and Set Up Post-Training Optimization Tool

In the instructions below, <INSTALL_DIR> is the directory where the Intel® distribution of OpenVINO™ toolkit is installed. POT is distributed as a part of the OpenVINO™ release package, and to use it as a command-line tool, you need to install it separately as well as its dependencies, namely Model Optimizer and AccuracyChecker. POT source files are available from the <INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit directory after the OpenVINO™ installation. It is recommended to create a separate Python* environment before installing the OpenVINO™ and its components. To set up the POT in your environment, follow the steps below:

1. Before using the POT, set up the Model Optimizer and AccuracyChecker components:
• To install the Model Optimizer:
1. Go to cd /opt/intel/openvino/deployment_tools/model_optimizer/install_prerequisites.
2. Run the script to configure the Model Optimizer: sh sudo ./install_prerequisites.sh 
• To install the Accuracy Checker:
1. Go to <INSTALL_DIR>/deployment_tools/open_model_zoo/tools/accuracy_checker.
2. Run the setup.py script: sh python3 setup.py install 
2. Setup the POT:
1. Go to <INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit.
2. Run the setup.py script:
python3 setup.py install

Now the POT is available in the command line by the pot alias. To verify this, run pot -h.

## Use Post-Training Optimization Command-Line Tool

Before running the POT, convert your pretrained model into the OpenVINO™ IR format with the Model Optimizer. In addition, it is highly recommended to use the AccuracyChecker to make sure that the model can be successfully inferred and achieves similar accuracy numbers as the reference model from the original framework.

To run the command-line Post-training Optimization Tool:

1. Activate the Python environment in the command-line shell where the POT and the Accuracy Checker were installed.
2. Set up the OpenVINO™ environment in the command-line shell with the following script:
source <INSTALL_DIR>/bin/setupvars.sh
3. Prepare a configuration file for the POT using the examples in the configs folder. To simplify this step, use the Accuracy Checker configuration file for the floating-point model and refer to it when necessary. See Post-Training Optimization Best Practices.
4. Launch the command-line tool with the configuration file:
pot -c <path_to_config_file>
For all available usage options, use the -h, --help arguments or refer to the Command-Line Arguments below.

5. By default, the results are dumped into the separate output subfolder inside the results folder that is created in the same directory where the tool is run from. Use the -e option to evaluate the accuracy directly from the tool.

See the How to Run Examples tutorial about how to run a particular example of 8-bit quantization with the POT.

### Command-Line Arguments

The following command-line options are available to run the tool:

Argument Description
-h, --help Optional. Show help message and exit.
-c CONFIG, --config CONFIG Path to a config file with task- or model-specific parameters.
-e, --evaluate Optional. Evaluate model on the whole dataset after optimization.
--output-dir OUTPUT_DIR Optional. A directory where results are saved. Default: ./results.
-sm, --save-model Optional. Save the original full-precision model.
-d, --direct-dump Optional. Save results directly to output directory without additional subfolders.
--log-level {CRITICAL,ERROR,WARNING,INFO,DEBUG} Optional. Log level to print. Default: INFO.
--progress-bar Optional. Disable CL logging and enable progress bar.
--stream-output Optional. Switch model quantization progress display to a multiline mode. Use with third-party components.
--keep-uncompressed-weights Optional. Keep Convolution, Deconvolution and FullyConnected weights uncompressed. Use with third-party components.