After you have used the Model Optimizer to create an Intermediate Representation (IR), use the Inference Engine to infer input data.
The Inference Engine is a C++ library with a set of C++ classes to infer input data (images) and get a result. The C++ library provides an API to read the Intermediate Representation, set the input and output formats, and execute the model on devices.
To learn about how to use the Inference Engine API for your application, see the Integrating Inference Engine in Your Application documentation.
Complete API Reference is in the full offline package documentation:
<INSTALL_DIR>is the OpenVINO toolkit installation directory.
index.htmlin an Internet browser.
Inference Engine uses a plugin architecture. Inference Engine plugin is a software component that contains complete implementation for inference on a certain Intel® hardware device: CPU, GPU, VPU, FPGA, etc. Each plugin implements the unified API and provides additional hardware-specific APIs.
Your application must link to the core Inference Engine library:
The required C++ header files are located in the
This library contains the classes to:
For each supported target device, Inference Engine provides a plugin — a DLL/shared library that contains complete implementation for inference on this particular device. The following plugins are available:
|CPU||Intel® Xeon® with Intel® AVX2 and AVX512, Intel® Core™ Processors with Intel® AVX2, Intel® Atom® Processors with Intel® SSE|
|GPU||Intel® Processor Graphics, including Intel® HD Graphics and Intel® Iris® Graphics|
|FPGA||Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA, Intel® Vision Accelerator Design with an Intel® Arria 10 FPGA (Speed Grade 1), Intel® Vision Accelerator Design with an Intel® Arria 10 FPGA (Speed Grade 2)|
|MYRIAD||Intel® Movidius™ Neural Compute Stick powered by the Intel® Movidius™ Myriad™ 2, Intel® Neural Compute Stick 2 powered by the Intel® Movidius™ Myriad™ X|
|GNA||Intel® Speech Enabling Developer Kit, Amazon Alexa* Premium Far-Field Developer Kit, Intel® Pentium® Silver processor J5005, Intel® Celeron® processor J4005, Intel® Core™ i3-8121U processor|
|HETERO||Automatic splitting of a network inference between several devices (for example if a device doesn't support certain layers|
|MULTI||Simultaneous inference of the same network on several devices in parallel|
The table below shows the plugin libraries and dependencies for Linux and Windows platforms.
|Plugin||Library name for Linux||Dependency libraries for Linux||Library name for Windows||Dependency libraries for Windows|
|MYRIAD||No dependencies||No dependencies|
|HETERO||Same as for selected plugins||Same as for selected plugins|
|MULTI||Same as for selected plugins||Same as for selected plugins|
Make sure those libraries are in your computer's path or in the place you pointed to in the plugin loader. Make sure each plugin's related dependencies are in the:
On Linux, use the script
bin/setupvars.sh to set the environment variables.
On Windows, run the
bin\setupvars.bat batch file to set the environment variables.
To learn more about supported devices and corresponding plugins, see the Supported Devices chapter.
The common workflow contains the following steps:
InferenceEngine::CNNNetReaderclass, read an Intermediate Representation file into an object of the
InferenceEngine::CNNNetworkclass. This class represents the network in the host memory.
InferenceEngine::Coreobject to work with different devices, all device plugins are managed internally by the
Coreobject. Pass per device loading configurations specific to this device (
InferenceEngine::Core::SetConfig), and register extensions to this device (
InferenceEngine::Core::LoadNetwork()method with specific device (e.g.
GPU, etc.) to compile and load the network on the device. Pass in the per-target load configuration for this compilation and load operation.
InferenceEngine::ExecutableNetworkobject. Use this object to create an
InferenceEngine::InferRequestin which you signal the input buffers to use for input and output. Specify a device-allocated memory and copy it into the device memory directly, or tell the device to use your application memory to save a copy.
For more details on the Inference Engine API, refer to the Integrating Inference Engine in Your Application documentation.