Below, a Layer is a CNN building block implemented in the training framework, for example, "Convolution" in the Caffe*. Kernel is defined as the corresponding implementation in Inference Engine.
Please refer to the Model Optimizer documentation on the details of how a mapping between framework's layers and Inference Engine kernels is registered. In short, you can plug your own kernel implementations into the Inference Engine and map them to the layers in the original framework.
The rest of the section covers custom kernels and how do you integrate them into the Inference Engine.
Example of Custom Kernels Support in the Samples
Every sample uses the Inference Engine API to load custom kernels depending on the device type. Specifically, for the CPU this is a shared library that exports certain interface that registers the kernels. For GPU or MYRIAD, it is an .xml
file that lists the kernels along with parameters that the kernels accept, and how these map to the specific IR values.
Example Custom Kernels
The extension
folder in the samples
directory comes with few real example of CPU-targeted kernels, like DetectionOutput (used in SSD*), etc.
Also bunch of the GPU-targeted kernels are added to the binaries upon samples compilation (so that the samples' apps can easy load them). Refer to the cldnn_global_custom_kernels
folder in GPU plugin installation directory.
How to Implement Custom GPU Layers
You need to provide the kernel code in the OpenCL C, and the configuration file that connects the kernel and its parameters to the parameters of the layer.
There are two options of using custom layer configuration file.
First is to include section with your kernels into global auto-loading cldnn_global_custom_kernels/cldnn_global_custom_kernels.xml
file. Second one is to provide a separate configuration file and load it using IInferencePlugin::SetConfig()
method with the PluginConfigParams::KEY_CONFIG_FILE
key and the configuration file name as the value, before loading the network that features the custom layers:
InferencePlugin plugin(plugin_ptr);
plugin.SetConfig({{PluginConfigParams::KEY_CONFIG_FILE, ”<path to the xml file>”}});
For the details on the configuration parameters and OpenCL kernel refer to the tutorial available online
How to Implement Custom MYRIAD Layers
- Since OpenCL™ toolchain for MYRIAD supports offline compilation only, OpenCL C code firstly should be compiled using standalone
clc
compiler with the following command ./clc --strip-binary-header custom_layer.cl -o custom_layer.bin
-
Write a configuration file with a kernel parameter description and bindings. For example, given the following OpenCL kernel signature:
__kernel void reorg_nhwc(__global const half *src, __global half *out, int w, int h, int c, int stride);
Configuration file for this kernel might be the following:
<CustomLayer name="ReorgYolo" type="MVCL" version="1">
<Kernel entry="reorg_nhwc">
<Source filename="reorg.bin"/>
</Kernel>
<Parameters>
<Tensor arg-name="src" type="input" port-index="0" format="BYXF"/>
<Tensor arg-name="out" type="output" port-index="0" format="BYXF"/>
<Scalar arg-name="w" type="int" port-index="0" source="I.X" />
<Scalar arg-name="h" type="int" port-index="0" source="I.Y" />
<Scalar arg-name="c" type="int" port-index="0" source="I.F" />
<Scalar arg-name="stride" type="int" source="stride" />
</Parameters>
<WorkSizes dim="input,0" global="(Y+7)/8*8,1,1" local="8,1,1"/>
</CustomLayer>
Each custom layer is described with CustomLayer
node. Here is the list of mandatory nodes and attributes:
- Root node
CustomLayer
:
- Attribute
name
is the name of the IE layer to bind kernel with.
- Attributes
type
and version
. Leave them MVCL
and 1
for now.
- Sub-node
Kernel
:
- Attribute
entry
is a name of our kernel function as written in source file (reorg_nhwc
is in example above).
- Node
Source
with attribute filename
is a path to compiled binary relative to this xml binding file.
- Sub-node
Parameters
describes parameters bindings.
- Sub-node
WorkSizes
describes local, global work group sizes and source for dimension deduction as a pair direction,port
. The example above describes work group relative dimension of input tensor that comes thought port 0 in IR. Any simple math expressions with +,-,*,/
and ()
from B
(batch), Y
(height), X
(width) and F
(channels) are supported for global
and local
work group configuration.
Parameter description format is the following:
-
Tensor
and Scalar
nodes are supported.
- Each
Tensor
node must contain the following attributes:
-
arg-name
, which is a name of kernel parameter in kernel signature
-
type
, which is input
or output
as in IR
-
port-index
, which is a number of input/output port as in IR
-
format
, which specifies channel order in tensor. Optional repacks would be generated if custom layer format is not compatible with formats of neighboring layers.
- Each
Scalar
node must contain the following attributes:
-
arg-name
, which is a name of kernel parameter in kernel signature.
-
type
, which is int
or float
and used for correct argument extraction from IR parameters.
-
source,
which might contain name of the parameter in IR file or input/output (I
/O
, In
/On
, where n
is a port number) followed by dimension B
(batch), Y
(height), X
(width) or F
(channels)
- Provide a separate configuration file and load it using
InferencePlugin::SetConfig()
method with the PluginConfigParams::KEY_CONFIG_FILE
key and the configuration file name as the value, before loading the network that features the custom layers:
InferencePlugin plugin(plugin_ptr);
plugin.SetConfig({{PluginConfigParams::KEY_CONFIG_FILE, ”<path to the xml file>”}});
Optionally, you can set path to custom layers description with a pair of VPU_CUSTOM_LAYERS
and /path/to/your/customLayers.xml
as a network configuration:
std::map<std::string, std::string> networkConfig;
config["VPU_CUSTOM_LAYERS"] = "/path/to/your/customLayers.xml";
IECALL(myriad->LoadNetwork(exeNetwork, cnnNetwork, networkConfig, &resp));
NOTE: If both native and custom layer implementations are present custom kernel was a priority over native code.
How to Implement Custom CPU Layers
This is a brief version of the full-blown Custom Layers tutorial available online.
- Create your custom layer factory
CustomLayerFactory
class.
class CustomLayerFactory {
};
- Inherit it from the abstract class
InferenceEngine::ILayerImplFactory
- Create constructor and virtual destructor, and a data member to keep the layer info
public:
explicit CustomLayerFactory(const CNNLayer *layer): cnnLayer(*layer) {}
private:
CNNLayer cnnLayer;
};
- Overload and implement the abstract methods (
getShapes
, getImplementations
) of the InferenceEngine::ILayerImplFactory
class
public:
StatusCode getShapes(
const std::vector<TensorDesc>& inShapes, std::vector<TensorDesc>& outShapes, ResponseDesc *resp) noexcept
override {
if (cnnLayer == nullptr) {
std::string errorMsg = "Cannot get cnn layer!";
errorMsg.copy(resp->msg, sizeof(resp->msg) - 1);
return GENERAL_ERROR;
}
if (inShapes.size() != 1) {
std::string errorMsg = "Incorrect input shapes!";
errorMsg.copy(resp->msg, sizeof(resp->msg) - 1);
return GENERAL_ERROR;
}
outShapes.clear();
outShapes.emplace_back(inShapes[0]);
return OK;
}
impls.push_back(ILayerImpl::Ptr(new CustomLayerImpl()));
return OK;
}
};
- Create your custom layer implementation
CustomLayerImpl
class:
class CustomLayerImpl {
};
- Because the layer will use execute method in order to change data, inherit it from the abstract class
InferenceEngine::ILayerExecImpl
and overload and implement the abstract methods of this class.
class CustomLayerImpl: public ILayerExecImpl {
public:
explicit CustomLayerImpl(const CNNLayer *layer): cnnLayer(*layer) {}
StatusCode getSupportedConfigurations(std::vector<LayerConfig>& conf, ResponseDesc *resp) noexcept
override;
StatusCode init(LayerConfig& config, ResponseDesc *resp) noexcept
override;
StatusCode execute(std::vector<Blob::Ptr>& inputs, std::vector<Blob::Ptr>& outputs, ResponseDesc *resp) noexcept
override;
private:
CNNLayer cnnLayer;
};
- First of all, implement the
getSupportedConfigurations
, which returns all supported configurations for this implementation. In order to specify formats of data you can use InferenceEngine::TensorDesc
. Refer to Inference Engine Memory Primitives for the instructions on how to do this.
virtual StatusCode CustomLayerImpl::getSupportedConfigurations(std::vector<LayerConfig>& conf, ResponseDesc *resp) noexcept {
try {
if (cnnLayer == nullptr)
if (cnnLayer->insData.size() != 1 || cnnLayer->outData.empty())
LayerConfig config;
DataPtr dataPtr = cnnLayer->insData[0].lock();
if (!dataPtr)
DataConfig dataConfig;
dataConfig.inPlace = -1;
dataConfig.constant = false;
for (size_t i = 0; i < dataPtr->getTensorDesc().getDims().size(); i++) {
order.push_back(i);
}
dataConfig.desc = TensorDesc(dataPtr->getTensorDesc().getPrecision(),
dataPtr->getTensorDesc().getDims(),
{dataPtr->getTensorDesc().getDims(), order});
config.inConfs.push_back(dataConfig);
DataConfig outConfig;
outConfig.constant = false;
outConfig.inPlace = 0;
order.clear();
for (size_t i = 0; i < cnnLayer->outData[0]->getTensorDesc().getDims().size(); i++) {
order.push_back(i);
}
outConfig.desc = TensorDesc(cnnLayer->outData[0]->getTensorDesc().getPrecision(),
cnnLayer->outData[0]->getDims(),
{cnnLayer->outData[0]->getDims(), order});
config.outConfs.push_back(outConfig);
config.dynBatchSupport = 0;
conf.push_back(config);
return OK;
} catch (InferenceEngine::details::InferenceEngineException& ex) {
std::string errorMsg = ex.what();
errorMsg.copy(resp->msg, sizeof(resp->msg) - 1);
return GENERAL_ERROR;
}
}
- Implement
init
and execute
methods. init
required to get selected configuration and check parameters.
virtual StatusCode CustomLayerImpl::init(LayerConfig& config, ResponseDesc *resp) noexcept {
if (config.dynBatchSupport) {
config.dynBatchSupport = 0;
rc = NOT_IMPLEMENTED;
}
for (auto& input : config.inConfs) {
if (input.inPlace >= 0) {
input.inPlace = -1;
rc = NOT_IMPLEMENTED;
}
for (auto& offset : input.desc.getBlockingDesc().getOffsetPaddingToData()) {
if (offset) {
return GENERAL_ERROR;
}
}
if (input.desc.getBlockingDesc().getOffsetPadding()) {
return GENERAL_ERROR;
}
for (
size_t i = 0; i < input.desc.getBlockingDesc().getOrder().
size(); i++) {
if (input.desc.getBlockingDesc().getOrder()[i] != i) {
if (i != 4 || input.desc.getBlockingDesc().getOrder()[i] != 1)
return GENERAL_ERROR;
}
}
}
for (auto& output : config.outConfs) {
if (output.inPlace < 0) {
}
for (auto& offset : output.desc.getBlockingDesc().getOffsetPaddingToData()) {
if (offset) {
return GENERAL_ERROR;
}
}
if (output.desc.getBlockingDesc().getOffsetPadding()) {
return GENERAL_ERROR;
}
for (
size_t i = 0; i < output.desc.getBlockingDesc().getOrder().
size(); i++) {
if (output.desc.getBlockingDesc().getOrder()[i] != i) {
if (i != 4 || output.desc.getBlockingDesc().getOrder()[i] != 1)
return GENERAL_ERROR;
}
}
}
return rc;
}
virtual StatusCode CustomLayerImpl::execute(std::vector<Blob::Ptr>& inputs, std::vector<Blob::Ptr>& outputs, ResponseDesc *resp) noexcept {
if (inputs.size() != 1 || outputs.empty()) {
std::string errorMsg = "Incorrect number of input or output edges!";
errorMsg.copy(resp->msg, sizeof(resp->msg) - 1);
return GENERAL_ERROR;
}
const float* src_data = inputs[0]->buffer();
float* dst_data = outputs[0]->buffer();
for (size_t o = 0; o < outputs->size(); o++) {
if (dst_data == src_data) {
dst_data[o] *= dst_data[o];
} else {
dst_data[o] = src_data[o]*src_data[o];
}
}
}
- Create a factory for your own primitives, inherited from the abstract class
InferenceEngine::IExtension
- Implement the utility methods
Unload, Release, SetLogCallback
:
public:
void Unload() noexcept
override {
}
void Release() noexcept override {
delete this;
}
};
- Implement the utility method
GetVersion
:
private:
{1, 0},
"1.0",
"CustomExtention"
};
public:
versionInfo = &ExtensionDescription;
}
};
- Implement main extension methods:
public:
std::string type_name = "CustomLayer";
types = new char *[1];
size = 1;
types[0] = new char[type_name.size() + 1];
std::copy(type_name.begin(), type_name.end(), types[0]);
types[0][type_name.size()] = '\0';
return OK;
}
StatusCode getFactoryFor(ILayerImplFactory *&factory,
const CNNLayer *cnnLayer, ResponseDesc *resp) noexcept
override {
if (cnnLayer->type != "CustomLayer") {
std::string errorMsg = std::string("Factory for ") + cnnLayer->type + " wasn't found!";
errorMsg.copy(resp->msg, sizeof(resp->msg) - 1);
return NOT_FOUND;
}
factory = new CustomLayerFactory(cnnLayer);
return OK;
}
};
- To use your custom layers, you need to compile the code as the shared library. After that you can use the
AddExtension
method of the general plugin interface in order to load your primitives:
auto extension_ptr = make_so_pointer<InferenceEngine::IExtension>(“<shared lib path>”);
plugin.AddExtension(extension_ptr);
For more details, refer to the sources of a current sample.
See Also