See Also
The following section provides information on how to distribute a model across all 8 VPUs to maximize performance.
Programming a C++ Application for the Accelerator
Declare a Structure to Track Requests
The structure should hold:
- A pointer to an inference request.
- An ID to keep track of the request.
struct Request {
int frameidx;
};
std::shared_ptr< InferRequest > Ptr
A smart pointer to the InferRequest object.
Definition: ie_infer_request.hpp:321
Declare a Vector of Requests
std::vector<Request> request(numRequests);
Declare and initialize 2 mutex variables:
- For each request
- For when all 8 requests are done
Declare a Conditional Variable
Conditional variable indicates when at most 8 requests are done at a time.
For inference requests, use the asynchronous IE API calls:
request[i].inferRequest = executable_network.CreateInferRequestPtr();
request[i].inferRequest->StartAsync();
Create a Lambda Function
Lambda Function enables the parsing and display of results.
Inside the Lambda body use the completion callback function:
request[i].inferRequest->SetCompletionCallback(callback);
Additional Resources