Dataset Types

Below is the list of dataset types available to use in the DL Workbench:

Your dataset does not need to contain images from official databases providing these types, like ImageNet or Pascal VOC, but it needs to adhere to the supported dataset formats.

Use case Datasets/Dataset formats
Classification ImageNet, unannotated*​
Object Detection​ Pascal VOC, MS COCO, unannotated*​
Segmentation (Semantic, Instance)​ MS COCO, Pascal VOC, CSS, unannotated*
Facial Landmark detection​ LFW, unannotated*
Face recognition​ VGGFace2, unannotated*
Super resolution​ Common Super Resolution, unannotated*
Style Transfer​ ImageNet, Pascal VOC, MS COCO, CSS, CSR, unannotated*
Inpainting ImageNet, Pascal VOC, MS COCO, CSS, CSR, unannotated*

ImageNet

ImageNet is a dataset for classification models. DL Workbench supports only the format of the ImageNet validation dataset published in 2012.

Download ImageNet Dataset

To download images from ImageNet, you need to have an account and agree to the Terms of Access. Follow the steps below:

  1. Go to the ImageNet homepage:
  1. If you have an account, click Login. Otherwise, click Signup in the right upper corner, provide your data, and wait for a confirmation email:
  1. Once you receive the confirmation email and log in, go to the Download page:
  1. Select Download Original Images:
  1. This redirects you to the Terms of Access page. If you agree to the Terms, continue by clicking Agree and Sign:
  1. Click one of the links in the Download as one tar file section to select it:
  1. Save it to the directory with the name /home/<user>/Work/imagenet.zip for Linux*, macOS* or C:\Work\imagenet.zip for Windows*.
  2. Download the archive with annotations.
  3. Unarchive both imagenet.zip and caffe_ilsvrc12.tar.gz. Place the val.txt file from caffe_ilsvrc12 inside the imagenet folder.
  4. Zip the contents of the imagenet folder.

ImageNet Structure

The final imagenet.zip archive must follow the structure below:

|-- imagenet.zip
|-- val.txt
|-- 0001.jpg
|-- 0002.jpg
|...
|-- n.jpg

The annotation file is organized as follows:

0001.jpg <label ID>
0002.jpg <label ID>
...
n.jpg <label ID>

NOTE: The dataset is considerably big in size. If you want to save your time when loading it into the DL Workbench, follow the instructions to cut the dataset.

Pascal Visual Object Classes (Pascal VOC)

Pascal VOC dataset is used to train classification, object-detection and semantic-segmentation models. DL Workbench supports validation on Pascal VOC datasets for object detection, semantic segmentation, image inpainting and style transfer. DL Workbench supports only the format of the VOC validation datasets published in 2007, 2010, and 2021.

Download Pascal VOC Dataset

To download test data from Pascal VOC, you need to have an account. Follow the steps below:

  1. Go to the PASCAL Visual Object Classes Homepage:
  1. Click PASCAL VOC Evaluation Server under the Pascal VOC data sets heading:
  1. If you have an account, click Login in the left upper corner. Otherwise, click Registration, provide your data, and wait for a confirmation email:
  1. Once you receive the confirmation email and log in, go to the Pascal VOC Challenges 2005-2012:
  1. Select a challenge. For example, The VOC2008 Challenge. On the challenge page, go to the Development Kit section:
  1. Download the training/validation_data file.

NOTE: The dataset is considerably big in size. If you want to save your time when loading it into the DL Workbench, follow the instructions to cut the dataset.

Pascal VOC Structure

Pascal VOC datasets consist of several folders containing annotation files and image indices. Each image file must have the corresponding annotation file.

A Pascal VOC dataset archive is organized as follows:

|-- VOCdevkit
|-- VOC
|-- Annotations
|-- 0001.xml
|-- 0002.xml
...
|-- n.xml
|-- ImageSets
|-- Layout
|-- test.txt
|-- Main
|-- 0001_test.txt
|-- 0002_test.txt
...
|-- n_test.txt
|-- Segmentation
|-- test.txt
|-- JPEGImages
|-- 0001.jpg
|-- 0002.jpg
...
|-- n.jpg
|-- SegmentationClass
|-- 0001.png
|-- 0002.png
...
|-- n.png
|-- SegmentationObject
|-- 0001.png
|-- 0002.png
...
|-- n.png

Common Objects in Context (COCO)

COCO dataset is used for object detection, instance segmentation, person keypoints detection, stuff segmentation, and caption generation. DL Workbench supports validation on COCO datasets for object detection, instance segmentation, image inpainting, and style transfer. DL Workbench supports only the format of the COCO validation datasets published in 2014 and 2017.

Download COCO Dataset

To use a dataset from the COCO website, download annotations and images archives separately. Choose one of the options:

NOTE: The dataset is considerably big in size. If you want to save your time when loading it into the DL Workbench, follow the instructions to cut the dataset.

COCO Structure

COCO dataset is organized as follows:

|-- val
|-- 0001.jpg
|-- 0002.jpg
...
|-- n.jpg
|-- annotations
|-- instances_val.json

The JSON file with annotations is organized as follows:

{
"info": <info>,
"images": [<images>],
"licenses": [<licenses>],
"annotations": [<annotations>]
}

Common Semantic Segmentation (CSS)

CSS is an OpenVINO™ dataset type aimed to simplify the structure provided by Pascal VOC. DL Workbench supports validation on CSS datasets for semantic segmentation, image inpainting, and style transfer.

CSS Structure

A CSS dataset archive consists of folders with images and masks, and a JSON file with meta information:

|-- dataset_meta.json
|-- Images
|-- 0001.jpg
|-- 0002.jpg
...
|-- n.jpg
|-- Masks
|-- 0001.png
|-- 0002.png
...
|-- n.png

The JSON meta information file is organized as follows:

{
"label_map": {<map>},
"background_label":"<label>",
"segmentation_colors":[<colors>]
}

Common Super-Resolution (CSR)

CSR is an OpenVINO™ dataset type for super-resolution, image-inpainting, and style-transfer models.

CSR Structure

The archive consists of three separate folders for high-resolution images, low-resolution images, and upsampled low-resolution images:

|-- HR
|-- 0001.jpg
|-- 0002.jpg
...
|-- n.jpg
|-- LR
|-- 0001.jpg
|-- 0002.jpg
...
|-- n.jpg
|-- upsampled
|-- 0001.png
|-- 0002.png
...
|-- n.png

Labeled Faces in the Wild (LFW)

LFW is used for face recognition. DL Workbench supports only LFW validation datasets.

Download LFW Dataset

  1. Create an empty LFW folder with two subdirectories: Images and Annotations.
  1. Download the lfw.tgz archive with images. Unarchive it and place it in the Images folder.
  2. Download the pairs.txt annotation file. Place the file in the Annotations folder.
  3. Archive the LFW folder.

LFW Structure

An LFW dataset archive consists of folders with images and annotations. The Images folder contains separate folders with photographs of a particular person. The Annotations folder contains two files: pairs.txt and landmarks.txt. The pairs.txt file is required, while landmarks.txt is optional.

|-- LFW
|-- Images
|-- Person_1
|-- Person_1_0001.jpg
...
|-- Person_1_n.jpg
...
|-- Person_N
|-- Person_N_0001.jpg
...
|-- Person_N_n.jpg
|-- Annotations
|-- pairs.txt
|-- landmarks.txt
  • The pairs.txt file follows the structure represented below. The file is split into sets of randomly selected persons to provide randomization for accuracy measurements.

    • The first line contains the number of sets and the number of images in a set. This information is necessary for accuracy measurements.
    • Lines with correct pairs: two images of one person.
    • Lines with incorrect pairs: two images of two different persons.

    Blocks of lines with correct and incorrect pairs alternate to represent different sets. Below is an example of an annotation beginning. The numbers of sets and images in them come first, followed by the lines for the first set.

10 300
Person_1 2 4
Person_2 3 6
Person_2 4 5
...
Person_N 2 3
Person_1 1 Person_15 1
Person_1 2 Person_43 1
Person_2 1 Person_89 2
...
Person_300 1 Person_21 1

Then the lines for the second set begin:

Person_301 2 4
Person_302 3 6
...
Person_600 2 3
Person_301 1 Person_334 1
Person_302 1 Person_570 2
...
Person_600 1 Person_416 1
  • The landmarks.txt file contains coordinates of five facial landmarks found in an image:

    • Left eye
    • Right eye
    • Nose
    • Left mouth corner
    • Right mouth corner

    Each line consists of the relative path to an image and two coordinates in pixels of each landmark in the same order as in the list above:

    Person_1/Person_1_0001.jpg 102 114 146 111 122 133 104 158 148 155
    Person_2/Person_2_0001.jpg 107 113 147 117 126 139 110 158 136 159
    Person_2/Person_2_0002.jpg 103 112 145 111 125 137 107 156 146 160
    ...
    Person_N/Person_N_0002.jpg 102 114 146 105 127 133 113 159 152 153

NOTE: There is no requirements for image and folder names. However, the names that you use for images and folders must match the names that you put in annotations.

Visual Geometry Group Face 2 (VGGFace2)

VGGFace2 is used for facial landmark detection.

VGGFace2 is currently not available for download. Consider creating your own dataset with the same structure and annotations as described below.

VGGFace2 Structure

A VGGFace2 dataset archive consists of folders with images and annotations. The Images folder contains separate folders with images of a particular person. The Annotations folder contains two files: loose_bb_test.csv and loose_landmark_test.csv.

|-- VGGFace2
|-- Images
|-- 0001
|-- 0001_01.jpg
|-- 0002_02.jpg
|-- 0003_02.jpg
|-- 0002_01.jpg
|...
|-- nnnn_1.jpg
|-- 0002
...
|-- nnnn
|-- Annotations
|-- loose_bb_test.csv
|-- loose_landmark_test.csv
  • The loose_bb_test.csv contains the coordinates in pixels of a bounding box in an image:
    NAME_ID X Y W H
    0001/0001_01.jpg 60 60 79 111
    0001/0002_01.jpg 107 113 147 117
    0001/0002_01.jpg 103 112 145 111
    ...
    NNNN/nnnn_01.jpg 102 114 146 105
  • The loose_landmarks_test.csv file contains coordinates of five facial landmarks found in an image:

    • Left eye
    • Right eye
    • Nose
    • Left mouth corner
    • Right mouth corner

    Each line consists of the relative path to an image and two coordinates in pixels of each landmark in the same order as in the list above:

NAME_ID P1X P1Y P2X P2Y P3X P3Y P4X P4Y P5X P5Y
0001/0001_01.jpg 75 110 103 104 90 133 85 149 114 144
0001/0002_01.jpg 195 212 279 206 237 273 208 318 283 312
0001/0002_01.jpg 289 232 400 233 345 322 289 373 394 378
...
NNNN/nnnn_01.jpg 83 90 111 87 103 111 86 129 111 126

NOTE: When you download an original dataset, it includes loose_bb_train.csv and loose_landmarks_train.csv files. Remove these files before importing the dataset into the DL Workbench.

NOTE: There is no requirements for image and folder names. However, the names that you use for images and folders must match the names that you put in annotations.

Not Annotated Dataset

Not annotated datasets are sets of images and do not contain annotations. Models in projects that use not annotated datasets can be calibrated only with the Default Calibration method and cannot be used for accuracy measurements.

Download Not Annotated Dataset

Download the Landscape Pictures dataset without annotations.

Not Annotated Dataset Structure

The archive is organized as follows:

|-- 0001.jpg
|-- 0002.jpg
|...
|-- n.jpg

See Also