Obtain Datasets

Dataset is a set of testing and validation images. Dataset format depends on a task a model is trained to perform. In DL Workbench you can work with annotated and not annotated validation datasets.

Upload Annotated Datasets

Note

Sample datasets must consist of a small sampling of images and be in ImageNet, Pascal Visual Object Classes (Pascal VOC), Common Objects in Context (COCO), Common Semantic Segmentation, Labeled Faces in the Wild (LFW), Visual Geometry Group Face 2 (VGGFace2), Wider Face, Open Images or unannotated format. To learn more about the formats, refer to Dataset Types.

On the Create Project page, go to Select a Validation Dataset tab and click Import Dataset :

_images/dataset_import.png

To import a new dataset, click Upload Dataset tab. Upload a .zip or .tar.gz archive with your dataset and specify the Dataset Name :

_images/import_annotated_dataset.png

Click Import. You are automatically directed back to the datasets table, where you can see the status of the import and select a dataset:

_images/dataset_selection.png

Upload Not Annotated Datasets

You can upload your images to create not annotated dataset. If you do not have enough images or want to enlarge the dataset, use augmentation methods to increase the size of a dataset by generating modified image copies.

On the Create Project page, go to Select a Validation Dataset tab and click Import Dataset :

_images/dataset_import.png

You will see the Create Dataset page where you can add your own images and specify the dataset name:

_images/import_dataset_page.png

After you click Import, you are redirected to the Create Project page where you can check the import status.

_images/dataset_selection.png

Augmentation

Apply different augmentation types to create variations of your images and improve the model performance. Extending your validation dataset also helps to avoid possible overfitting of a calibrated model. Augmentation methods include different image modifications, such as horizontal and vertical flips, random erase, noise injection, and color transformations.

Horizontal Flip

Horizontal image flip means reversing the rows and columns of an image pixels horizontally. Usually it does not modify the object category.

_images/horizontal_flip_closeup.png

Vertical Flip

Vertical image flip reverse the rows and columns of an image pixels vertically. It is recommended to use this method in the context of the selected image and model task to avoid recognition issues.

_images/vertical_flip.png

Random Erase

Random Erase randomly selects a rectangle section in the image and erases its pixels with random values. Note that this augmentation methon might randomly erase an object particularly important for your use case. It is recommended to use this method in the context of the selected image and model task.

_images/random_erase.png

Noise Injection

Noise Injection means injecting a matrix of random values. Noise Injection presents itself as random black and white pixels spread through the image. This method helps to avoid overfitting when you model concentrates on the image patterns that occur frequently but may not be useful.

_images/noise_injection.png

Color Transformations

Color Transformations change brightness and contrast of the image. You can select one or several presets with changed parameters. The preset specifies whether the brightness of the augmented image will be lighter(+20%) or darker(-20%). Contrast is the degree to which light and dark colours in the image differ. You can make the constrast of the augmented image higher(+20%) or lower(-20%).

_images/color_transformations.png

After clicking Import, you are redirected to the Create Project page where you can check the import status. To remove an imported dataset from the list, click the bin icon in the Action column.

_images/custom_dataset_imported.png

All images were taken from ImageNet, Pascal Visual Object Classes, and Common Objects in Context datasets for demonstration purposes only.