Smart Mining & Manufacturing: Data Access and Pre-processing for Computer Vision Applications – Common Challenges

Recent developments in deep learning have significantly improved our ability to detect defects. In the Automated Vision-Based Inspection and Defect Detection post we did a webinar on how you can perform anomaly detection using a 1-class Support Vector Machine (SVM) on image data. We then showed how you can detect as well as localise the defect using a Deep Neural Network (DNN), specifically a Variational Autoencoder (VAE) in the next post.

In this article, you will learn how to use MATLAB® to address some of the common challenges faced when developing deep learning-based approaches to detect and localise different types of anomalies. You will learn about: Data access and pre-processing techniques including denoising, registration and intensity adjustment that were used in our previous posts. Specifically, we’re going to discuss the four most common questions when it comes to data preparation:


  1. How do I access large data that might not fit into memory?
  2. How do I preprocess data and get the right features?
  3. How do I label my data faster?
  4. What if I have an imbalanced dataset or don’t have enough data?


1. How do I access large data that might not fit in memory?

When training deep neural networks, we are working with large amounts of data, often, we cannot load all of it into memory. One way MATLAB handles this problem is with datastores.  Rather than loading all the data into memory, datastores load the data only as you need it. They act as a pointer to the data. MATLAB has datastore capabilities for images, audio, and files. There is also the option to create a custom datastore if the existing types do not fit your datatype requirements.

Another feature you can use is bigImage, it allows out-of-core processing of very large images. A bigimage object represents big images as smaller blocks of data that can be independently loaded and processed. This allows you to process your images at different and multiple resolution levels (image pyramids). You can read, write and set blocks in arbitrary regions of the image.


2. How do I preprocess data and get the right features?

MATLAB has a large number of interactive applications written to perform technical computing tasks. The Image Processing apps let you automate common image processing workflows. You can interactively segment image data, compare image registration techniques, and batch-process large data sets. Thus, allowing you to explore images, 3D volumes, and videos; adjust contrast; create histograms; and manipulate regions of interest.

Color Thresholder App

The Color Thresholder app supports the manipulation of the color components of images based on different color spaces. You can see the image represented as point clouds in the colour spaces: RGB, HSV, YCbCr, and L*a*b . For color-based segmentation, select the color space that provides the best color separation. Using this app, you can create a segmentation mask for a color image.











Image Region Analyzer App

The Image Region Analyzer app measures a set of properties, such as area, eccentricity, major axis length etc. for each connected component (i.e. an object or region) and displays the extracted information in a table. You can use this to measure properties of an object and filter based on region properties.










Registration Estimator App

The Registration Estimator app lets you explore various algorithms for registering misaligned images. This allows the training models to focus on the features of the surface of the device being assessed, rather than its position and/or orientation.











3. How do I label my data faster?

MATLAB has several interactive tools for accelerating the data labeling process for image, signal, video as well as automotive datasets. You can then use the labelled data to validate or train algorithms such as image classifiers, object detectors, semantic segmentation networks, and deep learning applications.












4. What if I have an imbalanced dataset or don’t have enough data?

Instead of creating new fresh data, a very common strategy consists in generating additional data starting from your original dataset. This process is called data augmentation. Image applications are a good example domain where augmenting your data makes sense and can yield significant results.









The image augmentation will apply random transformations (scale, rotation, translation, etc.) to the dataset. In addition, you can perform colour transformations such as hue jitter and contrast jitter. These files can then be written as new images and added to your dataset to minimise the effects of having a small or unbalanced dataset.

What Can I Do Next?

Follow us