Sasken - Blog

Advanced Architectures for Image Recognition

Written by blog | Oct 5, 2017 8:55:16 AM

Today, Machine Learning, Artificial Intelligence, Deep Learning, Big Data, and Analytics are talked about quite often. However, Advanced Architecture is something that is fading and not given much significance.

When researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) created an app called Pic2Recipe that correctly identified recipes, everyone appreciated the effort that was taken towards AI and Deep Learning. However, very few paid attention towards Advanced Architecture.

Advanced Architecture is required for solving problems related to image and voice. One should understand the concept of computer vision before proceeding towards Deep Learning’s advanced architecture.

Deep Learning was used by Google in its voice and image recognition algorithms by Netflix and Amazon, to decide what people wanted to watch or buy next. Moreover, it was also used by researchers and solution providing companies to predict the future. As Deep Learning algorithms consist of a different set of models due to the flexibility, the neural network allows it to build a full-fledged end-to-end model.

Computer vision is building an artificial system which has the ability to gather information from images and automatic visual information from multi-dimensional data. It is focused on the self-executing extraction, analysis, and study about useful information from a particular image or a sequence of images. Computer vision consists of tasks like Object Recognition, Classification, Localization, Identification, Detection, Content-based Image Retrieval, Image Segmentation and much more.

Advanced Architecture is used for solving complex problems especially related to image recognition. We have moved towards the study of Deep Learning and Advanced Architecture. It uses ReLU for nonlinearity functions and data augmentation techniques which comprise of various reflections, patch extractions, and image translations.

The following are eight important architectures and their brief descriptions:
1. Generative Adversarial Network - GAN
When neural network is used to generate an entirely new image which is not present in the training data set, it is realistic enough to be in the data set.

2. GoogleNet or Inception Network
In this architecture, along with going deeper (it contains 22 layers); A research  done by Architectures Data Scientists mention that they also make a novel approach called the Inception module. GoogleNet was the winner of ImageNet 2014, where it proved to be a powerful model.

3. R-CNN or Region with CNN feature
R-CNN is an attempt to draw a bounding box over all the objects present in the image. It then recognizes what object is in the image.

4. Residual Networks or ResNet
It consists of multiple subsequent residual modules, which are the basic building block of ResNet architecture. A residual module has two options, either it can perform a set of functions on the input or it can skip this step altogether.

5. ResNeXt
ResNeXt builds upon the concepts of inception and resnet to bring about a new and improved architecture.

6. SegNet
SegNet consists of a sequence of processing layers (encoders) followed by a corresponding set of decoders for a pixel-wise classification.

7. SqueezeNet
SqueezeNet architecture is useful in low bandwidth scenarios like mobile platforms. [12]

8. Visual Graphics Group Net or VGG Net
VGG Net is pyramidal in shape, where bottom layers are closer to the image are wide, and the top layers are deep.

Author: Mohit Sharma, Manager – Digital Transformation Services