Image Overview

An image is made up of Pixel. Each Pixel has RGB value associated with it. All colors in the world could be created by combining Red, Green and Blue colors in some proportions. For example, Red and Yellow colors RGB values are [255,0,0] and [255,255,0].

Filter/Kernel Overview

Suppose ,we want to get the edge of an image or blur the image, there are DSP Filters which we can apply and get the results. We might have seen it in our mobile phones. Similar to this, we have convolution filters which we can apply on any image and store the features of an image. Convolution filters are initialized with random value and are later on adjusted with backpropagation algorithm.

Convolution and Stride

Stride is number of pixel shifts over input matrix. When stride is 1, move the filters to 1 pixel at a time. When stride is 2, move the filters to 2 pixel at a time and so on. Output Size: ((N-F+2p)/Stride)+1 Where N is Input image size like 4X4 and F is filter size 2X2. P stand for padding. Refer Example-1 without padding and Example-2 with padding.

Example-1

Example-2

Pooling Layers

Pooling layers section would reduce the number of parameters when the image are too large. Max Pooling takes the largest element from the rectified feather map. Average pooling takes the average of all values. Sum Pooling, sum of all elements in the feather map.

ReLU Operation

ReLU purpose is to introduce non linearity since real world would want to learn non negative linear values.

Convolution Layers

This is the core building block on CNN. The parameters of this layer consist of a set of learnable filters. During a forward pass, we slide(convolve) each filters across the length, width and height of input image and compute dot products between filter and input image. The output would be activation map. We will stack these activation map and produce the output of the layer.

LeNet-5(1998)

This is a 7 layer convolution network by leCun developed in 1998 to classify digits. It was used by several banks to recognize handwritten numbers on cheque. Input is 32X32 Pixel grey scale Image.

CNN model using Keras and Tensorflow library

# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Conv2D

from keras.layers import MaxPooling2D

from keras.layers import Flatten
from keras.layers import Dense
# Initialising the CNN
classifier = Sequential()
# Step 1 - Convolution
classifier.add(Conv2D(8, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))
# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Adding a second convolutional layer
classifier.add(Conv2D(8, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Step 3 - Flattening
classifier.add(Flatten())
# Step 4 - Full connection
classifier.add(Dense(units = 32, activation = 'relu'))
classifier.add(Dense(units = 1, activation = 'sigmoid'))
# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

MLAI

Friday 16 April 2021

CNN (convolutional neural network) Basic