We all love to see beautiful images, but have you ever thought how do computers see an image? In this tutorial, we will give an explanation of how images are stored in a computer. I ‘ll be using the example from the MNIST Dataset to understand how computers interpret images.
Let’s get started,
Any grayscale image is interpreted by a computer as an array. A grid of values where each grid cell is called a pixel and each pixel has a numerical value.
To use this grayscale image in our Machine Learning model, we had to normalize the pixel values.
Pre-Processing the Data
It is a method used in Machine Learning to bring features in a dataset to the same scale. when you normalize a feature all feature values will be in the range of 0 to 1. It will help your algorithm to train better. We do normalization because Neural Networks rely on gradient calculations. Data normalization is typically done by subtracting the mean (the average of all pixel values) from each pixel and then dividing the result by the standard deviation of all the pixel values. Sometimes you’ll see an approximation here, where we use a mean and standard deviation of 0.5 to center the pixel values.
Here, our pixel values range from 0 to 255. To scale our pixel values in a range of 0 to 1, we divide each pixel value with 255 to get our new pixel values.
Converting any image array into a vector from. We cannot give the array values directly to our model, it should be converted into a vector form by flattening the array values.
Ex: 28 X 28 = 784 pixel values
This way computers interpret the images, and by applying different machine learning techniques like normalization, flattening, etc. we can use this in our machine learning models.