In the realm of machine learning and artificial intelligence, one fascinating task is recognizing handwritten digits. This task, although seemingly simple for humans, poses a significant challenge for computers. In this blog post, we’ll embark on a journey to unravel the complexities of handwritten digit recognition using the famous MNIST dataset. We’ll leverage the power of deep learning frameworks like Keras and TensorFlow to build and train a neural network that can accurately classify handwritten digits.
# Importing necessary libraries
from keras.datasets import mnist
import matplotlib.pyplot as plt
First, we’ll dive into loading the MNIST dataset using Keras and preprocess the images to prepare them for training. We’ll explore the structure of the dataset and visualize some sample images to gain insights into the task at hand.
# Load the dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# Display an image
plt.imshow(X_train[0], cmap="gray")
plt.show()
Next, we’ll construct our neural network architecture using Keras. We’ll design a simple yet powerful network consisting of densely connected layers. We’ll delve into the rationale behind choosing specific network configurations and activation functions to ensure optimal performance.
from keras.models import Sequential
from keras.layers import Dense
# Create the model
model = Sequential()
model.add(Dense(512, input_shape=(784,), activation="relu"))
model.add(Dense(10, activation="softmax"))
With our neural network architecture in place, we’ll proceed to train the model using the training data. We’ll monitor the training process, evaluate the model’s performance on validation data, and analyze metrics such as accuracy and loss to gauge its effectiveness.
# Compile the model
model.compile(loss="categorical_crossentropy", metrics=['accuracy'], optimizer='adam')
# Train the model
hist = model.fit(X_train, Y_train, batch_size=64, epochs=10, verbose=1, validation_data=(X_test, Y_test))
Once the model is trained, we’ll analyze its performance on unseen test data and visualize the training and validation metrics using plots. We’ll explore avenues for fine-tuning the model to improve its accuracy and generalization capabilities.
# Plotting model accuracy
plt.plot(hist.history['accuracy'])
plt.plot(hist.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
# Plotting model loss
plt.plot(hist.history['loss'])
plt.plot(hist.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
After training the model, it’s crucial to assess its performance on unseen data to ensure its generalization ability. Testing the model on the test dataset provides valuable insights into its real-world effectiveness in classifying handwritten digits. By evaluating metrics such as loss and accuracy on the test set, we can ascertain how well the model performs on new, previously unseen examples.
import random
pred = model.predict(X_test)
(X_train, y_train), (x_test, y_test) = mnist.load_data()
fig = plt.figure(figsize=(10,10))
fig.suptitle("Some examples of images of the dataset", fontsize=16)
for i in range(25):
j = random.randrange(0, len(pred))
plt.subplot(5,5,i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(x_test[j], cmap=plt.cm.binary)
plt.xlabel(f' Pred: {np.argmax(pred[j])} Corr: {y_test[j]} ')
This particular model, trained on the MNIST dataset and implemented using Keras and TensorFlow, achieved remarkable success in a Kaggle competition, where it secured an impressive rank of 229 with an outstanding accuracy rate of 99.3%. This achievement highlights the efficacy and robustness of the model in accurately classifying handwritten digits.
In conclusion, this blog post provides a comprehensive overview of building a neural network for handwritten digit recognition using Keras and TensorFlow. By following the step-by-step guide and experimenting with different network architectures and parameters, readers can gain valuable insights into the fascinating world of deep learning and enhance their skills in tackling similar classification tasks. With the knowledge gained from this post, readers will be well-equipped to apply deep learning techniques to a wide range of real-world problems. The code to above model is in my GitHub repository.