<h1 style="text-align:center;">Neural Networks using PyTorch: Recognition of Handwritten Digits</h1>
<p style="text-align:center;">
Nazar Khan
<br>CVML Lab
<br>University of The Punjab
</p>

PyTorch provides a flexible and powerful deep learning framework with easy-to-use tools for building and training neural networks. We’ll walk through the following steps:

1. Loading and preparing data (using `torchvision`).
2. Defining the neural network architecture.
3. Specifying the loss function and optimizer.
4. Training the model.
5. Evaluating the model.
6. Visualizing performance.



### Step 0: Installing PyTorch
First, make sure you have PyTorch and Torchvision installed. You can install it by running:

```bash
pip install torch torchvision
```

Install Torchmetrics and Seaborn packages as well

```bash
pip install seaborn torchmetrics
```

I work on Ubuntu. For me, `torch` and `torchvision` did not run correctly. There was an issue with the Math Kernel Library (MKL). I was able to solve it by making a new Python environment with Python version 3.12 and using OpenBLAS instead of MKL. I named this new environment `cvml`. I used the following commands:

```bash
conda create --name cvml python=3.12
conda activate cvml
conda install numpy matplotlib scikit-learn seaborn ipython ipykernel blas=*=openblas -c conda-forge
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cpu
pip3 install torchmetrics
```

**<center>If you encounter any problems in installing and successfuly running `torch` or `torchvision`, please post on the Google Classroom immediately.</center>**

### Step 1: Loading the Data
In this tutorial, we’ll use the `MNIST` dataset, which consists of 28x28 grayscale images of handwritten digits (0-9).
The `MNIST` dataset is available as part of the larger `EMNIST` dataset. Please read [https://biometrics.nist.gov/cs_links/EMNIST/Readme.txt](https://biometrics.nist.gov/cs_links/EMNIST/Readme.txt) carefully to understand the dataset. You **must always** understand the data that you're working with.

`torchvision` provides utilities to load and preprocess datasets.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
from torchmetrics import ConfusionMatrix
import seaborn as sns

# Set device to GPU if available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Define transformations for the dataset (normalize the images)
transform = transforms.Compose([
    transforms.ToTensor(),  # Convert to tensor
    transforms.Normalize((0.5,), (0.5,))  # Normalize to range [-1, 1]
])

# Download and load the EMNIST dataset
train_dataset = torchvision.datasets.EMNIST(root='./data', split='mnist', train=True, download=True, transform=transform)
test_dataset = torchvision.datasets.EMNIST(root='./data', split='mnist', train=False, download=True, transform=transform)

# Data loaders
train_loader = DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=64, shuffle=False)

# Visualize some training images
def visualize_samples():
    examples = iter(train_loader)
    example_data, example_targets = next(examples) #.next()

    for i in range(6):
        plt.subplot(2, 3, i+1)
        plt.imshow(torch.transpose(example_data[i][0],1,0), cmap='gray')
        plt.title(f'Label: {example_targets[i].item()}')
        plt.axis('off')
    plt.show()

visualize_samples()



### Step 2: Defining the Neural Network Architecture
Now, let’s define a simple feedforward neural network using `nn.Module`. In PyTorch, `nn.Module` is the base class for all neural network components, such as layers and models. It provides the foundational structure for building and organizing neural networks in PyTorch, handling parameter management, forward passes, and modularity.

Our network will have an input layer, one hidden layer, and an output layer for classifying digits (0-9).

In [None]:
class SimpleNeuralNet(nn.Module):
    def __init__(self, input_size=28*28, hidden_size=128, num_classes=10):
        super(SimpleNeuralNet, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)  # First fully connected layer
        self.relu = nn.ReLU()  # Activation function
        self.fc2 = nn.Linear(hidden_size, num_classes)  # Second fully connected layer (output)

    def forward(self, x):
        x = x.view(-1, 28*28)  # Flatten the input
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

### Step 3: Specifying the Loss Function and Optimizer
In PyTorch, you need to define a loss function and an optimizer for training. We’ll use **cross-entropy loss** for classification and **Adam** optimizer.



In [None]:
# Instantiate the model, loss function, and optimizer
model = SimpleNeuralNet().to(device)
criterion = nn.CrossEntropyLoss()  # Loss function for multi-class classification
optimizer = optim.Adam(model.parameters(), lr=0.001)  # Adam optimizer
print("Hypothesis\n", model)
print("\nObjective Function:\n", criterion)
print("\nOptimizer:\n", optimizer)

### Step 4: Training the Model
We’ll now train the model by iterating through the training data, performing forward passes, computing the loss, and updating the weights using backpropagation. We’ll also calculate the validation loss and accuracy after each epoch.

In [None]:
# Function to train the neural network
def train_model(model, train_loader, criterion, optimizer, num_epochs=5):
    for epoch in range(num_epochs):
        model.train()  # Set model to training mode
        running_loss = 0.0
        for batch_idx, (inputs, labels) in enumerate(train_loader):
            inputs, labels = inputs.to(device), labels.to(device)
            
            # Forward pass
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            
            # Backward pass and optimization
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            
            running_loss += loss.item()
        
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(train_loader):.4f}')

# Train the model
train_model(model, train_loader, criterion, optimizer, num_epochs=5)



### Step 5: Evaluating the Model
After training, we can evaluate the model’s performance on the test dataset to see how well it generalizes to unseen data.

We will compute the *accuracy* of the predicted labels.

We will plot the *confusion matrix* as well. The entry at row $i$ and column $j$ of the confusion matrix shows how many times a sample from class $i$ was predicted as belonging to class $j$. Ideally, the confusion matrix should contain non-zero values only on the main diagonal.



In [None]:
def evaluate_model(model, test_loader):
    model.eval()  # Set model to evaluation mode
    correct = 0
    total = 0
    pred = []
    actual = []
    with torch.no_grad():  # Disable gradient calculation for evaluation
        for inputs, labels in test_loader:  # Pick a batch of test samples
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs) # Perform fprop on the whole batch
            _, predicted = torch.max(outputs.data, 1)   # Compute predictions on the whole batch
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
            pred.append(predicted)
            actual.append(labels)

    acc = 100 * correct / total
    print(f'Accuracy of the model on the test images: {acc:.2f}%')
    
    # Compute the confusion matrix
    num_classes = model.fc2.out_features
    confmat = ConfusionMatrix(task="multiclass", num_classes=num_classes)
    conf_matrix = confmat(torch.cat(pred,dim=0), torch.cat(actual,dim=0))

    # Plotting the confusion matrix
    plt.figure(figsize=(8, 6))
    sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues", cbar=False,
                xticklabels=[f'{i}' for i in range(num_classes)],
                yticklabels=[f'{i}' for i in range(num_classes)])
    plt.xlabel("Predicted")
    plt.ylabel("Ground Truth")
    plt.title("Confusion Matrix")
    plt.show()

# Evaluate the model
evaluate_model(model, test_loader)

After looking at your confusion matrix, please answer the following questions.
- How many times was 4 misclassified as 9?
- How many times was 9 misclassified as 4?

### Step 6: Visualizing Results
We can visualize some predictions made by the model on the test dataset.

In [None]:
def visualize_predictions(model, test_loader):
    examples = iter(test_loader)
    example_data, example_targets = next(examples) #.next()
    
    with torch.no_grad():
        example_data = example_data.to(device)
        outputs = model(example_data)
        _, predicted = torch.max(outputs, 1)
    
    # Plot 6 test images along with their predicted and true labels
    for i in range(6):
        plt.subplot(2, 3, i+1)
        plt.imshow(torch.transpose(example_data[i][0],1,0).cpu().reshape(28, 28), cmap='gray')
        plt.title(f'Pred: {predicted[i].item()}, True: {example_targets[i].item()}')
        plt.axis('off')
    plt.show()

# Visualize predictions
visualize_predictions(model, test_loader)

### Conclusion
This tutorial provided a step-by-step guide to building a simple neural network in PyTorch:

- Loading the `EMNIST` dataset using `torchvision`.
- Defining a simple neural network using `nn.Module`.
- Specifying a loss function and optimizer.
- Training the model using backpropagation.
- Evaluating the model’s accuracy on test data.
- Analyzing classification results using the confusion matrix.
- Visualizing some predictions made by the trained model.

PyTorch makes it easy to experiment with different neural network architectures and modify training procedures. You can further explore by adding more layers, using different activation functions, or experimenting with different datasets.