Assignment 1: Handwritten Number Classification with PyTorch

EC332 Machine Learning
Department of Computer Science
University of the Punjab
Author: Nazar Khan

Assignment Objective:

Your task is to build and train a deep learning model or models using PyTorch to classify handwritten numbers from an imbalanced dataset available at this link. Due to the imbalanced nature of the dataset, you are encouraged to use appropriate data augmentation techniques and other strategies to improve model performance. You may also use additional training data from some other dataset of handwritten digits, if that helps.

Dataset Description:

The dataset is split into two sub-folders: train and test
Each sub-folder is further organized into sub-folders named 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, with each sub-folder containing images representing handwritten instances of the corresponding number.
The dataset is imbalanced: some classes have more images than others. Some classes have really few examples.

Your Tasks:

Data Loading and Preprocessing (15 marks):
- Write a PyTorch dataset class to load the data.
- Use appropriate transforms for data augmentation (e.g., slight rotation, translation, scaling) to mitigate the effects of class imbalance and improve model generalization.
Data Handling (10 marks):
- Split the training set into training and validation sets.
- Ensure that the splits maintain the class distributions.
Model Development (25 marks):
- Implement a Convolutional Neural Network (CNN) in PyTorch.
- You can use pre-trained models (like ResNet or VGG) with modifications for this classification task or design your own architecture.
Training the Model (25 marks):
- Use appropriate techniques to address the class imbalance:
  - Weighted loss functions.
  - Oversampling or undersampling.
- Train the model using the training set and validate it using the validation set.
- Use an optimizer like Adam or SGD with appropriate learning rate scheduling.
Evaluation (15 marks):
- Evaluate your model on the test set.
- Report metrics such as accuracy, precision, recall, and F1-score for each class.
- Include a confusion matrix to visualize classification performance across classes.
Analysis and Reflection (10 marks):
- Discuss the impact of class imbalance on your results.
- Highlight the effectiveness of your data augmentation strategies.
- Suggest possible improvements or extensions to the work.

Submission Instructions:

Submit a Jupyter Notebook file (Assignment.ipynb) containing:
- Code for all the tasks listed above.
- Clear and concise explanations for each step.
- Plots/visualizations of model performance and sample predictions.
Include a requirements.txt file with all dependencies used.
Save the model and submit the model weights (model.pth).

Bonus Tasks (Optional, +10 marks):

Use a pre-trained model and fine-tune it for this problem.
Implement an advanced technique for handling class imbalance (e.g., focal loss or SMOTE).

Starter Code Snippets:

Custom Dataset Class:

from torchvision import transforms
from torch.utils.data import Dataset
from PIL import Image
import os

class HandwrittenDataset(Dataset):
    def __init__(self, root_dir, transform=None):
        self.root_dir = root_dir
        self.transform = transform
        self.image_paths = []
        self.labels = []

        for label in os.listdir(root_dir):
            label_path = os.path.join(root_dir, label)
            if os.path.isdir(label_path):
                for img in os.listdir(label_path):
                    self.image_paths.append(os.path.join(label_path, img))
                    self.labels.append(float(label))  # Convert labels to float

    def __len__(self):
        return len(self.image_paths)

    def __getitem__(self, idx):
        image_path = self.image_paths[idx]
        label = self.labels[idx]
        image = Image.open(image_path).convert("RGB")

        if self.transform:
            image = self.transform(image)

        return image, label

Data Augmentation:

# Define the RandomAffine transformation
transform = transforms.Compose([
    transforms.RandomAffine(
        degrees=15,          # Random rotation between -15 to +15 degrees
        translate=(0.1, 0.1),  # Randomly translate up to 10% of width and height
        scale=(0.9, 1.1),     # Randomly scale between 90% to 110%
        shear=10              # Randomly shear up to 10 degrees
    ),
    transforms.ToTensor(),  # Convert image to PyTorch tensor
    transforms.Normalize(mean=[0.5], std=[0.5])  # Normalize to [-1, 1] range
])

Model Training Loop:

import torch
from torch import nn, optim

def train_model(model, dataloaders, criterion, optimizer, num_epochs=10):
    for epoch in range(num_epochs):
        print(f"Epoch {epoch+1}/{num_epochs}")
        print("-" * 10)

        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()
            else:
                model.eval()

            running_loss = 0.0
            corrects = 0

            for inputs, labels in dataloaders[phase]:
                inputs, labels = inputs.to(device), labels.to(device)
                optimizer.zero_grad()

                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    loss = criterion(outputs, labels)
                    _, preds = torch.max(outputs, 1)

                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                running_loss += loss.item() * inputs.size(0)
                corrects += torch.sum(preds == labels.data)

            epoch_loss = running_loss / len(dataloaders[phase].dataset)
            epoch_acc = corrects.double() / len(dataloaders[phase].dataset)

            print(f"{phase} Loss: {epoch_loss:.4f} Acc: {epoch_acc:.4f}")

Grading Rubric:

Data Loading and Preprocessing: 15 marks
Data Handling: 10 marks
Model Development: 25 marks
Training: 25 marks
Evaluation: 15 marks
Analysis and Reflection: 10 marks
Bonus Tasks: 10 marks (optional)

This assignment combines foundational concepts in PyTorch with practical problem-solving, emphasizing the handling of class imbalance and the application of data augmentation techniques.