Deep Learning (CS-563)

Department of Computer Science
University of The Punjab

Instructor: Nazar Khan
Semester: Fall, 2025
Credit Hours: 3
Level: Graduate
Location: FCIT, Allama Iqbal Campus, AlKhwarizmi Lecture Theater
Class Times: Tue. and Thurs., 8:15 AM - 9:45 AM

Google Classroom
Textbook: Bishop
bishopbook_cover
Deep Learning: Foundations and Concepts
Chris Bishop and Hugh Bishop
Springer Nature, 2024
Reference 1: Murphy
mlpp_cover
Machine Learning: A Probabilistic Perspective
Kevin P. Murphy
MIT Press, 2022
PDF
Reference 2: DRL
drl_cover
Deep Reinforcement Learning
Aske Plaat
Springer Nature, 2022
Preprint

The ability of biological brains to sense, perceive, analyse and recognise patterns can only be described as stunning. Furthermore, they have the ability to learn from new examples. Mankind's understanding of how biological brains operate exactly is embarrassingly limited. However, there do exist numerous 'practical' techniques that give machines the 'appearance' of being intelligent. This is the domain of statistical pattern recognition and machine learning. Instead of attempting to mimic the complex workings of a biological brain, this course aims at explaining mathematically well-founded techniques for analysing patterns and learning from them.

Artificial Neural Networks as extremely simplified models of the human brain have existed for almost 75 years. However, the last 25 years have seen a tremendous unlocking of their potential. This progress has been a direct result of a collection of network architectures and training techniques that have come to be known as Deep Learning. As a result, Deep Learning has taken over its parent fields of Neural Networks, Machine Learning and Artificial Intelligence. Deep Learning is quickly becoming must-have knowledge in many academic disciplines as well as in the industry.

This course is a mathematically involved introduction into the wonderful world of deep learning. It will prepare students for further study/research in the areas of Pattern Recognition, Machine Learning, Computer Vision, Data Analysis, Natural Language Processing, Speech Recognition, Machine Translation, Autonomous Driving and other areas attempting to solve Artificial Intelligence (AI) type problems.

Lectures

# Topics Slides Videos Recitations Readings Miscellaneous

1

  • Course Details
  • Introduction to Machine Learning
  • Introduction to Neural Computations

Introduction to Deep Learning

Introduction to Neural Computations

Video


2

  • Mathematical Modelling of Neural Computations
    • McCulloch & Pitts Neurons
    • Hebbian Learning
    • Rosenblatt's Perceptron
  • XOR Problem
  • Multilayer Perceptrons

History of Neural Computation

Video

Recitation 1

Quiz 1

3

  • Universal Approximation Theorem for Multilayer Perceptrons
    • For Boolean functions
    • For classification boundaries
    • For continuous functions

MLPs and Universal Approximation Theorem

Video1

Video2


4

  • Training a perceptron
    • Minimization
    • Gradient Descent
    • Perceptron learning rule

Perceptron Training

Video 1

Video 2

Recitation 2


Quiz 2

5

  • Loss Functions and Activation Functions
    • Loss Functions for Regression
      • Univariate
      • Multivariate
    • Loss Functions for Classification
      • Binary
      • Multiclass
    • Activation Functions
      • Linear
      • Logistic Sigmoid
      • Softmax

Loss Functions and Activation Functions for Machine Learning

Video

    6

    • Training Neural Networks
      • Forward Propagation
      • Backward Propagation

    Training Neural Networks: Forward and Backward Propagation

    Video

    Recitation 3

      Quiz 3

      7

      • Backpropagation and Vanishing Gradients
        • Numerical derivative check
        • Efficiency of backpropagation
        • Vanishing gradient problem
        • Activation functions for Deep Learning
          • Tanh
          • ReLU
          • Leaky ReLU
          • ELU

      Backpropagation and Vanishing Gradients

      Video

      8

      • Gradient Descent Variations - I
        • Problems with vanilla gradient descent
        • First-order methods
          • Resilient Propagation (Rprop)
        • Second-order methods
          • Taylor series approximation
          • Newton's Method for finding stationary points
          • Quickprop

      Variations of Gradient Descent

      Video

      Recitation 4

      Quiz 4


      Assignment 1: Backpropagation for MLPs.

      9

      • Gradient Descent Variations - II
        • Momentum-based first-order methods
          • Momentum
          • Nesterov Accelerated Gradient
          • RMSprop
          • ADAM

      Momentum-based Gradient Descent

      Video

      10

      • Automatic Differentiation
        • Analytic vs Automatic Differentiation
        • Linear Regression via Automatic Differentiation
        • Logistic Regression via Automatic Differentiation

      Automatic Differentiation

      Notes

      Video

      11

      • Regularization - I
        • Primer on ML
          • Capabilities of polynomials
          • Everything contains noise
          • Overfitting vs Generalisation
        • Regularization Methods
          • Weights Penalty
          • Early Stopping
          • Data Augmentation
          • Label Smmoothing

      Regularization

      Video

      Recitation 5

      12

      • Regularization - II
        • Dropout
        • BatchNorm

      Dropout and BatchNorm

      Video

      13

      • Convolutional Neural Networks
        • Convolution
        • Neurons as detectors
        • Pooling
        • Forward Propagation
        • Covariance of CNNs

      Convolutional Neural Networks

      Video

      Recitation 6

      14

      • Variations of Convolutional Neural Networks - I
        • 1x1 Convolutions
        • Depthwise Separable Convolutions
        • Transposed Convolutions

      Variations of Convolutional Neural Networks

      Video

      15

      • Variations of Convolutional Neural Networks - II
        • Unpooling
        • Fully Convolutional Networks
        • ResNet

      Variations of Convolutional Neural Networks

      Video

      Recitation 7

      16

      • Recurrent Neural Networks (RNN)
        • Static vs. Dynamic Inputs
        • Temporal, sequential and time-series data
        • Folding in space
        • Folding in time
        • Unfolding in time
        • Forward propagation in RNN

      Recurrent Neural Networks

      Video

      17

      • RNN variants, benefits and stability
        • Bidirectional RNN
        • Some problems are inherently recurrent
        • Exploding gradients

      RNN variants, benefits and stability

      Video

      18

      • Long Short-Term Memory (LSTM)
        • RNN cell and its weakness
        • Building blocks of the LSTM cell
        • The LSTM cell
        • How does the LSTM cell remember the past?
        • Variants
          • Peephole connections
          • Coupled forget and input gates
          • Gated Recurrent Unit (GRU)

      Long Short-Term Memory

      Video

      19

      • Language Modelling
        • Modelling input text as numeric vectors
        • Text generation
        • Language translation
        • Beam Search

      Language Modelling

      Video

      20

      • Attention
        • Attention-based decoder for
          • Language translation
          • Image captioning
          • Handwritten text recognition

      Attention

      Video

      21

      • Transformers
        • Encoding with attention
          • Self-attention
          • Residual connection
          • Layer-norm
          • Parallelism by removing recurrence
          • Multiheaded self-attention
          • Positional encoding
        • Self-attention based Decoder
          • Encoder-decoder-attention

      Transformers

      22

      • Transformers - II
        • Self-attention based Decoder
          • Encoder-decoder-attention

      Transformers

      No Recitation

      23

      • Generative Adversrial Networks (GANs)
        • Generative vs. Discriminative Models
        • Adversarial Learning
        • Applications
        • GAN Training
          • Objective Functions
          • Training Procedure
          • Stability and Mode-Collapse
          • Tips & Tricks

      Generative Adversarial Networks

      Video

      24

      • Graph Neural Networks (GNNs) - I
        • Euclidean vs. Non-Euclidean Domains
        • Permutation Invariance
        • Permutation Equivariance
        • Learning on Sets
        • Learning on Graphs

      Graph Neural Networks

      Video

      Friday, April 22: Recitation 11

      25

      • Graph Neural Networks (GNNs) - II
        • GNN Layers
          • Convolutional (GCN)
          • Attention (GAT)
          • Message Passing (MPN)
        • Multilayer GNN
        • Example: 3-layer vanilla GNN
          • Node Prediction
          • Graph Prediction

      Graph Neural Networks

      Video

      26

      • Deep Q-Learning

      Deep Q-Learning

      27

      • Conclusion
        • What was covered?
        • What were the general principles?
        • What was not covered?

      Conclusion

      Video

      • Final Exam

      Grading

      Assignments Project Quizzes Midterm Final
      Weight 12% 8% 5% 35% 40%