Machine Learning (CS 667)
Spring 2015

Dr. Nazar Khan

The ability of biological brains to sense, perceive, analyse and recognise patterns can only be described as stunning. Furthermore, they have the ability to learn from new examples. Mankind's understanding of how biological brains operate exactly is embarrassingly limited.

However, there do exist numerous 'practical' techniques that give machines the 'appearance' of being intelligent. This is the domain of statistical pattern recognition and machine learning. Instead of attempting to mimic the complex workings of a biological brain, this course aims at explaining mathematically well-founded techniques for analysing patterns and learning from them.

This course is an extension of CS 567 -- Machine Learning and is therefore a mathematically involved introduction into the field of pattern recognition and machine learning. It will prepare students for further study/research in the areas of Pattern Recognition, Machine Learning, Computer Vision, Data Analysis and other areas attempting to solve Artificial Intelligence (AI) type problems.

Pre-requisite(s): CS 567 -- Machine Learning

Text: Pattern Recognition and Machine Learning by Christopher M. Bishop (2006)

Monday2:30 pm - 3:55 pmAl Khwarizmi Lecture Theater
Wednesday2:30 pm - 3:55 pmAl Khwarizmi Lecture Theater

Office Hours:
Thursday02:00 pm - 06:00 pm

Programming Environment: Matlab

Grading Scheme/Criteria:
Assignments and Quizes15%

Theoretical Assignments

  • Assignment 1
     Chapter 4 exercises 4.4,4.5,4.7--4.18,4.20.
  • Assignment 2
     Chapter 5 exercises 5.1--5.10.
  • Assignment 3
     Chapter 9 exercises 9.1--9.4, 9.8, 9.9, 9.24, 9.25.
  • Assignment 4
     Chapter 12 exercises 12.1, 12.3.

Programming Assignments


Implement a Convolutional Neural Network and train it to recognise hand-written digits from the MNIST dataset. (Due: Monday, June 1st, 2015)


  1. Linear Models for Classification
    • Discriminant Functions
      • Least Squares Classification -- y(x)=f(w'x)
      • Fisher's Linear Discriminant -- J(w)=w'*S_b*w / w'*S_w*w
      • Perceptron -- y(x)=step(w'φ(x))
    • Probabilistic Generative Models -- model posterior p(C_k|x) via class-conditional p(x|C_k) and prior p(C_k)
    • Probabilistic Discriminative Models -- model posterior p(C_k|x) directly
  2. Neural Networks
    • Back-propagation
    • Regularization Techniques
      • Early stopping
      • Weight decay
      • Training with transformed data
    • Convolutional Neural Networks
  3. Kernel Methods and Support Vector Machines
    • Dual formulations -- parametric to non-parametric
    • Maximising the margin -- hard constraints
    • Improving generalisation -- soft constraints
  4. Latent Variable Models
    • K-means Clustering -- alternating optimization
    • Gaussian Mixture Models
    • Expectation Maximisation (EM) Algorithm
  5. Principle Component Analysis
  6. Combining Models
    • Committees
    • Bagging -- Bootstrap Aggregation
    • Boosting
    • Decision Trees
    • Mixtures of Linear Regression Models
    • Mixtures of Logistic Regression Models
  7. Graphical Models
  8. Learning over Sequential Data