Machine Learning (CS 567)
Fall 2014

Dr. Nazar Khan

The ability of biological brains to sense, perceive, analyse and recognise patterns can only be described as stunning. Furthermore, they have the ability to learn from new examples. Mankind's understanding of how biological brains operate exactly is embarrassingly limited.

However, there do exist numerous 'practical' techniques that give machines the 'appearance' of being intelligent. This is the domain of statistical pattern recognition and machine learning. Instead of attempting to mimic the complex workings of a biological brain, this course aims at explaining mathematically well-founded techniques for analysing patterns and learning from them.

Accordingly, this course is a mathematically involved introduction into the field of pattern recognition and machine learning. It will prepare students for further study/research in the areas of Pattern Recognition, Machine Learning, Computer Vision, Data Analysis and other areas attempting to solve Artificial Intelligence (AI) type problems.

Text: Pattern Recognition and Machine Learning by Christopher M. Bishop (2006)

Lectures:
Monday2:30 pm - 4:00 pmAl Khwarizmi Lecture Theater
Wednesday2:30 pm - 4:00 pmAl Khwarizmi Lecture Theater

Office Hours:
Thursday02:00 pm - 06:00 pm

Programming Environment: Matlab

Grading Scheme/Criteria:
Assignments and Quizes15%
Project10%
Mid-Term35%
Final40%

Assignments

IMPORTANT: Can everyone please send me his/her email address at nazarkhan@pucit.edu.pk? Subject of email should be "CS567 ML".

  • Assignment 1 (Wednesday, October 29, 2014)
     First 13 exercises of Chapter 1 (excluding exercise 1.4)
     Due: Monday, November 10, 2014 before class
  • Assignment 2 (Monday, December 1, 2014)
     Chapter 1 exercises 1.21--1.35 (excluding exercise 1.30).
     Due: Monday, December 8, 2014 before class timings
  • Assignment 3 (Saturday, December 27, 2014)
     Chapter 1 exercises 1.29, 1.30 and 1.36--1.41.
     Due: Monday, January 12, 2015 before class timings
  • Assignment 4 (Due: Monday, 19th January, 2015 before class)
  • Assignment 5 (Due: Sunday, 8th February, 2015 11:59 pm)
  • Homework session 1 (Saturday 28th January, 2015): Chapter 2 exercises
  • Homework session 2 (Thursday 5th February, 2015): Chapter 2 exercises continued

Content

  1. Lectures 1 to 4 (Introduction)
    • Introduction
    • Curve Fitting (Over-fitting vs. Generalization)
    • Regularized Curve Fitting
    • Probability
  2. Lectures 5 to 8 (Background Mathematics)
    • Gaussian Distribution
    • Fitting a Gaussian Distribution to Data
    • Probabilistic Curve Fitting (Maximum Likelihood Estimation)
    • Bayesian Curve Fitting (Maximum Posterier Estimation)
    • Model Selection (Cross Validation)
    • Calculus of variations
    • Lagrange Multipliers
  3. Lectures 9 to 13 (Descision Theory and Information Theory)
    • Decision Theory
      • Minimising number of misclassifications
      • Minimising expected loss
      • Benefits of knowing posterior distributions
      • Generative vs Discriminative vs. Discriminant functions
      • Loss functions for regression problems
    • Information Theory
      • Information ∝ 1/Probability
      • Entropy = expected information (measure of uncertainty)
        • Maximum Entropy Discrete Distribution (Uniform)
        • Maximum Entropy Continuous Distribution (Gaussian)
      • Jensen's Inequality
      • Relative Entropy (KL divergence)
      • Mutual Information
  4. Lectures 14 to 17 (Probability Distributions and Density Estimation)
    • Density Estimation is fundamentally ill-posed
    • Probability Distributions
      • Bernoulli
      • Binomial
      • Beta
      • Multinomial
      • Dirichlet
      • Gaussian
    • Completing-the-square
    • Sequential Learning via Conjugate Priors
    • Density Estimation Methods
  5. Linear Models for Regression
    • Least-squares estimation
    • Design matrix
    • Pseudoinverse
    • Regularized least-squares estimation
    • Linear regression for multivariate targets
  6. Linear Models for Classification
    • Least-squares
    • Fisher's Linear Discriminant (FLD)
    • Perceptron
  7. Neural Networks
  8. Clustering
  9. Dimensionality Reduction
  10. Support Vector Machines