Machine Learning -- CS 667, PUCIT

Machine Learning (CS 667)
Spring 2015

The ability of biological brains to sense, perceive, analyse and recognise patterns can only be described as stunning. Furthermore, they have the ability to learn from new examples. Mankind's understanding of how biological brains operate exactly is embarrassingly limited.

However, there do exist numerous 'practical' techniques that give machines the 'appearance' of being intelligent. This is the domain of statistical pattern recognition and machine learning. Instead of attempting to mimic the complex workings of a biological brain, this course aims at explaining mathematically well-founded techniques for analysing patterns and learning from them.

This course is an extension of CS 567 -- Machine Learning and is therefore a mathematically involved introduction into the field of pattern recognition and machine learning. It will prepare students for further study/research in the areas of Pattern Recognition, Machine Learning, Computer Vision, Data Analysis and other areas attempting to solve Artificial Intelligence (AI) type problems.

Pre-requisite(s): CS 567 -- Machine Learning

Text: Pattern Recognition and Machine Learning by Christopher M. Bishop (2006)

Lectures:

Monday 2:30 pm - 3:55 pm Al Khwarizmi Lecture Theater

Wednesday 2:30 pm - 3:55 pm Al Khwarizmi Lecture Theater

Office Hours:

Thursday 02:00 pm - 06:00 pm

Programming Environment: Matlab

Grading Scheme/Criteria:

Assignments and Quizes 15%

Project 10%

Mid-Term 35%

Final 40%

Theoretical Assignments

Assignment 1
Chapter 4 exercises 4.4,4.5,4.7--4.18,4.20.

Assignment 2
Chapter 5 exercises 5.1--5.10.

Assignment 3
Chapter 9 exercises 9.1--9.4, 9.8, 9.9, 9.24, 9.25.

Assignment 4
Chapter 12 exercises 12.1, 12.3.

Programming Assignments

Assignment 1 (Due: April 15, 2015)

Project

Implement a Convolutional Neural Network and train it to recognise hand-written digits from the MNIST dataset. (Due: Monday, June 1st, 2015)

Content

Linear Models for Classification
- Discriminant Functions
  - Least Squares Classification -- y(x)=f(w'x)
  - Fisher's Linear Discriminant -- J(w)=w'*S_b*w / w'*S_w*w
  - Perceptron -- y(x)=step(w'φ(x))
- Probabilistic Generative Models -- model posterior p(C_k|x) via class-conditional p(x|C_k) and prior p(C_k)
- Probabilistic Discriminative Models -- model posterior p(C_k|x) directly
Neural Networks
- Back-propagation
- Regularization Techniques
  - Early stopping
  - Weight decay
  - Training with transformed data
- Convolutional Neural Networks
Kernel Methods and Support Vector Machines
- Dual formulations -- parametric to non-parametric
- Maximising the margin -- hard constraints
- Improving generalisation -- soft constraints
Latent Variable Models
- K-means Clustering -- alternating optimization
- Gaussian Mixture Models
- Expectation Maximisation (EM) Algorithm
Principle Component Analysis
Combining Models
- Committees
- Bagging -- Bootstrap Aggregation
- Boosting
- Decision Trees
- Mixtures of Linear Regression Models
- Mixtures of Logistic Regression Models
~~Graphical Models~~
~~Learning over Sequential Data~~