Machine Learning (CS 667)
The ability of biological brains to sense, perceive, analyse and recognise patterns can only be described as stunning. Furthermore, they have the ability to learn from new examples. Mankind's understanding of how biological brains operate exactly is embarrassingly limited.
However, there do exist numerous 'practical' techniques that give machines the 'appearance' of being intelligent. This is the domain of statistical pattern recognition and machine learning. Instead of attempting to mimic the complex workings of a biological brain, this course aims at explaining mathematically well-founded techniques for analysing patterns and learning from them.
This course is an extension of CS 567 -- Machine Learning and is therefore a mathematically involved introduction into the field of pattern recognition and machine learning. It will prepare students for further study/research in the areas of Pattern Recognition, Machine Learning, Computer Vision, Data Analysis and other areas attempting to solve Artificial Intelligence (AI) type problems.
Pre-requisite(s): CS 567 -- Machine Learning
Pattern Recognition and Machine Learning by Christopher M. Bishop (2006)
|Monday||2:30 pm - 3:55 pm||Al Khwarizmi Lecture Theater|
|Wednesday||2:30 pm - 3:55 pm||Al Khwarizmi Lecture Theater|
|Thursday||02:00 pm - 06:00 pm|
Programming Environment: Matlab
|Assignments and Quizes||15%|
- Assignment 1
Chapter 4 exercises 4.4,4.5,4.7--4.18,4.20.
- Assignment 2
Chapter 5 exercises 5.1--5.10.
- Assignment 3
Chapter 9 exercises 9.1--9.4, 9.8, 9.9, 9.24, 9.25.
- Assignment 4
Chapter 12 exercises 12.1, 12.3.
Implement a Convolutional Neural Network and train it to recognise hand-written digits from the MNIST dataset. (Due: Monday, June 1st, 2015)
- Linear Models for Classification
- Discriminant Functions
- Least Squares Classification -- y(x)=f(w'x)
- Fisher's Linear Discriminant -- J(w)=w'*S_b*w / w'*S_w*w
- Perceptron -- y(x)=step(w'φ(x))
- Probabilistic Generative Models -- model posterior p(C_k|x) via class-conditional p(x|C_k) and prior p(C_k)
- Probabilistic Discriminative Models -- model posterior p(C_k|x) directly
- Neural Networks
- Regularization Techniques
- Early stopping
- Weight decay
- Training with transformed data
- Convolutional Neural Networks
- Kernel Methods and Support Vector Machines
- Dual formulations -- parametric to non-parametric
- Maximising the margin -- hard constraints
- Improving generalisation -- soft constraints
- Latent Variable Models
- K-means Clustering -- alternating optimization
- Gaussian Mixture Models
- Expectation Maximisation (EM) Algorithm
- Principle Component Analysis
- Combining Models
- Bagging -- Bootstrap Aggregation
- Decision Trees
- Mixtures of Linear Regression Models
- Mixtures of Logistic Regression Models
Learning over Sequential Data