CS570 Computer Vision
Nazar Khan

Human beings (and even animals) "look" at the real-world and extract extremely accurate information extremely efficiently. Computers can fail catastrophically at this task! In this course we look into why "Vision" is a difficult problem to solve and we go through successful, mathematically well-founded techniques used to solve the Vision problem.

This course is a useful application of mathematical concepts from Linear Algebra and Calculus. Therefore, the students could do well by brushing up on their Linear Algebra, Calculus and programming skills before taking this class. The techniques learned here can be useful for other areas such as Image Processing, Machine Learning, Artificial Intelligence and Computer Graphics.

CS 570 is a graduate course worth 3 credit hours.

Lectures: Monday and Wednesday, 10:15 a.m. - 11:45 a.m. Room 14, FCIT, Allama Iqbal (Old) Campus
Google Classroom:
Office Hours: Monday, 2:30 p.m. - 3:30 p.m. or by appointment

Grading:

Assignments

35%

Project

15%

Quizzes

5%

Tests

5%

Mid-Term

15%

Final

25%

Prerequisites

  1. Ability to code
  2. Basic Calculus (Differentiation, Partial derivatives, Chain rule)
  3. Linear Algebra (Vectors, Matrices, Dot-product, Orthogonality, Eigenvectors)

Books and Other Resources

No single book will be followed as the primary text. Helpful online and offline resources include:


Image Processing

  1. Digital Image Processing by R. C. Gonzalez, R. E. Woods.

Computer Vision

  1. Introductory techniques for 3D Computer Vision by Trucco and Verri
  2. Image Processing, Analysis, and Machine Vision by Sonka, Hlavac, Roger
  3. Computer Vision: Algorithms and Applications by Richard Szeliski (http://szeliski.org/Book/)
  4. Multiple View Geometry in Computer Vision by Hartley and Zisserman
  5. Computer Vision: A Modern Approach by Forsyth and Ponce
  6. Computer Vision by Linda G. Shapiro, George C. Stockman

Lectures

#

Topics

Slides

Videos

Recitations

Readings

Miscellaneous

1

Introduction

  • What is Computer Vision?
  • Computer Vision vs. Biological Vision -- The Grand Deception!
  • Applications of Computer Vision
  • Course Details

Introduction to Computer Vision

Video


2

Background Math I

  • Notation
  • Inner Product
  • Euclidean Norm
  • Outer Product
  • Matrix and Vector Calculus

Background Mathematics

Video

Friday November 26: Recitation 1

Take-home Quiz 1

3

Background Math II

  • Matrices as Linear Operators
  • Eigenvectors
  • Optimisation
  • Constrained Optimisation
  • Taylor Series Approximation
  • Cartesian vs. Image axis

Background Mathematics

Video

4

Image Filtering

  • Convolution
  • Properties of convolution
  • Mean Filtering
  • Gaussian Filtering
  • Non-linear filtering cannot be performed via convolution

Image Filtering

Video

Friday December 3: Recitation 2

  • Working with images
  • Basic image handling using
    • NumPy
    • PIL
    • OpenCV
  • Filtering
    • Point operations
      • Image Thresholding
    • Neighborhood operations
      • Convolution
      • Averaging Filter
      • Gaussian Blurring
      • Median Blurring
      • Bilateral Filtering
  • Histogram equalization
  • Video
  • Recitation 2

Take-home Quiz 2

5

Derivative Approximations

  • Gradient magnitude and angle via partial derivatives
    • atan vs. atan2
  • Derivative approximations via Taylor's formula
    • 1st-derivative using central difference
    • 2nd-derivative using central difference
    • 1st-derivative using forward difference
    • 1st-derivative using backward difference

Derivative Approximations

Video

6

Derivative Filtering & Edge Detection

  • Convolution masks for derivative filtering
    • Smoothing before computing derivatives
    • Forward, Backward and Central Difference
    • Derivative of Gaussian Filter
    • Sobel operator
  • Naive Edge Detection
  • The Canny Edge Detector
    • Gradient computation
    • Non-maxima suppression
    • Hysteresis thresholding

Derivative Filtering & Edge Detection

Video

Saturday December 11: Recitation 3

  • Edge Detection
    • Sobel Filter
    • Prewitt’s Filter
    • Laplacian Filter
    • Canny Edge Detector
  • Video Processing
  • Applications of Edge Detection
    • Road Lane Detection
    • Edge-based Face Recognition
  • Video
  • Recitation 3

Take-home Quiz 3


Assignment 1

7

The Structure Tensor

  • Comparing patches -- SSD
  • Quadratic form for SSD
  • Taylor's approximation in SSD
  • Weighted SSD via Gaussian convolution
  • Structure Tensor -- Geometry and Algebra

Structure Tensor

Video

8

Corner Detection

  • Harris Corner Detector
  • Rohr Corner Detector
  • Scale Space and Gaussian Pyramids

Corner Detection

Video

Saturday December 18: Recitation 4

  • Corner Detection
    • Harris corner detector
    • Shi-Tomasi corner detector
  • Scale-space via Gaussian pyramid
  • Video
  • Recitation 4

Take-home Quiz 4


Assignment 2

9

Local Image Descriptors I

  • Sum-of-Squared-Differences (SSD)
  • Normalized Cross Correlation (NCC)
  • SIFT Keypoint Detection

Local Image Descriptors

Video

10

Local Image Descriptors II

  • SIFT Descriptor
  • Matching SIFT Descriptors

Local Image Descriptors

Video

Friday December 24: Recitation 5

  • Feature Extraction
    • SIFT (Scale Invariant Feature Transform)
    • SURF (Speeded-Up Robust Features)
    • ORB (Oriented FAST and Rotated Brief)
  • Feature Matching
    • SIFT robustness to Rotation, Scale, and Blurriness
  • Object Matching in Video
  • Video
  • Recitation 5

Take-home Quiz 5

11

Hough Transform

  • Slope-intercept representation of lines
  • Polar representation of lines
  • Line detection
  • Circle detection
  • Speed-up via gradient information

Hough Transform

Video

12

Deep Learning -- I

  • Deep Learning is unavoidable
  • Machine Learning
  • Deep Learning
  • The Artificial Neuron
  • Neural Networks
  • Loss Functions

Deep Learning

Video

Friday January 14: Recitation 6

  • Hough Transform
    • Line Detection
    • Circle Detection
    • Applications
      • Red Blood Cells Estimation
      • Iris Detection
      • Traffic Signs Detection
    • Video
    • Recitation 6

Take-home Quiz 6

Assignment 3

Project Proposal

13

Deep Learning -- II

  • Multiclass Classification
  • Activation Functions
  • Regularization

Deep Learning

Video

14

Convolutional Neural Network (CNN)

  • Convolutional Filters
  • Subsampling
  • Fully Connected Layers
  • 1x1 Convolutions
  • Depthwise separable Convolutions
  • Transposed Convolutions
  • Unpooling
  • Fully Convolutional Networks
  • Semantic Segmentation
  • Residual Blocks

Convolutional Neural Network (CNN)

Video

Friday January 21: Recitation 7

  • Keras basics
  • About train, val and test sets
  • Building blocks in Keras
  • Classifying cats vs dogs using CNN
  • Video
  • Recitation 7

  • Mid-Term Exam

15

Object Detection, Classification and Segmentation via Mask R-CNN

  • Detection vs. Classification vs. Segmentation
  • Overall Architecture

Object Detection, Classification and Segmentation via Mask R-CNN

Video

16

Object Detection, Classification and Segmentation via Mask R-CNN

  • Feature Pyramid Network (FPN)
  • Region Proposal Network (RPN)
  • RoIAlign
  • Classification
  • Bounding Box Regression
  • Instance Segmentation

Object Detection, Classification and Segmentation via Mask R-CNN

Video

Friday February 4: Recitation 8

17

2D Spatial Transformations

  • Matrix ≡ Linear Transformation
  • Scaling, Shear, Rotation
  • Translation is not linear in ℝ2
  • Homogenous Coordinates make translation linear in ℙ2
  • 2D Affine Transformation (Scaling, Shear, Rotation, Translation)
    • 6 degrees of freedom
  • 2D Projective Transformation (Homography)
    • 8 degrees of freedom

2D Spatial Transformations

Video

18

Estimating and Applying Transformations

  • Recovering best affine transformation from correspondences
  • Recovering best projective transformation from correspondences -- Direct Linear Transform (DLT)

Estimating Transformations

Video

Friday February 11: Recitation 9

  • Affine Transformation
    • Scaling
    • Shearing
    • Translation
    • Rotation
    • Estimation of affine transform
  • Projective transformation (Homography)
    • Perspective correction
    • Virtual Billboard
    • Add sponsors to sponsor-board in a video
  • Video
  • Recitation 9

Take-home Quiz 7

19

... DLT continued

Image Warping

  • Problems with naïve warping
  • Inverse transformation
  • Bilinear Interpolation

Image Warping

Video

20

Robust Estimation

  • Outliers
  • Least-squares models are sensitive to outliers
  • Robust Estimation via RANSAC
    • Line Fitting
    • Homography Estimation

RANSAC

Video

Friday February 18: Recitation 10

  • Robust line fitting using RANSAC
  • Object tracking in videos
  • Image stitching
  • Video
  • Recitation 10

21

Optic Flow -- Local I

  • Gray value constancy
  • Linearized Optic Flow Constraint (OFC)
  • Aperture Problem
  • Normal Flow
  • Local Method of Lucas & Kanade

Optic Flow -- Local

Video

22

Optic Flow -- Local II

  • Local Method of Lucas & Kanade
  • Flow Classification via Structure Tensor
  • Flow Visualization

Optic Flow -- Local

Video

Friday February 25: Recitation 11

Take-home Quiz 8

Assignment 4

23

Optic Flow -- Global

  • Global Method of Horn & Schunck
    • Data Term
    • Smoothness Term
    • Regularization Parameter
  • Functions versus Functionals
  • Calculus of Variations
    • Euler-Lagrange Equations
  • Fixed-point Iterations

Optic Flow -- Global

Video

Take-home Quiz 9

24

Camera Geometry

  • Camera Obscura
  • Pinhole Camera Model
  • Camera Matrix
    • Intrinsic Parameters
    • Projection
    • Extrinsic Parameters

Camera Geometry

Video

25

Camera Anatomy

  • Camera Center
  • Why do parallel lines meet in images?
    • Points at infinity
    • Vanishing points
  • What do the columns of P tell us?
  • What do the rows of P tell us?

Camera Anatomy

Video

Friday March 11: Recitation 12

26

  • Conclusion
    • What was covered?
    • What were the general principles?
    • What was not covered?

Conclusion

  • Final Exam