Computer Vision (CS 565)
Fall 2018

Dr. Nazar Khan (firstnamelastname attherateof pucit dot edu dot pk)

Human beings (and even animals) "look" at the real-world and extract extremely accurate information extremely efficiently. Computers can fail catastrophically at this task! In this course we look into why "Vision" is a difficult problem to solve and we go through successful, mathematically well-founded techniques used to solve the Vision problem.

This course is a useful application of mathematical concepts from Linear Algebra and Calculus. Therefore, the students could do well by brushing up on their Linear Algebra, Calculus and programming skills before taking this class. The techniques learned here can be useful for other areas such as Image Processing, Machine Learning, Artificial Intelligence and Computer Graphics.

Lectures:
Monday, Wednesday8:15 am - 09:45 amAl Khwarizmi Lecture Theater

Office Hours:
Monday, Wednesday 11:30 am - 01:00 pm

Programming Environment: Python

Grading:
Assignments+Project 20%
Quizzes 5%
Mid-Term 35%
Final 40%

  1. Theoretical assignments have to be submitted before the lecture on the due date.
  2. There will be no make-up for any missed quiz.
  3. Make-up for a mid-term or final exam will be allowed only under exceptional circumstances provided that the instructor has been notified beforehand.
  4. Instructor reserves the right to deny requests for any make-up quiz or exam.
  5. Worst score on quizzes will be dropped.
  6. Worst score on assignments will be dropped.

Assignments:

  • Assignment 1 (Due: Wednesday, December 05, 2018 before 5:30 PM)
  • Assignment 2 (Due: Wednesday, December 12, 2018 before 5:30 PM)
  • Assignment 3: Implement the Hough Transform and overlay detected lines over the input image. (Due: Monday, January 21, 2019 before 5:30 PM)
  • Assignment 4: Estimate Affine transform from N correspondences. Repeat for Homography. Warp input image by a transformation matrix. (Due: Wednesday, January 23, 2019 before 5:30 PM)

Grades:
Grading sheet (Accessible only through your PUCIT email account)

Content:

  1. Introduction
    • What is Computer Vision?
    • Computer Vision vs. Biological Vision -- The Grand Deception!
    • Applications of Computer Vision.
  2. Background Mathematics
    • Cartesian vs. Image axis
    • Taylor's formula
    • Matrix and Vector calculus
    • Eigenvectors
    • Constrained optimisation
    • Singular Value Decomposition (SVD)
  3. Image Processing
    • Image Filtering
      • Convolution
      • Properties of convolution
      • Mean Filtering
      • Gaussian Filtering
      • Non-linear filtering cannot be performed via convolution
  4. Edge Detection
    • Gradient magnitude and angle via partial derivatives
      • atan vs. atan2
    • Derivative approximations via Taylor's formula
    • Convolution masks for derivative filtering
      • Forward, Backward and Central Difference
      • Sobel operator
      • Derivative of Gaussian Filter
    • Naive Edge Detection
    • The Canny Edge Detector
      • Gradient computation
      • Non-maxima supression
      • Hysteresis thresholding
  5. Corner Detection
    • Structure Tensor -- Geometry and Algebra
    • Harris Corner Detector
    • Rohr Corner Detector
    • Scale Space and Gaussian Pyramids
  6. Local Image Descriptors
    • Sum-of-Squared-Differences (SSD)
    • Normalized Cross Correlation (NCC)
    • SIFT
      • Keypoint Detection
      • Descriptor
    • Matching SIFT Descriptors
    • OpenCV Examples
  7. Hough Transform
    • Slope-intercept representation of lines
    • Polar representation of lines
    • Line detection
    • Circle detection
    • Speed-up via gradient information
  8. 2D Spatial Transformations
    • Matrix ≡ Linear Transformation
    • Scaling, Shear, Rotation
    • Translation is not linear in ℝ^2
    • Homogenous Coordinates make translation linear in ℙ^2
    • 2D Affine Transformation (Scaling, Shear, Rotation, Translation)
      • 6 degrees of freedom
      • Recovering best affine transformation from correspondences
      • Affine Image Warping
    • 2D Projective Transformation (Homography)
      • 8 degrees of freedom
      • Recovering best projective transformation from correspondences -- Direct Linear Transform (DLT)
      • Projective Image Warping
  9. Optic Flow
    • Gray value constancy
    • Linearized Optic Flow Constraint (OFC)
    • Aperture Problem
    • Normal Flow
    • Local Method of Lucas & Kanade
    • Global Method of Horn & Schunck
  10. Robust Estimation via RANSAC
    • Outliers
    • Least-squares models are sensitive to outliers
    • Robust Estimation via RANSAC
    • Robust Line Fitting via RANSAC
    • Robust Homography Estimation in OpenCV
  11. Camera Geometry
    • Pinhole Camera Geometry
    • Camera Matrix = Intrinsic x Projection x Extrinsic
    • Camera Models
    • Camera Matrix Anatomy
  12. Camera Calibration