About Me

I am currently an Assistant Professor at PUCIT. I obtained a Ph.D in Computer Science from UCF under Marshall Tappen. Before UCF, I was a research associate at the Computer Vision Lab at LUMS. I completed my Masters degree in Computer Science from Universitaet des Saarlandes in Saarbruecken, Germany and my Bachelors in Computer Science from LUMS.

Phone: +92 111-923-923 Ext: 521
Email: nazarkhan at pucit.edu.pk
Address:
Ground Floor, Graduate Block
PUCIT, Punjab University, Old Campus,
The Mall, Lahore, Pakistan


Teaching

The 4 Point Lecture

SpringFall
2014
Probability and Statistics MA-250
Computer Vision CS-565
Machine Learning CS-567
Computer Vision SE-461
2015
Advanced Machine Learning CS-667
Machine Learning CS-567
Computer Vision CS-565/SE-461
2016
Advanced Machine Learning CS-667
Machine Learning CS-567
Computer Vision CS-565/CS-465
2017
Linear Algebra MA-310
Advanced Machine Learning CS-667
Machine Learning CS-567
Computer Vision CS-565/CS-465


Service


Reviewer for IEEE Transactions in Image Processing
Reviewer for 3D Research

Current Students


  • Saadia Shahzad (Incremental Ellipse Detection). Co-supervised with Dr. Zubair Nawaz.
  • Naila Hamid (Perceptual Line Segment Extraction).
  • Asma Shaukat (Machine Learning for Word and Sentence Similarity).
  • Tauseef Iftikhar (Probabilistic Graphical Models for Map Stitching).
  • Nausheen Qaiser (Camera-based Vehicular Speed Estimation).
  • Tayaba Anjum (Learning for Handwritten Text Recognition).
  • Waqas Tariq (Click-free Video-based Document Capture).
  • Sania Ashraf (Machine Learning for Expression Synthesis).
  • Omer Farooq (Fast 1D Hough Transform for Ellipse Detection).
  • Hussnain Haider (Handwritten, Offline Mathematical Expression Recognition).
  • Malik Masood (Automatic Map Parsing, Perceptual Line Segment Extraction).
  • Arbish Akram (Face synthesis).
  • Ayesha Rafique (Adversarial learning).

Current Research


Click-free, Video-based Document Capture
with Waqas Tariq

We propose a click-free method for video-based digitization of multi-page documents. The work is targeted at the non-commercial, low-volume, home user. The document is viewed through a mounted camera and the user is only required to turn pages manually while the system automatically extracts the video frames representing stationary document pages. This is in contrast to traditional document conversion approaches such as photocopying and scanning which can be time-consuming, repetitive, redundant and can lead to document deterioration.
Main contributions of our work are i) a 3-step method for automatic extraction of unique, stable and clear document pages from video, ii) a manually annotated data set of 37 videos consisting of 763 page turn events covering a large variety of documents, and iii) a soft, quantitative evaluation criterion that is highly correlated with the hard F1-measure. The criterion is motivated by the need to counter the subjectivity in human marked ground truth for videos. On our data set, we report an F1-measure of 0.91 and a soft score of 0.94 for the page extraction task.

W. Tariq and N. Khan, Click-Free, Video-Based Document Capture Methodology and Evaluation, CBDAR 2017.
[Paper] [Bib] [Dataset]
Word Pair Similarity
with Asma Shaukat

We present a novel approach for computing similarity of English word pairs. While many previous approaches compute cosine similarity of individually computed word embeddings, we compute a single embedding for the word pair that is suited for similarity computation. Such embeddings are then used to train a machine learning model. Testing results on MEN and WordSim-353 datasets demonstrate that for the task of word pair similarity, computing word pair embeddings is better than computing word embeddings only.

A. Shaukat and N. Khan, New Word Pair Level Embeddings to Improve Word Pair Similarity, ICDAR, WML 2017.
[Paper] [Bib]
LSM: Perceptually Accurate Line Segment Merging
with Naila Hamid

Existing line segment detectors tend to break up perceptually distinct line segments into multiple segments. We propose an algorithm for merging such broken segments to recover the original perceptually accurate line segments. The algorithm proceeds by grouping line segments on the basis of angular and spatial proximity. Then those line segment pairs within each group that satisfy unique, adaptive mergeability criteria are successively merged to form a single line segment. This process is repeated until no more line segments can be merged. We also propose a method for quantitative comparison of line segment detection algorithms. Results on the York Urban dataset show that our merged line segments are closer to human-marked ground-truth line segments compared to state-of-the-art line segment detection algorithms.
N. Hamid and N. Khan, LSM: Perceptually Accurate Line Segment Merging, Journal of Electronic Imaging, 25(6), 2016
[Project page] [PDF] [Bib] [Code]
Incremental Ellipse Detection
with Saadia Shahzad, Zubair Nawaz, Jerome Kieffer and Claudio Ferrero

Projections of spherical/ellipsoidal objects appear as ellipses in 2-D images. Detection of these ellipses enables information about the objects to be extracted. In some applications, images contain ellipses with scattered data i.e. portions of an ellipse can have significant gaps in-between. We initially group pixels to get small connected regions. Then we use an incremental algorithm to grow these scattered regions into ellipses. In our proposed algorithm, we start growing a region by selecting neighbours near this region and near the best-fit ellipse of this region. After merging the neighbours into the original region, a new ellipse again. This proceeds until convergence. We evaluate our method on the problem of detecting ellipses in X-ray diffraction images where diffraction patterns appear as so-called Debye-Scherrer rings. Detection of these rings allows calibration of the experimental setup.
Manuscript submitted for publication, 2017
Video-based Vehicular Statistics Estimation
with Nausheen Qaiser

A method for automatically estimating vehicular statistics from video. Accurately locating vehicles in a video becomes a challenging task when the brightness is varying and when the vehicles are occluded by each other. We are looking towards:
  • accurate tracking of vehicles in different scenarios such as
    • slow/fast-moving traffic,
    • stop and go traffic, and
    • light variation and occlusion
  • real-time vehicle tracking and speed computation.
Currently, the algorithm gives 19.3% error in speed estimates for a video capturing 3 lanes and containing about 80 vehicles from fast-moving to slow-moving ones, and stop-and-go traffic.
Automated Rural Map Parsing For Land Record Digitization
with Malik Masood

A framework for automated mauza-map stitching from digital images of Colonial-era, hand-drawn cadastral maps. The framework
  1. is automated,
  2. determines its own failure, and therefore
  3. transfers to a semi-automated system.
In order to assist the stitching process, we also extract meta-data from the map.
Input Previous 1D Method Our Method
A Fast and Improved Hough Transform based Ellipse Detector using 1D Parametric Space
with Umar Farooq

There are many approaches to detect ellipses from images. The standard Hough Transform based approach depends on the number of parameters and requires a five dimensional accumulator array to gather votes for the five parameters of an ellipse. We propose a modified HT based ellipse detector which requires a 1D parametric space. It overcomes the weaknesses of previous 1D approaches which include i) missed detections when multiple ellipses are partially overlapped, ii) redundant and false detections. We overcome these weaknesses while also reducing the execution time of the algorithm by exploiting gradient information of edge pixels.
Automated Road Condition Monitoring
with Naila Hamid, Kashif Murtaza and Raqib Omer

A framework for automated road condition monitoring. The research emphasis is on simultaneous incorporation of chromo-geometric information. Accordingly, color and vanishing point based road detection is performed. The condition of the road area is then determined using a hierarchical classification scheme to deal with the non-robustness of using color as a feature.
Discriminative Dictionary Learning with Spatial Priors

While smoothness priors are ubiquitous in analysis of visual information, dictionary learning for image analysis has traditionally relied on local evidences only. We present a novel approach to discriminative dictionary learning with neighborhood constraints. This is achieved by embedding dictionaries in a Conditional Random Field (CRF) and imposing label-dependent smoothness constraints on the resulting sparse codes at adjacent sites. This way, a smoothness prior is used while learning the dictionaries and not just to make inference. This is in contrast with competing approaches that learn dictionaries without such a prior. Pixel-level classification results on the Graz02 bikes dataset demonstrate that dictionaries learned in our discriminative setting with neighborhood smoothness constraints can equal the state-of-the-art performance of bottom-up (i.e. superpixel-based) segmentation approaches.

N. Khan and M. F. Tappen, Discriminative Dictionary Learning with Spatial Priors, ICIP 2013. [Paper] [Presentation] (Top 10% paper)



Previous Research


Stable Discriminative Dictionary Learning via Discriminative Deviation

Discriminative learning of sparse-code based dictionaries tends to be inherently unstable. We show that using a discriminative version of the deviation function to learn such dictionaries leads to a more stable formulation that can handle the reconstruction/discrimination trade-off in a principled manner. Results on Graz02 and UCF Sports datasets validate the proposed formulation.

N. Khan and M. F. Tappen, Stable Discriminative Dictionary Learning via Discriminative Deviation, ICPR 2012. [Paper] [Presentation] (Acceptance Rate: 16.13%)
Correcting Cuboid Corruption for Action Recognition in Complex Environment

The success of recognizing periodic actions in single-person-simple-background datasets, such as Weizmann and KTH, has created a need for more difficult datasets to push the performance of action recognition systems. We identify the significant weakness in systems based on popular descriptors by creating a synthetic dataset using Weizmann dataset. Experiments show that introducing complex backgrounds, stationary or dynamic, into the video causes a significant degradation in recognition performance. Moreover, this degradation cannot be fixed by fine-tuning the system or selecting better interest points. Instead, we show that the problem lies at the cuboid level and must be addressed by modifying cuboids.

S.Z. Masood, A. Nagaraja, N. Khan, J. Zhu, and M. F. Tappen. Correcting Cuboid Corruption for Action Recognition in Complex Environment. VECTaR2011 Workshop at ICCV 2011. [Paper]
Training Many-Parameter Shape-from-Shading Models Using a Surface Database

Shape-from-shading (SFS) methods tend to rely on models with few parameters because these parameters need to be hand-tuned. This limits the number of different cues that the SFS problem can exploit. In this paper, we show how machine learning can be applied to an SFS model with a large number of parameters. Our system learns a set of weighting parameters that use the intensity of each pixel in the image to gauge the importance of that pixel in the shape reconstruction process. We show empirically that this leads to a significant increase in the accuracy of the recovered surfaces. Our learning approach is novel in that the parameters are optimized with respect to actual surface output by the system. In the first, offline phase, a hemisphere is rendered using a known illumination direction. The isophotes in the resulting reflectance map are then modelled using Gaussian mixtures to obtain a parametric representation of the isophotes. This Gaussian parameterization is then used in the second phase to learn intensity-based weights using a database of 3D shapes. The weights can also be optimized for a particular input image.

N. Khan and M.F. Tappen, Training Many-Parameter Shape-from-Shading Models Using a Surface Database, 3DIM 2009 Workshop at ICCV 2009. [Paper] [Presentation]
3D Pose Estimation Using Implicit Algebraic Surfaces

2D-3D pose estimation deals with estimating the relative position and orientation of a known 3D model from a 2D image of the model. Common explicit approaches to the problem involve registering the 3D model points to image data in order to reveal the optimal pose parameters. In contrast, this work presents an implicit approach by representing the 3D model and the image silhouette as zero- sets of implicit polynomials and then minimising the distance between image outline pixels and the zero-set of the silhouette equation to reveal the optimal pose parameters. This work deals with representing 3D models as implicit polynomials, then computing sillhouette equations using elimination theory and finally estimating pose parameters. (Work done under Bodo Rosenhahn at the Graphics department of Max-Planck Institute for Computer Science).

N. Khan, Implicit 2D-3D Pose Estimation, Masters Thesis, Universitaet des Saarlandes 2006. [PDF] [Thesis]