Computer Vision
16720 Coursework - Fall 2020
Spatial Pyramid Matching for Scene Classification
Built a representation based on bags of visual words and used spatial pyramid matching for scene classification.
The program is able to classify images into 8 types of scenes. The following figure illustrates the overview of the bag-of-words approach implemented in this task :
Augmented Reality with Planar Homographies
- Implemented an AR application by using planar homographies.
- The program finds the correspondences between two images using BRIEF Descriptor.
- Estimates the homography between the images.
- The images are then warped to overlay a Harry Potter image onto a Computer Vision textbook cover.
- This is further extended to videos to generate an Augmented Reality application depicted in the following video.
Lucas-Kanade Tracking
- Implemented a simple Lucas-Kanade (LK) tracker with one single template.
- This tracker uses a pure translation warp to track a particular template throughout the video.
- I also account for the template drifting issue by updating the template as the video progresses.
- Implemented a motion subtraction method for tracking moving pixels in a scene.
- Studied efficient tracking using inverse composition.
3D Reconstruction
- Created a 3D Reconstruction of an object given a stereo-pair images of the object.
- Estimated the Fundamental Matrix using eight-point and seven-point algorithms.
- Calculated the Essential Matrix using the Fundamental Matrix and calibrated camera intrinsics.
- Used triangulation method to obtain a 3D metric reconstruction from 2D correspondences.
Neural Networks for Recognition
- Implemented a fully connected Neural network that can recognize handwritten letters in an image using the NIST36 dataset to a test accuracy of around 76%.
Photometric Stereo
- Rendering the n-dot-i lighting
- Calibrated Photometric Stereo - Lighting directions are given
- Uncalibrated Photometric Stereo - No lighting directions are given