Course Creators and Instructors
This course provides an introduction to computer vision including fundamentals of image formation, camera imaging geometry, feature detection and matching, multiview geometry including stereo, motion estimation and tracking, and classification. We’ll develop basic methods for applications that include finding known models in images, depth recovery from stereo, camera calibration, image stabilization, automated alignment (e.g. panoramas), tracking, and action recognition.
The focus of the course is to develop the intuitions and mathematics of the methods in lecture, and then to learn about the difference between theory and practice in the problem sets. All algorithms work perfectly in the slides. But remember what Yogi Berra said: In theory there is no difference between theory and practice. In practice there is. (Einstein said something similar but who knows more about real life?)
In general we will not make use of image and vision libraries until first understanding (and often coding) the basic methods. You should be comfortable writing code that reflects mathematics, coding a variety of data structures, and comparing them to find evaluate different hypotheses.
- Students should have a good, working knowledge of Matlab (or an open source equivalent) or doing mathematical programming in Python or C++. The course will use Matlab in lecture demonstrations and Matlab (or equivalent) will be better for the problem sets.
- This course has more math than many CS courses: Linear algebra, vector calculus, and linear algebra, probability, and linear algebra. (Get the hint?)
- No prior knowledge of vision is assumed. Some experience with programming with images is helpful. Experience with any image/signal processing will also be informative.
If you answer "no" to any of the following questions, it may be beneficial to refresh your knowledge of some material prior to taking CS 4495:
- 1. Do you remember what a Gaussian distribution is and what its parametric form looks like?
- 2. If you had a 2D array of numbers and you wanted to compute the derivative in the x direction, could you do that? How about the magnitude of the “gradient”?
- 3. If you had to draw a line on an image with your own code, could you do that (ie no libraries)?
- 4. If you wanted to convert a color image into a monochrome version (gray scale), would you know how to compute it?
- 5. Are you comfortable with code that works in theory but in practice the results are poorer than you expect? And do you enjoy fiddling (that’s the technical term) with parameters of your algorithm to get it to work on real images?
The grades will be based upon your performance on the problem sets. These involve modest amounts of coding and then a reasonable amount of hair pulling to get it to work on real images. You will be asked to turn in your code as well as resulting images.
Required Course Readings
The occasional text for this course is “Computer Vision: A Modern Approach (2nd edition)” by Forsythe and Ponce. I say occasional because the lectures (and the slides that drive them) are a pretty good set of source material. Occasionally you may need to look at the text to help you decipher the occasional unintelligible lecture.
Minimum Technical Requirements
- At least 32GB of available disk space and ability to install additional (free) software. For course projects, you will need to install the Oracle VirtualBox VM and run a Linux virtual machine that contains the setup for the project. Although it is possible to install the software for the projects natively in Linux, such a setup will not be supported.
- Browser and connection speed: An up-to-date version of Chrome or Firefox is strongly recommended. We also support Internet Explorer 9 and the desktop versions of Internet Explorer 10 and above (not the metro versions). 2+ Mbps recommended; at minimum 0.768 Mbps download speed
- Operating system: - PC: Windows XP or higher with latest updates installed - Mac: OS X 10.6 or higher with latest updates installed - Linux: Any recent distribution that has the supported browsers installed
Academic Integrity - All Georgia Tech students are expected to uphold the Georgia Tech Academic Honor Code. You should read this information as we take cheating very seriously. All Georgia Tech faculty (including the instructor for this course) are required to report cases of academic dishonesty to the Dean of Students' office at Georgia Tech. Cheating, unauthorized collaboration, etc. penalties can be sever and are decided by the Dean of Students' office and not by the instructor.
Projects are to be done individually. You may help each other with algorithms and general computation, but your code must be your own. At the university I describe this as white board level collaboration. You can also help someone debug *their own* code. But do not copy and paste from each other.
Also, you may not use code from the internet. Some of the problem sets will have explicit solutions posted on line. DO NOT USE THIS CODE or you will learn nothing.