Homography
Homography is a mathematical transformation that maps points from one plane to another, and in the context of the Kivy hardware, it is used to calibrate the system by establishing a precise link between the real-world coordinates of the user’s hand (as seen by the camera) and the virtual screen coordinates projected onto the table. Because the camera and projector view the surface from different angles, raw input from the camera is geometrically distorted; homography corrects this distortion by applying a 3×3 matrix computed during a calibration step, allowing the system to accurately interpret gestures and positions in real time as if the hand were interacting directly with the UI.
Homography Paper
The math for homography can be downloaded here:
149 KBYou can also browse the homography paper below.
Introduction
In interactive systems where a camera observes hand positions and a projector displays targets, it is often necessary to map coordinates from the camera space (where the user’s finger is detected) to the projector space (where the interface is projected). This mapping can be achieved using a homography matrix, assuming both the projector plane and camera-captured hand landmarks lie approximately on the same physical plane (e.g., a table surface).
Homography Overview
A homography is a transformation that relates two planes in projective space. It is represented as a matrix that maps homogeneous coordinates from one image (or plane) to another:
where:
- are the input coordinates (e.g., from the camera).
- are the output coordinates (e.g., on the projector).
- is the homography matrix.
Homography Matrix Structure
The homography matrix has 8 degrees of freedom (up to scale) and is typically written as:
Constructing the System of Equations
Given 4 pairs of corresponding points , we can derive the constraints needed to solve for .
From the homography relation:
Multiply both sides by the denominator to eliminate the fraction:
Rearranging, we obtain two linear equations per point:
Matrix Formulation
Stacking the equations for all 4 point pairs yields a linear system , where is a 9-element vector:
Each point pair contributes two rows to matrix :
Solving the System
The system is homogeneous. To find a nontrivial solution, we compute the Singular Value Decomposition (SVD) of :
The solution is the last column of (corresponding to the smallest singular value). Reshaping into a matrix gives the homography matrix .
Application: Hand-to-Projector Mapping
In the calibration process:
- The user touches 4 projected dots on a flat surface.
- The camera captures the index fingertip positions in the image.
- The system knows the corresponding projected points .
- A homography is computed to map camera points to projector points.
- Later, any new finger position from the camera can be mapped to the projector using:
Conclusion
The homography provides a mathematically sound and computationally efficient way to map points between the camera and projector planes. With just four well-chosen calibration points, we can enable precise interaction in spatial augmented reality systems using only a webcam and a projector.