A Comparison of Binary Feature Vector Matching Algorithms
Feature Matching Algorithm Comparison
This post offers a comparison of the following binary feature matching algorithms:
- Brute force Hamming distance comparison
- Hamming distance embedding Binary Search Tree (HBST)
- FLANN with Multi-Probe Locality Sensitive Hashing (LSH)
- FLANN with k-d trees
The feature detection and matching C++ code can be viewed here. For all algorithms except for HBST, the OpenCV implementations are used. For HBST, srrg_hbst is used.
Procedure
First, /$N=500/$ ORB features are detected in the two following images:


Then, the features are matched with each of the algorithms. A distance threshold is applied to reject matches with large distances between the matched descriptor vectors. Here is a sample of /$N=500/$ features matched with HBST (only 104 features were successfully matched after distance thresholding was applied):

This is repeated with /$N*2/$ features all the way up to /$N=64,000/$ features.
Results
Matching speed
The matching time versus /$N/$ detected features is plotted below:
For /$N<2,500/$ detected features, Brute force Hamming distance matching is faster than all of the other algorithms because it does not have to build an index. But for values of /$N>=2,500/$, the other algorithms win out because the time to build an index is balanced out with the speed of the search using the index.
For values of /$2,500<=N<=16,000/$, FLANN with multi-probe LSH is the fastest, followed closely by HBST. But for values of /$N>16,000/$, HBST is consistently the fastest matching algorithm, followed by FLANN with multi-probe LSH, then FLANN with k-d trees.
Number of matches
The number of matches versus /$N/$ detected features is plotted below:
Because it performs a complete search, Brute-force Hamming consistently provides the most matches that pass the distance threshold, followed very closely by FLANN with multi-probe LSH.
HBST provides fewer matches than both of the top performing matchers, but what it lacks in number of matches it makes up in speed for large values of /$N/$ detected features.
FLANN with k-d trees fails to make accurate matches when filtering with a distance threshold.
Conclusion
For small values of /$N<2,500/$ features to match, use Brute-force Hamming distance matching.
For large values of /$N>=2,500/$ features to match, use either HBST or FLANN with multi-probe LSH.