A Comparison of Binary Feature Vector Matching Algorithms

December 11, 2024 1 minute read

Feature Matching Algorithm Comparison

This post offers a comparison of the following binary feature matching algorithms:

Brute force Hamming distance comparison
Hamming distance embedding Binary Search Tree (HBST)
FLANN with Multi-Probe Locality Sensitive Hashing (LSH)
FLANN with k-d trees

The feature detection and matching C++ code can be viewed here. For all algorithms except for HBST, the OpenCV implementations are used. For HBST, srrg_hbst is used.

Procedure

First, /$N=500/$ ORB features are detected in the two following images:

Then, the features are matched with each of the algorithms. A distance threshold is applied to reject matches with large distances between the matched descriptor vectors. Here is a sample of /$N=500/$ features matched with HBST (only 104 features were successfully matched after distance thresholding was applied):

This is repeated with /$N*2/$ features all the way up to /$N=64,000/$ features.

Results

Matching speed

The matching time versus /$N/$ detected features is plotted below:

For /$N<2,500/$ detected features, Brute force Hamming distance matching is faster than all of the other algorithms because it does not have to build an index. But for values of /$N>=2,500/$, the other algorithms win out because the time to build an index is balanced out with the speed of the search using the index.

For values of /$2,500<=N<=16,000/$, FLANN with multi-probe LSH is the fastest, followed closely by HBST. But for values of /$N>16,000/$, HBST is consistently the fastest matching algorithm, followed by FLANN with multi-probe LSH, then FLANN with k-d trees.

Number of matches

The number of matches versus /$N/$ detected features is plotted below:

Because it performs a complete search, Brute-force Hamming consistently provides the most matches that pass the distance threshold, followed very closely by FLANN with multi-probe LSH.

HBST provides fewer matches than both of the top performing matchers, but what it lacks in number of matches it makes up in speed for large values of /$N/$ detected features.

FLANN with k-d trees fails to make accurate matches when filtering with a distance threshold.

Conclusion

For small values of /$N<2,500/$ features to match, use Brute-force Hamming distance matching.

For large values of /$N>=2,500/$ features to match, use either HBST or FLANN with multi-probe LSH.

Alonzo Lopez