Image Matching Across Wide Baselines: From Paper to Practice

6572 Accesses
242 Citations
54 Altmetric
1 Mention
Explore all metrics

Abstract

We introduce a comprehensive benchmark for local features and robust estimation algorithms, focusing on the downstream task—the accuracy of the reconstructed camera pose—as our primary metric. Our pipeline’s modular structure allows easy integration, configuration, and combination of different methods and heuristics. This is demonstrated by embedding dozens of popular algorithms and evaluating them, from seminal works to the cutting edge of machine learning research. We show that with proper settings, classical solutions may still outperform the perceived state of the art. Besides establishing the actual state of the art, the conducted experiments reveal unexpected properties of structure from motion pipelines that can help improve their performance, for both algorithmic and learned methods. Data and code are online (https://github.com/ubc-vision/image-matching-benchmark), providing an easy-to-use and flexible framework for the benchmarking of local features and robust estimation methods, both alongside and against top-performing methods. This work provides a basis for the Image Matching Challenge (https://image-matching-challenge.github.io).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance Measures and a Data Set for Multi-target, Multi-camera Tracking

The Diverging Definition of Robustness in Statistics and Computer Vision

Simple Baselines for Human Pose Estimation and Tracking

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

In (Barroso-Laguna et al. 2019) the models are converted to TensorFlow—we use the original PyTorch version.
https://github.com/danini/graph-cut-ransac/tree/benchmark-version.
https://github.com/ubc-vision/image-matching-benchmark-baselines.
https://github.com/etrulls/slurm-gcp.
Time measured on ‘1-standard-2’ VMs on Google Cloud Compute: 2 vCPUs with 7.5 GB of RAM and no GPU.

References

Aanaes, H., Dahl, A. L., & Steenstrup-Pedersen, K. (2012). Interesting interest points. International Journal of Computer Vision, 97, 18–35.
Article Google Scholar
Aanaes, H., & Kahl, F. (2002). Estimation of deformable structure and motion. In Vision and modelling of dynamic scenes workshop.
Agarwal, S., Snavely, N., Simon, I., Seitz, S., & Szeliski, R. (2009). Building Rome in one day. In International conference on computer vision.
Alahi, A., Ortiz, R., & Vandergheynst, P. (2012). FREAK: Fast retina keypoint. In Conference on computer vision and pattern recognition.
Alcantarilla, P. F., Nuevo, J., & Bartoli, A. (2013). Fast explicit diffusion for accelerated features in nonlinear scale spaces. In British machine vision conference.
Aldana-Iuit, J., Mishkin, D., Chum, O., & Matas, J. (2019). Saddle: Fast and repeatable features with good coverage. Image and Vision Computing, 97, 3807.
Article Google Scholar
Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., & Sivic, J. (2016). NetVLAD: CNN architecture for weakly supervised place recognition. In Conference on computer vision and pattern recognition.
Arandjelovic, R. & Zisserman, A. (2012). Three things everyone should know to improve object retrieval. In Conference on computer vision and pattern recognition.
Badino, H., Huber, D., & Kanade, T. (2011). The CMU visual localization data set. http://3dvis.ri.cmu.edu/data-sets/localization.
Balntas, V. (2018). SILDa: A multi-task dataset for evaluating visual localization. https://research.scape.io/silda/.
Balntas, V., Lenc, K., Vedaldi, A., & Mikolajczyk, K. (2017). HPatches: A benchmark and evaluation of handcrafted and learned local descriptors. In Conference on computer vision and pattern recognition.
Balntas, V., Li, S., & Prisacariu, V. (September 2018). RelocNet: continuous metric learning relocalisation using neural nets. In European conference on computer vision.
Balntas, V., Riba, E., Ponsa, D., & Mikolajczyk, K. (2016). Learning local feature descriptors with triplets and shallow convolutional neural networks. In British machine vision conference.
Barath, D. & Matas, J. (June 2018). Graph-cut RANSAC. In Conference on computer vision and pattern recognition.
Barath, D., Matas, J., & Noskova, J. (2019). MAGSAC: Marginalizing sample consensus. In Conference on computer vision and pattern recognition.
Barroso-Laguna, A., Riba, E., Ponsa, D., & Mikolajczyk, K. (2019). Key.Net: Keypoint detection by handcrafted and learned CNN filters. In International conference on computer vision.
Baumberg, A. (2000). Reliable feature matching across widely separated views. In Conference on computer vision and pattern recognition.
Bay, H., Tuytelaars, T., & Van Gool, L. (2006). SURF: Speeded up robust features. In European conference on computer vision.
Beaudet, P. R. (Nov. 1978). Rotationally invariant image operators. In Proceedings of the 4th international joint conference on pattern recognition (pp. 579–583). Kyoto.
Bellavia, F., & Colombo, C. (2020). Is there anything new to say about sift matching? International Journal of Computer Vision, 2020, 1–20.
Google Scholar
Bian, J.-W., Wu, Y.-H., Zhao, J., Liu, Y., Zhang, L., Cheng, M.-M., & Reid, I. (2019). An evaluation of feature matchers for fundamental matrix estimation. In British machine vision conference.
Brachmann, E., & Rother, C. (2019). Neural-guided RANSAC: learning where to sample model hypotheses. In International conference on computer vision.
Bradski, G. (2000). The OpenCV library. Dr. Dobb’s Journal of Software Tools, 120, 122–125.
Google Scholar
Brown, M., Hua, G., & Winder, S. (2011). Discriminative learning of local image descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33, 43–57.
Article Google Scholar
Brown, M., & Lowe, D. (2007). Automatic panoramic image stitching using invariant features. International Journal of Computer Vision, 74, 59–73.
Article Google Scholar
Bui, M., Baur, C., Navab, N., Ilic, S., & Albarqouni, S. (October 2019). Adversarial networks for camera pose regression and refinement. In International conference on computer vision.
Chum, O., & Matas, J. (June 2005). Matching with PROSAC—progressive sample consensus. In Conference on computer vision and pattern recognition.
Chum, O., Matas, J., & Kittler, J. (2003). Locally optimized RANSAC. In Pattern recognition.
Chum, O., Werner, T., & Matas, J. (2005). Two-view geometry estimation unaffected by a dominant plane. In Conference on computer vision and pattern recognition.
Cui, H., Gao, X., Shen, S., & Hu, Z. (July 2017). Hsfm: Hybrid structure-from-motion. In CVPR.
Dang, Z., Yi, K. M., Hu, Y., Wang, F., Fua, P., & Salzmann, M. (2018). Eigendecomposition-free training of deep networks with zero eigenvalue-based losses. In European conference on computer vision.
Detone, D., Malisiewicz, T., & Rabinovich, A. (2017). Toward geometric deep SLAM. Preprint arXiv:1707.07410.
Detone, D., Malisiewicz, T., & Rabinovich, A. (2018). Superpoint: Self-supervised interest point detection and description. CVPR workshop on deep learning for visual SLAM.
Dong, J., Karianakis, N., Davis, D., Hernandez, J., Balzer, J., & Soatto, S. (June 2015). Multi-view feature engineering and learning. In Conference on computer vision and pattern recognition.
Dong, J. & Soatto, S. (2015). Domain-size pooling in local descriptors: DSP-SIFT. In Conference on computer vision and pattern recognition.
Dusmanu, M., Rocco, I., Pajdla, T., Pollefeys, M., Sivic, J., Torii, A., & Sattler, T. (2019). D2-Net: A trainable CNN for joint detection and description of local features. In Conference on computer vision and pattern recognition.
Ebel, P., Mishchuk, A., Yi, K. M., Fua, P., & Trulls, E. (2019). Beyond Cartesian representations for local descriptors. In International conference on computer vision.
Fischler, M., & Bolles, R. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
Article MathSciNet Google Scholar
Gay, P., Bansal, V., Rubino, C., & Bue, A. D. (2017). Probabilistic structure from motion with objects (PSfMO). In International conference on computer vision.
Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? The KITTI vision benchmark suite. In Conference on computer vision and pattern recognition.
Hartley, R. (1997). In defense of the eight-point algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(6), 580–593.
Article Google Scholar
Hartley, R., & Zisserman, A. (2000). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.
MATH Google Scholar
Hartley, R. I. (1994). Projective reconstruction and invariants from multiple images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(10), 1036–1041.
Article Google Scholar
He, K., Lu, Y., & Sclaroff, S. (2018). Local descriptors optimized for average precision. In Conference on computer vision and pattern recognition.
Heinly, J., Schoenberger, J., Dunn, E., & Frahm, J.-M. (2015). Reconstructing the world in six days. In Conference on computer vision and pattern recognition.
Jacobs, N., Roman, N., & Pless, R. (2007). Consistent temporal variations in many outdoor scenes. In Conference on computer vision and pattern recognition.
Kendall, A., Grimes, M., & Cipolla, R. (2015). Posenet: A convolutional network for real-time 6-DOF camera relocalization. In International conference on computer vision.
Krishna Murthy, J., Iyer, G., & Paull, L. (2019). gradSLAM: Dense SLAM meets automatic differentiation.
Lenc, K., Gulshan, V., & Vedaldi, A. (2011). VLBenchmarks. http://www.vlfeat.org/benchmarks/.
Leutenegger, S., Chli, M., & Siegwart, R. Y. (2011). Brisk: Binary robust invariant scalable keypoints. In International conference on computer vision.
Li, Z., & Snavely, N. (2018). MegaDepth: Learning single-view depth prediction from internet photos. In Conference on computer vision and pattern recognition.
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 20(2), 91–110.
Article Google Scholar
Luo, Z., Shen, T., Zhou, L., Zhang, J., Yao, Y., Li, S., Fang, T., & Quan, L. (2019). ContextDesc: Local descriptor augmentation with cross-modality context. In Conference on computer vision and pattern recognition.
Luo, Z., Shen, T., Zhou, L., Zhu, S., Zhang, R., Yao, Y., Fang, T., & Quan, L. (2018). Geodesc: Learning local descriptors by integrating geometry constraints. In European conference on computer vision.
Lynen, S., Zeisl, B., Aiger, D., Bosse, M., Hesch, J., Pollefeys, M., Siegwart, R., & Sattler, T. (2019). Large-scale, real-time visual-inertial localization revisited. Preprint.
Maddern, W., Pascoe, G., Linegar, C., & Newman, P. (2017). 1 year, 1000 km: The Oxford RobotCar dataset. International Journal of Robotics Research, 36(1), 3–15.
Article Google Scholar
Matas, J., Chum, O., Urban, M., & Pajdla, T. (2004). Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing, 22(10), 761–767.
Article Google Scholar
Mikolajczyk, K., & Schmid, C. (2004). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.
Article Google Scholar
Mikolajczyk, K., Schmid, C., & Zisserman, A. (2004). Human detection based on a probabilistic assembly of robust part detectors. In European conference on computer vision.
Mishchuk, A., Mishkin, D., Radenovic, F., & Matas, J. (2017). Working hard to know your neighbor’s margins: Local descriptor learning loss. In Advances in neural information processing systems.
Mishkin, D., Matas, J., & Perdoch, M. (2015). MODS: Fast and robust method for two-view matching. Computer Vision and Image Understanding, 141, 81–93.
Article Google Scholar
Mishkin, D., Radenovic, F., & Matas, J. (2018). Repeatability is not enough: Learning affine regions via discriminability. In European conference on computer vision.
Muja, M. & Lowe, D. G. (2009). Fast approximate nearest neighbors with automatic algorithm configuration. In International conference on computer vision.
Mukundan, A., Tolias, G., & Chum, O. (2019). Explicit spatial encoding for deep local descriptors. In Conference on computer vision and pattern recognition.
Mur-Artal, R., Montiel, J., & Tardós, J. (2015). ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics, 31(5), 1147–1163.
Article Google Scholar
Nister, D. (June 2003). An efficient solution to the five-point relative pose problem. In Conference on computer vision and pattern recognition.
Noh, H., Araujo, A., Sim, J., & nd Bohyung Han, T. W. (2017). Large-scale image retrieval with attentive deep local features. In International conference on computer vision.
Ono, Y., Trulls, E., Fua, P., & Yi, K. M. (2018). LF-Net: Learning local features from images. In Advances in neural information processing systems.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
MathSciNet MATH Google Scholar
Pizer, S. M., Amburn, E. P., Austin, J. D., Cromartie, R., Geselowitz, A., Greer, T., ter Haar Romeny, B., Zimmerman, J. B., & Zuiderveld, K. (1987). Adaptive histogram equalization and its variations. In Computer vision, graphics, and image processing.
Pritchett, P., & Zisserman, A. (1998). Wide baseline stereo matching. In ICCV (pp. 754–760).
Pultar, M., Mishkin, D., & Matas, J. (2019). Leveraging outdoor webcams for local descriptor learning. In Computer vision winter workshop.
Qi, C., Su, H., Mo, K., & Guibas, L. (2017). Pointnet: Deep learning on point sets for 3D classification and segmentation. In Conference on computer vision and pattern recognition.
Radenovic, F., Tolias, G., & Chum, O. (2016). CNN image retrieval learns from BoW: Unsupervised fine-tuning with hard examples. In European conference on computer vision.
Ranftl, R. & Koltun, V. (2018). Deep fundamental matrix estimation. In European conference on computer vision.
Revaud, J., Weinzaepfel, P., De Souza, C., Pion, N., Csurka, G., Cabon, Y., & Humenberger, M. (2019). R2D2: Repeatable and reliable detector and descriptor. Preprint.
Revaud, J., Weinzaepfel, P., de Souza, C. R., Pion, N., Csurka, G., Cabon, Y., & Humenberger, M. (2019). R2D2: Repeatable and reliable detector and descriptor. In Advances in neural information processing systems.
Rosten, E., Porter, R., & Drummond, T. (2010). Faster and better: A machine learning approach to corner detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 105–119.
Article Google Scholar
Rublee, E., Rabaud, V., Konolidge, K., & Bradski, G. (2011). ORB: An efficient alternative to SIFT or SURF. In International conference on computer vision.
Sarlin, P., DeTone, D., Malisiewicz, T., & Rabinovich, A. (2020). Superglue: Learning feature matching with graph neural networks. In Conference on computer vision and pattern recognition.
Sattler, T., Leibe, B., & Kobbelt, L. (2012). Improving image-based localization by active correspondence search. In European conference on computer vision.
Sattler, T., Maddern, W., Toft, C., Torii, A., Hammarstrand, L., Stenborg, E., Safari, D., Okutomi, M., Pollefeys, M., Sivic, J., Kahl, F., & Pajdla, T. (2018). Benchmarking 6DOF outdoor visual localization in changing conditions. In Conference on computer vision and pattern recognition.
Sattler, T., Weyand, T., Leibe, B., & Kobbelt, L. (2012). Image retrieval for image-based localization revisited. In British machine vision conference.
Sattler, T., Zhou, Q., Pollefeys, M., & Leal-Taixe, L. (2019). Understanding the limitations of CNN-based absolute camera pose regression. In Conference on computer vision and pattern recognition.
Savinov, N., Seki, A., Ladicky, L., Sattler, T., & Pollefeys, M. (2017). Quad-networks: Unsupervised learning to rank for interest point detection. Conference on computer vision and pattern recognition.
Schönberger, J., & Frahm, J. (2016). Structure-from-motion revisited. In Conference on computer vision and pattern recognition.
Schönberger, J., Hardmeier, H., Sattler, T., & Pollefeys, M. (2017). Comparative evaluation of hand-crafted and learned local features. In Conference on computer vision and pattern recognition.
Schönberger, J., Zheng, E., Pollefeys, M., & Frahm, J. (2016). Pixelwise view selection for unstructured multi-view stereo. In European conference on computer vision.
Shi, Y., Zhu, J., Fang, Y., Lien, K., & Gu, J. (2019). Self-supervised learning of depth and ego-motion with differentiable bundle adjustment. Preprint.
Simo-serra, E., Trulls, E., Ferraz, L., Kokkinos, I., Fua, P., & Moreno-Noguer, F. (2015). Discriminative learning of deep convolutional feature point descriptors. In International conference on computer vision.
Strecha, C., Hansen, W., Van Gool, L., Fua, P., & Thoennessen, U. (2008). On benchmarking camera calibration and multi-view stereo for high resolution imagery. In Conference on computer vision and pattern recognition.
Sturm, J., Engelhard, N., Endres, F., Burgard, W., & Cremers, D. (2012). A benchmark for the evaluation of RGB-D SLAM systems. In International conference on intelligent robots and systems.
Sun, W., Jiang, W., Trulls, E., Tagliasacchi, A., & Yi, K. M. (2020). ACNe: Attentive context normalization for robust permutation-equivariant learning. In Conference on computer vision and pattern recognition.
Taira, H., Okutomi, M., Sattler, T., Cimpoi, M., Pollefeys, M., Sivic, J., et al. (2019). InLoc: indoor visual localization with dense matching and view synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1744–1756.
Google Scholar
Tang, C., & Tan, P. (2019). Ba-Net: dense bundle adjustment network. In International conference on learning representations.
Tateno, K., Tombari, F., Laina, I., & Navab, N. (July 2017). CNN-SLAM: Real-time dense monocular slam with learned depth prediction. In CVPR.
Thomee, B., Shamma, D., Friedland, G., Elizalde, B., Ni, K., Poland, D., et al. (2016). YFCC100M: the new data in multimedia research. Communications of the ACM, 59, 64–73.
Article Google Scholar
Tian, Y., Fan, B., & Wu, F. (2017). L2-Net: Deep learning of discriminative patch descriptor in Euclidean space. In Conference on computer vision and pattern recognition.
Tian, Y., Yu, X., Fan, B., Wu, F., Heijnen, H., & Balntas, V. (2019). SOSNet: Second order similarity regularization for local descriptor learning. In Conference on computer vision and pattern recognition.
Tolias, G., Avrithis, Y., & Jégou, H. (2016). Image search with selective match kernels: Aggregation across single and multiple images. IJCV, 116(3), 247–261.
Article MathSciNet Google Scholar
Torr, P., & Zisserman, A. (2000). MLESAC: A new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding, 78, 138–156.
Article Google Scholar
Triggs, B., Mclauchlan, P., Hartley, R., & Fitzgibbon, A. (2000). Bundle adjustment—A modern synthesis. In Vision algorithms: Theory and practice (pp. 298–372).
Vedaldi, A., & Fulkerson, B. (2010). Vlfeat: An open and portable library of computer vision algorithms. In Proceedings of the 18th ACM international conference on multimedia, MM’10 (pp. 1469–1472).
Verdie, Y., Yi, K. M., Fua, P., & Lepetit, V. (2015). TILDE: A temporally invariant learned detector. In Conference on computer vision and pattern recognition.
Vijayanarasimhan, S., Ricco, S., Schmid, C., Sukthankar, R., & Fragkiadaki, K. (2017). SFM-Net: Learning of structure and motion from video. Preprint.
Wei, X., Zhang, Y., Gong, Y., & Zheng, N. (2018). Kernelized subspace pooling for deep local descriptors. In Conference on computer vision and pattern recognition.
Wei, X., Zhang, Y., Li, Z., Fu, Y., & Xue, X. (2020). DeepSFM: Structure from motion via deep bundle adjustment. In European conference on computer vision.
Wu, C. (2013). Towards linear-time incremental structure from motion. In 3DV.
Yi, K. M., Trulls, E., Lepetit, V., & Fua, P. (2016). LIFT: Learned invariant feature transform. In European conference on computer vision.
Yi, K. M., Trulls, E., Ono, Y., Lepetit, V., Salzmann, M., & Fua, P. (2018). Learning to find good correspondences. In Conference on computer vision and pattern recognition.
Yoo, A. B., Jette, M. A., & Grondona, M. (2003). SLURM: Simple Linux utility for resource management. In Workshop on job scheduling strategies for parallel processing (pp. 44–60). Berlin: Springer.
Zagoruyko, S., & Komodakis, N. (2015). Learning to compare image patches via convolutional neural networks. In Conference on computer vision and pattern recognition.
Zhang, J., Sun, D., Luo, Z., Yao, A., Zhou, L., Shen, T., et al. (2019). Learning two-view correspondences and geometry using order-aware network. International conference on computer vision.
Zhang, J., Sun, D., Luo, Z., Yao, A., Zhou, L., Shen, T., Chen, Y., Quan, L., & Liao, H. (2019). Learning two-view correspondences and geometry using order-aware network. In ICCV.
Zhang, X., Yu, F. X., Karaman, S., & Chang, S.-F. (July 2017). Learning discriminative and transformation covariant local feature detectors. In Conference on computer vision and pattern recognition.
Zhao, C., Cao, Z., Li, C., Li, X., & Yang, J. (2019). NM-Net: Mining reliable neighbors for robust feature correspondences. In Conference on computer vision and pattern recognition.
Zhou, Q., Sattler, T., Pollefeys, M., & Leal-Taixe, L. (2020). To learn or not to learn: Visual localization from essential matrices. In ICRA.
Zhu, S., Zhang, R., Zhou, L., Shen, T., Fang, T., Tan, P., & Quan, L. (June 2018). Very large-scale global SFM by distributed motion averaging. In Conference on computer vision and pattern recognition.
Zitnick, C., & Ramnath, K. (2011). Edge foci interest points. In International conference on computer vision.

Download references

Author information

Authors and Affiliations

The University of British Columbia, Vancouver, BC, Canada
Yuhe Jin & Kwang Moo Yi
Visual Recognition Group, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czechia
Dmytro Mishkin & Jiri Matas
Computer Vision Lab, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Anastasiia Mishchuk & Pascal Fua
Google Research, Zurich, Switzerland
Eduard Trulls

Authors

Yuhe Jin
View author publications
You can also search for this author in PubMed Google Scholar
Dmytro Mishkin
View author publications
You can also search for this author in PubMed Google Scholar
Anastasiia Mishchuk
View author publications
You can also search for this author in PubMed Google Scholar
Jiri Matas
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Fua
View author publications
You can also search for this author in PubMed Google Scholar
Kwang Moo Yi
View author publications
You can also search for this author in PubMed Google Scholar
Eduard Trulls
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eduard Trulls.

Additional information

Communicated by Konrad Schindler.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant “Deep Visual Geometry Machines” (RGPIN-2018-03788), by systems supplied by Compute Canada, and by Google’s Visual Positioning Service. DM and JM were supported by OP VVV funded Project CZ.02.1.01/0.0/0.0/16 019/0000765 “Research Center for Informatics”. DM was also supported by CTU student Grant SGS17/185/OHK3/3T/13 and by the Austrian Ministry for Transport, Innovation and Technology, the Federal Ministry for Digital and Economic Affairs, and the Province of Upper Austria in the frame of the COMET center SCCH. AM was supported by the Swiss National Science Foundation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, Y., Mishkin, D., Mishchuk, A. et al. Image Matching Across Wide Baselines: From Paper to Practice. Int J Comput Vis 129, 517–547 (2021). https://doi.org/10.1007/s11263-020-01385-0

Download citation

Received: 08 May 2020
Accepted: 11 September 2020
Published: 07 October 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s11263-020-01385-0

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Performance Measures and a Data Set for Multi-target, Multi-camera Tracking

The Diverging Definition of Robustness in Statistics and Computer Vision

Simple Baselines for Human Pose Estimation and Tracking

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Image Matching Across Wide Baselines: From Paper to Practice

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Performance Measures and a Data Set for Multi-target, Multi-camera Tracking

The Diverging Definition of Robustness in Statistics and Computer Vision

Simple Baselines for Human Pose Estimation and Tracking

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now