[5] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60: 91-110.
[6] BAY H, ESS A, TUYTELAARS T, et al. Speeded-Up Robust Features (SURF)[J]. Computer Vision and Image Understanding, 2008, 110(3): 346-359.
[7] LI Q L, WANG G Y, LIU J G, et al. Robust scale-invariant feature matching for remote sensing image registration[J]. IEEE Geoscience and Remote Sensing Letters, 2009, 6(2): 287-291.
[8] XU W H, ZHONG S, ZHANG W J, et al. A new orientation estimation method based on rotation invariant gradient for feature points[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 18(5): 791-795.
[9] CHEN J, TIAN J, LEE N, et al. A partial intensity invariant feature descriptor for multimodal retinal image registration[J]. IEEE Transactions on Biomedical Engineering, 2010, 57(7): 1707-1718.
[10] YE Y X, SHAN J, HAO S Y, et al. A local phase based invariant feature for remote sensing image matching[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2018, 142: 205-221.
[11] SIMONYAN K, VEDALDI A, ZISSERMAN A. Learning local feature descriptors using convex optimisation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(8): 1573-1585.
[12] ALTWAIJRY H, VEIT A, BELONGIE S J. Learning to detect and match keypoints with deep architectures[C]//The 27th British Machine Vision Conference. York: [s.n.], 2016: 49.1-49.12.
[13] DAI J, ZHANG J K, NGUYEN T. Explicit learning of feature orientation estimation[C]//International Conference on Image Processing (ICIP). Taipei: IEEE, 2019: 4245-4249.
[14] SUN J M, SHEN Z H, WANG Y, et al. LoFTR: detector-free local feature matching with Transformers[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville: IEEE, 2021: 8918-8927.
[15] DONG Q L, CAO C J, FU Y W. Incremental Transformer structure enhanced image inpainting with masking positional encoding[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans: IEEE, 2022: 11348-11358.
[16] HO J, KALCHBRENNER N, WEISSENBORN D, et al. Axial attention in multidimensional Transformers[EB/OL]. (2019-12-20)[2024-01-08]. https://api.semanticscholar.org/CorpusID:209323787.
[17] WU K, PENG H W, CHEN M H, et al. Rethinking and improving relative position encoding for vision transformer[C]//IEEE/CVF International Conference on Computer Vision (ICCV). Montreal: IEEE, 2021: 10013-10021.
[18] LI J Y, HU Q W, AI M Y. RIFT: multi-modal image matching based on radiation-variation insensitive feature transform[J]. IEEE Transactions on Image Processing, 2020, 29: 3296-3310.
[19] CAI W X, JIN K, HOU J Y, et al. VDD: varied drone dataset for semantic segmentation[EB/OL]. (2023-08-27)[2024-01-08]. https://doi.org/10.48550/arXiv.2305.13608.