• Laser & Optoelectronics Progress
  • Vol. 59, Issue 22, 2215009 (2022)
Lingzhi Yu* and Xifan Zhang
Author Affiliations
  • School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
  • show less
    DOI: 10.3788/LOP202259.2215009 Cite this Article Set citation alerts
    Lingzhi Yu, Xifan Zhang. Cross-Domain Spatial Co-Attention Network for Sketch-Based Image Retrieval[J]. Laser & Optoelectronics Progress, 2022, 59(22): 2215009 Copy Citation Text show less

    Abstract

    Sketch-based image retrieval uses hand-drawn sketches as input to retrieve corresponding natural images, allowing users to draw and find desired natural images when no accurate query images are available. Edge maps are commonly used as an intermediate modality to bridge the domain gap between sketches and natural images. However, existing methods ignore the inherent relationship between edge maps and natural images. Based on the assumption that natural images and their corresponding edge maps have similar key regions, this paper proposes a deep learning model based on a cross-domain spatial co-attention network. The proposed model derives the shared spatial attention mask from the fused feature of the edge map and natural image, and it combines the loss function and auxiliary classifier for end-to-end training. When compared with existing representative sketch-based image retrieval methods, the proposed method can effectively extract the features of sketches and natural images, with mean average precision (mAP) values of 0.933 and 0.799 on the Sketchy and TU-Berlin datasets, respectively, outperforming most representative methods.
    MS=σθFLSAvg; FLSMax
    FLS'=FLSMS
    FLIE=FLI;FLE
    MIE=σθFLIEAvg; FLIEMax
    FLI'=FLIMIE
    FLE'=FLEMIE
    FHIE=FHIFHE
    BS'=ψSFHS
    BI'=ψIEFHIE
    CS=φSBS'
    CI=φIBI'
    LOriginTri=max0,FSS-FII+2-FSS-FII-2+m
    LIntra-class=FSS-FII+2
    LTriplet=LOriginTri+LIntra-class
    Lcls=CrossEntropyCS,YS+CrossEntropyCI,YI
    fqx=1,x0.50,x<0.5
    B=fqB'
    Lq=BS'-fqBS'1+BI'-fqBI'1
    L=LTriplet+αLcls+βLq
    PA=1N+k=1NgN+kk×ppositionk
    ppositionk=1,if returned image at position k is positive0,otherwise
    PmA=1Nqi=1NqPAi
    Lingzhi Yu, Xifan Zhang. Cross-Domain Spatial Co-Attention Network for Sketch-Based Image Retrieval[J]. Laser & Optoelectronics Progress, 2022, 59(22): 2215009
    Download Citation