Fig. 1. Overall architecture of the proposed method in this paper
Fig. 2. Reconstruction losses of different scales between the decoding layer and ground truth values in the network
Fig. 3. MFG module architecture
Fig. 4. Six mask images selected for quantitative comparison
Fig. 5. Qualitative comparison between the proposed method and other methods on three datasets. The first two rows display images from the CelebA-HQ dataset, the third and fourth rows show images from the Paris StreetView dataset, and the last three rows present images from the Places2 dataset. Different masks were used for testing in each image set. GT represents the ground truth.
Fig. 6. Comparison of inpainting results between networks with MFG module and networks without MFG module
Fig. 7. Visualization of the confidence level distribution
Fig. 8. Comparison of object removal effect between our method and other two methods in different scenarios(GT represents the ground truth, and Mask represents the mask image)
| Masks | 1%~10% | 10%~20% | 20%~30% | 30%~40% | 40%~50% | 50%~60% |
---|
MEA↓ | GC | 0.445 | 0.389 | 0.518 | 0.643 | 0.753 | 0.972 | MADF | 0.965 | 0.618 | 0.806 | 0.731 | 0.824 | 0.937 | MEDFE | 0.883 | 0.353 | 0.527 | 0.769 | 0.869 | 0.831 | PIC | 0.734 | 0.391 | 0.480 | 0.473 | 0.678 | 0.734 | RFR | 0.317 | 0.481 | 0.293 | 0.594 | 0.559 | 0.720 | Ours | 0.129 | 0.238 | 0.146 | 0.483 | 0.663 | 0.654 |
|
Table 1. Tested on the CelebA-HQ dataset
| Masks | 1%~10% | 10%~20% | 20%~30% | 30%~40% | 40%~50% | 50%~60% |
---|
MAE↓ | GC | 0.173 | 0.349 | 0.877 | 0.746 | 1.280 | 2.467 | MADF | 0.194 | 0.983 | 1.206 | 1.571 | 1.459 | 1.732 | MEDFE | 0.137 | 0.581 | 0.791 | 0.862 | 0.822 | 1.612 | PIC | 0.218 | 0.923 | 1.114 | 0.743 | 0.938 | 3.018 | RFR | 0.151 | 0.835 | 0.429 | 0.697 | 0.616 | 1.260 | Ours | 0.142 | 0.328 | 0.457 | 0.650 | 0.679 | 1.026 | PSNR↑ | GC | 32.69 | 36.03 | 36.86 | 37.12 | 38.27 | 37.46 | MADF | 29.43 | 28.49 | 34.82 | 35.61 | 31.54 | 34.46 | MEDFE | 34.74 | 33.28 | 35.94 | 34.87 | 34.04 | 33.12 | PIC | 31.67 | 30.74 | 31.11 | 33.68 | 32.87 | 31.68 | RFR | 35.98 | 35.08 | 35.37 | 39.39 | 39.92 | 38.91 | Ours | 34.42 | 36.28 | 38.12 | 37.58 | 39.23 | 38.67 |
|
Table 2. Tested on the Paris StreetView dataset
| Masks | 1%~10% | 10%~20% | 20%~30% | 30%~40% | 40%~50% | 50%~60% |
---|
PSNR↑ | GC | 27.38 | 28.24 | 26.24 | 24.48 | 21.82 | 22.13 | MADF | 26.75 | 29.94 | 24.05 | 26.39 | 27.26 | 21.19 | MEDFE | 36.48 | 27.12 | 27.93 | 24.73 | 24.46 | 23.42 | PIC | 32.65 | 34.56 | 24.67 | 26.68 | 25.14 | 22.64 | RFR | 32.82 | 30.08 | 27.18 | 27.73 | 26.89 | 25.12 | Ours | 37.10 | 36.76 | 28.43 | 27.81 | 27.72 | 24.50 | SSIM↑ | GC | 0.941 | 0.942 | 0.837 | 0.855 | 0.831 | 0.893 | MADF | 0.827 | 0.964 | 0.739 | 0.819 | 0.728 | 0.823 | MEDFE | 0.924 | 0.975 | 0.796 | 0.836 | 0.675 | 0.733 | PIC | 0.963 | 0.926 | 0.868 | 0.872 | 0.741 | 0.816 | RFR | 0.921 | 0.981 | 0.839 | 0.896 | 0.829 | 0.848 | Ours | 0.976 | 0.983 | 0.921 | 0.887 | 0.846 | 0.883 | FID↓ | GC | 7.28 | 18.74 | 19.72 | 18.34 | 28.79 | 53.51 | MADF | 8.97 | 23.09 | 22.91 | 25.08 | 27.06 | 51.83 | MEDFE | 7.13 | 16.36 | 18.47 | 17.26 | 30.27 | 58.64 | PIC | 7.43 | 15.67 | 18.41 | 17.93 | 27.30 | 55.43 | RFR | 8.37 | 16.52 | 17.29 | 18.16 | 31.78 | 58.46 | Ours | 6.22 | 16.44 | 17.49 | 16.43 | 21.15 | 54.03 | LPIPS↓ | GC | 0.014 | 0.051 | 0.079 | 0.061 | 0.097 | 0.153 | MADF | 0.024 | 0.045 | 0.083 | 0.078 | 0.080 | 0.127 | MEDFE | 0.015 | 0.060 | 0.041 | 0.050 | 0.091 | 0.157 | PIC | 0.017 | 0.053 | 0.076 | 0.064 | 0.086 | 0.107 | RFR | 0.018 | 0.031 | 0.061 | 0.051 | 0.087 | 0.113 | Ours | 0.016 | 0.023 | 0.059 | 0.047 | 0.064 | 0.096 |
|
Table 2. Tested on the CelebA-HQ dataset
| Masks | 1%~10% | 10%~20% | 20%~30% | 30%~40% | 40%~50% | 50%~60% |
---|
FID↓ | GC | 14.36 | 24.07 | 33.58 | 39.67 | 53.41 | 78.30 | MADF | 24.15 | 30.86 | 45.26 | 52.78 | 56.55 | 62.04 | MEDFE | 19.07 | 24.37 | 34.32 | 41.39 | 49.86 | 51.34 | PIC | 16.54 | 22.98 | 37.31 | 54.24 | 65.78 | 75.98 | RFR | 17.32 | 25.56 | 38.14 | 53.06 | 64.05 | 84.17 | Ours | 15.72 | 21.53 | 32.65 | 40.64 | 46.49 | 53.79 | LPIPS↓ | GC | 0.075 | 0.077 | 0.120 | 0.167 | 0.221 | 0.340 | MADF | 0.146 | 0.236 | 0.287 | 0.249 | 0.243 | 0.203 | MEDFE | 0.057 | 0.076 | 0.133 | 0.168 | 0.237 | 0.221 | PIC | 0.068 | 0.093 | 0.151 | 0.211 | 0.251 | 0.238 | RFR | 0.064 | 0.047 | 0.134 | 0.146 | 0.241 | 0.245 | Ours | 0.033 | 0.039 | 0.096 | 0.161 | 0.217 | 0.205 |
|
Table 3. Tested on the Places2 dataset
| Masks | 1%~10% | 10%~20% | 20%~30% | 30%~40% | 40%~50% | 50%~60% |
---|
SSIM↑ | GC | 0.921 | 0.964 | 0.840 | 0.946 | 0.862 | 0.761 | MADF | 0.547 | 0.753 | 0.608 | 0.882 | 0.817 | 0.829 | MEDFE | 0.833 | 0.956 | 0.937 | 0.917 | 0.849 | 0.846 | PIC | 0.870 | 0.911 | 0.864 | 0.868 | 0.790 | 0.710 | RFR | 0.981 | 0.948 | 0.919 | 0.927 | 0.869 | 0.879 | Ours | 0.977 | 0.973 | 0.948 | 0.938 | 0.878 | 0.907 | FID↓ | GC | 8.26 | 8.82 | 18.10 | 22.75 | 37.32 | 48.57 | MADF | 12.08 | 16.74 | 20.49 | 34.11 | 47.07 | 49.23 | MEDFE | 6.74 | 9.39 | 19.23 | 23.08 | 46.97 | 53.30 | PIC | 14.43 | 18.22 | 26.82 | 36.94 | 48.35 | 68.65 | RFR | 8.09 | 9.17 | 15.79 | 18.46 | 34.11 | 48.76 | Ours | 7.19 | 8.61 | 16.84 | 21.09 | 32.46 | 47.34 | LPIPS↓ | GC | 0.031 | 0.024 | 0.052 | 0.085 | 0.154 | 0.214 | MADF | 0.397 | 0.187 | 0.213 | 0.196 | 0.202 | 0.270 | MEDFE | 0.042 | 0.021 | 0.059 | 0.055 | 0.094 | 0.206 | PIC | 0.034 | 0.089 | 0.106 | 0.167 | 0.240 | 0.215 | RFR | 0.017 | 0.028 | 0.067 | 0.059 | 0.132 | 0.194 | Ours | 0.022 | 0.026 | 0.048 | 0.054 | 0.118 | 0.143 |
|
Table 3. Tested on the Paris StreetView dataset
| Masks | 1%~10% | 10%~20% | 20%~30% | 30%~40% | 40%~50% | 50%~60% |
---|
MAE↓ | GC | 0.127 | 0.186 | 0.556 | 0.711 | 1.568 | 2.247 | MADF | 0.230 | 0.856 | 0.528 | 0.985 | 1.703 | 1.833 | MEDFE | 0.153 | 0.181 | 0.625 | 0.916 | 1.057 | 1.156 | PIC | 0.098 | 0.230 | 0.697 | 0.968 | 1.360 | 1.416 | RFR | 0.080 | 0.173 | 0.761 | 1.041 | 1.419 | 1.898 | Ours | 0.074 | 0.169 | 0.507 | 0.771 | 1.007 | 1.169 | PSNR↑ | GC | 27.82 | 28.56 | 29.15 | 23.07 | 20.26 | 19.36 | MADF | 24.93 | 22.68 | 23.44 | 21.60 | 20.84 | 21.74 | MEDFE | 30.27 | 28.10 | 26.30 | 24.13 | 22.70 | 20.38 | PIC | 30.41 | 27.45 | 28.81 | 23.15 | 21.49 | 19.75 | RFR | 24.44 | 25.68 | 27.47 | 20.63 | 18.56 | 14.42 | Ours | 30.38 | 28.81 | 29.76 | 23.18 | 23.36 | 21.63 | SSIM↑ | GC | 0.906 | 0.826 | 0.769 | 0.755 | 0.673 | 0.590 | MADF | 0.692 | 0.607 | 0.574 | 0.539 | 0.621 | 0.628 | MEDFE | 0.923 | 0.860 | 0.768 | 0.725 | 0.663 | 0.617 | PIC | 0.913 | 0.853 | 0.754 | 0.675 | 0.693 | 0.532 | RFR | 0.948 | 0.868 | 0.781 | 0.627 | 0.505 | 0.516 | Ours | 0.962 | 0.877 | 0.806 | 0.780 | 0.716 | 0.676 |
|
Table 3. Tested on the Places2 dataset
结构设计 | MEA↓ | PSNR↑ | SSIM↑ | FID↓ | LPIPS↓ |
---|
平滑结构 | 有MFG模块 | 0.316 | 27.95 | 0.906 | 21.27 | 0.063 | 无MFG模块 | 1.581 | 21.62 | 0.565 | 55.64 | 0.235 | 修复结果 | 有MFG模块 | 0.207 | 26.52 | 0.947 | 18.41 | 0.042 | 无MFG模块 | 1.661 | 20.08 | 0.640 | 42.97 | 0.175 |
|
Table 4. Quantitative analysis of ablation experiments