• Opto-Electronic Engineering
  • Vol. 43, Issue 2, 69 (2016)
ZHANG Quan1、2、3、4, BAO Hua1、3, RAO Changhui1、3, and PENG Zhenming2
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • 3[in Chinese]
  • 4[in Chinese]
  • show less
    DOI: 10.3969/j.issn.1003-501x.2016.02.012 Cite this Article
    ZHANG Quan, BAO Hua, RAO Changhui, PENG Zhenming. Realization and Application of Two-dimensional Fast Fourier Transform Algorithm Based on GPU[J]. Opto-Electronic Engineering, 2016, 43(2): 69 Copy Citation Text show less

    Abstract

    NVIDIA as the inventor of the GPU provides a library function CUFFT for computing Fast Fourier Transform (FFT). After several generations update of CUFFT, there is still promotion space and it is not suit for kernel fusing on GPU to reduce the memory access and increase the Instruction Level Parallelism (ILP). We develop our own custom GPU FFT implementation based on the well-known Cooley-Tukey algorithm. We analyze the relationship of coalesce memory access and occupancy of GPU and get the optimal configuration of thread block. The results show that the proposed method improved the computational efficiency by 1.27 times than CUFFT 6.5 for double complex data 512×512. And then it is used to the computation of OTF with kernel fusing strategy, and it improved the efficiency of computation about 1.5 times than conventional method using CUFFT.
    ZHANG Quan, BAO Hua, RAO Changhui, PENG Zhenming. Realization and Application of Two-dimensional Fast Fourier Transform Algorithm Based on GPU[J]. Opto-Electronic Engineering, 2016, 43(2): 69
    Download Citation