• Chinese Optics Letters
  • Vol. 14, Issue 12, 121101 (2016)
Min-Chul Lee1, Kotaro Inoue1, Cheol-Su Kim2, and Myungjin Cho3、*
Author Affiliations
  • 1Department of Computer Science and Electronics, Kyushu Institute of Technology, Fukuoka 820-8502, Japan
  • 2Department of Electrical Energy and Computer Engineering, Gyeongju University, 188 Taejongro, Gyeongju City, KyeongsangBukdo, 38065, Republic of South Korea
  • 3Department of Electrical, Electronic, and Control Engineering, IITC, Hankyong National University, 327 Chungang-ro, Anseong-si, Gyonggi-do 456-749, Republic of South Korea
  • show less
    DOI: 10.3788/COL201614.121101 Cite this Article Set citation alerts
    Min-Chul Lee, Kotaro Inoue, Cheol-Su Kim, Myungjin Cho. Regeneration of elemental images in integral imaging for occluded objects using a plenoptic camera[J]. Chinese Optics Letters, 2016, 14(12): 121101 Copy Citation Text show less

    Abstract

    In this Letter, we propose an elemental image regeneration method of three-dimensional (3D) integral imaging for occluded objects using a plenoptic camera. In conventional occlusion removal techniques, the information of the occlusion layers may be lost. Thus, elemental images have cracked parts, so the visual quality of the reconstructed 3D image is degraded. However, these cracked parts can be interpolated from adjacent elemental images. Therefore, in this Letter, we try to improve the visual quality of reconstructed 3D images by interpolating and regenerating virtual elemental images with adjacent elemental images after removing the occlusion layers. To prove our proposed method, we carry out optical experiments and calculate performance metrics such as the mean square error (MSE) and the peak signal-to-noise ratio (PSNR).

    Integral imaging, which was first proposed by Lippmann in 1908[1], has been used to develop next-generation three-dimensional (3D) imaging and display techniques. To obtain and visualize 3D images, two main steps are required: pickup and reconstruction. In pickup, rays from 3D objects can be captured through lenslet array on an image sensor such as charge-coupled device (CCD). These captured rays are multiple two-dimensional (2D) images with different perspectives for 3D objects, which are referred to as elemental images. In the reconstruction or display stage, these elemental images are printed or displayed on a display device, such as a liquid crystal display (LCD), through the homogeneous lenslet array used in pickup so 3D images can be observed without special viewing glasses. Integral imaging does not require the coherent light source that is used in holography. In addition, it provides full color, full parallax, and continuous viewing points of 3D objects. Especially, it provides depth information of 3D objects using a passive imaging system. Therefore, it can be applied to occlusion removal techniques for 3D objects[211].

    Since integral imaging can obtain multi-view information of 3D objects, a depth map may be generated by varying the ratios of the viewing point and elemental images. Thus, the occlusion may be removed by classifying between objects and occlusion layers using the depth map and elemental images. However, this method has two main problems. The first problem is the resolutions of the elemental images and depth map are very low due to the lenslet array based integral imaging. Another problem is the information of elemental images may be lost for the occlusion removal.

    In this Letter, to solve these problems, we propose an elemental image regeneration method of 3D integral imaging for occluded objects using a plenoptic camera. A plenoptic camera, which is a modified version of an integral imaging system, can record a light field (location and direction of rays) by placing the main imaging lens in front of the lenslet array. It can capture a depth map with high resolution and an all-in-focus image in a single shot, and it can simplify the process for the conventional occlusion removal technique. To record elemental images with high resolutions, in this Letter, we used synthetic aperture integral imaging (SAII)[12] by Jang et al.

    SAII can capture an elemental image with the same resolution as the one of the image sensor by replacing the lenslet array with a camera array for to improve the resolution of elemental images. Finally, the cracked parts of the elemental images may be interpolated from adjacent elemental images to enhance them. Since elemental images have multi-view information, it is possible to interpolate the cracked parts from adjacent elemental images, which can be carried out by inverse computational integral imaging reconstruction (CIIR)[2,1316].

    First, we present our proposed method. A light field is a vector function presented for the position and angle of rays. In general, a light field can be defined as 5 dimensions that consist of 3D space coordinates and 2D angles. This is referred to as 5D light field. However, light intensity is invariant in an optical system, according to the brightness invariant principle. Therefore, a 5D light field can be redefined as a 4D light field. A plenoptic camera records this 4D light field so it can adjust the position of the focal plane for the image or estimate the depth map.

    A 4D light field function is shown in Fig. 1(a). It is rays from the objects, which can be recorded as the intersection points of two 2D planes. That is, L(x,y,u,v) are coordinates (x,y) and (u,v) on the XY and UV planes, respectively. They are the same as the intersection coordinates on one of the 2D planes and the angles of two axes. The concept of a 4D light field for a plenoptic camera is illustrated in Fig. 1(b). The difference between a plenoptic camera and a conventional camera is the lenslet array placed in front of the main imaging lens. In a conventional camera, rays can be recorded on the coordinates of the image sensor (i.e., 2D information). However, in a plenoptic camera, the intersection coordinates of rays on two planes can be found by imaging object rays through both the main lens and the lenslet array.

    4D light field function. (a) Overview and (b) plenoptic camera.

    Figure 1.4D light field function. (a) Overview and (b) plenoptic camera.

    Plenoptic cameras can reconstruct the image focused at a certain position by recording the 4D light field. This technique is called refocusing, which can be implemented by shifting the virtual image sensor plane. It is very simple and is carried out by shifting and averaging sub-aperture images, as shown in Fig. 2[17]. It is similar to CIIR, but its equations are different because it uses the light field function.

    Sub-aperture images.

    Figure 2.Sub-aperture images.

    For simplicity, let us consider the movement of 2D virtual image sensor plane X, as shown in Fig. 3. The X plane is placed a distance F from the U plane, and light field L(x,u) is rays passing through the u coordinate of the U plane and the x coordinate of the X plane. For refocusing, when the X plane is moved to the X plane, where it is a distance F from the U plane, the recorded light field L on the X plane can be described as the movement of the x coordinate on L. When the expanding coefficient is α=F/F, L can be written as follows[18]: L(x,u)=L(u+xuα,u).This equation can be extended as a 4D light field: L(x,y,u,v)=L(u+xuα,v+yvα,u,v).Moving the virtual image sensor plane is the same as moving the position of the XY plane in the recording coordinates of the light field. It is well known that a 2D image can be transformed from a 4D light field by integrating the light field. Therefore, the image at distance F, EF can be described as follows: EF(x,y)=1F2L(x,y,u,v)dudv.By substituting Eq. (2) into Eq. (3), we can obtain the following equation: EF(x,y)=1α2F2L[u(11α)+xα,v(11α)+yα,u,v]dudv.From Eq. (4), it can be seen that the image can be reconstructed by shifting and averaging the x, y coordinates of light field L and expanding the image with expanding coefficient α. That is, when the uth column and vth row sub-aperture image is L(u,v), the 2D image can be transformed by shifting and averaging sub-aperture images instead of elemental images in integral imaging. Therefore, Eq. (4) can be rewritten[18]EF(x,y)=1α2F2L(u,v)[u(11α)+xα,v(11α)+yα]dudv.

    Refocusing.

    Figure 3.Refocusing.

    The depth map can be estimated using the light field. This function is included in Lytro software, but the algorithm has not been opened to the public. Thus, in this Letter, we present our own depth map estimation. The depth map has a 16-bit grayscale and its brightness is determined by the Lambda used for refocusing. LambdaMin is brightness 0, and LambdaMax is brightness 2161. To estimate the physical distance from these Lambdas, the calibration process is required because the brightness of the depth map is different from the physical distance.

    Regeneration of elemental images has two main stages: occlusion removal, and interpolation of cracked parts caused by occlusion removal. Table 1 describes the system parameters and their definition, and the regenerated images are shown in Fig. 4.

    Image of equation (k=3, l=3): (a) EI, (b) D, (c) OL, (d) OREI, (e) VEI, and (f) REI.

    Figure 4.Image of equation (k=3, l=3): (a) EI, (b) D, (c) OL, (d) OREI, (e) VEI, and (f) REI.

    ParameterDefinition
    CSize of image sensor
    DDepth maps
    EIElemental image
    MMovement of virtual object
    NNumber of pixels for each elemental image
    OLOcclusion layer
    OREIOcclusion removed elemental image
    REIRegenerate elemental image
    SgNumber of shifted pixels for regeneration
    SrNumber of shifted pixels for reconstruction
    ThThreshold for occlusion layer
    VEIVirtual elemental image
    dReconstruction distance
    fFocal length of image sensor
    pMoving gap between image sensor

    Table 1. Definition of Parameters

    Occlusion removal can be implemented by using a certain threshold of the depth map. Let the depth map be D and the occlusion layer be OL. Then, OL can be written as OL(k,l)(x,y)={1,D(k,l)(x,y)Th0,otherwise,where OL(k,l) is the kth column and the lth row occlusion layer, x, y give the pixel position, and Th is the threshold value for occlusion removal. Then, occlusions can be removed from the elemental images. Let the elemental image be EI and elemental image with occlusions removed be OREI. OREI is written as OREI(k,l)(x,y)={EI(k,l)(x,y),OL(k,l)(x,y)=10,otherwise.Since the elemental image has a lot of zero brightness after occlusion removal, the visual quality of the elemental image may be degraded. Thus, in this Letter, the elemental image is interpolated and regenerated by using adjacent elemental images. Regeneration can be carried out by shifting adjacent elemental images and generating the virtual elemental image.

    OREI can be split by intensities of the depth map, as follows: Sliced(k,l)(x,y,t)={OREI(k,l)(x,y),D(k,l)(x,y)=t0,otherwise(Th<ttmax),where t is the intensity of the depth map, and tmax is the maximum intensity of the depth map. Then, the movement of elemental images is calculated as depicted in Fig. 5(a). Let (k,l) be the coordinates of the current regenerated image and (m,n) be the coordinates of the image for interpolation. The movements of the elemental image in the x and y directions, Mx and My, are written by Mx(k,m)=(km)p,My(l,n)=(ln)p,where p is the distance between cameras for SAII. Using these movements, shifting pixels for each elemental image as shown in Fig. 5(b) can be written as follows: Sgx(k,m,t)=EIxfMx(k,m)cxz(t),Sgy(l,n,t)=EIyfMy(l,n)cyz(t),where EIx and EIy are the number of pixels for the image sensor in the x and y directions, and z(t) is the function transformed by the intensity of the depth map in relation to the physical distance. This function depends on the specifications of the plenoptic camera and the calibration methods. Thus, the virtual elemental image VEI can be written as VEI(k,l)(x,y)=1O(x,y)t=Th+1tmaxm=1Kn=1LSliced(m,n)(x+Sgx(k,m,t),y+Sgy(l,n,t),t),where O(x,y) is the superposition matrix for CIIR. Equation (11) is the inverse of CIIR. Finally, the regenerated elemental image can be obtained from OREI and VEI as follows: REI(k,l)(x,y)={OREI(k,l)(x,y),OL(k,l)(x,y)=1VEI(k,l)(x,y),otherwise.To prove our proposed method, we carried out computer simulations. The parameters are described in Table 1. In CIIR, the shifting pixels of each elemental image Srx and Sry are as follows: Srx=Nxpfcxd,Sry=Nypfcyd.Finally, the reconstructed 3D image at distance d can be obtained by the following equation: I(x,y,d)=1O(x,y)k=0K1l=0L1REI(k,l)(xkSrx,ylSry).Next, we show the experimental results. The depth map from the Lytro software cannot present the physical depth. Thus, we need to place the reference object at a fixed distance and measure the distance by stereo matching. Therefore, we can estimate the relation between the intensity of the depth map and the physical depth. In the pickup stage, LYTRO ILLUM is used, where the resolution of the camera is 2022(H)×1404(V), the focal length of camera lens f=70mm, the refocus range is 400–750 mm, and the distance between cameras p=10mm. When we know the shifting pixels (S) between the two elemental images, we can calculate the depth d using the following equation: d=NxpfcxS=2022×10×7036×S.

    Overview of the algorithm. (a) Movement of elemental images and (b) shifted pixels for regeneration.

    Figure 5.Overview of the algorithm. (a) Movement of elemental images and (b) shifted pixels for regeneration.

    Table 2 and Fig. 6 show the measurement results. These depths are converted to the intensity of the depth map t. As shown in Fig. 6, the relation between the intensity of the depth map and the physical depth is linear. Thus, the linear approximation can be found by the least squares method. Therefore, the experimental equation for the transformation between the intensity of the depth map and the physical depth is z(t)=7.33t522.1.This equation is used to calculate Sgx and Sgy in the regeneration of elemental images.

    Linear approximation results.

    Figure 6.Linear approximation results.

     NearMiddleFar
    S (px)1057157
    d (mm)374.4553.8689.8
    t118149164

    Table 2. Calibration Results

    In our experiment, there are two pickup scenarios: with occlusion and without occlusion. 3D objects without occlusion are used to calculate the mean square error (MSE) and the peak signal-to-noise ratio (PSNR) as follows: MSE=E[(RefIinput)2],PSNR=20log10(MAXIMSE),where E[·] is the expectation operator, Ref is the reference image, Iinput is the reconstruction results, and MAXI is the maximum pixel intensity of the image. The occlusion is placed in front of the left shoulder of the object.

    Figure 7 shows the reconstructed 3D images at the reconstruction distance d=620mm for each method. The reconstructed 3D images using the elemental images without occlusion shown in Fig. 7(a) are the references for the MSE and PSNR. Figures 7(b) and 7(c) show the reconstructed 3D images using elemental images with the occlusion removed conventionally and the reconstructed 3D images by our proposed method, respectively. As shown in the enlarged figure, the characters “BF-37” on the shoulder of the object can be easily recognized in Fig. 7(f). To evaluate the visual quality of the reconstructed 3D images, we calculate the MSE and PSNR as shown in Fig. 8. Thus, we see our proposed method can obtain better results. (The MSE is improved by 60%, and the PSNR is improved by 7dB).

    Experimental results at d=620 mm: (a) original, (b) conventional occlusion removal, (c) proposed method, (d)–(f) enlarged views of (a)–(c), respectively.

    Figure 7.Experimental results at d=620mm: (a) original, (b) conventional occlusion removal, (c) proposed method, (d)–(f) enlarged views of (a)–(c), respectively.

    Quality of image using MSE and PSNR.

    Figure 8.Quality of image using MSE and PSNR.

    In this Letter, we propose a regeneration technique for elemental images in integral imaging using a plenoptic camera after removing the occlusions. In conventional methods, the image information may be lost after occlusion removal. On the other hand, in our proposed method, the image information can be interpolated by adjacent elemental images after occlusion removal. However, our method has some problems. The visual quality of the regenerated elemental images depends on the accuracy of the depth map or calibration. In addition, our method uses an averaging process for the 3D reconstruction, so high spatial frequencies may be lost. We will look for a solution to these problems in the future.

    Min-Chul Lee, Kotaro Inoue, Cheol-Su Kim, Myungjin Cho. Regeneration of elemental images in integral imaging for occluded objects using a plenoptic camera[J]. Chinese Optics Letters, 2016, 14(12): 121101
    Download Citation