An efficient parallel processing technique and a scalable architecture of lifting Discrete Wavelet Transform (DWT) were proposed for high-speed implementation. The independence of the same stage of lifting operations of the multiple input data for lifting DWT was exploited to increase the amount of parallelism, which efficiently increases the data processing capacity of system. New high throughput architecture of 1-D (5, 3) DWT was introduced especially. The throughput of the proposed design can be flexibly extended to 2J-samples/clock-cycle, which provides a new alternative for high-speed implementation of DWT. Theoretical analysis and experimental results show the effectiveness of the proposed design.