The discrete wavelet Transform (DWT) has been studied and developed in various scientific and engineering fields. Its multi-resolution and locality nature facilitates application required for progressiveness in capturing high-frequency details. However, when dealing with enormous data volume, the performance may drastically reduce. The multi-resolution sub-band encoding provided by DWT enables for higher compression ratios, and progressive transformation of signals. The widespread usage of the DWT has motivated the development of fast DWT algorithms and their tuning on all sorts of computer systems. However, this transformation comes at the expense of additional computational complexity. Achieving real-time or interactive compression/de-compression speed, therefore, requires a fast implementation of DWT that leverages emerging parallel hardware systems. The recent advancement in the consumer level multicore hardware is equipped with Single Instruction and Multiple Data (SIMD) power.In this study, Parallel Discrete Wavelet Transform has been developed with novel Adaptive Load Balancing Algorithm (ALBA). The DWT is parallelized, partitioned, mapped and scheduled on single core and Multicore. The Parallel DWT is developed in C# for single and Intel Quad cores as well as the combination of C and CUDA is implemented on GPU. This brings the significant performance on a consumer level PC without extra cost.
Published in | Science Journal of Circuits, Systems and Signal Processing (Volume 2, Issue 2) |
DOI | 10.11648/j.cssp.20130202.11 |
Page(s) | 22-28 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2013. Published by Science Publishing Group |
Discrete Wavelet Transform, Multicore, GPUs
[1] | Intel Corporation, "Intel Nehalem 2010" Available from: http://www.intel.com/technology/architecture-silicon/next-gen. |
[2] | Singhal, R., "Inside Intel ® Nest Generation Nehalem Microarchitecture." 2009. |
[3] | NVidiaCorporation, "NVIDIA CUDA Compute Unified Device Architecture, Programming Guide, Version 4.0" 2012. |
[4] | NVidiaCorporation, "NVIDIA GeForce GPU Architecture Overview, Technical Brief", November 2010. |
[5] | NVidiaCorporation,"GeForce GTX 260 2011", Available from:http://www.nvidia.com/object/product_geforce_gtx_260_us.html. |
[6] | NVidiaCorporation,"CUDA Software Development Kit 5.0", Available from:https://developer.nvidia.com/cuda-downloads.2012; |
[7] | NVidiaCorporation, "Get Started - Parallel Computing", 2012. |
[8] | JeiH. I., "Parallel Image Compression and Decompression based on Wavelet Principle on Multi-core Cluster", Inner Mongolian University, 2008. |
[9] | University of Tsinghus, "The writing group of the multi-core series of the textbooks: Multi-core programming", The Press of the Tsinghua University, Beijing2007. |
[10] | Ling Y., "Parallel wavelet analysis based on multi-core cluster.",2005. |
[11] | Chatterjee, S.B. Cache-Efficient Wavelet Lifting in JPEG 2000. in IEEE Conference on Multimedia and Expo. 2002. |
[12] | A. Shahbahrami, B.J., and S. Vassiliadis, "Improving the Memory Behavior of Vertical Filtering in the Discrete Wavelet Transform", Third Conf. Computing Frontiers (CF ’06)2006. p. 253-260. |
[13] | P. Meerwald, R.N., and A. Uh.,"Cache Issues with JPEG2000 Wavelet Lifting", SPIE Electronic Imaging, Visual Comm. and Image Processing. 2002. |
[14] | Daubechies, I., "The wavelet transforms time-frequency localization and signal analysis", IEEE Transitions on Information Theory, 2002. |
[15] | ImanElyasi, S.Z., "Elimination Noise by Adaptive Wavelet Threshold", World Academy of Science, Engineering and Technology 2000. |
[16] | Kaiser, G., Friendly Guide To Wavelets1994: Birkhauser. |
[17] | Polikar, R., "The Engineer's Ultimate Guide to Wavelet Analysis", Iowa State University 2000. |
APA Style
Mohammad Wadood Majid, Golrokh Mirzaei, Mohsin M. Jamali. (2013). Novel Implementation of Recursive Discrete Wavelet Transform for Real Time Computation with Multicore Systems on Chip (SOC). Science Journal of Circuits, Systems and Signal Processing, 2(2), 22-28. https://doi.org/10.11648/j.cssp.20130202.11
ACS Style
Mohammad Wadood Majid; Golrokh Mirzaei; Mohsin M. Jamali. Novel Implementation of Recursive Discrete Wavelet Transform for Real Time Computation with Multicore Systems on Chip (SOC). Sci. J. Circuits Syst. Signal Process. 2013, 2(2), 22-28. doi: 10.11648/j.cssp.20130202.11
AMA Style
Mohammad Wadood Majid, Golrokh Mirzaei, Mohsin M. Jamali. Novel Implementation of Recursive Discrete Wavelet Transform for Real Time Computation with Multicore Systems on Chip (SOC). Sci J Circuits Syst Signal Process. 2013;2(2):22-28. doi: 10.11648/j.cssp.20130202.11
@article{10.11648/j.cssp.20130202.11, author = {Mohammad Wadood Majid and Golrokh Mirzaei and Mohsin M. Jamali}, title = {Novel Implementation of Recursive Discrete Wavelet Transform for Real Time Computation with Multicore Systems on Chip (SOC)}, journal = {Science Journal of Circuits, Systems and Signal Processing}, volume = {2}, number = {2}, pages = {22-28}, doi = {10.11648/j.cssp.20130202.11}, url = {https://doi.org/10.11648/j.cssp.20130202.11}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.cssp.20130202.11}, abstract = {The discrete wavelet Transform (DWT) has been studied and developed in various scientific and engineering fields. Its multi-resolution and locality nature facilitates application required for progressiveness in capturing high-frequency details. However, when dealing with enormous data volume, the performance may drastically reduce. The multi-resolution sub-band encoding provided by DWT enables for higher compression ratios, and progressive transformation of signals. The widespread usage of the DWT has motivated the development of fast DWT algorithms and their tuning on all sorts of computer systems. However, this transformation comes at the expense of additional computational complexity. Achieving real-time or interactive compression/de-compression speed, therefore, requires a fast implementation of DWT that leverages emerging parallel hardware systems. The recent advancement in the consumer level multicore hardware is equipped with Single Instruction and Multiple Data (SIMD) power.In this study, Parallel Discrete Wavelet Transform has been developed with novel Adaptive Load Balancing Algorithm (ALBA). The DWT is parallelized, partitioned, mapped and scheduled on single core and Multicore. The Parallel DWT is developed in C# for single and Intel Quad cores as well as the combination of C and CUDA is implemented on GPU. This brings the significant performance on a consumer level PC without extra cost.}, year = {2013} }
TY - JOUR T1 - Novel Implementation of Recursive Discrete Wavelet Transform for Real Time Computation with Multicore Systems on Chip (SOC) AU - Mohammad Wadood Majid AU - Golrokh Mirzaei AU - Mohsin M. Jamali Y1 - 2013/04/02 PY - 2013 N1 - https://doi.org/10.11648/j.cssp.20130202.11 DO - 10.11648/j.cssp.20130202.11 T2 - Science Journal of Circuits, Systems and Signal Processing JF - Science Journal of Circuits, Systems and Signal Processing JO - Science Journal of Circuits, Systems and Signal Processing SP - 22 EP - 28 PB - Science Publishing Group SN - 2326-9073 UR - https://doi.org/10.11648/j.cssp.20130202.11 AB - The discrete wavelet Transform (DWT) has been studied and developed in various scientific and engineering fields. Its multi-resolution and locality nature facilitates application required for progressiveness in capturing high-frequency details. However, when dealing with enormous data volume, the performance may drastically reduce. The multi-resolution sub-band encoding provided by DWT enables for higher compression ratios, and progressive transformation of signals. The widespread usage of the DWT has motivated the development of fast DWT algorithms and their tuning on all sorts of computer systems. However, this transformation comes at the expense of additional computational complexity. Achieving real-time or interactive compression/de-compression speed, therefore, requires a fast implementation of DWT that leverages emerging parallel hardware systems. The recent advancement in the consumer level multicore hardware is equipped with Single Instruction and Multiple Data (SIMD) power.In this study, Parallel Discrete Wavelet Transform has been developed with novel Adaptive Load Balancing Algorithm (ALBA). The DWT is parallelized, partitioned, mapped and scheduled on single core and Multicore. The Parallel DWT is developed in C# for single and Intel Quad cores as well as the combination of C and CUDA is implemented on GPU. This brings the significant performance on a consumer level PC without extra cost. VL - 2 IS - 2 ER -