Parallel principles are the most effective way how to increase parallel computer performance and parallel algorithms (PA) too. In this sense the paper is devoted to a complex performance evaluation of chosen PA. At first the paper describes very shortly PA and then it summarized basic concepts for performance evaluation of PA. To illustrate the analyzed evaluation concepts the paper considers in its experimental part the results for real analyzed examples of discrete fast Fourier transform (DFFT). These illustration examples we have chosen first due to its wide application in scientific and engineering fields and second from its representation of similar group of PA. The basic form of parallel DFFT is the one-dimensional (1-D), unordered, radix–2 algorithm which uses divide and conquer strategy for its parallel computation. Effective PA of DFFT tends to computing one – dimensional FFT with radix greater than two and computing multidimensional FFT by using the polynomial transfer methods. In general radix - q DFFT is computed by splitting the input sequence of size s into q sequences each of them in size n/q, computing faster their q smaller DFFT’s, and then combining the results. So we do it for actually dominant asynchronous parallel computers based on Network of workstations (NOW) and Grid systems.
Published in |
American Journal of Networks and Communications (Volume 3, Issue 5-1)
This article belongs to the Special Issue Parallel Computer and Parallel Algorithms |
DOI | 10.11648/j.ajnc.s.2014030501.12 |
Page(s) | 15-28 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2014. Published by Science Publishing Group |
Parallel Computer, NOW, Grid, Parallel Algorithm (PA), Matrix PA, Decomposition, Performance Modeling, Optimization, Issoeficiency Function, Numerical Integration, Discrete Fast Fourier Transform (DFFT), Overhead Function
[1] | Abderazek A. B., Multicore systems on chip - Practical Software/Hardware design, Imperial college press, pp. 200, 2010 |
[2] | Arora S., Barak B., Computational complexity - A modern Approach, Cambridge University Press, pp. 573, 2009 |
[3] | Casanova H., Legrand A., Robert Y., Parallel algorithms, CRC Press, 2008 |
[4] | Desel J., Esperza J., Free Choise Petri Nets, Cambridge University Press, UK, pp. 256, 2005 |
[5] | Dubois M., Annavaram M., Stenstrom P., Parallel Computer Organization and Design, pp. 560, 2012 |
[6] | Dubhash D.P., Panconesi A., Concentration of measure for the analysis of randomized algorithms, Cambridge University Press, UK, 2009 |
[7] | Edmonds J., How to think about algorithms, Cambridge University Press, UK, pp. 472, 2010 |
[8] | Gelenbe E., Analysis and synthesis of computer systems, Imperial College Press, pp. 324, 2010 |
[9] | Hager G., Wellein G., Introduction to High Performance Computing for Scientists and Engineers, pp. 356, July 2010 |
[10] | Hanuliak P., Analytical method of performance prediction in parallel algorithms, The Open Cybernetics and Systemics Journal, Vol. 6, Bentham, UK, pp. 38-47, 2012 |
[11] | Hanuliak M., Unified analytical models of parallel and distributed computing, American J. of Networks and Communication, Science PG, Vol. 3, No. 1, USA, pp. 1-12, 2014 |
[12] | Hanuliak M., Hanuliak I., To the correction of analytical models for computer based communication systems, Kybernetes, Vol. 35, No. 9, UK, pp. 1492-1504, 2006 |
[13] | Hanuliak M., Performance modeling of Nov and Grid parallel computers, AD ALTA – Vol. 3, issue 2, Hradec Kralove, Czech republic, pp. 91-96, 2013 |
[14] | Harchol-BalterMor, Performance modeling and design of computer systems, Cambridge University Press, UK, pp. 576, 2013 |
[15] | Hwang K. and coll., Distributed and Parallel Computing, Morgan Kaufmann, pp. 472, 2011 |
[16] | John L. K., Eeckhout L., Performance evaluation and benchmarking, CRC Press, 2005 |
[17] | Kshemkalyani A. D., Singhal M., Distributed Computing, University of Illinois, Cambridge University Press, UK, pp. 756, 2011 |
[18] | Kirk D. B., Hwu W. W., Programming massively parallel processors, Morgan Kaufmann, pp. 280, 2010 |
[19] | Kostin A., Ilushechkina L., Modeling and simulation of distributed systems, Imperial College Press, pp. 440, 2010 |
[20] | Kumar A., Manjunath D., Kuri J., Communication Networking , Morgan Kaufmann, pp. 750, 2004 |
[21] | Kushilevitz E., Nissan N., Communication Complexity, Cambridge University Press, UK, pp. 208, 2006 |
[22] | Kwiatkowska M., Norman G., and Parker D., PRISM 4.0: Verification of Probabilistic Real-time Systems, In Proc. of 23rd CAV’11, Vol. 6806, Springer, pp. 585-591, 2011 |
[23] | Le Boudec Jean-Yves, Performance evaluation of computer and communication systems, CRC Press, pp. 300, 2011 |
[24] | McCabe J., D., Network analysis, architecture, and design (3rd edition), Elsevier/ Morgan Kaufmann, pp. 496, 2010 |
[25] | Meerschaert M., Mathematical modeling (4-th edition), Elsevier, pp. 384, 2013 |
[26] | Misra Ch. S.,Woungang I., Selected topics in communication network and distributed systems, Imperial college press, pp. 808, 2010 |
[27] | Miller S., Probability and Random Processes, 2nd edition, Academic Press, Elsevier Science, pp. 552, 2012 |
[28] | Peterson L. L., Davie B. C., Computer networks – a system approach, Morgan Kaufmann, pp. 920, 2011 |
[29] | Resch M. M., Supercomputers in Grids, Int. J. of Grid and HPC, No.1, pp. 1-9, 2009 |
[30] | Riano l., McGinity T.M., Quantifying the role of complexity in a system´s performance, Evolving Systems, Springer Verlag, pp. 189–198, 2011 |
[31] | Ross S. M., Introduction to Probability Models, 10th edition, Academic Press, Elsevier Science, pp. 800, 2010 |
[32] | Takahashi D., Kanada Y.: High-performance radix-2, 3 and 5 parallel 1-D complex FFT algorithms for distributed-memory parallel computers, J. of Supercomputing, 15, Kluwer Academic Publishers, The Netherlands, pp. 207-228, 2000 |
[33] | Wang L., Jie Wei., Chen J., Grid Computing: Infrastructure, Service, and Application, CRC Press, 2009 www pages |
[34] | www.top500.org |
[35] | www.intel.com |
[36] | www.spec.org. |
APA Style
Peter Hanuliak, Juraj Hanuliak. (2014). Complex Performance Modeling of Parallel Algorithms. American Journal of Networks and Communications, 3(5-1), 15-28. https://doi.org/10.11648/j.ajnc.s.2014030501.12
ACS Style
Peter Hanuliak; Juraj Hanuliak. Complex Performance Modeling of Parallel Algorithms. Am. J. Netw. Commun. 2014, 3(5-1), 15-28. doi: 10.11648/j.ajnc.s.2014030501.12
AMA Style
Peter Hanuliak, Juraj Hanuliak. Complex Performance Modeling of Parallel Algorithms. Am J Netw Commun. 2014;3(5-1):15-28. doi: 10.11648/j.ajnc.s.2014030501.12
@article{10.11648/j.ajnc.s.2014030501.12, author = {Peter Hanuliak and Juraj Hanuliak}, title = {Complex Performance Modeling of Parallel Algorithms}, journal = {American Journal of Networks and Communications}, volume = {3}, number = {5-1}, pages = {15-28}, doi = {10.11648/j.ajnc.s.2014030501.12}, url = {https://doi.org/10.11648/j.ajnc.s.2014030501.12}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajnc.s.2014030501.12}, abstract = {Parallel principles are the most effective way how to increase parallel computer performance and parallel algorithms (PA) too. In this sense the paper is devoted to a complex performance evaluation of chosen PA. At first the paper describes very shortly PA and then it summarized basic concepts for performance evaluation of PA. To illustrate the analyzed evaluation concepts the paper considers in its experimental part the results for real analyzed examples of discrete fast Fourier transform (DFFT). These illustration examples we have chosen first due to its wide application in scientific and engineering fields and second from its representation of similar group of PA. The basic form of parallel DFFT is the one-dimensional (1-D), unordered, radix–2 algorithm which uses divide and conquer strategy for its parallel computation. Effective PA of DFFT tends to computing one – dimensional FFT with radix greater than two and computing multidimensional FFT by using the polynomial transfer methods. In general radix - q DFFT is computed by splitting the input sequence of size s into q sequences each of them in size n/q, computing faster their q smaller DFFT’s, and then combining the results. So we do it for actually dominant asynchronous parallel computers based on Network of workstations (NOW) and Grid systems.}, year = {2014} }
TY - JOUR T1 - Complex Performance Modeling of Parallel Algorithms AU - Peter Hanuliak AU - Juraj Hanuliak Y1 - 2014/07/31 PY - 2014 N1 - https://doi.org/10.11648/j.ajnc.s.2014030501.12 DO - 10.11648/j.ajnc.s.2014030501.12 T2 - American Journal of Networks and Communications JF - American Journal of Networks and Communications JO - American Journal of Networks and Communications SP - 15 EP - 28 PB - Science Publishing Group SN - 2326-8964 UR - https://doi.org/10.11648/j.ajnc.s.2014030501.12 AB - Parallel principles are the most effective way how to increase parallel computer performance and parallel algorithms (PA) too. In this sense the paper is devoted to a complex performance evaluation of chosen PA. At first the paper describes very shortly PA and then it summarized basic concepts for performance evaluation of PA. To illustrate the analyzed evaluation concepts the paper considers in its experimental part the results for real analyzed examples of discrete fast Fourier transform (DFFT). These illustration examples we have chosen first due to its wide application in scientific and engineering fields and second from its representation of similar group of PA. The basic form of parallel DFFT is the one-dimensional (1-D), unordered, radix–2 algorithm which uses divide and conquer strategy for its parallel computation. Effective PA of DFFT tends to computing one – dimensional FFT with radix greater than two and computing multidimensional FFT by using the polynomial transfer methods. In general radix - q DFFT is computed by splitting the input sequence of size s into q sequences each of them in size n/q, computing faster their q smaller DFFT’s, and then combining the results. So we do it for actually dominant asynchronous parallel computers based on Network of workstations (NOW) and Grid systems. VL - 3 IS - 5-1 ER -