Parallel Pipelined Hardware Acceleration of Fast Fourier Transforms on FPGA

Simi Zerine Sleeba, Karthika B.R., Krishna Sudhan, Lakshmi Priya Unni, Lin John

Abstract


The Fast Fourier Transform (FFT) is widely used in digital signal processing ap-plications and particularly for implementing convolution operation for real-time object detection using CNN. This paper proposes an efficient hardware architecture for Radix-2 FFT computation, implemented on an FPGA, employing multiple parallel and pipelined stages of butterfly units. The proposed architecture utilises Block RAM to store inputs and twiddle factor values to compute the transform. The hardware for the proposed architecture is synthesised on a Zync Ultrascale FPGA and its performance is evaluated using parameters such as critical path delay, throughput, device utilisation and power consumption.The performance of the proposed parallel pipelined architecture for 8 point FFT, measured in FFTOPS, is found to be 67% higher than the non-pipelined architecture. Performance comparison with the state-of-the-art parallel pipelined methods confirm the acceleration achieved by the proposed FFT architecture. A comprehensive comparison of the proposed hardware with the synthesised version of the FFT IP core bundled with the Vivado Design suite is also presented in the paper.

Keywords


Fast Fourier Transform; Hardware accelerator ;Block RAM ;Field Programmable Gate Array ;Parallel Pipelining;

References


A. Chahardahcherik, Y. S. Kavian, O. Strobel, and R. Rejeb. (2011) ’Implementing FFT algorithms on FPGA’, International Journal of Computer Science and Network Security, Vol. 11, No. 11, pp.148—156.

W. Cooley and J. W. Tukey(1965) ’An algorithm for the machine calculation of complex Fourier series’, Mathematics of Computation, Vol. 19, No. 90, pp.297–301.

K. Harikrishna, T. R. Rao, and V. A. Labay (2011) ’FPGA implementation of FFT algorithm for IEEE 802.16 e (mobile wimax)’, International Journal of Computer Theory and Engineering, Vol. 3, No. 2, pp.197–203.

I. Az, S. Sahin and M. A. Cavuslu (2007) ‘Implementation of Fast Fourier and Inverse Fast Fourier Transforms in FPGA’, Proceedings of 3rd International Symposium on Electrical, Electronic and Computer Engineering , pp. 7–10.

V. Kitsakis, K. Nakos, D. Reisis, and N. Vlassopoulos (2018) ’Parallel memory accessing for FFT architectures’, Journal of Signal Processing Systems, Vol. 90, No. 11, pp.1593–1607.

T.Y. Sun and Y.H. Yu (2009) ’Memory usage reduction method for FFT implementations on DSP based Embedded System’, Proceedings of IEEE 13th International Symposium on Consumer Electronics, pp.812—815.

B. S. C. Varma,, K. Paul, and M. Balakrishnan (2013) ’Accelerating 3D-FFT using hard embedded blocks in FPGAs’, Proceedings of the 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded Systems, pp.92–97.

G. P. Kumar, M. S. Chandra, K. S. Prasanna, and M. Mahesh (2021) ’Parallel memory accessing for FFT architectures’, Journal of Signal Processing Systems, Vol. 90, No. 11, pp.1593–1607.

I.S.Uzun, A.Amira and A.Bouridane (2005) ’FPGA Implementations of Fast Fourier Transforms for Real-Time Signal and Image Processing’, IEE Proceedings - Vision, Image, and Signal Processing,Vol. 152, No. 3, pp. 283–295.

R.V.Benadikar and V. Jayasree (2018) ’Comparative Study Of FPGA Implementation Of Parallel 1-D FFT Algorithm Using Radix-2 And Radix-4 Butterfly Elements’, IOSR Journal of VLSI and Signal Processing, Vol. 8, No. 4, pp. 23–47.

J. Y. Kim and M. H. Sunwoo (2004) ’Design of Address Generation Unit for audio DSP’, IEEE International Symposium on Intelligent Signal Processing and Communication Systems, pp. 616–619.

Z. Kaya, M. Garrido and J. Takala (2023) ’Memory-Based FFT Architecture With Optimized Number of Multiplexers and Memory Usage’, IEEE Transactions on Circuits and Systems-II:Express Briefs, Vol. 70, No. 8, pp.3084–3088.

Y.N. Chang and K. K. Parhi (2003) ’An efficient pipelined FFT architecture’, IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, Vol. 50, No. 6, pp. 322–325.

P.K.Godi, B.T.Krishna and P. Kottipalli (2020) ’Design optimisation of multiplier-free parallel pipelined FFT on field programmable gate array’, IET Circuits, Devices and Systems,Vol.14, No. 7, pp. 995-1000.

M.D. Hameed Pasha and P. Radhika (2018) ’A Pipelined FFT Architecture to Process Two Independent Data Streams’, Journal of Emerging Technologies and Innovative Research,Vol.5, No. 1, pp.802-807.

S. Josue Saenz, J. J. Raygoza P., E. C. Becerra A., S. O. Cisneros and J. R. Dominguez, ’FPGA design and implementation of radix-2 Fast Fourier Transform algorithm with 16 and 32 points’, 2015 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), Mexico, 2015, pp. 1-6.

A. Changela, M. Zaveri and D. Verma,’FPGA implementation of high-performance, resource-efficient Radix-16 CORDIC rotator based FFT algorithm’, Integration,Vol. 73,pp. 89-100.

M.S.Gilan and B. Maham, ’Optimized power and speed of Split-Radix, Radix-4 and Radix-2 FFT structures’. EURASIP Journal of Advanced Signal Processing, 81(2024).

H.N. Nguyen, S.A. Khan, C. Kim and J.Kim (2018) ’A pipelined processor using an optimal hybrid rotation scheme for complex multiplication: Design, FPGA implementation and Analysis’, Electronics,Vol.7, No. 8, pp. 137.

A.S. Raghuvanshi, A. Kumar and S. Gavel (2021) ’Throughput Radix8 Based FFT Architecture’, Proceeding of Fifth International Conference on Microelectronics, Computing and Communication Systems, Lecture Notes in Electrical Engineering, vol 748. Springer, Singapore.

M. Garrido(2022). ’A Survey on Pipelined FFT Hardware Architectures’, Journal of Signal Processing Systems, 94(11), 1345-1364.

Carl Ingemarsson and Oscar Gustafsson (2018). ’SFF—The Single-Stream FPGA-Optimized Feedforward FFT Hardware Architecture’. Journal of Signal Processing Systems, 90(11),1583–1592.

Hsu, S. C., Shen-ju-huang, Chen,.S. G., Lin, S. C. and Garrido, M. (2020). ’A 128-point multi-path SC FFT architecture’. Proceedings of the IEEE International Symposium of Circuits and Systems (pp. 1–5).

M. Garrido, S.J. Huang and S.G. Chen (2018).’Feedforward FFT Hardware Architectures based on Rotator Allocation’, IEEE Transactions on Circuits and Systems Part I, 65(2), 581–592.

Z. Kaya and M. Garrido (2024) ’Optimised 4 Parallel 1024-point MSC FFT’, IEEE Access, Vol.12,84110-84121.

G.-T. Deng, M. Garrido, S.-G. Chen, and S.-J. Huang (2023), “Radix-2ˆk MSC FFT architectures,” IEEE Access, vol. 11, pp. 81497–81510, 2023.

N. Shirazi, P.M. Athanas, and A.L. Abbott (1995). ’Implementation of a 2-D fast Fourier transform on an FPGA based custom computing machine Field-Programmable Logic and Applications, Lecture Notes in Computer Science, vol 975.

J. S. Kim, C. -L. Yu, L. Deng, S. Kestur, V. Narayanan and C. Chakrabarti, ’FPGA architecture for 2D Discrete Fourier Transform based on 2D decomposition for large-sized data’, IEEE Workshop on Signal Processing Systems, Tampere, Finland,121-126.

D. Jeon, M. Seok, C. Chakrabarti, D. Blaauw and D. Sylvester, ’A super-pipelined energy efficient subthreshold 240 MS/s FFT core in 65 nm CMOS’, IEEE Journal on Solid-State Circuits, 47(1), 23– 34.

V.Nair, M. Chatterjee, N. Tavakoli, A.Siami Namin and C. Snoeyink(2020) ’Optimizing CNN using Fast Fourier Transformation for Object Recognition’.arXiv 234-239. 10.1109/ICMLA51294.2020.00046.

Y. Zhang, F. Li, H.Xu, X. Li, X.Jiang, S. Efficient Convolutional Neural Networks Utilizing Fine-Grained Fast Fourier Transforms. Electronics 2024, 13, 3765.


Full Text: PDF

Refbacks

  • There are currently no refbacks.


 

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)
ISSN 2089-3272

Creative Commons Licence

This work is licensed under a Creative Commons Attribution 4.0 International License.

web analytics
View IJEEI Stats