Design and Performance Analysis of a Fast 4-Way Set Associative Cache Controller using Tree Pseudo Least Recently Used Algorithm

Mohamed Alfian Al-Zikry Hazlan, Teddy Surya Gunawan, Mashkuri Yaacob, Mira Kartiwi, Fatchul Arifin

Abstract


In the realm of modern computing, cache memory serves as an essential intermediary, mitigating the speed disparity between rapid processors and slower main memory. Central to this study is the development of an innovative cache controller for a 4-way set associative cache, meticulously crafted using VHDL and structured as a Finite State Machine. This controller efficiently oversees a cache of 256 bytes, with each block encompassing 128 bits or 16 bytes, organized into four sets containing four lines each. A key feature of this design is the incorporation of the Tree Pseudo Least Recently Used (PLRU) algorithm for cache replacement, a strategic choice aimed at optimizing cache performance. The effectiveness of this controller was rigorously evaluated using ModelSim, which generated a comprehensive timing diagram to validate the design's functionality, especially when integrated with a segmented main memory of four 1KB banks. The results from this evaluation were promising, showcasing precise logic outputs within the timing diagram. Operational efficiency was evidenced by the controller's swift processing speeds: read hits were completed in a mere three cycles, read misses in five and a half cycles, and both write hits and misses in three and a half cycles. These findings highlight the controller's capability to enhance cache memory efficiency, striking a balance between the complexities of set-associative mapping and the need for optimized performance in contemporary computing systems. This study not only demonstrates the potential of the proposed cache controller design in bridging the processor-memory speed gap but also contributes significantly to the field of cache memory management by offering a viable solution to the challenges posed by traditional cache configurations.

Keywords


Cache controller; VHDL-designed; Finite State Machine; 4-way set associative cache; Tree PLRU algorithm

References


M. Gupta, L. Bhargava, and S. Indu, "Mapping techniques in multicore processors: current and future trends," The Journal of Supercomputing, vol. 77, pp. 9308-9363, 2021.

A. Farshin, A. Roozbeh, G. Q. Maguire Jr, and D. Kostić, "Make the most out of last level cache in intel processors," in Proceedings of the Fourteenth EuroSys Conference 2019, 2019, pp. 1-17.

P. Chauan, G. Singh, and G. Singh, "Cache controller for 4-way set-associative cache memory," International Journal of Computer Applications, vol. 129, no. 1, p. 8887, 2015.

P. Visconti, R. Velazquez, C. D.-V. Soto, and R. De Fazio, "FPGA based technical solutions for high throughput data processing and encryption for 5G communication: A review," TELKOMNIKA (Telecommunication Computing Electronics and Control), vol. 19, no. 4, pp. 1291-1306, 2021.

G. Kaur, R. Arora, and S. S. Panchal, "Implementation and Comparison of Direct mapped and 4-way Set Associative mapped Cache Controller in VHDL," in 2021 8th International Conference on Signal Processing and Integrated Networks (SPIN), 2021: IEEE, pp. 1018-1023.

N. Beckmann, H. Chen, and A. Cidon, "{LHD}: Improving cache hit rate by maximizing hit density," in 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), 2018, pp. 389-403.

E. Liu, M. Hashemi, K. Swersky, P. Ranganathan, and J. Ahn, "An imitation learning approach for cache replacement," in International Conference on Machine Learning, 2020: PMLR, pp. 6237-6247.

I. Lokegaonkar, D. Nair, and V. Kulkarni, "Enhancement of cache memory performance," in 2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), 2021: IEEE, pp. 1490-1492.

C. Griner, S. Schmid, and C. Avin, "CacheNet: Leveraging the principle of locality in reconfigurable network design," Computer Networks, vol. 204, p. 108648, 2022.

M. Jhamb, R. Sharma, and A. Gupta, "A high level implementation and performance evaluation of level-I asynchronous cache on FPGA," Journal of King Saud University-Computer and Information Sciences, vol. 29, no. 3, pp. 410-425, 2017.

M. R. Khalil, L. A. Mohammed, and O. N. Yousif, "Customer application protocol for data transfer between embedded processor and microcontroller systems," TELKOMNIKA (Telecommunication Computing Electronics and Control), vol. 19, no. 3, pp. 801-808, 2021.

Y. S. Watile and A. Khobragade, "FPGA Implementation of cache memory," International Journal of Engineering Research and Applications (IJERA), vol. 3, no. 3, pp. 283-286, 2013.

V. S. Bhure and P. R. Chakole, "Design of cache controller for multi-core processor system," International Journal of Electronics and Computer Science Engineering, 2012.

S. Kumar and P. Singh, "An overview of modern cache memory and performance analysis of replacement policies," in 2016 IEEE International Conference on Engineering and Technology (ICETECH), 2016: IEEE, pp. 210-214.

A. Alsharef, P. Jain, M. Arora, S. R. Zahra, and G. Gupta, "Cache memory: an analysis on performance issues," in 2021 8th international conference on computing for sustainable global development (INDIACom), 2021: IEEE, pp. 184-188.

W. Stallings, Computer Organization and Architecture: Designing for Performance. Pearson, 2018.

M. T. Banday and M. Khan, "A study of recent advances in cache memories," in 2014 International Conference on Contemporary Computing and Informatics (IC3I), 2014: IEEE, pp. 398-403.

S. S. Omran and I. A. Amory, "Design of two dimensional reconfigurable cache memory using FPGA," in 2016 5th International Conference on Electronic Devices, Systems and Applications (ICEDSA), 2016: IEEE, pp. 1-8.

D. Grund and J. Reineke, "Toward precise PLRU cache analysis," in 10th International Workshop on Worst-Case Execution Time Analysis (WCET 2010), 2010: Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.

B. U. I. Khan, R. F. Olanrewaju, R. N. Mir, A. R. Khan, and S. Yusoff, "A Computationally Efficient P-LRU based Optimal Cache Heap Object Replacement Policy," International Journal of Advanced Computer Science and Applications, vol. 8, no. 1, 2017.

H. S. Mahmood and S. S. Omran, "Pipelined MIPS processor with cache controller using VHDL implementation for educational purposes," in 2013 International Conference on Electrical Communication, Computer, Power, and Control Engineering (ICECCPCE), 2013: IEEE, pp. 82-87.

X. Chen, L.-W. Chang, C. I. Rodrigues, J. Lv, Z. Wang, and W.-M. Hwu, "Adaptive cache management for energy-efficient GPU computing," in 2014 47th Annual IEEE/ACM international symposium on microarchitecture, 2014: IEEE, pp. 343-355.

S. Srivastava and P. Singh, "HCIP: Hybrid Short Long History Table-based Cache Instruction Prefetcher," International Journal of Next-Generation Computing, vol. 13, no. 3, 2022.

B. Kumar, A. K. Bhosale, M. Fujita, and V. Singh, "Validating multi-processor cache coherence mechanisms under diminished observability," in 2019 IEEE 28th Asian Test Symposium (ATS), 2019: IEEE, pp. 99-995.

B. J. LaMeres, Quick Start Guide to Verilog. Springer, 2019.


Full Text: PDF

Refbacks

  • There are currently no refbacks.


 

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)
ISSN 2089-3272

Creative Commons Licence

This work is licensed under a Creative Commons Attribution 4.0 International License.

web analytics
View IJEEI Stats

503 Service Unavailable

Service Unavailable

The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

Additionally, a 503 Service Unavailable error was encountered while trying to use an ErrorDocument to handle the request.