fbpx

A Hybrid Cam-Ram Architecture for Increasing Processor Performance


Call for Papers Engineering Journal, May 2019

Download Full-Text PDF Cite this Publication

Text Only Version

A Hybrid Cam-Ram Architecture for Increasing Processor Performance

MAHALAKSHMI.V

Research student Dept of ECE

VPMM Engineering college for women

Virudhunagar, India maha.heana@gmail.com

AJEETHA CHERIN.D

Research student Dept of ECE VPMMECW

Virudhunagar, India ajeetha233@gmail.com

SNEHA.S

Assistant professor Dept of ECE

VPMM engineering college for women

Virudhunagar, India snehsselvam@gmail.com

Abstract-In RAM memory, while command was passed from CPU to RAM memory there will be address mapping was made during recovery of data. This may take much time compare to CAM memory based address fetching. In this paper, we implement the new form of hybrid RAM with CAM memory in renaming and recovering the data. Here we also implement new form of CAM architecture to improve the speed of processing. Modern superscalar processors implement register Renaming using either random access memory (RAM) or Content- addressable memories (CAM) tables. The design of these Structures should address both access time and misprediction Recovery penalty. Although direct- mapped RAMs provide faster Access times, CAMs are more appropriate to avoid recovery Penalties. The presence of associative ports in CAMs, however, Prevents them from scaling with the number of physical registers And pipeline width, negatively impacting performance, area, and Energy consumption at the rename stage. In this paper, we present a new hybrid RAMCAM register renaming scheme, which combines the best of both approaches. In a steady state, a RAM provides fast and energy- efficient access to register mappings. On miss peculation, a low-complexity CAM enables immediate recovery. Experimental results show that in a four-way state-of the art superscalar processor, the new approach provides almost the same performance as an ideal CAM-based renaming scheme, while dissipating only between 17% and 26% of the original energy and, in some cases, consuming less energy than purely RAM-based renaming schemes. Overall, the silicon area required to implement the hybrid RAMCAM scheme does

not exceed the area required by conventional renaming mechanisms.

Key words- Content Addressable Memory, Match line (ML), Search Line, Low Power, High Speed.

I.INTRODUCTION

Content-addressable memory (CAM) is a special type of computer memory used in certain very high speed searching applications. Memory map is a structure of data that indicates how memory is laid out. Memory maps can have a different meaning in different parts of the operating system. A logical source register is renamed using its identifier to obtain the current mapping. This is performed faster and more efficiently in terms of energy with a RAM structure.

The RAM is directly indexed by a source register, whereas this register is compared against all current mappings in the CAM. CONTENT Addressable Memory (CAM) is an application specific memory that allows its entire contents to be searched within a single clock cycle. Binary CAM performs exact-match searches, while a more powerful Ternary CAM (TCAM) allows pattern matching with the use of dont cares. Dont cares act as wildcards during a search, and are particularly attractive for implementing longest-prefix-match searches in routing tables Dynamic storage of ternary data requires refresh operation and an embedded DRAM process, while static storage of ternary data requires considerable layout area.

DRAM based main memory significantly increases the power and cost budget of a computer system, new memory technologies such as Phase- change RAM (PRAM), Ferroelectric RAM (FRAM), and Magnetic RAM (MRAM) have been proposed to replace the DRAM. Among these memories, PRAM is the most promising candidate for large scale main memory because of its high density and low power consumption. In previous researches, a hybrid main memory approach of DRAM and PRAM is adopted to make up for the latency and endurance limits of PRAM. On the other hand, large amount of a main memory is used for page cache to hide disk access latency. Many page caching algorithms such as LRU, LIRS, and CLOCK-Pro are developed and show good performance, but these are only consider the main memory with uniform access latency and unlimited endurance. They cannot be directly adapted to the hybrid main memory architecture with PRAM.

In this paper, propose a new page caching algorithm for the hybrid main memory. It is designed to overcome the long latency and low endurance of PRAM. On the basis of the LRU replacement algorithm, we propose page monitoring and migration schemes to keep read-bound access pages to PRAM.

MEMORY MANAGEMENT

It is the act of managing computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and freeing it for reuse when no longer needed. This is critical to the computer system.

A page virtual page is a fixed-length contiguous block of virtual memory, and it is the smallest unit of data for the memory allocation performed by the operating system for a program and transfer between main memory and any other auxiliary store, such as a hard disk drive.

Virtual memory allows a page that does not currently reside in main memory to be addressed and used. If a program tries to access a location in such a page, an exception called a page fault is generated. The hardware or operating system is notified and

loads the required page from the auxiliary store (hard disk) automatically.

A program addressing the memory has no knowledge of a page fault or a process following it. Thus a program can address more (virtual) RAM than physically exists in the computer. Virtual memory is a scheme that gives users the illusion of working with a large block of contiguous memory space when in actuality most of their work is on auxiliary storage (disk). Fixed-size blocks (pages) or variable-size blocks of the job are read into main memory as needed.

This paper features: 1) a compact TCAM cell based on a novel 4T static storage cell and 2) a match sensing scheme that increases speed and reduces power consumption by limiting

The voltage swing of the match-lines (MLs) and minimizing the Switching activity of the search- lines (SLs). One method for avoiding soft errors in CAMs is to use DRAM cells, which have high soft- error immunity, as the storage instead of SRAM cells. Using DRAM cells, however, results in increased design complexity and fabrication costs. Another method for reducing the SER in CAM implements an embedded DRAM (e-DRAM) block alongside an SRAM-based CAM the e-DRAM block,

which includes ECC circuitry is used to continuously write correct data into the CAM. Thus, in the worst case, any soft error in the CAM is overwritten.

  1. EXISTING APPROACHS

    The existing work is used to reduce the overall power consumption and increase the searching speed by using power gated match line (ML) sense amplifier. The sense amplifier is used to reduce the power consumption of the content addressable memory (CAM).The existing system having some steps to find the power levels in each match lines during the compare operation. These steps are Initialize memory, Extract counting of 1s, Extract parity bit, Detect ML sensing line, Check power level.

    The fig(2) contains two separate metal rails such as VDDML and VDDL.The row based match line sense amplifier and its consists of a delay loop and a gated power transistor. The power gated transistor is denoted as Px .This type of transistor is used for self turn off after the comparison esult achieved.

    The comparison unit is defined as transistor M1 to M4.the SRAM unit is defined as cross coupled inverters. The content addressable memory and the priority encoder are powered by VDDML and VDDL respectively.

    In fig(3) initialize memory stage, first made the CAM memory architecture which gives condition to ML sensing line to detect power level. Here we give total content information of data address.

    Fig 2.CAM architecture

    In fig 2(a), there are two types of units are presented such as comparison unit and SRAM unit.

    Fig 2(a) power control unit for four rows

    Fig 3.Block Diagram Of CAM

    Psychology, memory is the process in which information is encoded, stored, and retrieved. Encoding allows information that is from the outside world to reach our senses in the forms of chemical and physical stimuli. In this first stage we must change the information so that we may put the memory into the encoding process. Storage is the second memory stage or process. This entails that we maintain information over periods of time. Finally the third process is the retrieval of information that we have stored. We must locate it and return it to our consciousness. Some retrieval attempts may be effortless due to the type of information.

    In extract counting of 1s stage, extract number of 1 in the stored words are counted and kept in the Counting bits segment. The data input for searching also made the same process. This

    information was first compare and finally search was done for matched number of ones.

    In Extract parity bit stage, extract parity bit for the stored words and keep it in parity bit segment. Comparing and checking are done as same as in counting bit formation. Content addressable memory(CAM) parity checking is the storing of a redundant parity bit representing the parity (odd or even) of a small amount of computer data (typically one byte) stored in random access memory, and the subsequent comparison of the stored and the computed parity to detect whether a data error has occurred. The parity bit was originally stored in additional individual memory chips; with the introduction of plug-in DIMM, SIMM, etc. modules, they became available in non-parity and parity (with an extra bit per byte, storing 9 bits for every 8 bits of actual data) versions.

    In Detect match line (ML) sensing line stage, extract the matching line in the register and also gives controlling for each ML sensing lines. These ML sensing lines consists of controlling block at each output. This compare power level and gives command to power supply.

    At the end of each ML line, there was a controlling block which contains CMOS arrangement. This block checks the power level of ML line by connecting ML line to gate terminal. If the power level exceeds threshold level, then it gives trigger to power supply to turn OFF. In this case the number of ones are calculated and it is collected from the data array .this number of ones are stored in another array and then perform the compare operation up to the output data are obtained. From this operation the size of the data array was reduced therefore the power and delay was reduced. The power was reduced by using one logic circuits.

  2. PROPOSED SYSTEM

    In our proposed technique, implement new form of CAM architecture to reduce the existing draw backs. In searching the data in CAM memory, first check the parity bit for input data and content in memory. Then we count number of ones in parity bit

    data to form as an index for fetching data. Then this was given to RAM memory to renaming or update the data. Due to this architecture, we can get correct fetched data .Also we can get speed process of fetching data by new CAM architecture .RAM is an acronym for random access memory, a type of computer memory that can be randomly accessed; that is, any byte of memory can be accessed without touching the preceding bytes. RAM is the most common type of memory found in computers and other devices, such as printers. Common usage, the term RAM is synonymous with main memory, the memory available to programs. For example, a computer with 8MB RAM has approximately 8 million bytes of memory that programs can use. In contrast, read only memory refers to special memory used to store programs that boot the computer and perform diagnostics. Most personal computers have a small amount of ROM (a few thousand bytes). In fact, both types of memory (ROM and RAM) allow random access. To be precise, therefore, RAM should be referred to as read/write RAM and ROM as read- only RAM. Here there are some modules are used to find the performance of the processor such as , Initialize memory, Extract counting of 1s, Extract parity bit, Detect ML sensing line, Update page. The update page is used to update the searched address and page information in RAM memory. In the figure

    4 the input data was given into the address calculation and its calculates the number of ones in the input.

    The address of the data was stored into the index. Here the decoder was used to searching the table basis of the index values. The data array is used to store the retrieve data from the decoder. The RAM and CAM having different address basis so the page mapping is used to create the address basis for the RAM.

    If the size of the data array was reduced means ,automatically the size of the cam array was decreased .Therefore the searching time of the input data was reduced that means the searching speed of the cam was increased and also its having low power for the searching operation. The proposed system having the high searching speed and low power in the computer networks, computer routers and data comperssors Tthe main advantage of this architecture is high speed and low power, area efficiency.

    Fig 4.Flow diagram

    The figure 4 shows that the flow diagram of the CAM-RAM architecture for increasing the processor performance.

  3. RESULTS AND COMPARISON

Fig 5.On-chip power by function

Fig 6.On-chip power over Vccint

Fig 7.Static current by supply

The CAM-RAM architecture having high searching speed when compared with the existing system .The CAM-RAM architecture is used to increasing processor performance that is low power, high speed ,area .

Fig 8.On-chip typical vs max. power

Fig 9.On chip power

Fig 6 to fig 9 show that the performance of the proposed system .The proposed system having the power consumption is 42.38 and its also reduces the size of the processor.

III.CONCLUSION

In our project, implement hybrid combination of RAM with CAM architecture .In our proposed method, we change the architecture of CAM design to speed up the process. Experimental result shows better performance result than traditional CAM architecture. We presented a renaming mechanism consisting of a RAM table and a low-complexity CAM table, as a hybrid design that took the best of both approaches. Experimental results showed that a two-way hybrid approach achieved small performance slowdowns (about 2% and 1% for integer and floating-point benchmarks, respectively) with respect to a four-way CAM-based renaming mechanism that was able to recover in one clock cycle. These small slowdowns were accompanied by a drastic reduction of the original associative searches carried out in the CAM-based approach to only 8% and 3%. Hybrid designs also reduced the dynamic energy by 16% and 12% with respect to the original CAM consumption, closing the dynamic energy consumption gap between CAM and RAM approaches.

ACKNOWLEDGMENT

The authors would like to thank staffs and parents for their support in the completion of this project.

REFERENCES

  1. Anh-Tuan Do, Shoushun Chen ,Zhi-Hui Kong,and Kiat Seng Yeo,A high speed low power CAM with a parity bit and power gated ML sensing IEEE VLSI systems,VOL.21,NO.1,JANUARY 2013,Pg 151

    156.

  2. A. T. Do, S. S. Chen, Z. H. Kong, and K. S. Yeo, A low-power CAM with efficient power and delay trade-off, in Prc. IEEE Int. Symp. Circuits Syst. (ISCAS), 2011, pp. 25732576.

  3. Mohan, W. Fung, D. Wright, and M. Sachdev, A low-power ternary CAM with positive- feedback match-line sense amplifiers, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 56, no. 3, pp. 566573, Mar. 2009.

  4. N. Mohan and M. Sachdev, Low-leakage storage cells for ternary content addressable memories, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 17, no. 5, pp. 604612, May 2009.

  5. O. Tyshchenko and A. Sheikholeslami, Match sensing using match line stability in content addressable memories (CAM), IEEE J. Solid- State Circuits, vol. 43, no. 9, pp. 19721981, Sep. 2008.

  6. Baeg, Low-power ternary content-addressable memory design using a segmented match line, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 55, no. 6, pp. 14851494, Jul. 2008.

  7. K. Pagiamtzis and A. Sheikholeslami, Content- addressable memory (CAM) circuits and architectures: A tutorial and survey, IEEE J. Solid- State Circuits, vol. 41, no. 3, pp. 712 727, Mar. 2006.

  8. K. Pagiamtzis and A. Sheikholeslami, A low- power content-addressable memory (CAM) using pipelined hierarchical search scheme, IEEE J. Solid-State Circuits, vol. 39, no. 9, pp. 15121519, Sep. 2004.

  9. Arsovski and A. Sheikholeslami, A mismatch- dependent power allocation technique for match-line sensing in content-addressable memories, IEEE J. Solid-State Circuits, vol. 38, no. 11, pp. 19581966,Nov. 2003.

  10. A. Moshovos, Checkpointing alternatives for high-performance, poweraware processors, in Proc. Int. Symp. Low Power Electron. Design, Aug. 2003, pp. 318321.

  11. K. C. Yeager, The MIPS R10000 superscalar microprocessor, IEEE Micro, vol. 16, no. 2, pp. 2840, Apr. 1996.

  12. B. Sinharoy, R. Kalla, W. J. Starke, H. Q. Le, R. Cargnoni, J. A. Van Norstrand, B. J. Ronchetti,

    J. Stuecheli, J. Leenstra, G. L. Guthrie, D. Q. Nguyen, B. Blaner, C. F. Marino, E. Retter, and

    P. Williams, IBM POWER7 multicore server processor, IBM J. Res. Develop., vol. 55, no. 3, pp. 191219, May/Jun. 2011.

  13. M. Butler, L. Barnes, D. D. Sarma, and B. Gelinas, Bulldozer: An approach to multithreaded compute performance, IEEE Micro, vol. 31, no. 2, pp. 615, Mar./Apr. 2011.

V.Mahalakshmi received the Bachelors degree in Electronics and Communication Engineering from the St.Xaviers catholic college of Engineering,India, and currently pursuing her Master degree in Electronics and Communication Engineering from

VPMM Engineering college for Women, India. She is interested in VLSI design.

D.Ajeetha cherin received the Bachelors degree in Electronics and Communication Engineering from the Kamaraj college of Engineering and Technology

,India, and currently pursuing her Master degree in Electronics and Communication Engineering from VPMM Engineering

college for Women, India.

S.Sneha received the Bachelors degree in Electronics and

Communication Engineering from the VPMM Engineering college for Women, India and received her Master degree from Kalasalingam University ,India .She is

currently a Ass istant Professor with the Department of Electronics and Communication Engineering, VPMM Engineering College for Women, India.

Leave a Reply

Your email address will not be published. Required fields are marked *