# Networking Application of CAM with Improved Process Variation Tolerance Using Paritybits and MLSA

# <sup>1</sup>T. Leena and <sup>2</sup>AsstProff. Mr. P. Nagarajan

<sup>1</sup>PG Scholar Department Of Electronics And Communication Engineering Affiliated To Anna University <sup>2</sup>Research Scholar Department Of Electronics And Communication Engineering Affiliated To Anna University

Abstract: CAM provide highspeed search function in single clock cycle.CAM can be used to accelerate any applications ranging from local area networks, database management, filestorage management, pattern recognition, artificial intelligence. Networking applications requires low power and increased speed. In this we proposed a high speed low power CAM for data compression. CAMs can be used in data compression techniques as a translation table. Since a good portion of a compression algorithm's time is spent searching and maintaining these data structures, replacing them with a hardware search engine can greatly increase the throughput of the algorithm. Parity bit and an effective gated power techniques allocates less power to match decisions involving a larger number of mismatched bits. These scheme results in a significant CAM power reduction. This provides a performance advantage over other memory search algorithms. It will reduce the peak current, average power consumption (36%), boosted search speed (39%) and improved process variation tolerance.

*Index Terms:* Data compression, content addressable memory(CAM),match-line, Low power design,memory architecture,leakage currents.

#### I. Introduction

Unlike standard computer memory (random access memory or RAM) in which the user supplies a memory address and the RAM returns the data word stored at that address, a CAM is designed such that the user supplies a data word and the CAM searches its entire memory to see if that data word is stored anywhere in it.If the data word is found, the CAM returns a list of one or more storage addresses where the word was found (and in some architectures, it also returns the data word, or other associated pieces of data). Thus, a CAM is the hardware embodiment of what in software terms would be called an associative array. The data word recognition unit was proposed by Dudley Allen Buck in 1955. A major interface definition for CAMs and other Network Search Elements (NSEs) was specified in an Interoperability Agreement called the Look-Aside Interface (LA-1 and LA-1B) developed by the Network Processing Forum, which later merged with the Optical Forum (OIF). Numerous devices have been produced by Integrated Device Technology, Cypress Semiconductor, IBM, Broadcom and others to the LA interface agreement. On December 11, 2007, the OIF published the serial lookaside (SLA) interface agreement. CAM is designed to search its entire memory in a single operation, it is much faster than RAM in virtually all search applications. There are cost disadvantages to CAM however.

CAM uses additional circuitry and extra circuitry. The additional circuitry increases the physical size of the CAM chip which increases manufacturing cost. The extra circuitry also increases power dissipation since every comparison circuit is active on every clock cycle. Consequently, CAM is only used in specialized applications where searching speed cannot be accomplished using a less costly method.

One successful early implementation was a General Purpose Associative Processor IC and System. A CAM cell serves two basic functions: bit storage (as inRAM) and bit comparison (unique to CAM). There are three types of core cells are used such as NAND cell ,NOR cell and Hyprid type cell[2].

#### II. Cam Basics

We now take a more detailed look at CAM architecture. A small model is shown in Figure 1.It consist of core cells, searchline, matchline, MLSA and an encoder. Searchdata register is used to get the input from the user, and then it will be compared with the memory bank through search line. Matchline sensing amplifiers are used to sense the voltage variations from the matchline. Encoder is used to identify the location of the output [2].



Fig.1 Basic CAM Structure

#### III. Search Speed Boosting

Auxiliary bit is used to boost the search speed of the CAM. The parity bit based CAM design is shown in Fig. 2(b) consisting of the original data segment and an extraone-bit segment, derived from the actual data bits. In Conventional CAM additional bits are used to filter some mismatched CAM words before the actual comparison. These additional bits are derived from the databits. In Fig. 2.(a). the number of '1's from the stored words are counted and kept in the counted bitsegment[1].



Fig.2.Conceptual View of (a)Conventional CAM,(b)Parity bit based CAM

When a search operation starts, number of "1"s in the searchword is counted and stored. These additional bits are compared first and only those that have the same number of "1"s (e.g., the second and the fourth) are turned on in the second sensing stage for further comparison. Derivation of additional bits in this scheme will reduces the search speed. For increasing the search speed of the CAM Paritybits are used. In Fig. 2.(b) original data segment and an extraone-bit segment were derived from the actual data bits. In this the odd parity and an even parity will be calculated. However these paritybits reduces the sensing delay and boost the search speed.

## **IV.** Effective Gated Power Technique

The CAM architecture with an effective gated power technique is depicted in Fig.3.That will be organised into words(rows) and bits(coloumns). It uses P-type NOR CAM and an ML structure. Transistors M1-M4 will be acting as a comparison unit and the cross coupled inverters will be act as SRAM storage. These are powered by two separate metal rails, namely  $V_{DDML}$  and the  $V_{DD}$ . The  $V_{DDML}$  is controlled by a Power transistor Px. Leakage current is one of the sources of power dissipation in low power VLSI design. Due to the charging and discharging of matchline the leakage is getting stronger.



Fig.3. CAM architecture with an effective gated power technique.

As shown in Fig 3.the gated power transistor is controlled by a feedback loop, it will auto turn off the power supply, when the voltage on the matchline will reaches the certain threshold level. By the introduction of the power transistor Px, the driving strength of the 1-mismatch case is about 10% weaker than that of the conventional design and thus slower. As combining this sense amplifier with the parity bit scheme the overall search delay is improved by 39%[1]. This CAM architecture offers both low-power and high-speed operation. This will be used for Networking Applications.

## V. Data Compression

Data compression removes redundancy that resides in a givenpiece of information, producing an equivalent but shorter message.CAM is well suited for data compression because themovement of packets through local- or wide-area networks requiresome form of address translation. Since a good portion of a compression algorithm's time is spent searching and maintainingthese data structures, replacing them with a hardwaresearch engine can greatly increase the throughput of the algorithm.In a data compression application, CAM lookup is performed after each word of the original data is presented (Figure 4). If the code corresponding to the word bit pattern inthe input register is found, then the appropriate symbol or tokenis output and input register is flushed. If the code is notfound in the CAM, then another word is shifted in. A CAMwill generate a result in a single transaction regardless of tablesize or length of search list. This makes CAM an ideal candidatefor data compression schemes that use sparsely populated tables as part of their algorithm.



Figure 4: CAM in Data Compression is Used to ReplaceCommon Sequences with Tokens

#### VI. Results And Discussion

In this section, performance of the proposed design will be evaluated using the conventional circuit and those in [5], [6] as references. In[5], the power consumption is limited by the amount of charge injected to the ML at the beginning of the search.



Fig.5. one mismatch ML waveforms of the original and proposed architecture with paritybit during the search operation.

In [6], a similar concept is utilized with a positive feedback loop to boost the sensing speed. Bothdesigns are very power efficient. As will be shown latter, the proposed design consumes slightly higher power consumption when compared with [5] and [6] but is more robust against PVT variations.

We investigate the ability of the four designs to work at low supply voltage, by re-implementing the designs in [5], [6] and the conventional one into the same 65-nm technology. Designs in [5] and [6] demonstrate poor adaptability to voltage scaling. The conventional one into the same 65-nm technology. The conventional one into the same 65-nm technology in the conventional one into the same 65-nm technology. The conventional one into the same 65-nm technology in the conventional one into the same 65-nm technology. The conventional one into the



Fig.6.simulated transient current occurred on a row of 128 CAM cells during the compare cycle of the conventional CAM



Fig.7.simulated transient current occurred on a row of 128 CAM cells during the compare cycle of the proposed CAM

On the contrary, when the supply voltage scales to 0.5 V, both the proposed and the conventional design can work well. First, the search energy of the four designs in consideration is presented in Fig. 6. It can be seen that at 1 V supply voltage, [5] and [6] have the lowest energy consumption per search, followed by the proposed design. However, they cause to work when the supply voltage scales down to be low 0.9 V.

Between the conventional and the proposed design, the proposed design consumes 62% less power consumption at any supply voltage value.

Second, the sensing delay comparison is shown in Fig. 9 where the proposed design has 39% improvement when compared to the conventional design and is the fastest design.



Fig.8. ML sensing delay of the four designs in consideration against temperature variations. Sensing delay is defined as the sensing delay of the 1-mismatch ML, i.e., the worst-case scenario.

This figure also suggests that sensing delay increases dramatically when supply voltage enters the near-subthreshold region. Finally,the corresponding leakage currents of the four designs against voltage scaling is are shown in Fig. 10. The proposed design is the second-best circuit after the conventional design. Both of them have about 20% and37% lower leakage current when compared to [5] and [6] at 1 V, respectively. This feature confirms that the proposed design is more suitable for ultra-low power applications in 65-nm CMOS process and beyond.

## **Dynamic Power Consumption**

Because the power-gated transistor is turned off after the output is obtained at the sense amplifier, the proposed technique renders a lower average power consumption. This is mainly due to the reduced voltage swing on the ML bus. Another contributing factor to the reduced average power consumption is that the new design does not need to precharge the SL buses because the EN signal turns off transistor  $P_x$  of each row and hence the SL buses do not need to be pre-charged, which in turn saves 50% power on the SL buses. Fig. 8 illustrates the average energy consumption (divided into ML power and SL power) of the proposed design as compared to other three benchmark designs, including all the power overhead of the control circuitry.



Fig.9. Standby leakage current of the existing and proposed designs in consideration against supply voltage scaling from 1 to 0.5v.

Since [5], [6], and the proposed design do not pre-charge the SLs before each compare cycle, their SLs energy consumption is only half of that of the conventional circuit. As for the ML energy, at 1V supply voltage the proposed design only dissipates 0.41 fJ/search/bit while that of the conventional design is 1.148

fJ/search/bit. Our ML energy consumption is higher than that of [5] (10.8%) and [6] (32%) but as will be shown below, our proposed design is much more robust against process and environment variations.



Fig.10.Simulation result for the conventional precomputation CAM

In conventional precomputation CAM the number of one's present in the datalines are counted and then it will be used for further comparison stage. The power consumption is very high and also the search speed is very low.



Fig.11.Simulation result for the parity bit based CAM

In this the odd parity and even parity will be calculated then it will be used for further comparison stage. The power consumption is 36% lesser than conventional precomputation CAM and search speed is boosted 39% higher than precomputation CAM.

#### VII. Conclusion

We proposed a low power CAM with high speed for its networking application Data compression. Now a days data compression has achieved world wide acceptance. Data compression techniques can be effectively applied todiverse data types, including written natural language text, computer source and object code, bit-maps, numerical data, graphics, CAD data, map and terrain data, speech, music, scientific and instrument data, fax and half-tone data, gray-scale and color images, medical data and imagery, video, animation, and space data. A low power CAM can act as an address translater in data compression technique. In this work by calculating odd and even parity the search speed will be boosted. By using an effective gated power transistor the power supply will be auto turn offed and thus will save power. It provides several major advantages, namely reduced peak current (and thus IR drop), average power consumption (36%), boosted search speed (39%) and improved process variation tolerance.

## References

- [1]. A High Speed Low Power CAM With a Parity Bit and Power-Gated ML Sensing Anh-Tuan Do, Shoushun Chen, Zhi-Hui Kong, and Kiat Seng Yeo
- [2]. Arsovski and A. Sheikholeslami, "A mismatch-dependent power al-location technique for match-line sensing in content-addressable memories," IEEE J. Solid-State Circuits, vol. 38, no. 11, pp. 1958–1966, Nov. 2003
- [3]. S. Baeg, "Low-power ternary content-addressable memory design using a segmented match line," IEEE Trans. Circuits Syst. I, Reg.Papers, vol. 55, no. 6, pp. 1485–1494, Jul. 2008.
- [4] A. T. Do, S. S. Chen, Z. H. Kong, and K. S. Yeo, "A low-power CAM with efficient power and delay trade-off," in Proc. IEEE Int. Symp. Cir-cuits Syst. (ISCAS), 2011, pp. 2573–2576.

## Networking Application of CAM With Improved Process Variation Tolerance Using Paritybits And

- [5]. N. Mohan and M. Sachdev, "Low-leakage storage cells for ternary content addressable memories," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 17, no. 5, pp. 604-612, May 2009.
- [6]. N. Mohan, W. Fung, D. Wright, and M. Sachdev, "A low-power ternary CAM with positive-feedback match-line sense amplifiers," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 56, no. 3, pp. 566–573, Mar. 2009

  K. Pagiamtzis and A. Sheikholeslami, "Content addressable memory (CAM) circuits and architectures: A tutorial and survey,"
- ]7]. IEEE J. Solid-State Circuits, vol. 41, no. 3, pp. 712–727, Mar. 2006.
- K. Pagiamtzis and A. Sheikholeslami, "A low-power content addressable memory (CAM) using pipelined hierarchical search scheme," IEEE J. Solid-State Circuits, vol. 39, no. 9, pp. 1512–1519, Sep. 2004. [8].
- Tyshchenko and A. Sheikholeslami, "Match sensing using match-line stability in content addressable memories (CAM)," IEEE J. [9]. Solid-State Circuits, vol. 43, no. 9, pp. 1972-1981, Sep. 2008.
- Content-Addressablememory (CAM) and its network applications, Midas Peng and Sherri Azgomi Altera International Ltd. [10].