

Journal of Engineering Science and Technology Review 9 (5) (2016) 145 - 149

Research Article

JOURNAL OF Engineering Science and Technology Review

www.jestr.org

# Design and Simulation of 6T SRAM Cell Architectures in 32nm Technology

### G. Apostolidis, D. Balobas and N. Konofaos\*

Department of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece

Received 30 June 2015; Accepted 25 January 2016

### Abstract

A comparative study of various 6T SRAM cell layouts is presented at 32 nm, including four symmetric topologies. The comparison comprises two conventional cells, a thin cell, which is the current industry standard, and a recently proposed ultrathin cell. The evaluation is based on area efficiency, power dissipation and read/write delay, all of which are calculated with proper BSIM4 level simulations. The thin-cell appears to be the best topology in both power/delay performance and area.

Keywords: 6T SRAM cell, memory array, 32 nm, layout design, power dissipation, area, read/write delay.

#### 1. Introduction

SRAMs are widely used as cache memories in microprocessors because of their high speed operation and low power dissipation. The standard architecture of 6T (6 Transistor) SRAM cell continues to play a major role in nearly all VLSI systems due to its short access times and full compatibility with logic process technology [1].

As technology scaling continues, the lithographic challenges in printing and controlling the dimensions within the same printed layer in orthogonal directions have become increasingly difficult. This has led to restrictions in layout orientation and shape for printed layers requiring tight dimensional control [2]. Modern trends for 32 nm CMOS technology and beyond pose serious challenges in circuit design and require memory cells to occupy the smallest possible area while they meet lithographic constraints for achieving lower power dissipation, good operating stability, shorter response times and high performance. The optimal 6T layout topology depends on optimizing these factors [3].

In this work we present a comparison study of four topologies, designed under the 32 nm rules. Proper layouts are designed and presented with detailed information on transistor sizing and interconnections implementation. Furthermore, simulations demonstrate the performance of each topology regarding area, power dissipation, and read/write delay.

### 2. Standard 6T SRAM cell

The standard cell comprises six transistors, as shown in Figure 1. The nMOS access transistors (A1 and A2) located at the ends of circuit and a pair of cross-coupled inverters constitute memory cell. The nMOS elements (D1 and D2) of the latch are the driver transistors, while pMOS (P1 and P2) are the pull-up transistors. The access transistors operate when the word line is raised, for read or write operation, connecting the cell to the bit lines (Bit line, ~Bit line).

\* E-mail address: nkonofao@csd.auth.gr

The cell has three different operation modes. In the standby state, word line is not asserted, so access transistors are turned off. Therefore, cell cannot be accessed and two cross-coupled inverters will continue to feed back each other, as long as they are connected to the supply, and data will hold in the latch. The read operation starts by precharging the bit lines high, then allowing them to float. Afterwards, word line is asserted, turning on all access transistors. The data stored in the nodes are driven onto bit lines. A voltage difference is developed between bit lines and a sense amplifier detects the value of the cell.



Fig. 1. The standard 6T SRAM cell

During the write operation, the bit lines are driven to complementary voltage levels and then word line is raised. The data to be written into the cell are driven onto the bit lines and one of the storage nodes is discharged through the access transistor. The cross-coupled inverters raise the voltage on the opposite storage node and latch the cell. Thus, the new data overpowers the cross-coupled inverters. The central challenges in SRAM design are minimizing its size and ensuring that the circuitry holding the state is weak enough to be overpowered during a write cycle (writability), yet strong enough not to be disturbed during a read cycle (read stability) [4].

ISSN: 1791-2377 © 2016 Eastern Macedonia and Thrace Institute of Technology. All rights reserved.

### 3. Cell Categories

According to the categorization made by Ishida et. al [5], the 6T SRAM cells are divided into four variations that result from the different placement of the two inverters constituting the core of the 6T cell. The first type consists of two sub-types, making a total of five basic cells. Amongst the conventional 1-3 types, type 1b [6] presents characteristics suitable for deep nanoscaling, while type 2 [7] is the most popular design, having been widely used until the 90 nm generation. Due to the increasing lithography limitations of new technology nodes, the type 2 cell was replaced by the lithographically friendly type 4 cell [8], also known as the thin cell [9], which has been the industry standard since 65 nm [4]. The cell is long and skinny, reducing the critical bit line capacitance at the expense of longer word lines. Ishida's categorization has been recently expanded to include a type 5 category, introducing the type 5 ultra-thin cell [10], which, compared to the thin cell, is said to offer lower bit line capacitance, reduced metal complexity and notchless design for improved resistance to alignment induced device mismatch, thus adapting to the increasing scaling and lithographic restrictions.



Fig. 2. Summary of 6T SRAM cell layout topologies

The cell categories and corresponding types are described in Figure 2. The cells examined and compared in this work are: type 1b, type 2, type 4 and type 5.

#### 4. Design features of cell layouts

The layouts of the examined cell types were implemented using a standard 3-metal CMOS n-well process at the 32nm technology node. To ensure both read stability and writability, transistors must satisfy certain dimensional limitations. Additionally, in order to attain good layout density, transistors must be designed to be as small as possible. In general, driver transistors must be stronger than access transistors (read ability) and access transistors should prevail against pull-up transistors (writability) [11]. Hence, the W/L ratios of transistors that we set for all cells are the following: 6/2 for driver transistors, 4/2 for access transistors and 3/2 for pull-up transistors.

For signal routing, three metal layers are used. The connections within the core (latch) of cells are implemented with metal-1 wires and polysilicon gates, while input and output routing paths consist of metal-2 and metal-3 wires. Data and ~Data nodes represent cell outputs. In type 1b cell,

metal-1 wires are used for the supply voltage (Vdd) and ground (Vss), metal-2 wires for the bit lines and a metal-3 wire for the word line. In type 2 cell, metal-1 wires are used for the supply voltage and ground, metal-2 wires for the bit lines and the word line propagates through a long polysilicon line. In type 4 cell, metal-1 wires are used for the ground, metal-3 wire for the word line and metal-2 wires for the bit lines and supply voltage. In type 5 cell, metal-2 wires are used for the ground, supply voltage and bit lines, while word line is designed by metal-3 wire. Proper contacts are used for the connection of the cells with the various metal layers, the n-wells and the p-substrate. The layouts of the cells are illustrated in Figures 3, 4, 5 and 6, respectively.



Fig. 4. Layout of type 2 cell



Fig. 5. Layout of type 4 thin-cell



Fig. 6. Layout of type 5 ultra-thin cell

#### 5. Array layout design and area efficiency

Continuous downscaling of CMOS technology intensifies the efforts for much more compact structure and shrinking of circuit elements, in order to increase the capacity per unit of silicon area and decrease the delay time, due to shorter signal routes. The cells presented above were used for the construction and evaluation of memory arrays, thus we used each cell type to design 4x4 (16-bit) SRAM arrays. The layouts of all arrays are shown in Figure 8. Every array is implemented with the maximum area efficiency that the corresponding cell can provide, given the design rules followed. Hence, some cells are properly flipped horizontally or vertically in order to partially merge and overlap with adjacent cells. This results in different cells sharing the same polysilicon, diffusion or n-well areas, as well as metal wires and contacts. Furthermore, n-well taps and substrate contacts may be shared among multiple cells for additional area efficiency. As shown in Figure 7, type 4 cell presents the smallest area of  $3.186 \mu m^2$ , which is 14.9%, 32.6% and 36% less than type 2, 5 and 1b, respectively. Table 1 summarizes the width, height, area and bit density of the examined SRAM arrays.

Tab. 1. Area and bit density of 16-bit SRAM arrays

| 1 ab. 1. 1 ilou uii | <b>1 ab. 1.</b> The and ble density of 10 bit bit analys |                |               |                                       |  |  |
|---------------------|----------------------------------------------------------|----------------|---------------|---------------------------------------|--|--|
| Arrays<br>(16-bit)  | Width<br>(µm)                                            | Height<br>(µm) | Area<br>(μm²) | Bit Density<br>(μm <sup>2</sup> /bit) |  |  |
| Type 1b             | 2.340                                                    | 2.130          | 4.984         | 0.312                                 |  |  |
| Type 2              | 1.590                                                    | 2.355          | 3.744         | 0.234                                 |  |  |
| Type 4              | 2.655                                                    | 1.200          | 3.186         | 0.199                                 |  |  |
| Type 5              | 4.380                                                    | 1.080          | 4.730         | 0.296                                 |  |  |



Fig. 7. Area of 16-bit SRAM arrays

#### 6. Simulations and Results

All SRAM cells, as well as the 16-bit SRAM arrays, are simulated with five different operating voltages (0.8, 0.7, 0.6, 0.5 and 0.4 V) at room temperature (27° C) and 1 GHz frequency. For all the designs and simulations, a BSIM4 level model for low-leakage nMOS and pMOS transistors at 32nm is used.

#### 6.1. Write delay

Write delay is defined as the interval time between the assertion of word line and the recording of new data in bitcell nodes. To calculate the write delay, two cases need to be taken into account, from which we calculate and present the average value: writing '1' when the bit-cell contains '0' and writing '0' when the bit-cell contains '1'. There is no delay when writing the same binary value in bit-cell.

According to the simulations, the type 4 cell achieves the best write delay for all supply voltages. In contrast, type 5 cell has slightly worse write delay than the other cells. The write delay simulation results are presented in Table 2.

Table 2. Write delay of SRAM cells

| Calls   | Supply Voltage |         |        |        |        |  |
|---------|----------------|---------|--------|--------|--------|--|
| Cells   | 0.4 V          | 0.5 V   | 0.6 V  | 0.7 V  | 0.8 V  |  |
| Type 1b | 21 ps          | 12 ps   | 8.5 ps | 7.5 ps | 6.5 ps |  |
| Type 2  | 20 ps          | 11.5 ps | 8.5 ps | 7.5 ps | 6 ps   |  |
| Type 4  | 17 ps          | 10 ps   | 7.5 ps | 6 ps   | 5.5 ps |  |
| Type 5  | 22 ps          | 13 ps   | 10 ps  | 8 ps   | 7 ps   |  |

# 6.2. Read delay

To calculate the delay of the read operation, an external circuit has to be used for signal sensing. In this simulation, we use a large signal sensing method, specifically a pair of HI-skew inverters connected to the bit lines. The layout of the inverters pair is shown in Figure 9. The transistor sizes for the inverters are:  $Wp = 7\lambda$ ,  $Wn = 3\lambda$ ,  $Lp = Ln = 2\lambda$ .

Therefore, the read delay is defined as the interval time between the assertion of the word line and the rise of ~Output node when reading '0' or the rise of Output node when reading '1'. The average delay time of the two cases is considered.

After completion of the simulations, all cells provide similar read delay measurements, since the reading speed largely depends on the external circuit that is used, which is identical in all cases. The read delay simulation results are presented in Table 3.



Fig. 9. Layout of Hi-skew inverter pair

#### Table 3. Read delay of SRAM cells

| Cells   | Supply Voltage |       |       |       |       |  |
|---------|----------------|-------|-------|-------|-------|--|
|         | 0.4 V          | 0.5 V | 0.6 V | 0.7 V | 0.8 V |  |
| Type 1b | 21 ps          | 11 ps | 8 ps  | 6 ps  | 6 ps  |  |
| Type 2  | 20 ps          | 11 ps | 8 ps  | 6 ps  | 5 ps  |  |
| Type 4  | 20 ps          | 11 ps | 8 ps  | 6 ps  | 5 ps  |  |
| Type 5  | 20 ps          | 11 ps | 8 ps  | 6 ps  | 5 ps  |  |

## 6.3. Power dissipation

In order to calculate the average power dissipation of the cells, proper bit sequences are inserted to the bit lines to cover all the possible transactions. More specifically, the repeating sequence of transactions that each cell performs is: write '0' (writing 0 when data = 1), write '0' (writing 0 when data = 0), read (reading 0), write '1' (writing 1 when data = 0), write '1' (writing 1 when data = 1), read (reading 1).

Furthermore, all memory arrays are simulated under a certain input combination, which comprises a sequence of four write cycles, four read cycles and another four write and read circles, for a total of 16 ns. The rows are written and then read consecutively. Specific 4-bit words are used so that the input sequence is identical in every array's simulation, thus obtaining comparable results.

For all power simulations, the input sequences are properly set so that no external circuitry is needed for addressing, precharging etc. The measurements for the cells and corresponding arrays are shown in Table 4 and Table 5, respectively. These results are also shown in Fig. 10 and 11. The thin cell (type 4) presents the lowest power dissipation in all cases. The ultra-thin cell (type 5) presents the highest power dissipation among all cell simulations, as well as the array simulations with operating voltage of 0.8 and 0.7 V. The type 5 array gets better with voltage downscaling, though, presenting comparable and in some cases better results than type 1b and type 2 for 0.6, 0.5 and 0.4 V.

| Cells   | Supply Voltage |       |       |       |       |
|---------|----------------|-------|-------|-------|-------|
|         | 0.4 V          | 0.5 V | 0.6 V | 0.7 V | 0.8 V |
| Type 1b | 14 nW          | 25 nW | 37 nW | 52 nW | 71 nW |
| Type 2  | 13 nW          | 24 nW | 36 nW | 51 nW | 69 nW |
| Type 4  | 12 nW          | 22 nW | 32 nW | 45 nW | 62 nW |
| Type 5  | 15 nW          | 27 nW | 40 nW | 56 nW | 77 nW |

Table 5. Power dissipation of SRAM arrays

| Arrays   | Supply Voltage |        |        |        |        |  |
|----------|----------------|--------|--------|--------|--------|--|
| (16-bit) | 0.4 V          | 0.5 V  | 0.6 V  | 0.7 V  | 0.8 V  |  |
| Type 1b  | 112 nW         | 180 nW | 270 nW | 392 nW | 560 nW |  |
| Type 2   | 109 nW         | 176 nW | 260 nW | 369 nW | 522 nW |  |
| Type 4   | 99 nW          | 158 nW | 236 nW | 343 nW | 492 nW |  |
| Type 5   | 109 nW         | 174 nW | 262 nW | 438 nW | 599 nW |  |

#### 7. Conclusions

Various types of 6T SRAM cell layout architectures and corresponding 16-bit arrays have been implemented and compared at the 32 nm, in terms of area, power dissipation and read/write delay. The thin cell topology has proved to be the best design on all aspects. The recently proposed ultra-thin cell provides a more lithographically friendly alternative to the thin cell but introduces a significant penalty in area and power/delay performance, presenting overall worse results than the conventional designs.



Fig. 8. Layouts of 16 bit SRAM arrays



Fig. 10. Power dissipation of SRAM cells



Fig. 11. Power dissipation of SRAM arrays

This paper was presented at Pan-Hellenic Conference on Electronics and Telecommunications - PACET, that took place May 8-9 2015, at Ioannina Greece.

#### References

- 1. S.-M. Kang and Y. Leblebici, "CMOS Digital Integrated Circuits: Analysis and Design", McGraw Hill, 2002.
- Changhwan Shin, "Advanced MOSFET Designs and Implications for SRAM Scaling", Electrical Engineering and Computer Sciences University of California at Berkeley, pp. 1-3, May 2012, Technical Report Number: UCB/EECS-2012-50.
- B.H.Calhoun, Yu Cao, Xin Li, Ken Mai, L.T. Pileggi, R.A.Rutenbar, K.L.Shepard, "Digital circuit design challenges and opportunities in the era of nanoscale CMOS," Proceedings of the IEEE, vol. 96, issue 2, pp. 343–365, February 2008.
- 4. Neil H. E. Weste, David Money Harris, "CMOS VLSI Design A Circuits and Systems Perspective 4th Edition", Pearson Education, 2011.
- M.Ishida, T.Kawakami, A.Tsuji, N.Kawamoto, M.Motoyoshi, N.Ouchi, A novel 6T-SRAM cell technology designed with rectangular patterns scalable beyond 0.18 um generation and desirable for ultra high speed operation, in: IEEE Int. Electron Devices Meet. IEDM, pp. 201-204, 1998.
- M. Helm, et al, "A Low Cost, Microprocessor Compatible, 18.4 μm<sup>2</sup>, 6-T Bulk Cell Technology for High Speed SRAMs," Symp. on VLSI Tech., p.65, 1993.
- Y.Sambonsugi, T.Maruyama, K. Yano, H.Sakaue, H.Yamamoto, E. Kawamura, S.Ohkubo, Y.Tamura, T.Sugii, "A Perfect Process Compatible 2.491 μm<sup>2</sup> Embedded SRAM Cell Technology for 0.13 μm Generation CMOS Logic LSIs," Symp. on VLSI Tech., p.62, 1998.
- K. Osada et al., "Universal-VDD 0.65-2.0-V 32-kB cache using a voltage-adapted timing-generation scheme and a lithographically symmetrical cell," JSSC, vol. 36, no. 11, pp. 1738–1744, Nov. 2001.
- M.Khare et al., "A high performance 90nm SOI technology with 0.992 mm<sup>2</sup> 6T-SRAM cell", Proc. Intl. Electron Devices Meeting, pp. 407–410, 2002.
- Randy W. Mann, Benton H. Calhoun, "New category of ultrathin notchless 6T SRAM cell layout topologies for sub-22nm", Department of Electrical and Computer Engineering, University of Virginia, 11th Int'l Symposium on Quality Electronic Design, IEEE, 2010.
- E. Grossar, M. Stucchi, K. Maex, W. Dehaene, "Read stability and write-ability analysis of SRAM cells for nanometer technologies." IEEE Journal of Solid-State Circuits, vol. 41, no. 11, pp. 2577-2588, 2006.