# High Performance Bridge Style Full Adder Structure Omid Kavehei<sup>a</sup> Said F.Al-Sarawi<sup>a</sup> Derek Abbott<sup>a</sup> and Keivan Navi<sup>b</sup> <sup>a</sup>Center for High Performance Integrated Technologies and Systems (CHiPTec) The University of Adelaide, SA 5005, Australia. <sup>b</sup>Micro&Nano-electronic Research Center, Shahid Beheshti University, Tehran, Iran. #### **ABSTRACT** Adders are the core element in arithmetic circuits like subtracters, multipliers, and dividers. Optimization of adders can be achieved at device, circuit, architectural, and algorithmic levels. In this paper we present a new optimize full adder circuit structure that provides an improved performance compared to standard and mirror types adder structures. The performance of this adder in terms of power, delay, energy, and yield are investigated. This paper also proposes a novel simulation setup for full adder cells that is suitable for analyzing full adder cells at the high frequency. The simulation results of this structure will take into account the process variations for a 90 nm CMOS process and present results based on post-layout simulation using Cadence and Synopsys tools. **Keywords:** Full Adder, Bridge Style Adders, Process Variation, Parametric Yield. #### 1. INTRODUCTION Adders play an important role in many applications, such as digital signal processing, multi-operand addition systems, multipliers and many complex arithmetic circuits. Many studies have been done to reduce area, delay, power, and energy of adder cells. <sup>1–5</sup> Many of these studies have been investigated different topologies to implement the cells with focus on delay-power trade-off without considering process variation. <sup>1, 2</sup> As technology is scaling down challenges faced by the designers become more complex. Variability, leakage power dissipation, and thermal aspects are some of the big challenges in nanometer circuits and systems design. <sup>6,7</sup> Today, circuit designers have to consider not only the trade-off between delay and power, but also the impact of process and environmental variations on the delay-energy and parametric yield trade-off. Consequently, delay of circuits, dynamic and static power consumption, and frequency are no longer deterministic and must be modeled statistically. To summarize, some performance criteria are considered in the design and evaluation of adder cells, design scalability, area, wiring complexity, delay, power consumption, robustness with respect to voltage and transistor scaling as well as varying process. <sup>1,8</sup> As a matter of fact, circuit robustness is becoming a key aspect in nanometer scale VLSI, where variation ranges of many process and environment parameters will increase dramatically. <sup>1,8,9</sup> CMOS design is robust against voltage and transistor sizing, however, using lower supply voltages can have a large impact on performance. <sup>10</sup> Thus, simulations have to be performed in the presence of random parameter variations. In this paper, we consider these key issues in the simulations. The paper is organized as follows: Backgroung studies are briefly discussed in Section 2. The mirror CMOS full adder and bridge style full adder cells are discussed in Section 3. Section 4 describes and compares the full adders cells. The proposed simulation-setup and statistical analysis are also discussed in Section 4. Finally, Section 5 concludes this paper. # 2. BACKGROUND STUDIES A full adder can be implemented by direct implementation of the sum and carry functions using conventional <sup>8</sup> or complex/bridge style<sup>4</sup> mirror CMOS structure. The conventional mirror full adder has been used for many years as a basic 1-bit full adder cell in digital signal processing and other applications. It is also usually used as a accepted benchmark for evaluating research outcomes in the area of adder design. <sup>1, 2, 4, 5</sup> This structure has 28 transistor which could cause more energy consumption and chip area. On the other hand, the mirror adder has a symmetric interconnections pattern for Further author information: (Send correspondence to Omid Kavehei) Omid Kavehei: E-mail: omid@eleceng.adelaide.edu.au Said Al-Sarawi: E-mail: alsarawi@eleceng.adelaide.edu.au Smart Structures, Devices, and Systems IV, edited by Said Fares Al-Sarawi, Vijay K. Varadan, Neil Weste, Kourosh Kalantar-Zadeh, Proc. of SPIE Vol. 7268, 72680D © 2008 SPIE · CCC code: 0277-786X/08/\$18 · doi: 10.1117/12.813924 pull-up and pull-down networks which provide an improvement in performance compared to the conventional structure. Both of the conventional and mirror full adder structures can be found in Ref. 8. However, there is no symmetry between implementation of carry and sum functions. Preliminary investigation to implement a fully symmetric 1-bit full adder cell with respect to the advantage of mirror structure is done in Ref. 5. The proposed class of CMOS circuits, called "bridge", has limited advantages in term of the transistor count and power consumption. Therefore, another type of bridge style full adder have been presented in Ref. 4, where a dense and symmetric structure are proposed. The main advantage of these novel structures is a significant reduction in the number of transistors (4 transistors reduction). At first glance, after layout implementation this could be found that one of the proposed full adders, called "full bridge", makes a symmetry between carry and sum layout implementations, which means that there are the same pattern for interconnections between NMOS (PMOS) transistors in carry and sum functions. This symmetry is very important from the manufacturing process point of view. <sup>8</sup> This paper investigates the parametric yield and energy-delay-voltage trade-off for the proposed full adder structures to mention the reliability of the adder cells in the presence of process variation. This paper also compares the area and input parasitic capacitances of the different full adder cells. #### 3. SYMMETRIC SUM AND CARRY FULL ADDER CELL Different logic styles can be investigated from different points of view. In other words, it is different design constraints imposed by the application that each logic style has its place in the cell library development. Even a selected style appropriate for a specific function may not be suitable for another one. For example, static approach presents robustness against noise effects, thus automatically provides a reliable operation. The issue of ease of design is not always attained easily. The CMOS design style is not area-efficient for complex gates with large fan-ins, hence, care must be taken when a static logic style is selected to realize a logic function. Pseudo NMOS technique is straightforward, yet it compromises noise margin and suffers from static power dissipation. Pass transistor logic (PTL) style is known to be a popular method for implementing some specific circuits such as multiplexers and XOR-based circuits, like adders. Dynamic logic circuits realize the fast, small and complex gates, however, this advantage is gained at the cost of parasitic effects such as charge sharing, noise sensitivity and leakage power dissipation. In general, none of the mentioned styles can compete with CMOS style in robustness and stability. 1, 4, 8 Fig. 1 shows the conventional (mirror) CMOS full adder with 28 transistors. <sup>8</sup> This structure calls CFA (Conventional Full-Adder). Bridge style focuses on meshes, and connects each two adjacent mesh by a transistor, named *bridge transistor*. Bridge transistors provide the possibility of sharing transistors between two different paths to create a new path from supply rails to output. These transistors must be arranged in such a way that not only validate the correctness of the circuit, but also preserve pull-up and pull-down networks mutually exclusive. <sup>4</sup> Sizing is an important issue for these transistors. More information about these transistors can be found in Ref. 5. As mentioned above, the full bridge style adder cell has a fully symmetric structure. As shown in Fig. 2, topology of the carry and sum are exactly the same (there is only a different in inputs-transistors assignment). This full bridge style adder cell is named FB (Full Bridge). FB also provides a fully symmetric layout, as shown in Fig. 2. Replacing the carry part of FB with the carry section of CFA has also been demonstrated as another structure of full adders which is called HB (Half Bridge), for reducing the number of branches from supply voltage or ground rails to the output. The mentioned adder cell and its layout, are shown in Fig. 3. Table 1 shows the clear improvement of the bridge schemes over the mirror cell in terms of transistor count, area and overall input parasitic capacitance. Reducing the overall parasitic capacitance at the inputs, affecting the total output nodes capacitance and the switching speed, especially in low voltage operations (Fig. 6). It is worth noting that the output capacitances in the three circuits are approximately the same. At first glance, these significant improvements clearly indicate great promise for better results in the post-layout simulation phase. The values in Table 1, were extracted from post-layout simulations for a 90 nm CMOS process. The similarity in the design of full bridge, FB, full adder cell is very promising for introducing a standard cell library with a structure like (bridge) carry or (bridge) sum. With such library a 1-bit full adder is only implemented by using four gates (two inverters + two bridge gates). It is also interesting to extend the application of this structure to the field-programmable gate array (FPGA) area. The wide range of adder application in digital signal processing (DSP) and the mentioned possible potentials for standard cell design and FPGA shows the significance of the proposed cell. Figure 1. Conventional CMOS full-adder, CFA, schematic and layout Table 1. Transistor count, Layout area, and Input parasitic capacitance comparisons, in a full-adder cell | Parameters | Cell Type | | | | |----------------------------------|-----------|------------|------------|------| | | | CFA | FB | HB | | No. of Transistors | 28 | 24 (14%) | 24 (14%) | | | Layout area $(\mu m^2)$ | 38.30 | 36.32 (5%) | 38.02 | | | | A | 2.34 | 1.59 (32%) | 1.77 | | Input Parasitic capacitance (fF) | | 2.48 | 1.69 (32%) | 1.74 | | | | 1.99 | 1.50 (25%) | 1.57 | ## 4. PERFORMANCE EVALUATION AND COMPARISON OF THE CIRCUITS Recently, variability aware design has been gaining attraction in the field of IC design and become one of the main challenges for future CMOS design. <sup>11</sup> Process variations result from non-uniform conditions during the fabrication process. Variations in process parameters like the doping profiles, effective gate length, oxide thickness, and diffusion depths are different instances of the process variations. Variation in electrical parameters, such as threshold voltage and sheet resistance, have been caused by uncertainty in the manufacturing process. Generally, there are two types of process variations, inter-die (or die-to-die, D2D) and intra-die (or within-die, WID). The WID variations plays an increasingly important role in meeting the power and frequency constraints. <sup>12, 13</sup> This paper investigates the impact of the global (D2D) variation on the three full adder cells. It is well known that ring oscillator is used as a common test bench to analyze the impact of process variation. A ring oscillator comprises a chain of cascaded odd number of inverters. It is able to put the inverters at the highest possible frequency. Consequently, the power consumption of the inverters reaches its maximum level. Fig. 4 proposes a ring structure to simulate the full adder cells at maximum operating frequency, where En1 and En2 are control signals to switch between active and standby modes. A similar topology is briefly mentioned in Ref. 3. Using the two controlling signals, En1 and En2, and choosing the appropriate input vector for standby mode, the cells can be simulated in term of static power dissipation during its standby mode (in four possible input vectors - ABC: 000, 100, 011, 111). This can be achieved by implementing a combination of NAND and NOR circuits. Figure 2. Full Bridge, FB, CMOS full-Adder schematic and layout The simulation results of these structures takes into account the process variations for a 90 nm CMOS process and present results based on post-layout simulation using Cadence and Synopsys tools. This study also shows that the proposed adder cells are the most efficient in terms delay, power and energy in the presence of intra-die process variations. The technology specifications for the variation sources are collected in Table 2. $^{14}$ **Table 2.** Technology specifications, where $\mathbf{L}_{INT}$ , $\mathbf{W}_{INT}$ , $\mathbf{V}_{TO}$ , and $\mathbf{N}_{DEP}$ are channel length offset, channel width offset, threshold voltage at zero bias, and doping concentration, respectively. | Parameters | $\mu$ (mean) | $3\sigma/\mu$ (%) | | |--------------------------------------------|--------------|-------------------|----| | $\mathbf{L}_{\mathrm{INT}} \; (nm)$ | 7.50 | 20 | | | $\mathbf{W}_{\mathrm{INT}}\left(nm\right)$ | 5.00 | 20 | | | $N_{ m DEP}({ m 10}^{18}/{ m cm}^3)$ | NMOS | 2.02 | 15 | | | PMOS | 1.53 | 20 | | $\mathbf{V}_{\mathrm{TO}}\left(v\right)$ | NMOS | 0.41 | 7 | | | PMOS | -0.36 | 8 | Regarding to $3\sigma/\mu$ values for delay that will be reported later, the adder cells show 15% to 40% variations toward the means. Obviously, many different device parameters such as, the effective channel length and width, the surface potential and junction depth, the built-in voltage of the source/drain junctions and threshold voltage, will be varied with the variations of these parameters. Fig. 5.a and 5.b shows $\mu_{\rm delay}$ and $\sigma_{\rm delay}$ of the cells at different nominal $V_{\rm dd}$ (for 90 nm technology), respectively. Figure 3. Half Bridge, HB, style CMOS full-adder, schematic and layout Figure 4. Proposed simulation-setup The results of this simulations are mentioned in Table 3.a, 3.b, and 3.c. Fig. 5 shows both $\mu_{\rm delay}$ and $\sigma_{\rm delay}$ increase dramatically by the decreasing of the supply voltage. Fig. 5, and Table 3, show that bridge style cells reduce the mean value of delay. Between the two preferable candidates, FB also shows better performance than HB. The average power dissipation in CMOS circuits is proportional to $V_{\rm dd}^2$ , frequency, and output capacitance. Thus, an approach to reducing the power consumption of CMOS circuits is reduce the supply voltage. Note that, actually, switching power dissipation is proportional to $V_{\rm dd} \times V_{\rm swing}$ . Here, $V_{\rm swing}$ , is output voltage swing and it is defined as the maximum peak voltage that the output can produce. In CMOS circuits it is equal to the supply voltage. Some design approaches, like pass-transistor logic (PTL) design, reduces the power consumption by using non-full-swing outputs. As it is shown in Fig. 5, using low $V_{\rm dd}$ significantly increases both delay and delay variability, and hence, significant **Figure 5.** Delay mean, $\mu$ , and standard deviation, $\sigma$ , for the full adder cells Figure 6. PDP and EDP of the cells increase in terms of the overall performance yield loss; nevertheless, according to Ref. 15 and Ref. 16, voltage scaling is still susceptible to provide an improvement in power consumption with respect to the performance yield. In Ref. 17 the authors demonstrate that $V_{\rm dd}/2$ is the best choice to satisfy both low-power and high-performance goals. Based on Ref. 17, if $V_{\rm dd}$ is scale down to $V_{\rm dd}/2$ , FB shows a considerable improvement in both power-delay product (PDP) and energy-delay product (EDP). These results are demonstrated in Fig. 6.a and 6.b. As can be seen in Fig. 6.b, EDP of FB has been continued to vigorously decrease, after 0.8 volt to 0.6 volt, in contrary to the other cells. This means that, FB is a good candidate for a high-performance design with regard to the total power consumption. Parametric yield loss is another important feature of the design. High-speed and low-power circuits design suffer from leakage, or static, power dissipation and low performance, respectively. As mentioned earlier, with advanced technologies, the circuit switching speed can be significantly increased at the cost of increased power dissipation. In other words, (dynamic) leakage power dissipation dominates the total power consumption in advanced technologies. <sup>7</sup> Therefore, it can Table 3. Mean $(\mu)$ and Standard deviation $(\sigma)$ values for the adder cells in the presence of process variations, considering $V_{\rm dd}$ down-scaling | (2) | CEA | |-----|-----| | (a) | CFA | | | $V_{dd}$ | 0.5 | 0.6 | 0.8 | 1.0 | 1.2 | |--------------------|-----------------------|-------|-------|-------|-------|-------| | Delay | $\mu_{\mathbf{D}}$ | 3.178 | 1.765 | 0.917 | 0.696 | 0.571 | | $(\mathbf{nS})$ | $\sigma_{\mathbf{D}}$ | 0.426 | 0.177 | 0.066 | 0.039 | 0.030 | | Energy | $\mu_{\mathbf{E}}$ | 22.32 | 32.50 | 60.63 | 96.95 | 149.3 | | $(\mathbf{fJ})$ | $\sigma_{\mathbf{E}}$ | 1.117 | 1.492 | 2.281 | 3.033 | 5.504 | | Power | $\mu_{\mathbf{P}}$ | 0.979 | 1.881 | 4.730 | 9.368 | 15.64 | | $(\mu \mathbf{W})$ | $\sigma_{\mathbf{P}}$ | 0.096 | 0.147 | 0.240 | 0.402 | 0.592 | | Yield (% | <b>o</b> ) | 73 | 80 | 87 | 92 | 93 | (b) FB | | $V_{dd}$ | 0.5 | 0.6 | 0.8 | 1.0 | 1.2 | |--------------------|-----------------------|-------|-------|-------|-------|-------| | Delay | $\mu_{\mathbf{D}}$ | 3.147 | 1.170 | 0.800 | 0.646 | 0.531 | | $(\mathbf{nS})$ | $\sigma_{\mathbf{D}}$ | 0.445 | 0.183 | 0.063 | 0.034 | 0.027 | | Energy | $\mu_{\mathbf{E}}$ | 24.49 | 36.57 | 68.80 | 109.2 | 168.6 | | $(\mathbf{fJ})$ | $\sigma_{\mathbf{E}}$ | 1.350 | 1.707 | 2.866 | 3.625 | 6.083 | | Power | $\mu_{\mathbf{P}}$ | 0.942 | 1.864 | 4.832 | 9.336 | 15.77 | | $(\mu \mathbf{W})$ | $\sigma_{\mathbf{P}}$ | 0.094 | 0.151 | 0.278 | 0.436 | 0.669 | | Yield (% | 5) | 72 | 78 | 85 | 94 | 95 | (c) HB | | $V_{dd}$ | 0.5 | 0.6 | 0.8 | 1.0 | 1.2 | |--------------------|-----------------------|-------|-------|-------|-------|-------| | Delay | $\mu_{\mathbf{D}}$ | 3.151 | 1.711 | 0.864 | 0.652 | 0.533 | | (nS) | $\sigma_{\mathbf{D}}$ | 0.425 | 0.179 | 0.058 | 0.037 | 0.027 | | Energy | $\mu_{\mathbf{E}}$ | 25.15 | 37.15 | 68.16 | 110.0 | 168.1 | | $(\mathbf{fJ})$ | $\sigma_{\mathbf{E}}$ | 1.353 | 1.587 | 2.521 | 3.407 | 5.935 | | Power | $\mu_{\mathbf{P}}$ | 1.034 | 1.996 | 5.037 | 9.882 | 16.44 | | $(\mu \mathbf{W})$ | $\sigma_{\mathbf{P}}$ | 0.104 | 0.148 | 0.266 | 0.421 | 0.659 | | Yield (% | 5) | 76 | 77 | 87 | 92 | 93 | be argued that a high-speed circuit should be a circuit with a huge amount of leakage currents, particularly in the nanometer circuits. On the contrary, in a low-power circuit, frequency constraint becomes a bottleneck in further reduction in power consumption. Power and frequency constraints are defined by coefficients of their nominal values. Parametric yield is also determined as the percentage of designs that satisfy the specified frequency and power constraints. These constraints can also be called *yield cut-offs*. Power constraint is the low yield cutoff and frequency constraint is the high yield cut-off. Based on the preceding discussion, the parametric yield window has two-sided constraint. However, in this study, the leakage power dissipation effect on the parametric yield is neglected, so the yield window is defined as timing yield. <sup>14</sup> Considering the impact of leakage power dissipation on the parametric yield could be investigate with a set of leakage reduction techniques as a future work. In this study we consider $0.9 \times f_{\rm nom}$ as frequency constraint, where $f_{\rm nom}$ is the nominal frequency. <sup>7</sup> Fig. 7 shows the timing/performance yield of the cells *vs* supply voltage. As shown in the figure, the curves are very close together on different supply voltages and this confirms the result of Ref. 14. Hence, EDP and PDP of the larger circuits will be significantly affected by using the proposed cells, without any significant degradation in the timing yield, but, it also seems possible that the parametric yield loss in that large circuits, like multipliers, would be significantly improved by using the cells. **Figure 7.** Performance Yield of the full adder cells vs $V_{\rm dd}$ ## 5. CONCLUSIONS AND FUTURE WORKS This paper investigates the bridge style full adder cells in the presence of process variations. All of the cells have also been embedded in the novel simulation-setup to extract the results at a high operation frequency. The standard deviations $(\sigma)$ and means $(\mu)$ of delay, power, and energy have been extracted based on a Monte-Carlo analysis. These results show that both of the energy-delay and power-delay products are significantly affected by using the full bridge style adder cell, without any considerable degradation in timing yield. Hence, energy saving in larger designs will be significantly improved with respect to the parametric yield loss. Considering, global and local variations, leakage power reduction techniques, and large circuits to investigate the It is also interesting to investigate the performance and parametric yield of such designs. ## **REFERENCES** - 1. R. Zimmermann and W. Fichtner, "Low-power logic styles: CMOS versus pass-transistor logic," *IEEE Journal of Solid-State Circuits* **32**(7), pp. 1079–1090, 1997. - 2. M. Alioto and G. Palumbo, "Impact of Supply Voltage Variations on Full Adder Delay: Analysis and Comparison," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* **14**(12), pp. 1322–1335, 2006. - 3. A. Beaumont-Smith and N. Burgess, "A GaAs 32-bit Adder," in *Proc. of IEEE Symposium Computer Arithmetic*, pp. 10–17, 1997. - 4. O. Kavehei, M. Azghadi, K. Navi, and A. Mirbaha, "Design of Robust and High-Performance 1-bit CMOS Full Adder for Nanometer Design," in *Proc. of ISVLSI*, pp. 10–15, 2008. - 5. K. Navi, O. Kavehie, M. Rouholamini, A. Sahafi, and S. Mehrabi, "A Novel CMOS Full Adder," in *Proc. of the 20th International Conference on VLSI Design*, pp. 303–307, IEEE Computer Society Washington, DC, USA, 2007. - 6. B. Amelifard, F. Fallah, and M. Pedram, "Leakage Minimization of SRAM Cells in a Dual- $V_t$ and Dual- $t_{ox}$ Technology," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* **16**(7), pp. 851–860, 2008. - 7. D. Sylvester, K. Agarwal, and S. Shah, "Variability in nanometer cmos: Impact, analysis, and minimization," *Integration, the VLSI Journal* **41**(3), pp. 319–339, 2008. - 8. N. Weste and K. Eshraghian, Principles of CMOS VLSI design: a systems perspective, Addison-Wesley, 1985. Ch. 8. - J. D. Meindl, "Gigascale integration: is the sky the limit?," IEEE Circuits and Devices Magazine 12(6), pp. 19–24, 1996. - S. Hanson, B. Zhai, K. Bernstein, D. Blaauw, A. Bryant, L. Chang, K. K. Das, W. Haensch, E. J. Nowak, and D. M. Sylvester, "Ultralow-voltage, minimum-energy CMOS," *IBM Journal of Research and Development* 50(4), pp. 469–490, 2006. - 11. H. Qin, Y. Cao, D. Markovic, A. Vladimirescu, and J. Rabaey, "Standby supply voltage minimization for deep submicron SRAM," *Microelectronics Journal* **36**(9), pp. 789–800, 2005. - 12. A. Srivastava, D. Sylvester, and D. Blaauw, "Power Minimization Using Simultaneous Gate-Sizing, Dual- $V_{\rm dd}$ , and Dual- $V_{\rm th}$ Assignment," in *Proc. of ACM/IEEE Design Automation Conference*, pp. 783–787, June 2004. - 13. R. Rao, A. Devgan, D. Blaauw, and D. Sylvester, "Parametric Yield Estimation Considering Leakage Variability," in *Proc. of 41th ACM/IEEE Design Automation Conference*, pp. 7–11, 2004. - 14. Y. Cao, H. Qin, R. Wang, P. Friedberg, A. Vladimirescu, and J. Rabaey, "Yield optimization with energy-delay constraints in low-power digital circuits," in *Electron Devices and Solid-State Circuits*, 2003 IEEE Conference on, pp. 285–288, 2003. - 15. B. Zhai, D. Blaauw, D. Sylvester, and K. Flautner, "The limit of dynamic voltage scaling and insomniac dynamic voltage scaling," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* 13, pp. 1239–1252, Nov. 2005. - 16. E. Courses and T. Surveys, "Low Power and Power Management for CMOS: An EDA Perspective," *IEEE Transactions on Electron Devices* **55**(1), pp. 186–196, 2008. - 17. B. Zhai, D. Blaauw, D. Sylvester, and K. Flautner, "Theoretical and practical limits of dynamic voltage scaling," in *Proc. of 41th ACM/IEEE Design Automation Conference*, pp. 868–873, ACM New York, NY, USA, 2004.