# Low Power Serial-Parallel Bootstrapped Dynamic Shift Register Leo Lee $^{a,b}$ , Said Al-Sarawi $^{a,b}$ and Derek Abbott $^{a,b}$ <sup>a</sup>Centre for Biomedical Engineering (CBME) and Department of Electrical and Electronic Engineering, The University of Adelaide, SA 5005, Australia <sup>b</sup> The Centre for High Performance Integrated Technologies and Systems (CHiPTec) and Department of Electrical and Electronic Engineering, The University of Adelaide, SA 5005, Australia ## ABSTRACT In this paper a new low power area efficient serial-to-parallel shift register design is presented. The design of the register only contains 4 transistors per stage and uses a capacitive bootstrapping technique to offset the threshold voltage drop of MOSFETs. We shall refer to this logic family as Non-Ratioed Bootstrap Logic (NRBL). The intended target applications are in smart sensor arrays and image sensors for use in the select registers to control the photo diode array. Keywords: Dynamic Shift Register, CMOS, VLSI, Smart Sensors, Image Sensors, Integrated Circuits ## 1. INTRODUCTION The shift register is perhaps the most common structure for enabling the movement of a data bit in VLSI systems.<sup>1</sup> Shift registers have many applications ranging from producing time delay by using serial-in serial-out shift registers, or converting serial data to parallel data using a serial-in parallel-out shift register. Serial-in parallel-out shift registers are commonly used in image sensors to control the analogue multiplexer, which transmits data from the photodiode pixel to the A/D converter then processed into an image, as shown in Fig. 1 There has been recent interest in using capacitors to reduce the size of VLSI circuits.<sup>2</sup> Here we present a different capacitive approach to produce a very compact serial-parallel shift register. We demonstrate functionality with 4 transistors per stage and no dc power rails. There are a number of applications where both low power and compactness are required. These include dense optical smart sensors<sup>3</sup> and gate control of solid-state quantum computers.<sup>4</sup> A special feature of this shift register is that the output amplitude can be externally controlled by adjusting the height of the 2-phase non-overlapping clock, this feature is important for these target applications. As shown in Fig. 2 each stage of the low power serial-to-parallel dynamic shift register consists of four non-ratioed minimum sized n-channel transistors and can be implemented in any standard CMOS process. A bootstrapping capacitor is used to overcome the threshold voltage drop of the n-channel pass transistor. The register has no DC power supply and is driven entirely by the 2-phase clock, leading to low power dissipation. The clocks should be non-overlapping and if this requirement is not met, the register will not perform correctly. Non-overlapping clocks are advantageous as fixed pattern noise (fpn) due to capacitive clock feedthrough can be more easily controlled. Power consumption and dissipation is an important parameter in modern VLSI design, as electronic devices increasingly become portable. Power consumption therefore needs to be minimised, keeping the battery size practical for portable electronic devices. Further author information: (Send correspondence to Leo Lee.) Leo Lee.: Email: llee@eleceng.adelaide.edu.au, Telephone: +61 8 83036296 Fax: +61 8 83034360 Said Al-Sarawi.: Email: alsarawi@eleceng.adelaide.edu.au Derek Abbott.: Email: dabbott@eleceng.adelaide.edu.au Figure 1. Photo Diode Array. The row select register is required to drive the gates of the photo diode address transistor on each row, therefore it has a larger load to drive than the column select register, which is only required to drive the gate of one transistor in the analog multiplexer. # 2. OPERATION OF THE SHIFT REGISTER With reference to Fig. 2 the operation of the circuit is as follows. The drain of each enable transistor is pulsed every clock period. The output of each shift element remains low until a high is enabled by the previous stage or the serial I/P. The output is charged up by the clock pulse when a high appears at the gate of the enable transistor. The output pulse does not drop by the threshold voltage $V_t$ , of the n-transistor due to the bootstrap action of the capacitor $C_b$ . $C_b$ , pulls up the enable transistor by $V_{boot}$ , holding it at $V_{Clk} + V_{boot}$ , $$V_{bool} = \frac{C_p}{C_b + C_p} V_{Clk} \tag{1}$$ where $C_p$ is the parasitic capacitance to ground at the input and $V_{Clk}$ is the clock high level. As the output is charged up to $V_{Clk}$ , $V_{Clk} - V_t$ appears at the gate of the reset transistor and the following enable transistor via the pass transistor. As Clk 1 starts going low the output drops accordingly. When the output is $2V_t$ below $V_{Clk}$ , the reset transistor turns on and resets the bootstrap capacitor. This causes the enable transistor turns off when the output is at this level. When the next clock phase comes on (Clk 2) the discharge transistor turns on and completes the discharge of the output. The bootstrap capacitor of the next stage is not charged as the gate of the pass transistor is now at a lower potential than its source. The same cycle repeats for the subsequent stages of the shift register. Note, the connection of the two clocks alternate between each stage of the shift register. A summary of the functions of each transistor are as follows, - Enable transistor: Connects clock to output when enabled by the previous stage. - Reset tansistor: Resets the bootstrap capacitor - Pass transistor: (a) passes initial current to charge up bootstrap capacitor of next stage, (b) when output discharges, it turns off so that the bootstrap capacitor of the next stage does not discharge, (c) provides a threshold voltage drop so that the reset transistor is kept hard off until the output potential drops. Proc. of SPIE Vol. 4935 #### 3. STRUCTURE OF THE SHIFT REGISTER Each stage of the shift register is non-inverting and consists of 4 transistors with an extra transistor at the input and output of the shift register as shown in Fig. 3. The added transistor at the input stage is required to clock the serial input pulse. The added transistor at the output stage is required to discharge the last bootstrap capacitor. The layout of the shift register Fig. 4 shows the layout of four stages of the shift register. The two clock lines were designed with an interdigitated structure such that the contacts for both clock lines are at the same level in the layout. This symmetry is important as it balances the parasitic capacitance on the clock lines. This allows us to have the same bootstrap capacitance value for each stage and also allows easier clock buffer designs. Furthermore if the shift register is used to address an imaging array, as in Fig. 1, the balanced clock lines will minimise odd/even type fixed-pattern-noise on the output. Note that the bootstrap capacitance is formed by a gate oxide layer sandwiched between poly and diffusion layers. So there is the question of which way around to place the capacitor. Fig. 4 shows the poly terminal connected to the gate of the enable transistor and the diffusion terminal connected to an output stage. This orientation has two advantages: (a) it reduces the number of contact cuts by maintaining contiguous layers and (b) the diffusion is kept away from the sensitive node at the gate of the enable transistor (this is important for application where the shift register might possibly be exposed to light). Figure 2. One stage of the shift register, with only 4 transistors and a capacitor $C_b$ to provide bootstrapping action to overcome the threshold voltage drop of MOSFET. The register has no dc supply and is driven entirely by two non-overlapping clocks. ## 4. BOOTSTRAP ANALYSIS Here we show that a first order circuit analysis of the bootstrapped enabled transistor is fairly straight forward and roughly agrees with SPICE simulations. We begin the analysis by considering the rising clock edge in Fig. 5. Now, $$\begin{aligned} V_G &= V_{Clk} + V_{bool} \\ \text{where} \ \ V_{bool} &= \eta V_{Clk} \ \ \text{and} \ \ \eta = \frac{C_p}{C_p + C_{bool}}. \end{aligned}$$ Figure 3. Multiple stages of NRBL shift register, driven by 2 non-overlapping clocks. The shift register also requires two extra transistors, one before the first stage for the serial input, and one at the last stage to discharge the last bootstrap capacitance. Note the alternate clock connections between the stages Figure 4. Layout of the dynamic shift register in a standard $0.25\mu m$ CMOS process, where only n-channel transistors are required. Four shift stages are shown and each stage occupies a pitch of $2.04\mu m$ At $t \to \infty$ , $V_{GS} = V_G - V_S = (V_{Clk} + V_{boot}) - V_{Clk} = V_{boot}$ Due to the bootstrapping action, $V_S$ follows $V_D$ and so $V_{DS}$ is small. This is maintained because $\eta$ is designed such that $V_{boot} \ge V_t$ to keep the transistor on. As saturation is defined when $V_{DS} > (V_{GS} - V_t)$ , we can assume non-saturation due to the small $V_{DS}$ and hence we must use $i_{DS} = K\{(V_{GS} - V_t)V_{DS} - \frac{V_{DS}^2}{2}\}$ . The circuit equations are: $$i_b = C_p \dot{V_G} \tag{2}$$ $$i_{ds} - i_b = C_L \dot{V}_S \tag{3}$$ $$i_b = C_{bool}(\dot{V}_G - \dot{V}_S). \tag{4}$$ Assuming $$V_{DS}$$ is very small, $g(V_G - V_S - V_t) = i_{ds}$ (5) where g is a linear approximation and let $$g = \text{constant} = \frac{K \times V_{CIk}}{2}$$ (6) From Eqs. (2) and (3) Figure 5: Circuit analysis of the enable transistor, when it receives a rising clock edge. $$C_{p}\dot{V}_{G} = C_{boot}(\dot{V}_{G} - \dot{V}_{S})$$ $$\dot{V}_{S} = \dot{V}_{G}(1 - \frac{C_{p}}{C_{boot}})$$ $$\therefore \dot{V}_{S} = \gamma \dot{V}_{G} \text{ where } \gamma = (1 - \frac{C_{p}}{C_{boot}})$$ Now substituting Eqs. (2) and (5) into Eqn. (3) $$g(V_{G} - V_{S} - V_{t}) - C_{p}\dot{V}_{G} = C_{L}\dot{V}_{S}.$$ Then differentiate with respect to $t$ , $$\Rightarrow C_{L}\ddot{V}_{S} + C_{p}\ddot{V}_{G} = g\dot{V}_{G} - g\dot{V}_{S}.$$ Then substitute $\dot{V}_{S} = \gamma\dot{V}_{G}$ $$\Rightarrow \gamma C_{L}\ddot{V}_{G} + C_{p}\ddot{V}_{G} = g\dot{V}_{G} - \gamma g\dot{V}_{G}$$ $$\therefore \ddot{V}_{G} = \frac{g - \gamma g}{\gamma C_{L} + C_{p}}\dot{V}_{G}$$ $$\therefore \dot{I}_{\tau} = \frac{gC_{p}}{C_{L}C_{boot} - C_{L}C_{p} + C_{p}C_{boot}}$$ $$\therefore \tau = \frac{C_{boot}}{g} \{ \frac{C_{L}}{C_{p}} - \frac{C_{L}}{C_{boot}} + 1 \}.$$ (7) From Eqn. 7 the expected operating frequency of the shift register can be calculated using $f = \frac{1}{2\pi\tau}$ . The calculated frequency was 116 MHz, and this is just over double the maximum frequency obtained by SPICE. This could be due to g being slightly overestimated, as we picked a mid point value rather than calculating an integrated average value. ## 5. SIMULATION Simulation of the dynamic shift register was performed with HSPICE using an industrial $0.25 \,\mu\mathrm{m}$ CMOS process at 2.5 V clock voltage. The value of the bootstrap capacitance is determined by the value of $V_{boot}$ required as shown in Eqs. 1 and was chosen to be 15.46 fF. Using a bootstrap capacitance value of 15.46, $\eta$ is calculated to be 0.185, $\eta \times 2.5 = 0.46$ V. This provides the necessary voltage to overcome the threshold voltage drop 0.41 V of the transistor. The shift register was simulated with a load inverter with size equivalent to 128 minimum size load transistors. Fig. 6 shows the operation of the shift register, with two non-overlapping clocks at 50 MHz. When the input pulse is clocked at the serial I/P, the output pulses of each stage are as shown in Fig 6. Figure 6. Simulation results of NRBL shift register. The top graph shows the non-overlapping clock pulses at 50MHz after passing through the clock distribution circuit. The second graph shows an input pulse. The last graph shows the operation of 10 shift register stages. ### 5.1. Distributed Clock Tree When one counts up all the capacitance in the registers in a large CMOS design, it could lead to large power dissipation when driven in a small time and at high repetition rate. Long clock lines will also introduce substantial resistance. This would introduce a RC delay in the clock lines. To make the simulations more realistic a distributed clock tree is used to distribute the capacitance required to be driven and to also minimise clock skew. A cascade inverter design was used to size the inverter stages of the clock tree. In order to minimise the total delay a width factor of e (base of natural logarithms) is used. Fig. 7 shows an example of clock distribution circuit. ## 5.2. Power Dissipation The power dissipation of the dynamic shift register is minimal as only one stage is on at a time. The total power dissiption is therefore mainly due to the charging and discharging of the capacitance that each stage drives, $C_L$ . This is equal to $C_L V_{clk}^2 f$ , where f is the clock frequency. The calculated power dissipation is 0.02 mW, however this does not take in account of the power dissipation due to the distributed clock circuit. Fig. 8 shows simulated results of power dissipation versus varying voltage of operation, with load of 128 minimum transistors and with no load. Table 1 shows a comparison of the NRBL shift register with two other shift registers, a standard dynamic shift register, and Shibata's Shift Register. The standard dynamic shift register is a simple pass transistor and inverter arrangement, however two stages are required for a non-inverting output. The shift register proposed by Shibata uses a D flip-flop arrangement. From Table 1 it can been seen that the NRBL shift register has the lowest power dissipation and transistor count. Figure 7: Distibuted Clock Tree Design 194 Figure 8. Power Dissipation vs Clock Voltage. This graph represents the average power dissipation of a single stage of the shift register, simulated with full load and no load | | NRBL | Dynamic | Shibata's | |------------------|-----------|------------------------|-------------------------| | | Shift Reg | Shift Reg <sup>1</sup> | Shift Reg <sup>10</sup> | | Max Speed | 50 MHz | 50 MHz | 100 MHz | | Transistor/Stage | 4 | 6 | $D-FF(\sim 15)$ | | No. Pwr Supplies | 0 | 1 | 1 | | Pwr Dissipation | 0.27 | 0.50 | 1.5 | | per Stage (mW) | | | | Table 1. Comparison of different shift register designs. Note in the NRBL Shift Register case, it contains only 4 transistors, no power dc power supply and the lowest power dissipation. The power dissipation value obtained for the NRBL shift register was under full load, at $50~\mathrm{MHz}$ ## 5.3. Temperature Simulation The temperature dependance of transistors on the shift register performance must be considered. The simulation were performed using the industrially specified temperature range of $-40^{\circ}$ C to 85°C. From the simulations, the shift register was found to be functioning correctly within the specified temperature range. #### 5.4. Process Variation Simulations The fabrication process is a long sequence of chemical reactions that result in device characteristics that follow a normal or Gaussian distribution.<sup>8</sup> Since the dynamic shift register discussed in this paper only consists of n-type transistors, only the fast-n/slow-p and slow-n/fast-p boundary conbinations are considered. From the simulated results the shift register was shown to be functioning correctly when simulated with process variations. The simulated outputs are similar to those shown in Fig. 6. #### 6. CONCLUSION A 4-transistor per stage serial to parallel dynamic shift register has been designed and simulated with HSPICE using a 2.5 V, $0.25 \,\mu m$ CMOS process. It has been shown to drive a fan out of 128 with a very low power dissipation. Although the proposed dynamic shift register works with a low clock speed of 50 MHz, the performance requirement of the shift register is low power and small area. These requirements can be traded to increase the register speed. #### 7. ACKNOWLEDGMENTS Funding from the ARC and Sir Ross and Sir Keith Smith Fund is gratefully acknowledged. #### REFERENCES - C. Mead and L. Conway, Introduction to VLSI Systems, Addison-Wesley, Reading, Massachusetts, first ed., 1980 - 2. P. Celinski J.F. López S. Al-Sarawi and D. Abbott, "Low power, high speed, charge recycling CMOS threshold logic gate," *Electronics Letters* 37(17), pp. pp. 1067–1077, 2001. - 3. A. Moini A. Bouzerdoum K. Eshraghian A. Yakovleff X.T. Nguyen A. Blanksby R. Beare D. Abbott and R.E. Bogner, "An insect vision-based motion detection chip," *IEEE Journal of Solid-State Circuits* **32**(2), pp. pp. 279–284, 1997. - 4. J. Ng and D. Abbott, "Introduction to solid-state quantum computation for engineers," *Microelectronic Journal* 33, pp. pp. 171–177, 2002. - 5. S. Lu, "A safe single-phase clocking scheme for CMOS circuits," *IEEE Journal of Solid-State Circuits* **23**(1), pp. pp. 280–283, 1988. - 6. S. Ohba M. Nakai H. Ando S. Hanamura K. Satoh K. Takahashi M. Kubo S. Shimada and T. Fujita, "MOS area sensor. II. low-noise MOS area sensor with antiblooming photodiodes," *IEEE Transactions on Electron Devices* ED-27(8), pp. pp. 1682–1687, 1980. - 7. N. Sklavos P. Kitsos N. Zervas and O. Koufopavlou, "A new low and high speed bidirection shift register architecture," tech. rep., University of Patras, Patras, Greece, 2000. - 8. N.H.E. Weste and K. Eshraghian, *Principles of CMOS VLSI Design, A System Perspective*, Addison-Wesley, Reading, Massachusetts, second ed., 1993. - 9. J.M. Rabaey, Digital Integrated Circuits, A Design Perspective, Prentice Hall, Upper Saddle River, NJ, first ed., 1996. - 10. N. Shibata M. Watanabe and Y. Tanabe, "A current-sensed high-speed and low-power first-in-first-out using a wordline/bitline-swapped dual-port SRAM cell," *IEEE Journal of Solid-State Circuits* **37**(6), pp. pp. 735–750, 2002.