Tuesday, December 24, 2013

Faulty Clock Gating: How "Not" to Gate the Clock

You would come across a plethora of technical literature on clock gating and it's associated techniques. It does not come as a surprise because clock gating is the most commonly employed design technique to save dynamic power. However, many implementations are faulty, in the sense that while they indeed gate the clock, but the result in an overall increased dynamic power consumption. We would discuss one such common technique, which obviates all the power saving benefits of clock gating. You are advised to use your discretion before using it.
The basic rationale behind clock gating:
  • Even when the output of a flip-flop is not toggling, owing to the transitions (and hence charging/discharging of nodes) in the internal circuitry of the flop-flop, it still continues to dissipate dynamic power when it is being fed by a clock signal.
  • When the input of the flip-flop is not toggling or would not toggle, one can effectively gate the clock to that flip-flop for that particular time and save dynamic power. 
One logical implementation for the above problem statement (and this is indeed the implementation employed in many technical papers and patents) is depicted below:
Let's take a look at the above implementation. The XOR gate between the D input and the Q output of the flip-flop has been used as the enable signal for the clock gate CGIC. The logical explanation behind this is: when the output of the flop is same as input, which would be detected by XOR'ing the two, one can gate the clock to the clock gate.
Example: Let's say initially Q =1. Now D = 1, which means that t he output of the flop is destined to stay at "1" for the next cycle as well. XOR'ing these two signals: Q XOR D = 0, EN = 0 would gate the clock to the flip-flop. So, would that save power? Well, one would expect it that way. Let's take a look at why it would result in an increased power dissipation.

The circuit shown above is a trap! The actual circuit would be something like the one shown below:

  • As evident from the above figure, the  XOR gate would continue to toggle for the entire time period of the clock and would become stable only "setup time" before the next clock edge. And during this entire duration, it would continue dissipating dynamic power. You might argue here that the power dissipated must be less than the power dissipated by an idle flop receiving clock. Well, that might be true for some technology, but XOR gate is the most bulky gate (among all primitive gates) and I would say that this power, if not less, would at least be comparable to that of an idle flop receiving a clock signal.
  • Secondly, the circuit above uses a CGIC. Note that CGIC comprises off one latch and an AND gate, while a flop comprises  of two latches. The internal circuitry of the CGIC would continue to charge/discharge and hence dissipate power.
The sum of the above two power dissipation would over-shadow the benefits one was expecting in the first place, and hence it  is a common design trap. Beware of it.