Wednesday, May 1, 2013

PIPELINING & RISCs

Pipelining, what is it?

Pipelining and parallelism are 2 methods used to achieve concurrency.
Pipelining increases concurrency by dividing a computation into a number of steps.
Pipelining is a key implementation technique used to build fast processors that can be seen in RISC architecture. It allows the execution of multiple instructions to overlap in time.In a non-pipelined processing, by contrast, the next data/instruction is processed after the entire processing of the previous data/instruction is complete.

---Parallelism is the use of multiple resources to increase concurrency

                                         A Two stage Instruction Pipeline will look as so

-If we assume that the fetch and execute stages require the same amount of time, and
-If the computer has two hardware units, one for fetching instructions and the other for executing them (what is the implication?).
The fetch and execute operations can each be completed in one clock cycle


PIPELINE PERFORMANCE

n:number of instructions
k: stages in pipeline
t : cycletime (time in seconds needed to advance a set of instructions one stage through the pipeline)
Tk: total time for pipelining
T0 : total time without pipelining 



The formula for calculating total time for equal stages

For n instructions, with k equal stages and delay of t for each stage
Total time with no pipelining, T0 = nkt
Total time with pipelining, Tk = (k + (n-1))t

The formula for calculating speedup 
Speedup of a k-stage pipeline for n instructions :
  S   = T0 / Tk
             = nk t  / ((k+(n-1)) t
            à k (for large n)
Note: the equations for T0  and Tk  are used here where at the time for every stage is equal. If the time t is not equal, it means that each stage from 1 to k would have the time(s), like    ,    ….  
Example 1: Based on the 2 stage timing diagram
With k=2 stage, n=9:
T0 = 9*2*1 = 18, Tk=(2+(9-1)) *1=10
S= 18/10 = 1.8 times
Example 2: Based on the 6 stage timing diagram
With k=6 stage, n=9:
T0 = 9*6*1= 54, Tk=(6+(9-1))*1=14
S= 54/14 = 3.857 times
We can see from here indeed increasing the number of stages with the other values being same, speedup improves.

Example 3: Based on the 3 unequal stages example
With k=3 stage (uneven), n=4:
To = n.     = 4*(3+3+2) = 32
Tk =            + (n-1).   
   = (3+3+2)+ (3)(3)
    = 8 + 9
    = 17
S = 32/17 = 1.88 



Calculating throughput
Pipelined Throughput= n/Tk (n)
Non-pipeline Throughput = n/To (n)
Where n= total no of instructions
To (n) is the Total Time for Non-pipelining, Tk (n) is the Total Time for Pipelining
Both throughputs is in instructions/per unit time(s) 


Example 1: Based on the 6 stage timing diagram

n= 9, k=6
Non Pipelined Throughput   = 9/To (9),
       = 9/ 9*6*1
      = 9/54
       = 0.17
Pipelined Throughout   = 9/ Tk (9)
             =9/((6+(9-1))*1)
                          =9/14
                                     = 0.64
Example 2: Based on the 3 unequal stages example



With k=3 stage (uneven), n=4:
Non Pipelined Throughput   = 4/To (4)
                                            = 4/(4*(3+3+2) )
       = 4/32
                    = 0.125
Pipelined Throughout   =4/ Tk (4)
             = 4/ ((3+3+2)+ (3)(3))
              = 4/17
                                    = 0.235
in order to increase speedup the best way is to increase the number of stages.

Factors that limits the performance of pipelining
Unequal duration/delay of stages
Conditional branch instruction or interrupts. 
Ex:
Instruction 3 is a conditional branch to instruction 15
No instructions completed during time units 9-12. This is performance penalty incurred because we could not anticipate the branch
Flushing of pipeline
Pipelined operation cannot be maintained in the presence of branch or jump instructions.

There are 3 types of hazzards to limitation of pipelining
-Resource hazards : HW cannot support this combination of instructions (single person to fold and put clothes away, washer-drier)
-Data hazards: Instruction depends on result of prior instruction still in the pipeline
-Data dependencies example
A = B + C
D = E + A
C = G x H

-Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps).
How may the hazards be prevented?


No comments:

Post a Comment