The Skinny On Computer System Architecture, Fun Sized : 2013

PIPELINING & RISCs

Pipelining, what is it?

Pipelining and parallelism are 2 methods used to achieve concurrency.

—Pipelining increases concurrency by dividing a computation into a number of steps.

Pipelining is a key implementation technique used to build fast processors that can be seen in RISC architecture. It allows the execution of multiple instructions to overlap in time.In a non-pipelined processing, by contrast, the next data/instruction is processed after the entire processing of the previous data/instruction is complete.

---Parallelism is the use of multiple resources to increase concurrency

A Two stage Instruction Pipeline will look as so

-If we assume that the fetch and execute stages require the same amount of time, and

-If the computer has two hardware units, one for fetching instructions and the other for executing them (what is the implication?).

The fetch and execute operations can each be completed in one clock cycle

PIPELINE PERFORMANCE

•n:number of instructions

•k: stages in pipeline

•t : cycletime (time in seconds needed to advance a set of instructions one stage through the pipeline)

•Tk: total time for pipelining

•T0 : total time without pipelining

The formula for calculating total time for equal stages

•For n instructions, with k equal stages and delay of t for each stage

Total time with no pipelining, T0 = nkt

Total time with pipelining, Tk = (k + (n-1))t

The formula for calculating speedup

•Speedup of a k-stage pipeline for n instructions :

S = T0 / Tk

= nk t / ((k+(n-1)) t

à k (for large n)

•Note: the equations for T0 and Tk are used here where at the time for every stage is equal. If the time t is not equal, it means that each stage from 1 to k would have the time(s), like , ….

Example 1: Based on the 2 stage timing diagram

With k=2 stage, n=9:

T0 = 9*2*1 = 18, Tk=(2+(9-1)) *1=10

S= 18/10 = 1.8 times

Example 2: Based on the 6 stage timing diagram

With k=6 stage, n=9:

T0 = 9*6*1= 54, Tk=(6+(9-1))*1=14

S= 54/14 = 3.857 times

We can see from here indeed increasing the number of stages with the other values being same, speedup improves.

Example 3: Based on the 3 unequal stages example

With k=3 stage (uneven), n=4:

To = n. = 4*(3+3+2) = 32

Tk = + (n-1).

= (3+3+2)+ (3)(3)

= 8 + 9

= 17

S = 32/17 = 1.88

Calculating throughput

•Pipelined Throughput= n/Tk (n)

•Non-pipeline Throughput = n/To (n)

Where n= total no of instructions

To (n) is the Total Time for Non-pipelining, Tk (n) is the Total Time for Pipelining

Both throughputs is in instructions/per unit time(s)

Example 1: Based on the 6 stage timing diagram

n= 9, k=6

Non Pipelined Throughput = 9/To (9),

= 9/ 9*6*1

= 9/54

= 0.17

Pipelined Throughout = 9/ Tk (9)

=9/((6+(9-1))*1)

=9/14

= 0.64

Example 2: Based on the 3 unequal stages example

With k=3 stage (uneven), n=4:

Non Pipelined Throughput = 4/To (4)

= 4/(4*(3+3+2) )

= 4/32

= 0.125

Pipelined Throughout =4/ Tk (4)

= 4/ ((3+3+2)+ (3)(3))

= 4/17

= 0.235

in order to increase speedup the best way is to increase the number of stages.

Factors that limits the performance of pipelining

—Unequal duration/delay of stages

—Conditional branch instruction or interrupts.

Ex:

–Instruction 3 is a conditional branch to instruction 15

–No instructions completed during time units 9-12. This is performance penalty incurred because we could not anticipate the branch

–Flushing of pipeline

–Pipelined operation cannot be maintained in the presence of branch or jump instructions.

There are 3 types of hazzards to limitation of pipelining

-Resource hazards : HW cannot support this combination of instructions (single person to fold and put clothes away, washer-drier)

-Data hazards: Instruction depends on result of prior instruction still in the pipeline

-Data dependencies example

A = B + C

D = E + A

C = G x H

-Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps).

•How may the hazards be prevented?

INPUT OUTPUT INTERFACE

I/O INTERFACE

I/o Devices are another key element of the computer system as each module acts as an interface to the system bus or central switch and controls one or more peripheral devices.

in other words,I/O module helps communication between the peripherals and the bus

Why are the peripherals not connected to the system bus directly?

o Peripherals have different ways of operations. It is not practical to ask the processor to control all different devices.

o Data transfer rate in devices are much slower than the memory or processor. So it is not practical to link high speed bus directly to the devices.

o Some devices might have higher speed than the bus, so it is not practical to have a direct link.

o Peripherals may use different data formats and word lengths than the computer they are attached to.

Functions of a I/O module

I/O Operations

Three techniques are possible for I/O operation:

1. Programmed I/O

o Data are exchanged between the processor and the I/O module.

o When the processor is executing a program and gets and instruction related to I/O, it issues a command to the respective I/O module.

o The I/O module will perform the requested operation.

o The I/O will not alert or interrupt the processor once the operation is done.

o It is the responsibility of the processor to keep on checking the status of the I/O module to see if the operation is complete.

2. Interrupt driven I/O

o This method overcomes CPU waiting time.

o The processor issues an I/O command to a module and then goes on to do some other useful work.

o The I/O module will then interrupt the processor to request service when is ready to exchange data with the processor.

o The processor then executes the data transfer and then resumes its processing.

3. Direct Memory Access (DMA)

o When large volumes of data are to be moved between memory and the peripherals, an efficient technique is Direct Memory Access (DMA).

DMA Function

o DMA involves an additional module on the system bus.

o The DMA module is capable of imitating the processor and taking over control of the system from the processor.

o It needs to do this to transfer data to and from memory over the system bus.

External Interface

· Used to connect devices to the I/O module.

· A common characteristic of interface is whether it is serial or parallel.

· Parallel Interface

o Multiple lines connect the peripheral and the I/O module. So multiple bits are transferred at once.

o Used in high speed peripherals such as hard disk, printer, scanner.

· Serial Interface

o There is only one line to transmit data and bits are transferred one at a time.

o Used in slower devices such as mouse and keyboard.

BUSES

BUS

what is a BUS?

A bus is a communication path way connecting two or more devices.

Take the motherboard and look at the bottom side of it, you can see lines connection to each element. That is a bus

How does a Bus work?

· To send data to another device

o First the module must get permission to use the bus, and then only can send the data

· To request data from another device

o First the module must get permission to use the bus, and then send a request signal to the other device and finally wait for the device to send the data.

for more information of the bus take a look at the mindmap below