## Sim #3 – 8 CPU + Arbiter Simulation

- · Simulations can be used to obtain quantitative results for values that have no closed form solution or which are difficult to predict
- Will use an 8 CPU + Arbiter over a shared bus simulation as the target of Sim #3.





# Fixed Priority vs Round Robin Priority

3

- · A priority scheme refers to the method for selecting a CPU in the case of simultaneous bus requests
- · A fixed priority scheme always uses the same priority based on bus request#
  - The arbiter in this simulation assigns CPU#0 the highest priority, CPU#7 the lowest
  - Disadvantage of fixed priority is that the lowest priority CPU can starve in the presence of high bus contention
  - Advantage is simplicity
- A round robin scheme rotates priority after every IO transaction
- Idea is that each CPU has equal time at having the highest priority

2/6/2002

BR



2/6/2002

## A Question

- For transfer size = 8 clocks, at what bus utilization does the difference in bus transfers between the highest and lowest priority CPUs exceed 20% in a fixed priority scheme?
- · To answer the above question, need to simulate the system at different levels of bus utilization
- The more IO requests a CPU makes, the higher the bus utilization
- · Must measure the bus utilization for a fixed number of clocks
- · Must record the number of IO transfers that each CPU makes during the simulation
- · The Sim#3 assignment lists other questions that must be answered

2/6/2002

BR

## CPU, Arbiter Model Generics

- The ZIP archive attached to the lab contains the arbiter, CPU, testbench, configuration models
- Arbiter generic ROUND ROBIN controls whether rrobin or fixed priority scheme is used
- CPU generics:
  - RND\_SEED a number between 1 and 50 that is used to select a starting random seed value contained in the rnd2 package CPU\_ID - identifies this CPU and is the number placed on the
  - address bus when this CPU is bus master CLK\_MAX - when the total number of clocks seen thus far equals
  - this value, the CPU should halt all activity and assert its active output to the 'Z' value. The active signal in the testbench has a weak pullup ('H') on it – when this signal transitions from '0' to 'H' all CPUs have stopped.

2/6/2002

BR

## CPU request rate Generic

- The request rate generic will be used to control the number of IO requests a CPU makes – the higher this number, the more IO requests the CPU should make.
  - The more IO requests, the higher the bus utilization
- The CPU model has a finite state machine the local state represents the clocks in which the CPU is not making an IO request
  - The more clocks spent in the local state, the fewer IO requests that are made
- Declare a boolean array called req\_array that has 2000 elements - For each clock spent in the local state, increment a pointer (index) into req array
  - If req\_array[index] = TRUE, then make an IO request
  - Initialize req\_array such that request\_rate number of values are TRUE, and use a random number generator to pick these locations in req\_array.

7

9

| 2/6/2002 | BR |  |
|----------|----|--|



## Collecting Statistics, Printing Results You will need to add a VHDL package of your own that defines the shared variables needed to collect any statistics required to answer the questions Also need to print out a report once the specified total number of clocks have been reached (I will use these numbers as a rough sanity check on your model) # CPU0 TClks: 10000, TIOs: 25, TLatency: 57, LatPerIO: 2.280000e+00 CPUI TClks: 10000, TIOS: 24, TLatency: 78, LatPerIO: 3.250000e+00 CPU2 TClks: 10000, TIOS: 24, TLatency: 55, LatPerIO: 2.291667e+00 CPU3 TClks: 10000, TIOs: 24, TLatency: 64, LatPerIO: 2.666667e+00 CPU4 TClks: 10000, TIOs: 24, TLatency: 59, LatPerIO: 2.458333e+00 # CPU5 TCHAS: 10000, TIOS: 24, TLatency: 107, LatPerIO: 4.863366+00 # CPU5 TCHAS: 10000, TIOS: 24, TLatency: 107, LatPerIO: 2.625000e+00 # CPU7 TCHAS: 10000, TIOS: 24, TLatency: 57, LatPerIO: 2.375000e+00 # TransferSize: 8 ReqRate: 5 %busUtil: 15%

AvgIOs: 23 AvgTotalLatency: 67 AvgLatencyPerIO: 2.913043e+00

BR

| 2 | /6 | 120 | )02 |
|---|----|-----|-----|
|   |    |     |     |

## By plotting Average IO latency vs Bus Utilization, and IO transfers versus Bus Utilization can answer the questions. \*CPU0\_3 \*CPU7\_3 \*CPU0\_X 180 160 140

May need to change scale on Bus Utilization axis to get higher resolution in some cases.

10



Plots

## Sanity Checks

- · Please do simple sanity checking on your statistics
- Bus Utilization < 100 %</li>
- For low request rates, there is little bus contention, so:
  - Latency per IO should be close to 2
  - Number of IO transfers made by each CPU should be close to (Total Clocks)/2000 \* req\_rate
  - Bus utilization will be close to
  - (Number of IO \* Transfer size \* #of CPUs)/Total\_clocks \* 100%

BR

2/6/2002

11

#### Regression Testing

- · Multiple simulations runs have to be performed with different values of request\_rate, transfer\_size and priority scheme.
- This is known as regression testing, and it should be automated to save time
- Automation usually done via an external scripting language such as Perl • The zip archive contains a Perl script called *sim3\_sol.pl* that can
- be used for this.
  - Look at the comments in the perl script for usage directions
  - The script reads a template file called sim3/cfg\_tb.template that contains place holders for model generic values and produces a new cfg\_tb.vhd file with actual values substituted for model generic values
  - Number of simulation runs is determined by parameter specification in sim3\_sol.pl - feel free to modify this script to suit your needs or write your own in your favorite scripting language.

BR

2/6/2002

## Report, Model Checkoff

- Include your graphs, answers to questions in a file called 'report.pdf'.
- If you need to expand portions of the graph to get the required answers, then do so.
- I don't expect answers past one decimal point (ie. 3.5). I do expect answers with at least this fidelity ("about 4" is not acceptable).
  - You need to illustrate either via the graphs or model numerical output how you got your answers.

13

- $-\,$  If you give me an answer without justification,  $\,$  I will count it as wrong.
- I will run your simulation with my own values for request\_rate, transfer\_size, priority scheme and examine your model output.
   I don't expect your numbers to match mine exactly, but they should be reasonably close.

2/6/2002 BR