Last Name: 1) Complete the following classification of computer systems. | 79 | | Instruction Streams | | |-----------------|----------|---------------------|----------| | | | Single | Multiple | | Data<br>Streams | Single | SISD | SIMD | | | Multiple | MI 5D | MIMD | - 2) According to Amdahl's law, what would the maximum speed up of a parallel system be with a code which 85% of it can be parallelized and remaining has to be executed sequentially, under - a. 10 processors - b. 100 processors - c. Infinite number of processors Comment on what would an appropriate number of processors that should be dedicated to such a program? $$s = \frac{1}{0.15 + .85} = 4.26$$ $$c. s = \frac{1}{.15} = 6.67$$ Last Name: The figure below depicts the memory system. - a. How many virtual pages numbers does the system have and how many bytes is each page? b. Identify type of cache architecture diret block 512e 24 - b. Identify type of cache architecture c. Complete the cache architecture by connecting the dash lines to appropriate physical - address. - d. Identify how many bits is each connection marked by "?". | T | ~ | 7 4 | 4 | - | |-----|------|-----|---|---| | H . | 1 +1 | - 4 | 1 | 7 | | | | | | | # Quiz #22 Dr. Izadi First NAME: Last Name: 1) Design a 2 way set associative cache with the following parameters: • Address size: 32 bits • Cache size: 16 KB Word size: 4 Bytes Complete the partially drawn diagram and indicate offset, index, and tag bits. Also, show the number of cache row entries. 2) The following is a series of address references given as word addresses: 8, 4, 20, 4, 16, 8, 4, 16, 4, 20, 4, 24. Assume two-way set associative cache with a word size of 1 byte and a total cache size of 8 bytes. Show the hits and misses and final cache contents and the final cache content. Assume FIFO (first in first out) replacement strategy. 0000 | Location | Hit/Miss? | |----------|-----------| | 8 | M | | 4 | M | | 20 | M | | 4 | 1+ | | 16 | M | | 8 | M | | 4 | M | | 16 | M | | 4 | H | | 20 | M | | 4 | M<br>M | | 24 | M | | ۸ . ۸. | 16 | 5 | |--------|------|------| | AIA | 8208 | 8464 | | 01 | | | | 10 | | | | )) | | | 8268164 41642624 - 3) Assume an instruction cache miss rate for an application is 1% and the data cache miss rate of 2%. Assume further that our CPU is running at 4 GHz and has a CPI of 2 without any memory stalls. The main memory access time is 50 ns. - a. Determine the overall CPI with the indicated misses, provided the frequency of all loads and stores in the application is 30%. - b. Suppose we like to add a second level cache with an access time of 4 ns, which has an instruction miss rate of .4% and data cache miss rate of .6%. Determine the overall CPI. b. $$CPI = 2 + 16 * . 01 + 16 * . 02 * . 3$$ $+200 * . 004 + 200 * . 006 * . 3$ $= 3.4$ | | ~ 4 | | ~ | |-----|-----|---|---| | EG | 1 / | | | | 1 7 | | - | | Quiz #21 Dr. Izadi First NAME: Last Name: ### 15 Points - 1) Design a direct-mapped cache with the following parameters: - Address size: 32 bits Total cache size: 16 KB • Cache block: 4 word = 16 bytes Indicate bit values for - a. Offset 2 + 2 = 4 - b. Index 10 - c. Tag 18 d. The number of cache row entries 1024 #### 20 Points 2) The following is a series of address references given as word addresses: 5, 4, 20, 4, 21, 15, 5, 19, 4, 20, 4, 22. Assume direct map with a word size of 1 byte and a block size of two bytes and a total cache size of 8 bytes. Show the hits and misses and final cache | A0=1 | Ao=o | |---------|------| | 9/17 | 816 | | 27,421, | 4218 | | | | A4 A3/A2A1 | 10 | |----------|-----------|------------|----| | Location | Hit/Miss? | 0 1100 | 0 | | 8 | W | 0 1/10 0 | 0 | | 4 | M | 10/10 | 0 | | 20 | M | | | | 4 | M | (1) | 0 | | 8 | H | >0110 | 1 | | 12 | M | 10/00 | 0 | | 4 | M | 310 | | | 16 | M | , | | | 4 | H | | | | 20 | M | | | | 4 | M | | | | 16 | Н | | | ## 15 Points 3) Assume an instruction cache miss rate for an application is 4% and the data cache miss rate of 6%. Assume further that our CPU has a CPI of 2 without any memory stalls and the miss penalty is 25 cycles for all misses. Determine the overall CPI with the indicated misses, provided the frequency of all loads and stores in the application is 20%. $$CPI = 2 + 4\% * 25 + 6\% * 20\% * 25$$ $$= 2 + 1 + 0.3$$ $$= 3.3$$ 1. | Index | v | Tag | |-------|---|-------------------| | 000 | Υ | 01 <sub>two</sub> | | 001 | N | | | 010 | N | | | 011 | Υ | 00 <sub>two</sub> | | 100 | N | | | 101 | Υ | 11 <sub>two</sub> | | 110 | Υ | 00 <sub>two</sub> | | 111 | N | | - a. After a request to address 10100, is the cache request a hit or miss? Why? - Miss V = 0b. Which memory location does cache reference 110 belong? In none, specify. c. Which memory location does cache reference 010 belong? If none, specify/ $\text{Non}\mathcal{L}$ # 15 Points - 2. Design a direct-mapped cache with the following parameters - Address size: 32 bitsCache data size: 16 KB - · Cache block: 1 word 20 Points 3) The following is a series of address references given as word addresses: 20, 4, 8, 16, 8, 12, 4, 20, 4. Assume direct map with a word size of 4 bytes and a total size of 4 words. | Show the hits and mi | 1 (* 1 | 1 | 41 C 1 | 1 | |----------------------|--------------------|-------------------|------------|----------------| | Show the nite and mi | ceae and final cac | he contents Show | the final | rache content | | Show the mis and mi | sses and imai cac | ne contents. Show | the illiai | cache content. | | | | | | | | | 32 16 8 4 21 | 8 | |-----------|--------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Hit/Miss? | 0 10 1 XX | Indox Cache | | \ \ \ | | 00 16 | | $\wedge$ | 2001 | 01/204284 | | M | +0010 | 01 204284 | | W | 00,00 | 10 8 | | 8 | | 1 12 | | $\sim$ | 3000 | ~ <b>j.</b> | | H | ->000 \ | | | M | 70101 | | | W | | | | | ₩ | Hit/Miss? M M O 0 0 1 M O 100 M O 100 H M O 100 H O 100 H O 100 | Last Name: # 1) Given this instruction sequence, Assume the instructions to be invoked on an exception begin like this: 80000180<sub>hex</sub> sw \$26, 1000(\$0) 80000184<sub>hex</sub> sw \$27, 1004(\$0) Show what happens in the pipeline if an overflow exception occurs in the add instruction. 2) Show how the following loop can be scheduled on a static two-issue pipeline for MIPS? Reorder the instructions to avoid as many pipeline stalls as possible without affecting the overall execution. Computer the overall IPC. $$\pm PC = \frac{7}{5} = 1.4$$ • Dr. Izadi First NAME: Last Name: Assume the following sequence of instructions - 32 and \$10, \$4, \$8 - 36 sub \$4, \$8, \$6 - 40 beq \$2, \$4, 2 - 44 and \$8, \$2, \$5 - 48 or \$9, \$2, \$6 - 52 add \$11, \$4, \$2 - 56 slt \$15, \$6, \$7 - a. Using the following diagram, assuming (\$2)=0x90 (\$4)=0x30, show the next three cycling steps: b. Repeat part a. for (\$2)=0x30 (\$4)=0x30 Last Name: For the code below, a. On the diagram, mark and identify all the data dependencies in the code given below and identify which dependencies will cause data hazards without forwarding hardware. b. Assuming there is no special hardware that is added for forwarding. Add "nop" instructions to the code to avoid the data hazards. c. How many clock cycles does it take to execute the code in part b. d. Using forwarding, clearly show how it can be used to resolve data hazards. If a bubble needs to be added, simply make a marking as below. (use the next page for e. How many clock cycles does part d take? 10 f. Indicate what each stage will do during the 5th clock cycle. Last Name: For the code below, a. On the diagram, mark and identify all the data dependencies in the code given below and identify which dependencies will cause data hazards without forwarding hardware. b. Assuming there is no special hardware that is added for forwarding. Add "nop" instructions to the code to avoid the data hazards. c. How many clock cycles does it take to execute the code in part b. 17 CLK cycles d. Using forwarding, clearly show how it can be used to resolve data hazards. f. Indicate what each stage will do during the 5<sup>th</sup> clock cycle. Register write: WR \$6 Memory: bypasss resultof \$8 or \$6 ALU: subtract intermediate values \$6-\$3 Register Read: read \$8 \$ \$7 | 2000 | 0.00 | | |------|------|-----| | | 700 | 142 | | H | TI 6 | 14/ | | | | | Quiz #15 Dr. Izadi First NAME: Last Name:\_ Using multiple copies of the following diagram, show the active stages for execution of the following sequence of instruction set lw \$s1, (\$t3)0x12 add \$s2, \$a1, \$t2 | EGC44 | 7 | |-------|-----| | EUC44 | . Z | Quiz #14 Last Name: Dr. Izadi Problem 1 (25 Pt.) a. Highlight and annotate the active data path with what they carry for: beq \$8, \$9, 0x12 Assume prior to the execution, PC = 0X4040, \$8 = 0X9040, \$9 = 0X9040. b. Complete the control branch circuitry to allow beg to be executed and fill the following table with the associated control signal values: | Instruction | PCSrc | RegDst | AND REAL PROPERTY AND ADDRESS OF THE PARTY | Memto-<br>Reg | STATE OF THE PARTY | A CONTRACTOR OF | DESCRIPTION OF THE PROPERTY | Concession of Contraction | ALUControl | |-------------|-------|--------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------|------------| | beq | 1 | X | 0 | X | 0 | X | S | | 110 | 0001 0010 00 Problem 2 (25 Pt.) a. Highlight and annotate the active data path with what they carry for: sw \$2,0x34 (\$7) Assume prior to the execution, PC= 0X4044, \$2 = 0X9040, \$7= 0X8480, Mem[0x8480]= 0X 142A, Mem[0x8514] = 0X 3482, Mem[0x84B4]= 0x AC12 b. Fill the following table with the associated control signal values: | Instruction | PCSrc | RegDst | ALUSrc | Memto-<br>Reg | | STATE OF THE PARTY | \$1000 pt (0.000) | Branch | ALUControl | |-------------|-------|--------|--------|---------------|---|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------|--------|------------| | sw | 10 | X | | X | 0 | 0 | V | 0 | 010 | c. Determine the contents after the execution $\$2 = 0 \times 9040 \$7 = 0 \times 848$ $9 \times 9040$ | T | ЭC | 1 1 | 10 | |----|----|-----|-----| | н. | ¥( | 44 | . / | | | | | | Quiz #13 Last Name: Dr. Izadi Problem 1 Highlight the active data path for: \$2, 0X44(\$7) Annotate the different segments of highlighted datapath with what they carry Assume the instruction is at memory location 0X4000, (\$2) = 0X19, and (\$7) = 0X8000, and Mem[0x8000] = 0x40 and Mem[0x8044] = 0x24 After execution: After execution: $$2 = 0 \times 14$ $$7 = 0 \times 8000$ Mem[0x8000]= Mem[0x8044] = PC= $0 \times 400 \times 400 \times 1000$ Problem 2 Highlight the active data path for: ori 64 \$2, \$7, 0x32 0010010 00010110 130110110 ->0X3( Annotate the different segments of highlighted datapath with what they carry Assume the instruction is at memory location 0X2004, (\$2) = 0X9, and (\$7) = 0X16 After execution: $$2 = 0 \times 36$ $$7 = 0 \times 6$ PC= $0 \times 2008$ | EGC442 | 1/ 1/ | Quiz #12 | Dr. Izadi | |-------------|-------|------------|-----------| | First NAME: | Ke/ | Last Name: | | Show the data path for the following instruction set: - lw rt, d16(rs), - R-type, Make sure only use the components that are essential and cross out the components that are not used. You may need additional components such as multiplexers. First NAME: \_\_\_\_Key \_\_\_ Las Name: 1. Determine the g<sub>i</sub> and p<sub>i</sub> values of the following two 4 bit numbers. What is c<sub>4</sub>? $$\begin{array}{lll} g_i = a_i \ b_i \\ p_i = a_i + b_i \\ g_0 = 1 & p_0 = 1 \\ g_1 = 0 & p_1 = 1 \\ g_2 = 0 & p_2 = 1 \\ g_3 = 0 & p_3 = 1 \end{array}$$ $$c_4 = g_3 + p_3 g_2 + p_3 p_2 \ g_1 + p_3 \ p_2 \ p_1 \ g_0 + p_3 \ p_2 \ p_1 p_0 \ c_0$$ $$c_4 = 0 + 0 + 0 + 1 + c_0$$ = 1 - 2. One simple way to model time for logic is to assume each AND and OR gate takes the same time for a signal to pass through it. Time is estimated by simply counting the number of gates along the longest path through a piece of logic. Compare the number of gate delays for the critical paths of the following 16-bit adders - a. Ripple carry $$16 \times 2 = 32$$ b. two-level carry lookahead $$\begin{array}{l} p_i\!=\!1\ g_i\!=\!1\ gate \\ G_0\!=\!{}_{g3}\!+\!p_3g_2\!+\!p_3p_2\ g_1\!+\!p_3\ p_2\ p_1\ g_0\!=\!2\ gate\ delay \\ P_0\!=\!p_3\!-\!p_2\!-\!p_4p_0\!-\!-\!1\ gate\ dely \\ C4\!=\!G3+P3G2+P3P2\ G1+P3\ P2\ P1\ G0+P3\ P2\ P1P0\ c_0\!=\!2 \\ Total\ delay\ =\!5 \end{array}$$ c. Carry lookahead at level one, and ripple carry between 4-bit modules $p_i = 1$ $g_i = 1$ gate $$c_4 = g_3 + p_3g_2 + p_3p_2$$ $g_1 + p_3$ $p_2$ $p_1$ $g_0 + p_3$ $p_2$ $p_1p_0$ $c_0 = 2$ gate delay Total level $1 = 3$ gate delay Ripple 4 times = $4 \times 3 = 12$ 1. Design the least significant and most significant modules of a 16 bit ALU with the following functionality. Show the carry, zero, overflow and sign flags for the ALU. ### 25 Points 1) For the third multiplication algorithm, as depicted the following diagram Figure 1: A multiplier Assuming the following data: $$\begin{array}{c} 1001 \rightarrow \text{Multiplicand} \\ \times 1101 \rightarrow \text{Multiplier} \\ \rightarrow \text{Products} \end{array}$$ Show step by step how the algorithm performs the multiplication. At each step show the intermediate Multiplier, Multiplicand, and Product. 10 Points 2. Show the IEEE 754 binary representation of the number -65.25ten in single precision: 15 Points 3. Convert the single precision binary floating-point representation to decimal. Last Name: ### 20 Points - 1) Given 10011100110001111100010111100010, is it - a. An assembly code? - b. A signed 2's complement number? - c. A unsigned number? - d. A Machine code? - e. Could be b., c. or d? - f. None of the above ### 20 Points 2) The largest product resulting from a multiplication of a 16-bit multiplicand and a 16bit multiplier is 32 bits long. ### 20 Points 3) Determine the indicated flags for the following data and operation. 1011001 0101101 Result = 0 0 1 0 0 CY = 1Sign = O Overflow = Zero = O #### 40 Points 4) For the first multiplication algorithm, as depicted the following diagram Show step by step how the algorithm performs the multiplication. At each step show the intermediate Multiplier, Multiplicand, and Product. ① poduct 0000 0000 ② multiplier a 101 add product 0000 1011 multiplier to 10 no add multiplier to 10100 multiplier to 10100 multiplier pool add multiplier pool add product 00110111 multiplicant 1011000 To product 00110111 multiplicant 1011000 no odd multiplier 0000 no odd product 0011 0111 → 55 0000 1011 15 Pt. 1) A program calls a procedure SWAP two swap variable x and y. Currently, x is in \$s1, y is in \$s2. How might the program pass the parameter values to Power? Last Name: | a. | add | \$s2, \$a2, \$0 | |------|-----|-----------------| | | add | \$s1, \$a1, \$0 | | (b.) | add | \$a2, \$s2, \$0 | | 9 | add | \$a1, \$s1, \$0 | | c. | add | \$v2, \$s2, \$0 | d. None of the above. add \$v1, \$s1, \$0 #### 35 Pt. 2) Write a leaf procedure SWAP (x, y) to exchange variable x and y. Assume x and y are properly assigned MIPS registers. 35 Pt. Write a leaf procedure SWAP (x, y, k) to exchange contents of x[k] and y[k]. Assume &x[0] is in \$a1, &y[0] is in \$a2, and k is in \$a3. Swap: 511 \$13,\$93,2 add \$1,\$01,\$03 add \$2,\$02,\$03 lw \$t0,\$C\$02] lw \$t1,\$C\$0] Sw \$t1,\$C\$0] Sw \$t1,\$C\$0] 4) A main program calls a SWAP procedure using the instruction: jal swap. That instruction is at address 0x2F000. Upon execution of jal, what happens to \$ra? Va= 0x 25004 | EGC442<br>First NAM | E: K | 2Y | Quiz 6<br>Last Na | me: | Dr | . Izadi<br>——— | | |---------------------|--------------------------------|---------------------------|-----------------------------------------|-----------------|------------------------|----------------|------| | | | | | | | | 64 | | 45 Pt. | | | | | | | | | 5.0 | | owing initial | l values, determine t | he resulting va | alue for the g | iven | | | | ration. | | | | | | | | | | | 0 0000 0000 1110 1 | | | | | | | | | 0 0000 0000 1101 1 | | | | | | \$s1 | = 1100 101 | 10 1100 001 | 1 1111 0101 1100 1 | 010 | | | | | and | 110010 | 10 1100 0000 | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 010 | | | | | | For the foll | owing instru | action: | | | | | | | and \$t1, \$s | 1, \$a2 | | | | | | | ) | Only put do | own the valu | ie of the register th | at changes in | hex | | | | | \$s1: | | \$a2: | \$ | iti:OX O C | 000 | OOCA | | | | | | | | | | | | For the foll or \$11,\$s1, | owing instru<br>\$a2 | action: | | | | | | | Only put do | own the valu | ie of the register th | at changes in | hex | | | | | \$s1: | | \$a2: | \$ | ttl: OX C | AC3 | F5DF | | | | | | | | | | | | For the foll<br>nori \$1], \$s | owing instru<br>1, 0x0000 | action: | | | | | | | Only put do | own the valu | ie of the register th | 1.50 | | | | | | \$s1: | | \$a2: | \$ | t1: <mark>0\(35</mark> | 3 C O | A35 | | | 1100 | 0101 | 0011 1100 | 0000 | 1010 | 0011 | dol | Assume \$1 has 0x34 and \$s2 has 0x2E. Given this code: ox 49 ox 9 e \$s3, \$s4, Else add \$s0, \$s1, \$s2 j Exit Else: sub \$s0, \$s1, \$s2 Exit: 10 Pt. 2) If \$s3 is 0x49 and \$s4 is 0x49, which instruction executes after bne? add \$5\$,\$51,\$52 10 Pt. 3) If the first instruction were beg rather than bne, what instruction should then appear immediately after beq? 50h \$50,\$51,\$52 $$if(i!=j)$$ \$91 \$02 \$5 where i, j, f, g, h are assigned to \$\$1, \$\$2, \$\$1, \$\$2 registers, respectively. be6 \$51, \$52, EXIT add \$12, 501, 502 sub \$t2,\$t2, \$51 # 1-5, each 5 points, 6, 25 points Assume \$s3 has 5004, and words addressed 5000..5002 have the data shown: 5000: 0x99 5001: 0x77 5002: 0x23 5003:0x23 2 5004:0x6E 5005:0x34 5006:0x13 5007:0x34 - 1) What address will be computed by: 1. Sto -4(\$s3) 5004-465000 to=0199772323 - 2) What value will be put in \$11 by: 5004-2=502 lw \$t1, -2(\$s3) \$t1=0X23236E34 - 3) What value will be put in \$t2 by: 5004+0=5004 lw \$t2, 0(\$s3) \$tz=0x6E 34 13 34 - 4) An array A has a base address of 2000. What is the address of A[2]? 2000 + 4\*2 = 2008 5) Assuming \$s3 has 5000, is the following an acceptable instruction? . 5003 is not multiple of 4 lw \$t0, 3(\$s3) Assume the base address of array A is in register \$s2. Write a sequence of instructions to perform the following operations: A[2] ← A[0] + A[1] # You must show your complete work for full credit! . 1. Consider the following performance measurements for a program: | | Computer A | Computer B | |--------------|---------------------|--------------------| | Number of | 4 × 10 <sup>8</sup> | 3 ×10 <sup>8</sup> | | Instructions | | | | Clock rate | 4 GHZ → 125/15 | 2 GHz → 5 n 5 | | CPI | 2 | 1 | a. Which computer is faster for that program? $$EA = 4 \times 10^{9} \times 2 \times .29 \times 2 \times 10^{9} \times 10^{9} = .2 \text{ sec}$$ $$EB = 3 \times 10^{9} \times 1 \times .5 \times 10^{9} = .15 \text{ sec}$$ b. Which computer has a higher MIPS $$MIPSA = \frac{4 \times 10^{9}}{10^{6} \times .2} = 2000$$ $$MIPSB = \frac{3 \times 10^{9}}{10^{6} \times .15} = 2000$$ 2. A program runs in 120 seconds. Multiply and divide operations are responsible for 40 and 60 of those seconds, respectively. Would it be possible to run the program 2 times faster - a. By improving the multiplication alone. If yes, by what factor? - b. By improving the division alone. If yes, by what factor? - c. By improving the multiplication and division by the same factor. If yes, by 2,5 +1'me 5 $$\frac{120 = 40 + 60 + 20}{60 + 0 + 80}$$ a. $$\frac{100}{60 \pm 0 + 8^{\circ}}$$ $\frac{100}{40} = 2.5 \pm 1$ mes b. $60 \pm 0 + 60$ c. $60 = 40 + 20$ $$c. 60 = 40 + 20$$ - 3. Given: b = 2, c = 5, d = 1, add t, d, c5 $\longrightarrow$ $5+1 \longrightarrow 6$ sub a, t, $b^2$ Final value of a -4 and b 2 - 4. Order the assembly instructions to calculate the expression: a = b + c d + e - add - t0, b, c - sub - a, t0, d - add - 0, t0, e 2 Computer A's performance is 4 times as fast as the performance of computer B, which runs a given application in 20 seconds. How long will computer A take to run that application? $$t_A = \frac{t_B}{4}$$ $$t_A = \frac{20}{4} = 5 \text{ Sec}$$ 2. Our favorite program runs in 40 seconds on computer A, which has a 2 GHz clock. We are trying to help a computer designer build a computer B, which will run this program in 14 seconds. The designer has determined that a substantial increase in the clock rate is possible, but this increase will affect the rest of the CPU design, causing computer B to require 1.4 times as many clock cycles as computer A for this program. What is the CPU clock cycles for computer B? $$t_{A}=40$$ $t_{B}=2x10^{9}$ $N_{A}$ $t_{B}=14$ $t_{B}=2x10^{9}$ $t_{A}=40=8$ $t_{A}=40=8$ $t_{A}=40=8$ $t_{A}=40=8$ $t_{A}=40=8$ $t_{A}=40=1.4$ $t_{A}=40=1$ - Suppose we have two implementations of the same instruction set architecture. Computer A has a clock cycle time of 125 ps and a CPI of 4.0 a program with 2 X 106 instructions, and computer B has a clock cycle time of 250 ps and a CPI of 2.4 for the same program. - a. Which computer executes fewer clock cycles? a. Which computer executes fewer clock cycles? b. Which computer has the smaller execution time? $$\overrightarrow{A}$$ $TA = 125 PS$ $CPI_A = 4 N = 2\times10^6$ $TB = 250PS$ $CPI_B = 24 N = 2\times10^6$ $CLK_A = 4\times2\times10^6 = 8\times10^6$ $CLK_B = 2.4\times2\times10^6 = 4.8\times10^6$ $CLK_B = 2.4\times2\times10^6 = 4.8\times10^6$ $CPU_A = 8\times10^6\times125\times10^{-12} = 1 \text{ M/S}$ $CPU_A = 8\times10^6\times250\times10^{-2} = 1.2 \text{ M/S}$ 4. A compiler designer is trying to decide between two code sequences for a particular computer. The hardware designers have supplied the following facts: | | CPI fo | CPI for each instruction class | | | | | |-----|--------|--------------------------------|---|--|--|--| | | A | В | C | | | | | CPI | 1 | 2 | 3 | | | | For a particular high-level language statement, the compiler writer is considering two code sequences that require the following instruction counts: | | Instruction counts for each instruction class | | | | | | |---------------|-----------------------------------------------|---|---|--|--|--| | Code sequence | Α | В | C | | | | | 1 | 1 | 3 | 2 | | | | | 2 | 4 | 2 | 1 | | | | 67 a. Which code sequence executes the largest number of instructions? aloz 7 b. Which code sequence executes the largest number of clock cycles? Cale 1 = |X| + 2X3 + 3X2 = |3|Cale 2 = |X| + 2X2 + 3X| = |1| Code 1 | GC4<br>rst N | 42<br>NAME: | Key | Quiz #2 Last Name: | Dr. Izadi | |--------------|------------------|----------------------------------------------------------------------------|------------------------------------------------------------|-------------------------| | 1. | Match<br>archite | | he closest analog of a great idea i | n computer | | • | Make | the Common Case | e Fast | | | • | Hiera | rchy of Memories | | | | • | Design | ı for Moore's Law | | | | • | Use A | bstraction to Simp | olify Design | | | | a. | A soccer player ru | ns not to where the ball is, but to | where the ball will be. | | | b. | A customer talks tagent's supervisor. | o a phone agent. If there's a probl | em, he talks to the | | | c. | | first designs a house with 5 rooms, windows, and flooring. | s, then designs room | | | d. | A college student weekend beach sp | rents an apartment closer to camp ot. | us than to her favorite | | 2. | The fo | rue | machine-language instruction: 10 | 00110010100000. | | 3. | The fo | ue | n assembly language instruction: | 000110010100000. | | 4. | | ge instructions like<br>ue | ssembly language instructions like 1000110010100000. | e add A,B to machine- | | 5. | C oo ad | is a high-level lang<br>00000001010001000<br>d \$2, \$4, \$2<br>mp = v[k]; | | (A) | | 6. | | kind of language is C? | |----|------------------------------------|---------------------------------------------------------------------| | | $\square$ M | achine | | | $\Box$ As | ssembly | | | $\mathbb{K}_{\mathrm{H}}$ | igh-level | | .7 | The fi | va componenta of a computer | | d' | THE RESERVE OF THE PERSON NAMED IN | ve components of a computer. ntrol | | q | • Inp | out . | | C | • Me | emory | | b | • Ou | tput | | 0 | • Da | tapath | | | a. | Writes data to memory. Ex: Keyboard. | | | b. | Reads data from memory. Ex: Display. | | | c. | Stores instructions and data. | | | d. | Sends signals that determine the operation of the other components. | | | e. | Performs computations. | | 8. | | egrated circuit is often called a | | | K c | | | | | CPU | | 9. | The C | PU chip physically occupies of the size of the iPad 2. | | | F-3 | nost | | | 1/2000 | small fraction | | 10 | | | | 10 | E-17 | U is also known as datapath | | | | control | | | | processor | | | /\ a | processor | 1. Convert (A4.C)<sub>16</sub> directly to base 2 and base 8. 244.6 2. Using a total of 8 bits, represent -65 in signed 2's complement format. 3. The given numbers are represented in signed 2's complement. Carry out the indicated operation and specify the indicated flags. | 11001010 | | |------------|--| | 11001010 | | | - 01111011 | | 10000101 110010 GCY=1 Carry = | Sign = 0Overflow = 4. Using gates and multiplexer(s), design a one-bit ALU that performs the following logical operations: | $S_1$ | $S_0$ | Function | |-------|-------|----------| | 0 | 0 | OR | | 0 | 1 | XOR | | 1 | 0 | NOT | | 1 | 1 | AND | - a. Utilizing the block diagram of 4- bit registers, multiplexers, decoders, and any other needed gates, design a register file with 4 4-bit registers such that it can allow reading of any two registers and writing of any one. - b. Indicated how many bits each line in the following block diagram represent.