Part 5 - Branches, Decisions, and Loops

Program flow: branch, jump, and set instructions; loops, and decisions.

Jump and Branch Instructions

The power of computers is their ability to repeat actions and their ability to alter their operation depending on data. Modern programming languages express these abilities using control structures. Repeated action (iteration) is done with a while structure. Alternative control paths (alternation) is done with an if-then-else structure.

The machine instructions of the processor do not have these structures, nor does assembly language. When you program in assembly language you must build these structures out of basic assembly instructions. These basic instructions are the jump instructions and the conditional branch instructions.

Chapter Topics:

Jump Instruction:
- j instruction (jump)
Conditional Branch Instructions:
- beq instruction (branch equal)
- bne instruction (branch not equal)

Question 1

When a program is executing, does each machine instruction have an address in main memory?

Answer

Yes. Each byte of main memory has an address.

How a jump works

When a program is executing, its instructions are located in main memory. The address of an instruction is the address of the first (the lowest addressed) byte of the four-byte instruction.

Each machine cycle executes one machine instruction. At the top of the machine cycle, the PC (program counter) contains the address of an instruction to fetch from memory. The instruction is fetched into the processor and is prepared for execution.

In the middle of the machine cycle the PC is incremented by four so that it points to the instruction that follows the one just fetched. Then the fetched instruction is executed and the cycle repeats. The machine cycle automatically executes instructions in sequence.

When a jump instruction executes (in the last step of the machine cycle), it puts a new address into the PC. Now the fetch at the top of the next machine cycle fetches the instruction at that new address. Instead of executing the instruction that follows the jump instruction in memory, the processor "jumps" to an instruction somewhere else in memory.

However, it takes an extra machine cycle before the change in the PC takes effect. Before the PC changes, the instruction that follows the jump instruction in memory is fetched and executed. After that instruction executes, the next instruction to execute is the one that was jumped to. The instruction that follows a jump instruction in memory is said to be in the branch delay slot.

The reason for this delay is that MIPS is pipelined. Normally, instructions are executed one after another in sequence. In order to gain speed, the processor cleverly fetches several sequential instructions and starts working on them all. When the machine cycle calls for one of these instructions to be executed, much of the work has already been done. These instructions are in an instruction pipe.

This means that the instruction in the branch delay slot has mostly been completed when the jump is executed. Rather than waste this effort, the instruction in the branch delay slot is allowed to finish. Only then is the PC changed by the jump instruction.

The instruction that follows a jump instruction in memory (in the branch delay slot) is always executed. Often this is a no-op instruction. After it executes, the next instruction to execute is the one that was the target of the jump instruction.

The SPIM simulator allows you to turn the pipeline feature off, but this is not an option with actual hardware. So, for now, leave this option on.

Question 2

(Review:) What does a no-op instruction do?

Answer

A no-op instruction is an instruction that has no effect. A common no-op instruction is sll $0,$0,0.

Altering the PC

Here is a sequence of instructions. The "load" and "add" represent typical instructions. The "jump" instruction shows the address we wish to put into the PC. (The actual MIPS instruction for this involves details that we are skipping for the moment.)

The last instruction, a no-op, fills the branch delay slot to give the PC time to change. Once started, the four instructions execute in an unending loop.

A loop structure is created with the jump instruction. The intent of the jump instruction is to put the address 0x00400000 into the PC. However, this effect is is not seen until after the instruction in the branch delay slot has executed.

Question 3

(Thought question:) Is there anything in hardware that guarantees that the target of a jump instruction is an instruction (and not data)?

Answer

No. A jump instruction causes the processor to fetch whatever bit pattern is stored in memory at the new address. This might not be part of the program.

Practice

Here is another schematic program. The instructions are just sketched in as place holders. Don't pay much attention to them, but look at the jump instruction and its target.

The target of the jump instruction is the address 0x00450000.

Question 4

Fill in the blanks.

Answer

The jump instruction

In our schematic programs, the "jump" instruction loaded the PC with a 32-bit address.

How does a 32-bit instruction specify a 32-bit address? Some of the instruction's bits must be used for the op-code. Here is the assembly language form of the jump instruction.

j    target     # after a delay of one machine cycle,
                # PC  <-- address of target

Here is the machine language form of the instruction:

   6              26
000010 00000000000000000000000000    -- fields of the instructuion

opcode         target                -- meaning of the fields

There is room in the instruction for a 26-bit address. The 26-bit target address field is transformed into a 32-bit address. This is done at run-time, as the jump instruction is executed.

Instructions always start on an address that is a multiple of four (they are word-aligned). So the low order two bits of a 32-bit instruction address are always "00". Shifting the 26-bit target left two places results in a 28-bit word-aligned address (the low-order two bits become "00".)

After the shift, we need to fill in the high-order four bits of the address. These four bits come from the high-order four bits in the PC. These are concatenated to the high-order end of the 28-bit address to form a 32-bit address.

For example, here is the machine language for the instruction that jumps to location 0x5B145188. Say that the instruction is located at address 0x56767250.

Question 5

While this is going on, what address is in the PC?

Answer

The PC contains the address of the instruction that follows the jump instruction. Recall that the machine cycle is "fetch - increment PC - execute" so the PC will have been incremented.

Most jumps (and branches) are local

Most jumps and branches are to nearby addresses. The target of the jump instruction and the instruction following the jump instruction are likely to be close together in memory. The high-order four bits of their addresses will be identical. So the high-order four bits of the PC are the same as needed for the target address.

Of course, an assembly language programmer must be careful to make sure that this is so. When a compiler translates a source program into machine language it also must pay attention to addresses. For the tiny programs you will write for this course the high four bits of the PC will always be the high four bits of the jump address.

A jump instruction can't jump to any arbitrary location in the full 32-bit address space. It must jump to an address within the following range:

wxyz 0000 0000 0000 0000 0000 0000 0000
                   .
                   .
                   .
wxyz 1111 1111 1111 1111 1111 1111 1100

Here, wxyz represents the high-order four bits of the PC. Almost always the jump instruction and the jump address are both within this range.

All these details may look terrible to you at this point. Don't worry: (1) its not as bad as it looks, and (2) usually the assembler does all the work. (But for now, you get to do the work).

Question 6

Some 32-bit processors have instructions of several lengths. Such processors can include the full 32-bit address in a jump instruction. But MIPS instructions are always 32 bits. Some sort of trick must be used with its jump instruction.

Here is a great idea! Why not implement the jump instruction without using an explicit op-code? Build the processor so that when execution encounters a 32-bit address it automatically jumps to that address.

Will this scheme work?

Answer

No. Addresses are 32-bit patterns, machine instructions are 32-bit patterns, and many data are 32-bit patterns. There is no way to tell them apart, except by context. Here is an example. Say that this bit pattern got fetched as an instruction: 0x00000000. Is this the address of the first byte of memory, or the sll $0,$0,0 instruction?

Jump practice

The following program illustrates how the jump instruction is constructed. For simplicity, all instructions other than the jump instruction are no-ops. The jump instruction jumps to the first instruction of the program. The very last instruction fills the delay slot.

The left-most six bits of the j machine instruction are the opcode. You need to decide if the next 26 bits are correct. Do this by filling in the blanks of the question. (Copy-and-paste works well for this.)

Question 7

Write the jump address 0x00400000 as 32 bits: _____
Write the 26-bit field of the jump instruction: _____
Shift it left two positions: _____
What are the high-order four bits of the PC? _____
Copy (4) to the left of (3): _____ _____
Is (5) the same as (1)? _____

Answer

Symbolic address

How do you know what the high-order four bits of the PC are? Well, since you have the address of the instruction following the jump instruction, you know that its address is in the PC. So the high order four bits come from that address.

Usually the high order four bits of the address of the jump instruction and of the one following it are the same. But in very rare instances they might be different (when adding four to the jump instruction's address causes a carry into the high order bits).

With some trickery, a 26-bit field can specify a 32-bit address. But it is a nuisance to figure out! If you were doing machine language programming, that is what you would have to do. But the assembler does the work for you. Here is a tiny program:

## jump.asm
##
      .text
      .globl  main

main: 
      sll    $0,$0,0
      sll    $0,$0,0
      sll    $0,$0,0
      sll    $0,$0,0
      j      main
      addiu $8,$8,1 
      
## End of file

It is similar to the previous example. The symbolic address main stands for the address of the first instruction. The instruction j main tells the assembler to assemble a machine instruction with the proper 26-bit field so that control is transferred to main.

The branch delay slot is filled with an instruction that increments register $8.

Question 8

After the loop has executed five times, what value will be in register $8? SPIM initializes all registers to zero when a program starts.

Answer

Five. Each time the loop executes, the instruction in the branch delay slot increments $8.

Assembled program

Here is a SPIM view of the program. When you run it, remember to set the value of the PC to 0x400000. To see the branch delay it should be enabled in the options menu.

The assembler constructed the same machine language jump instruction as we did by hand. (Compare with two pages back). Using the assembler and a symbolic address is much easier.

Question 9

Is the jump instruction all we need to construct a while loop?

Answer

No. You can build unending loops (infinite loops) with it. But additional instructions are needed for while loops and for if-then-else structures.

Conditional branches

A conditional branch instruction branches to a new address only if a certain condition is true. Usually the condition is about the values in two registers. Here is the beq (branch on equal) instruction:

beq  u,v,addr   # if register $u == register $v
                #     PC  <— addr (after a delay of one machine cycle.)
                # else
                #     no effect.

The bit patterns in two registers are compared. If the bit patterns are the same, the PC is changed to the branch address. There is a branch delay slot that follows this instruction (just as for a jump instruction).

Question 10

Will a 32-bit address fit inside the 32-bit beq instruction?

Answer

No. Same problem as with the j instruction.

Ifs and whiles

More trickery is used to create a 32-bit branch address out of the smaller sized field of the beq instruction. But let's skip that for now.

Branch instructions (including the beq instruction) are used to implement both loops and branches. At right is a flowchart showing an optional branch. Here is the assembly language for it:

        ...                # load values into $8 and $9 
        beq  $8,$9,cont    # branch if equal
        sll  $0,$0,0       # branch delay slot
        ...                # conditionally
        ...                # executed statements
        ...                # 
cont:   add   $10,$10,$11  # always executed

(Any instruction can be a target of a branch. The add instruction is here just as an example.)

Question 11

Must the contents of registers $8 and $9 in this example be regarded as numbers?

Answer

No. beq tests if the same 32-bit pattern is in each register. The pattern can represent anything.

Branch on not equal

Here is the bne (branch on not equal) instruction:

bne  u,v,addr   # if register $u != register $v
                #     PC  <— addr (after a delay of one machine cycle.)
                # else
                #     no effect.

Can a branch instruction implement a two-way decision (an if-then-else)?

Question 12

Yes.

Answer

tbdAnswer

Two-way decision

The two-way decision (alternation) is written in assembly language using both a conditional branch and a jump instruction. Look at this example carefully! It shows how a basic control structure is built out of small assembly instructions.

        ...                # load values into 
                           # $8 and $9 
        beq  $8,$9,equal   # branch if equal
        sll  $0,$0,0       # branch delay slot
        ...                #  
        ...                #  false branch
        ...                #  
        j    cont
        sll  $0,$0,0       # branch delay slot
equal:  ...                #
        ...                #  true branch
        ...                #
cont:   add   $10,$10,$11  # always executed

Of course, any of the conditional branch instructions may be used in the place of beq. If you want the "true" branch to come first and the "false" branch to come second (as in an if-then-else of Java or C), pick a different branch instruction.

Question 13

In an if-then-else structure, the two branches of control always come together at the first statement outside of the structure (the statement at cont (continue) in the example). Is this necessarily so in assembly language?

Answer

No.

Absolute value

You can build a real rats-nest of code with assembly language. Avoid this by implementing the structures of a high level language. Draw a flowchart of the program or rough it out in C or Java before coding. Put comments in the assembly source program before you add code to show how the code and the flowchart correspond.

The flowchart at right shows a program that calculates the absolute value of the integer at symbolic address "A". Here is a start on a program that follows that logic:

Assume that "A" starts at address 0x10000000. The lui instruction points the base register $10 at that address.

Question 14

Fill in the blanks.

Answer

The program below has the blanks filled.

Shifting the sign bit

To determine if "A" is negative, check if its sign bit is one. To do this, logically shift the sign bit into bit position 0 of a register. The register will be zero if "A" is positive. The program assumes that integer values are represented with two's complement.

The branch delay slot is filled with a no-op.

Question 15

Fill in the blanks.

Answer

The blanks are filled, below.

Store -A

The sign bit is shifted right 31 bit positions. This puts it in the low-order bit of the destination register ($9 in this case). To test if $9 is zero, use branch-on-equal with register $0 (which is always zero).

Now calculate -A and store it back into word "A". The instruction at done is a no-op.

Question 16

Fill the blanks.

Answer

The complete program is given below.

Complete program

Here is the complete program, suitable for copying to a text file and running with SPIM. Remember to initialize the PC (to 0x00400000). Execute the program step at a time by pushing F10.

## absVal.asm
##
## Calculate the absolute value of A
##
## Registers:
##   $8  --- A, two's comp. integer
##   $9  --- sign bit of A
##   $10 --- base register for .data section

        .text
        .globl  main

main: 
# Get A
        lui   $10,0x1000     #  Init base register
        lw    $8,0($10)      #  Load A
        sll   $0,$0,0

# Is A Negative?
        srl   $9,$8,31       #  Shift sign bit to position 0
        beq   $0,$9,done     #  sign bit == zero, done
        sll   $0,$0,0

# Store -A
        subu  $8,$0,$8       #  negate A
        sw    $8,0($10)      #  save it

done:   sll   $0,$0,0        #  target of the branch

        .data
A:      .word   -1

## End of File

Question 17

Would the program work if "A" were unsigned binary?

Answer

No.

Chapter quiz

Instructions: For each question, choose the single best answer. Make your choice by clicking on its button. You can change your answers at any time. When the quiz is graded, the correct answers will appear in the box after each question.

Question 1

What are the three steps in the machine cycle?

(A) increment the PC; fetch the instruction; execute the instruction
(B) fetch the instruction; execute the instruction; increment the PC
(C) execute the instruction; fetch the instruction; increment the PC
(D) fetch the instruction; increment the PC; execute the instruction

Answer

Question 2

What are the four bytes immediately following a jump instruction called?

(A) fetch delay slot
(B) pipeline delay slot
(C) branch delay slot
(D) PC advance slot

Answer

Question 3

What is a pipeline?

(A) Several words of data from memory are moved into the processor before instructions need them.
(B) Several sequential instructions are simultaneously prepared for execution while one instruction finishes its execution.
(C) A single instruction is divided into four phases and each phase is executed in one machine cycle.
(D) Multiple items of data are sent down the system bus like water in a pipe.

Answer

Question 4

Say that a sll instruction is located in memory at address 0x400100, and an add instruction is located in memory at address 0x400104. After the add instruction executes, what value will be in the PC?

(A) 0x400100
(B) 0x400104
(C) 0x400105
(D) 0x400108

Answer

Question 5

Say that a j (jump) instruction is located in memory at address 0x400100, and a sll instruction is located in memory at address 0x400104. After the j instruction executes, what value will be in the PC?

(A) 0x400100
(B) 0x400101
(C) 0x400102
(D) 0x400104

Answer

Question 6

Here is a schematic program loop.

What numbers go into the two blanks?

(A)
```
14
08
```
(B)
```
14
00
```
(C)
```
00
08
```
(D)
```
14
18
```

Answer

Question 7

Here is a 32-bit j instruction. The first 6 bits are the op-code.

000010 00 0001 0000 0000 0000 0000 1000

Here is the value of the PC while the target address is being constructed:

0000 1000 0001 0000 0000 1100 0110 1000

What address does the j put into the PC?

(A) 0000 00 0001 0000 0000 0000 0000 1000 00
(B) 0000 1000 0001 0000 0000 1100 0110 1000
(C) 0000 10 0001 0000 0000 1100 0110 1000 00
(D) 1000 00 0001 0000 0000 0000 0000 1000 00

Answer

Question 8

Examine the following program fragment. The program is to add $5 and $6 together only if they are not equal.

         ori   $5,$0,8            # load $5 with 8
         ori   $6,$0,9            # load $6 with 9
      
         ____  $5,$6,spot
      
         ____  $0,$0,0            # branch delay slot
      
         addu  $8,$5,$6           # $8 = $5 + $6      
      
spot:  

Pick instructions to fill the blanks.

(A) beq ; addu
(B) bne ; sll
(C) bne ; addu
(D) beq ; sll

Answer

Question 9

Here is an if-then-else structure. The code is to compare $10 and $11. If these registers contain the same bit pattern, set register $7 to 1. Otherwise set $7 to 0.

        ori   $10,$0,123
        ori   $11,$0,123
        
        ___   $10,$11,_____
        sll   $0,$0,0
        ori   $7,$0,0
        j     _____
        sll   $0,$0,0

equal:  ori   $7,$0,1

join:   .....

Which choices should fill the blanks?

(A) bne ; equal ; join
(B) beq ; join ; equal
(C) beq ; equal ; join
(D) bne ; join ; equal

Answer

Question 10

Say that registers $5 and $6 each contain an ASCII character in the low order byte. Can the beq instruction be used to compare the characters?

(A) Yes, because beq will recognize the character data and do a character comparison.
(B) No, because beq only works with two's complement integers.
(C) No, because beq only works with full 32-bit data.
(D) Yes, because beq compares bit patterns regardless of what they represent.

Answer

Programming exercises

For these programming exercises, use only those instructions that have been discussed so far in these notes:


`add`	`divu`	`mflo`	`sll`
`addi`	`j`	`mult`	`sra`
`addiu`	`lb`	`multu`	`srl`
`addu`	`lbu`	`nor`	`sub`
`and`	`lh`	`or`	`subu`
`andi`	`lhu`	`ori`	`sw`
`beq`	`lui`	`sb`	`xor`
`bne`	`lw`	`sh`	`xori`
`div`	`mfhi`

In the Simulator/Settings/MIPS menu of SPIM set Bare Machine ON, Accept Pseudo Instructions OFF, Enable Delayed Branches ON, Enable Delayed Loads ON, Enable Mapped I/O OFF, Load Exception Handler OFF.

Exercise 1 (*) non-ending loop

Write a program that computes the sum:

1 + 2 + 3 + 4 + 5 + ...

Do this by using the j instruction to implement a non-ending loop. Before the loop, initialize a register to zero to contain the sum, and initialize another register to one to be the counter. Inside the loop add the counter to the sum, increment the counter, and jump to the top of the loop.

Execute the program by single-stepping (by pushing F10). After you have done this enough to confirm that the program works, look at SPIM's menu and select Simulator and Multiple Step. Enter a number of steps (such as 40) into the window and click "OK". Each step is the execution of one machine cycle. Once you see how this works you can do the same thing more easily by pushing F11.

Exercise 2 (*) non-ending loop with overflow

Write a program that adds $8 to itself inside a non-ending loop. Initialize $8 before the loop is entered. Use the add instruction so that when overflow is detected the program ends with a trap.

Now change the add instruction to addu. Now when overflow occurs, nothing happens. Run the program and observe the difference.

Exercise 3 (**) - counting loop

Write a program that computes the sum:

1 + 2 + 3 + 4 + 5 + ... + 98 + 99 + 100

Do this, as above, by using the j instruction to implement a non-ending loop. Before the loop, initialize a register to zero to contain the sum, initialize another register to one to be the counter, and another register to 101₁₀. Inside the loop add the counter to the sum, increment the counter, and jump to the top of the loop.

However, now, at the top of the loop put in a beq instruction that branches out of the loop when the counter equals 101₁₀. The target of the branch should be something like this:

exit:    j    exit      #  sponge for excess machine cycles
         sll   $0,$0,0

Now run the program by setting the value of the PC to 0x400000 (as usual) and entering 500 or so for the number of Multiple Steps (F11). Your program will not need so many steps, but that is OK. The excess steps will be used up repeatedly executing the statment labeled "exit".

Exercise 4 (**) - Fibonacci series

Write a program that computes terms of the Fibonacci series, defined as:

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, ...

Each term in the series is the sum of the preceeding two terms. So, for example, the term 13 is the sum of the terms 5 and 8.

Write the program as a counting loop that terminates when the first 100 terms of the series have been computed. Use a register for the current term and a register for the previous term. Each execution of the loop computes a new current term and then copies the old current term to the previous term register.

Notice how few machine language instruction this program takes. It would be hard for a compiler to produce such compact code from a program in a high level language. Of course, our program is not doing any IO.

Set Instructions and more Branch Instructions

This chapter describes two more conditional branch instructions and two conditional set instructions. Set instructions set a register to 1 or 0 depending on whether a condition is true or false.

Chapter Topics:

More Conditional Branch Instructions (one operand register):
- bltz instruction (branch on less than zero)
- bgez instruction (branch on greater or equal to zero)
Conditional Set Instructions:
- slt instruction (set on less than)
- sltu instruction (set on less than, unsigned)
- slti instruction (set on less than, immediate)
- sltiu instruction (set on less than, immediate unsigned)

These new instructions help you implement loops and branches. But this is still a difficult task. Study this chapter to gain insight into what compilers do for you.

Question 1

What two conditional branch instructions did the previous chapter discuss?

Answer

The previous branch instructions compare the bit patterns in two registers:

beq  u,v,addr   # Branch if register $u == register $v
                # A branch delay slot follows the instruction. 

bne  u,v,addr   # Branch if register $u != register $v
                # A branch delay slot follows the instruction. 

Nothing is assumed about what the bit patterns represent. The processor can easily determine if two patterns are the same, or not, no matter how the patterns are being used.

Branch on less than zero, branch on greater than zero

MIPS has several ways to implement relational operators. Here are two more conditional branch instructions. These instructions compare the contents of a register to zero. The register's contents is assumed to be in two's complement representation.

bltz   s,label        #  Branch if the two's comp. integer
                      #  in register s is < 0
                      #  A branch delay slot follows the instruction.
bgez   s,label        #  Branch if the two's comp. integer
                      #  in register s is >= 0
                      #  A branch delay slot follows the instruction.

The first instruction branches if the integer is strictly less than zero. The other branches if the integer is greater than or equal to zero.

Both of these instructions are followed by a branch delay slot. This means that the instruction in that slot will always execute, and the branch (if it happens) will not happen until after that instruction finishes.

Question 2

Here is a statement in C (or in Java):

if ( !( a < 0 ) ) { ... }

Rewrite it without using the not operator (the !).

if ( a _____ 0 ) { ... }

Answer

if ( !( a < 0 ) ) { ... } 

if ( a >= 0  ) { ... }

The two operators < and >= cover all possible integer values.

Set on less than

The set instructions are used to implement relational operators. However, they do not in themselves alter the flow of control. They set a register to 1 or 0 to show the relation between two values. The slt instruction is used with two's complement integers:

                   #  $s and $t contain
                   #  signed integers
                   #
slt   d,s,t        #  if ( $s < $t )
                   #    d = 1
                   #  else
                   #    d = 0

The sltu instruction is used with unsigned integers:

                   #  $s and $t contain
                   #  unsigned integers
                   #
sltu  d,s,t        #  if ( $s < $t )
                   #    d = 1
                   #  else
                   #    d = 0

Question 3

A. After the following code executes, what value is in register $7?

addiu    $5,$0,-25
addiu    $6,$0,25
slt      $7,$5,$6

B. After the following code executes, what value is in register $7?

addiu    $5,$0,-25
addiu    $6,$0,25
sltu     $7,$5,$6

Answer

A. After the following code executes, what value is in register $7?

addiu    $5,$0,-25
addiu    $6,$0,25
slt      $7,$5,$6

Register $7 will be set to 1, because (when the patterns in registers $5 and $6 are regarded as signed values) $5 < $6.

B. After the following code executes, what value is in register $7?

addiu    $5,$0,-25
addiu    $6,$0,25
sltu     $7,$5,$6

Register $7 will be set to 0, because (when the patterns in registers $5 and $6 are regarded as unsigned values) $5 is not less than $6. If the bit pattern in register $5 is regarded as unsigned, it looks like a large positive value.

Set on less than immediate

The other two set instructions compare an operand register with an immediate value in the instruction. There is a version for two's complement:

                   #  $s and imm contain two's comp. integers
                   #  -32768 <= imm <= 32767
                   #
slti  d,s,imm      #  if ( $s < imm )
                   #    d = 1
                   #  else
                   #    d = 0

And a version for unsigned integers:

                   #  $s and imm contain unsigned integers
                   #  0 <= imm <= 32767
                   #
sltiu  d,s,imm     #  if ( $s < imm )
                   #    d = 1
                   #  else
                   #    d = 0
                   #

In both, the immediate field of the machine instruction is 16 bits wide. However, the sltiu instruction can only be used with small integers 0 <= imm <= 32767 (and another range which is not important for this course.)

Question 4

After the following code executes, what value is in register $7?

addiu    $6,$0,25
slti     $7,$6,37

Answer

Temperature range tester

Say that you are writing a control program for a robot spray painter. The allowed temperature range for the paint is 30 degrees to 55 degrees Celsius. The device driver for the temperature sensor puts the temperature in register $2.

Your program will test if the unsigned integer in register $2 is in range. If it is in range, register $3 is set to 1, otherwise to 0.

(Sensors often deliver unsigned data. Pretend that our sensor delivers unsigned data in the range of 0 to 100 in register $2.)

Question 5

Sketch a flow chart for this program.

Answer

With assembly language it is essential to make a plan before coding. See below.

Start on the program

The flowchart for the program is below. The first box sets a flag to a default value in advance of the test that might change it. This is a common trick.

Here is an outline of the program:

The range test is in two parts. The first part (in this program) tests if temp is less than or equal to 55. However, the machine instruction is "set on less than". If temp is out of range a branch is taken to out. The branch is followed by a no-op for the branch delay.

Question 6

Fill in the blanks.

Answer

See below.

More blanks

The immediate operand used in the set instruction is changed to 56 to implement "less than or equal to 55". Notice that the assembly language uses decimal numbers for temperatures. This is fine. The assembler translates the decimal representation of the source file into the correct bit pattern for the machine instruction.

The next part of the program tests if temp is less than 30. Be careful with the branch instruction so that it branches for the correct condition.

Question 7

Fill in the blanks.

Answer

The completed program follows.

Complete program

Here is the complete program, suitable to copy to a file and to run with SPIM. When you run it, set the PC to 0x400000 (as usual) and also use the set value menu to set $2 to a temperature. Run the program with different temperatures and check that $3 is set correctly.

## tempRange.asm
##
## Check that   30 <= temp <= 55
## Set flag to 1 if in range, to 0 if out of range
##
## Registers:
## $2 --- temperature
## $3 --- in/out range indicator flag
## $8 --- scratch
flowchart
      .text
      .globl  main

# Set range indicator to 1
main:   ori     $3,$0,1         #  set to 1

# Test  30 <= temp <= 55
        sltiu   $8,$2,56        #  $8=1 if temp <= 55
        beq     $8,$0,out       #  0? out of range
        sll     $0,$0,0         #  delay

        sltiu   $8,$2,30        #  $8=1 if  temp < 30
        beq     $8,$0,cont      #  0? in range
        sll     $0,$0,0         #  delay

# Out of Range: clear range indicator to 0
out:    ori     $3,$0,0         #  clear flag to 0

cont:   sll     $0,$0,0         #  target for the jump

Question 8

Could the no-op instructions (the sll) be removed without affecting the program?

Answer

No.

Delay slot bug

The program can be made slightly shorter by removing the no-op instruction filling the first delay slot. The instruction following it (the sltiu) will always execute, sometimes uselessly, but never will do damage.

# Set range indicator to 1
        ori     $3,$0,1         #  set to 1

# Test  30 <= temp <= 55
        sltiu   $8,$2,56        #  $8=1 if temp <= 55
        beq     $8,$0,out       #  0? out of range
        sll     $0,$0,0         #  delay

        sltiu   $8,$2,30        #  $8=1 if  temp < 30
        beq     $8,$0,cont      #  0? in range
        sll     $0,$0,0         #  delay

# Out of Range: clear range indicator to 0
out:
        ori     $3,$0,0         #  clear to 0

cont:   sll     $0,$0,0         #  target for the jump
## End of file

The second no-op, however, is essential. If it is missing, the next instruction, the ori sets the flag to zero regardless of the branch instruction. This is a common bug, and can be very frustrating because sometimes the result is correct.

Question 9

(Review:) What other type of instruction is followed by a delay slot?

Answer

Load instructions.

Counting loop

A common type of program loop is one that is controlled by an integer that counts up from a initial value to an upper limit. Such a loop is called a counting loop. The integer is called a loop control variable. Loops are implemented with the conditional branch, jump, and conditional set instructions.

A loop has three parts that must be correct:

The counter must be initialized.
The test must end the loop on the correct count.
The counter must be increased.

It is easy to get these wrong in a high-level programming language. It is remarkably easy to get them wrong in assembly language.

Usually you want a top-driven loop such as the one at right, where the test is performed at the top before control enters the loop body. Be clear about the loop you want before you program it, because assembly language allows any sort of weird loop.

Question 10

Is the following loop (in C) correct in all three parts? It is intended to execute 10 times starting at zero.

int j;
   
j = 0;
while (j < 10)
{
    . . .
    j++;
}

Answer

The loop is correct. (Although j changes to 10 at the bottom of the last iteration, this is the normal way for loops to work).

Assembly language loop

Here is an assembly version of the counting loop, without the branch delay slots filled:

#
#  branch delay slots not filled
#
init:   ori    $8,$0,0        # count = 0

test:   sltiu  $9,$8,10       # count < 10
        beq    $9,$0,endLp

        . . .                 # do stuff

        addiu  $8,$8,1        # count++ ;
        j      test

endLp:  sll    $0,$0,0        # branch target

Question 11

Find and fill in the branch delay slots.

Answer

See below.

Complete loop

Here is the loop with the branch delay slots filled. One could be clever and eliminate the last no-op, but let's not.

#
#  branch delay slots  filled
#
init:   
        ori    $8,$0,0        # count = 0

test:   sltiu  $9,$8,10       # count < 10
        beq    $9,$0,endLp    # end loop if count >= 10
        sll    $0,$0,0        # delay

                              # do stuff

        addiu  $8,$8,1        # count++ ;
        j      test
        sll    $0,$0,0        # delay

endLp:  sll    $0,$0,0        # branch target

The no-op at endLp is not filling a branch delay slot. It is there as a convenient target for the branch instruction.

With a few assembly language directives, the code is ready to run. Step through the code and watch $8 (count) increase from 0 to 0xA.

Question 12

Examine the program. How could you modify it to compute the sum of the integers 0 through 9?

Answer

See below.

Summing program

The loop is already correct for the problem. Don't change it. Computing the sum is done by adding just two statements:

##
##  Sum of integers 0 .. 9
##
## Registers:
##  $8  --- loop control
##  $9  --- scratch
## $10  --- sum

init:   ori    $10,$0,0       # sum = 0
        ori    $8,$0,0        # count = 0
test:   sltiu  $9,$8,10       # count < 10
        beq    $9,$0,endLp    # end loop if count >= 10
        sll    $0,$0,0        # delay

        addu   $10,$10,$8     # sum += count

        addiu  $8,$8,1        # count++ ;
        j      test
        sll    $0,$0,0        # delay

endLp:  sll    $0,$0,0        # jump target

Question 13

Of all the slls in the program, which one is the most dangerous to remove?

Answer

The first one.

Chapter quiz

Question 1

Examine the following program fragment:

        ori   $8,$0,13
        ori   $9,$0,1
        bltz  $8,target
        sll   $0,$0,0
        ori   $9,$0,0
      
target: sll   $0,$0,0       # arbitrary instruction

What value is found in $9 when control reaches target?

(A) 0
(B) 1
(C) 4
(D) 13

Answer

Question 2

Trick Question: Examine the following program fragment:

        ori   $8,$0,-57
        ori   $9,$0,1
        bltz  $8,target
        ori   $9,$0,0       # think about the delay
                            # slot
target: sll   $0,$0,0       # arbitrary instruction

What value is found in $9 when control reaches target?

(A) 0
(B) 1
(C) 3
(D) 4

Answer

Question 3

Examine the following program fragment:

        ori   $8,$0,13
        ori   $9,$0,1
        bgez  $8,target
        sll   $0,$0,0
        ori   $9,$0,0
      
target: sll   $0,$0,0       # arbitrary instruction

What value is found in $9 when control reaches target?

(A) 0
(B) 1
(C) 4
(D) 13

Answer

Question 4

Examine the following program fragment (slightly different from the previous):

        ori   $8,$0,13
        bgez  $8,target
        ori   $9,$0,1
        ori   $9,$0,0
      
target: sll   $0,$0,0       # arbitrary instruction

What value is found in $9 when control reaches target?

(A) 0
(B) 1
(C) 4
(D) 13

Answer

Question 5

Examine the following program fragment:

        addiu    $3,$0,-13
        addiu    $7,$0,23
        ?????

Pick the instruction to replace ????? that will set register $10 to one.

(A) sltu $3,$7,$10
(B) slt $10,$7,$3
(C) slt $10,$3,$7
(D) sltu $10,$3,$7

Answer

Question 6

Examine the following program fragment:

        addiu    $3,$0,-13
        slti     $5,$3,-8

What value is in $5 after both instructions exectute?

(A) 0
(B) 1
(C) -8
(D) -13

Answer

Question 7

Examine the following program fragment:

        ori      $3,$0,25
        slti     $5,$3,53

What value is in $5 after both instructions exectute?

(A) 0
(B) 1
(C) 25
(D) 53

Answer

Question 8

(Very Tricky:) Examine the following program fragment:

        addiu    $3,$0,-1
        slti     $5,$3,17

What value is in $5 after both instructions exectute? (If your answer is incorrect, run the program with SPIM and examine the registers. SPIM does not "know" how the bit pattern got into $3, it's just a pattern and the slti instruction acts on it mechanically.)

(A) 0
(B) 1
(C) -8
(D) -13

Answer

Question 9

Which style of implementing a counting loop is usually easiest to understand?

(A) data driven loop
(B) bottom driven loop
(C) conditional driven
(D) top driven loop

Answer

Question 10

Examine the following program fragment:

       ori    $5,$0,5       # initialize count
       ori    $8,$0,0       # initialize accumulator
       
test:  bltz   $5,done
       sll    $0,$0,0
       addu   $8,$8,$5      # add count to accumulator
       addiu  $5,$5,-1
       j      test
       sll    $0,$0,0
       
done:  sll    $0,$0,0

How many times is the addu instruction executed?

(A) 0
(B) 5
(C) 6
(D) 7

Answer

Programming exercises

For these programming exercises, use only those instructions that have been discussed so far in these notes:


`add`	`div`	`mflo`	`slt`, `slti`
`addi`	`divu`	`mult`	`sltu`, `sltiu`
`addiu`	`j`	`multu`	`sra`
`addu`	`lb`	`nor`	`srl`
`and`	`lbu`	`or`	`sub`
`andi`	`lh`	`ori`	`subu`
`beq`	`lhu`	`sb`	`sw`
`bgez`	`lui`	`sh`	`xor`
`bltz`	`lw`	`sll`	`xori`
`bne`	`mfhi`

In the Settings menu of SPIM set Bare Machine ON, Allow Pseudo Instructions OFF, Load Trap File OFF, Delayed Branches ON, Delayed Loads ON, Mapped IO OFF, Quiet OFF.

Run the programs by setting the value of the PC to 0x400000 and then single stepping (pushing F10) or by multiple stepping (push F11 and enter a number of steps). Observing the results in the SPIM window.

Exercise 1 (*) - sum of evens; sum of odds

Write a program that computes three sums:

+ 2 + 3 + 4 + ... + 99 + 100

+ 3 + 5 + 7 + ... + 97 + 99

+ 4 + 6 + 8 + ... + 98 + 100

Use a register (say $8) for the sum of evens, a register (say $9) for the sum of odds, and another (say $10) for the sum of all numbers.

Do this with one counting loop. The loop body contains logic to add the count to the proper sums.

Exercise 2 (*) - significant bits

With an ori instruction, initialize $8 to a bit pattern that represents a positive integer. Now the program determines how many significant bits are in the pattern. The significant bits are the leftmost one bit and all bits to its right. So the bit pattern:

0000 0000 0010 1001 1000 1101 0111 1101

... has 22 significant bits. (To load register $8 with the above pattern, 0x00298D7D, use an ori followed by a left shift followed by another ori.)

Exercise 3 (**) - allowed ranges

A temperature in $8 is allowed to be within either of two ranges: 20 <= temp <= 40 and 60 <= temp <= 80. Write a program that sets a flag (register $3) to 1 if the temperature is in an allowed range and to 0 if the temperature is not in an allowed range.

It would be helpful to draw a flowchart before you start programming.

Exercise 4 (****) - median of three

Write a program that computes the median of three values in memory. After it has been found, store the median in memory.

      .data
A:    .word 23
B:    .word 98
C:    .word 17    

The median of three integers is greater than or equal to one integer and less than or equal to the other. With the above three integers the median is "23". Assume that the data changes from run to run. Here is some more possible data:

      .data
A:    .word 9
B:    .word 98
C:    .word 9    

With the new data, the median is "9".

Structured Programming

This chapter discusses two topics of great interest in computer science: What machine instructions are needed in a processor, and: How to build programs that work. The MIPS instructions you have looked at so far are more than enough for a processor. The structured programming approach is used to build reliable programs.

Chapter Topics:

Computing power
The minimum set of machine instructions
Throughput
Complex instructions sets vs Reduced instruction sets
Structured programming
The minimum set of control structures

Much of the chapter is about the following question, which, oddly, is usually ignored in introductory courses.

Question 1

When is one processor more powerful than another?

Answer

When the one processor can compute something that the other one can't.

But that answer is somewhat vague.

Computing power

So far, these chapters have covered several kinds of instructions:

Bit-wise logic instructions
Integer arithmetic instructions
Memory access instructions
Instructions that conditionally alter program flow

You might wonder how many instructions one processor must have in order to be as powerful as another. The answer is: the above set of instructions is more than enough. But the idea of computer "power" is somewhat vague. Sometimes people use it to mean "speed" and sometimes to mean "what a processor can compute." Usually it means a fuzzy combination of both. Let us use the following definition:

Computing Power: Two processors have the same computing power if they can run the same programs (after translation into each processor's machine language) and produce the same results.

For example, say that two processors have the same power. Then if one processor can run a particular program, then the other one must be able to run it, and both processors produce the same result. This must be true for all programs (after appropriate compilation into the machine language for each processor).

Sometimes the result a program produces depends on the compiler. For example, different compilers for C use different numbers of bits for the data type int. But that is an effect of the compiler, not of the processor. All that matters for processor "power" is that it is possible to translate identical programs into machine language appropriate for each processor and that these machine language programs produce the same result on each processor.

Processor speed is left out of the definition. It is helpful to regard computing power and processor speed as separate aspects. Memory and peripherals (such as graphics boards) are also left out.

Question 2

Can a program that uses 64-bit integers run on a processor that has 32-bit words?

Answer

Yes. Arithmetic with integers of any size can be implemented no matter what word size the processor uses. This is done by using both the integer arithmetic instructions of the processor (if any) and bit manipulation instructions.

Equal power processors

For example: 16-bit Intel microprocessors can run programs that use 64-bit integer arithmetic. This is done by using several 16-bit machine operations for each 64-bit operation.

As a more modern example: Pentium-1 and Pentium-4 processors can run the same programs. One (the P-4) has a faster machine cycle than the other. And one (the P-4) has more types of machine instructions than the other. If you have a C program that computes something, both processors can run it, and whatever it computes comes out the same on both (assuming appropriate compilers). The running time would be far longer on the P-1 than on the P-4, but running time is not part of the definition.

In 1952 the SWAC digital computer was programmed to find perfect numbers. A perfect number is an integer whose integer divisors sum up to the number. For example, 6 is perfect because 6 is divided by 1, 2, and 3 and 6 = 1 + 2 + 3. Other perfect numbers are 28, 496, and 8128.

After much computation SWAC found the perfect number 33550336.

Nice article about the SWAC: www.computerhistory.org

Question 3

Will a modern Pentium processor find the same perfect numbers as the SWAC?

Answer

Of course. My 1.7 GHz processor quickly finds 33550336.

Minimum instruction set

But, according to the definition of computing power, my very fast processor is no more powerful than the SWAC.

Processors must control their peripheral devices, and send and receive information from them. They do this by reading and writing bit patterns on the system bus, and by reading and writing special memory addresses that are assigned to peripherals. So any processor that can read and write memory can also control hard disks and graphics boards. No special kind of processor "power" is needed for that. The multimedia instructions that have been added to recent processor chips do not add extra computing power. Of course, they do add convenience and speed.

What machine instructions must a processor absolutely have?

Important Fact: All processors that support the fundamental machine instructions of bit manipulation, conditional branching, and memory access have the same computing power. All processors have these instructions (and many more). All processors are equivalent in computing power (in the sense of the previous definition).

Arithmetic (both integer and floating point) can be done with bit manipulation instructions, so arithmetic instructions are not fundamental (but are almost always included in a processor).

Above a certain minimum set of instructions adding new instructions does not add to the computing power of a processor. (To learn more about this topic, take a course in Foundations of Computation or in Mathematical Logic).

Question 4

(Thought question:) Why do most processors have many more instructions than the minimum set needed for full computing power?

Answer

For convenience and throughput.

Convenience and throughput

Convenience is important. You likely have been frustrated trying to find the right MIPS instructions to use in your programs. Sometimes it feels like fitting square pegs into round holes. How much more convenient to have a wide selection of pegs, even if they don't add computing power!

Convenience is not just for human programmers. Compilers and other systems software output machine code. Compilers are easier to write, and are less buggy when there is a rich set of instructions.

Throughput is how much computing a processor (or full computer system) can perform in a unit of time. Say that a processor can perform 50 million instructions in one second. The more computing each instruction does, the greater the throughput. So most processors have instructions that do much more than the bare minimum.

Question 5

Which of the following improvements increases the throughput of a computer system?

Faster machine cycle (500 MHz to 1000 MHz).
More bits on the system bus (32 bits to 64 bits).
More main memory (128 Meg to 512 Meg).
Larger hard disk (20 Gig to 40 Gig).
Faster data transfer rate between the hard disk and memory (40 MBps to 80 MBps).
Bigger monitor (17" to 21").
Many, big, complex machine instructions.

Answer

Faster machine cycle (500 MHz to 1000 MHz).
- Yes. More instructions done per second.
More bits on the system bus (32 bits to 64 bits).
- Yes. More data moved per transfer operation.
More main memory (128 Meg to 512 Meg).
- Yes. Data can be kept in fast main memory rather than slow disk memory.
Larger hard disk (20 Gig to 40 Gig).
- No, assuming that all data and software fit on the smaller size.
Faster data transfer rate of hard disk (40 MBps to 80 MBps).
- Yes. Processor spends less time waiting for data.
Bigger monitor (17" to 21").
- No.
Many, complex machine instructions.
- Maybe, maybe not.

CISC

A CISC (Complex Instruction Set Computer) processor has many instructions, some of them complex instructions that do a lot per machine cycle. The Intel processors are CISC. A RISC (Reduced Instruction Set Computer) processor has few instructions, most of them simple. It may take several instructions to do the equivalent of a CISC instruction. The MIPS processors are RISC.

If everything else were the same, CISC would have greater throughput. However, a larger silicon chip is needed to implement the many complex instructions. This means that data must move through greater distances, and that takes more time. So the machine cycle must be slower. Also, the instructions themselves take more time to execute (usually more than several RISC instructions). Fine tuning the chip for speed is difficult when there are many instruction types. The simple instructions are compromised by the complex ones. But the simple instructions are the most frequently executed!

Worse, it is hard for a compiler to make use of complex instructions. They frequently compile programs into machine programs that use only simple instructions. The MMX (multi-media extension) instructions added to Pentium chips (in 1997) are not output by any compiler. (In 2015 compilers still have trouble with these instructions.)

Question 6

If you are writing a computer game and wish to use MMX instructions, what must you do?

Answer

You must use assembly language to write some functions of the game.

RISC

Reduced Instruction Set Computers (RISC) have a small number of instructions. The idea of RISC is that it is better to have a small, sleek, fast instruction set than to have a big collection of poorly coordinated, ungainly, complex instructions.

If a small set of instructions is implemented on a chip, the chip now has room for more general purpose registers, a larger on-chip memory cache, an instruction pipeline, and other features that add speed.

For the last decade or so RISC chips have been ahead of CISC chips. But modern CISC chips now include many of the features of RISC chips (such as cache and pipelining). Modern CISC chips have greater throughput than older RISC chips. Consumer and office computers use CISC chips to be compatible with existing Windows software. High-end workstations and recently designed systems (such as embedded systems) typically use RISC.

Question 7

Examine (for a moment) the following program excerpt. (Trick question:) Is it correct?

start:   ori   $8,$0,4      # $8 = 4
         ori   $9,$0,12     # $9 = 12
         addu  $10,$8,$9    # $10 = 12+4 = 16
         sll   $10,$10,2    # $10 = 16*4 = 64

Answer

start:   ori   $8,$0,4      # $8 = 4
         ori   $9,$0,12     # $9 = 12
         addu  $10,$8,$9    # $10 = 12+4 = 16
         sll   $10,$10,2    # $10 = 16*4 = 64

Ordinarily, you would say "yes", assuming that the comments are correct.

Bug-free software

The answer assumes that execution starts at start. What if execution started at the addu instruction? Registers $8 and $9 would probably contain different numbers. That could happen if the following code were somewhere in the program:

ori   $8,$0,99     # $8 = 99
ori   $9,$0,43     # $9 = 43
j     start+8      # jump to the second statement after start

start is a symbolic address that stands for the first instruction's run time address. start+8 stands for the address 8 bytes away. The jump instruction transfers control to that address.

Question 8

Is there a way to prevent distant statements from jumping into the middle of a block?

Answer

No.

Bug-free blocks

Bugs happen when control jumps into the middle of a block of code. If the block was written to execute from start to finish, then the starting statements must execute or there will be errors. Don't jump into blocks, or bugs will happen.

This is one of the ideas of structured programming. A block of code is a list of instructions that has one entry point (where execution starts) and one exit point (where execution leaves the block). The entry point has well defined entry conditions. The exit point has well defined exit conditions.

For the block to execute correctly, execution must start at the entry point, and the entry conditions must be met. When execution leaves the block, the exit conditions are true (if the block itself is bug free).

Question 9

Are Java, C, and C++ structured languages?

Answer

Yes, all modern high level programming languages are structured.

Structure rule one: code block

If the entry conditions are correct, but the exit conditions are wrong, the bug must be in the block. This is not true if execution is allowed to jump into a block. The bug might be anywhere in the program. Debugging under these conditions is much harder.

Rule 1 of Structured Programming: A code block is structured. In flow charting terms, a box with a single entry point and single exit point is structured.

This may look obvious, but that is the idea. Structured programming is a way of making it obvious that program is correct.

In assembly language there is no syntax for showing program blocks. You think about them when you design the program, and when you draw a flowchart. But in coding you just follow your design.

Question 10

How is a code block implemented in assembly language?

Answer

You must write statements which you intend to have executed in sequence. Then you must be careful not to jump into the middle of them.

Implementing rule one

Statements automatically execute in sequence. There is no language support for enforcing the single entry/single exit idea. You must carefully follow the rule as you program.

You might think that the rule is followed if you only jump to labeled statements. But a statement in the middle of a block might have a label, as in the following:

start:   ori   $8,$0,4      # $8 = 4
         ori   $9,$0,12     # $9 = 12
midblk:  addu  $10,$8,$9    # $10 = 12+4 = 16
         sll   $10,$10,2    # $10 = 16*4 = 64

         ....

         ori   $8,$0,99     # $8 = 99
         ori   $9,$0,43     # $9 = 43
         j     midblk       # jump to the second statement after start

Question 11

Is there a syntax for defining code blocks in high-level languages like Pascal, C, or Java?

Answer

Yes: in Pascal begin and end delimit a block. In C and Java { and } delimit a block.

Sequence

Pascal, C, and Java are structured languages. A block of statements in those languages has one entry point and one exit point.

What about two blocks in a row? Say that the exit conditions of block 1 are the correct entry conditions for block 2. Then the two blocks can follow in sequence.

The set of two blocks can be regarded as one big block. If the entry conditions for block 1 are correct, then the exit conditions for block 1 are correct, then the entry conditions for block 2 are correct, then (finally) the exit conditions for block 2 are correct.

In terms of the big block, if the entry conditions for the big block are correct, then the exit conditions for the big block are also correct.

Question 12

Look at the big block (in dotted lines). If the big block entry conditions are correct, what do you know about the big block exit conditions?

Answer

If the big block entry conditions are correct, then the big block exit conditions are correct.

Structure rule two: sequence

A sequence of blocks is correct if the exit conditions of each block matches the entry conditions of the following block. Execution enters each block at the block's entry point, and leaves through the block's exit point. The whole sequence can be regarded as a single block, with an entry point and an exit point.

Rule 2 of Structured Programming: Two or more code blocks in sequence are structured.

The assembly language implementation of this rule is the same as rule one: a programmer must consciously follow the rule. This means that there must be no jumps elsewhere in the code to points inside the blocks.

Question 13

Are if-then-else structures possible in assembly language?

Answer

Yes.

Structure rule three: alternation

If-then-else is sometimes called alternation (because there are alternative choices). In structured programming, each choice is a code block. If alternation is arranged as in the flowchart at right, then there is one entry point (at the top) and one exit point (at the bottom). The structure should be coded so that if the entry conditions are satisfied, then the exit conditions are fulfilled (just like a code block).

Rule 3 of Structured Programming: The alternation of two code blocks is structured.

An example of an entry condition for an alternation structure is: register $8 contains a signed integer. The exit condition might be: register $8 contains the absolute value of the signed integer. The branch structure is used to fulfill the exit condition.

Question 14

Can the condition which is tested (in the diamond-shaped box) be complicated?

Answer

Sure. It might involve many assembly statements.

Implementing alternation

There is no explicit support for alternation in assembly language. You must do it in pieces, as in the following. Say that registers $8 and $9 hold the values a and b.

        ...                #   
if:     beq  $8,$9,yes     # is a==b ?
        nop                #    no
        ...                #    false block
        ...                #    
        ...                #  
        j    endif
        nop
yes:    ...                #   yes
        ...                #   true block
        ...                #
endif:  nop                # always executed

Notice that the order of the true block and the false block is the opposite of what is usual in high level languages.

This is not the only way alternation can be implemented. You can put the true block first if you want (but then the branch statement must be changed). Also, testing the condition might be complicated and might involve several branches and other instructions that the flowchart does not show.

Question 15

Is an if-endif (single alternative) structured?

Answer

Yes. It is regarded as if-then-else with an empty else-branch.

Structure rule four: iteration

Iteration (while-loop) is arranged as at right. It also has one entry point and one exit point. The entry point has conditions that must be satisfied and the exit point has conditions that will be fulfilled. There are no jumps into the structure from external points of the code.

Rule 4 of Structured Programming: The iteration of a code block is structured.

As with the other structures, iteration is built out of assembly language pieces. There are various ways to do it. The following is typical

while:  bltz   $8,endWh

        . . .

        j      while
        sll    $0,$0,0

endWh:  sll    $0,$0,0

The particular instructions that make up the test depend on circumstances.

Question 16

In a structured language (such as Pascal, C, or Java) can an alternation structure be put inside an iteration structure? (can the body of a while-loop contain an if-then-else?)

Answer

Of course.

Looks like a code block

In all of the structures so far: there has been an entry point at the top and an exit point at the bottom. There are no jumps into or out of the structure. The entry point has entry conditions that are expected to be satisfied, and exit conditions that are fulfilled if they are. Each of the structures can be considered to be a code block.

It is OK for a structure to have jumps within it (in assembly language, this is how some structures are implemented). From the outside, each structure looks like a code block.

Question 17

Could a code block acting as a loop body be replaced by an alternation structure?

Answer

Yes

Structure rule five: nesting structures

In flowcharting terms, any code block can be expanded into any of the structures. Or, going the other direction, if there is a portion of the flowchart that has a single entry point and a single exit point, it can be summarized as a single code block.

Rule 5 of Structured Programming: A structure (of any size) that has a single entry point and a single exit point is equivalent to a code block.

For example, say that you are designing a program to go through a list of signed integers calculating the absolute value of each one. You might (1) first regard the program as one block, then (2) sketch in the iteration required, and finally (3) put in the details of the loop body.

Or, you might go the other way. Once the absolute value code is working, you can regard it as a single code block to be used as a component of a larger program.

Question 18

(Thought question:) Would it be a good idea to implement a complicated code block as a subroutine (as a function)?

Answer

Yes. We will do this in a later chapter.

Computational power of structured programming

You might think that these rules are OK for ensuring stable code, but that they are too restrictive. Some power must be lost. But nothing is lost. Two researchers, Böhm and Jacopini, proved that any program can be written in a structured style. No computing power is lost by restricting control flow to the forms of structured programming.

The other control structures you may know, such as case, do-until, do-while, and for are not needed. However, they are sometimes convenient, and are usually regarded as part of structured programming. In assembly language they add little convenience (since you have to implement them yourself). I recommend using only the structures discussed in this chapter.

Any program you need to write in assembly can be written in a structured style. You should think about a problem and the means of solving it in exactly the way you would using a high-level, structured language. Once you have the solution designed, the implementation becomes mere "coding". Coding is the near-mechanical translation of a design into code. Design and coding should not be done simultaneously (in an ideal world, where all projects are on time, and dot-coms never fail).

Question 19

(Thought question:) Can a structured assembly language program sometimes be speeded up by using non-structured control?

Answer

Sometimes. But you should write the structured version first. Keep it, so that when the speedy version turns out to be buggy you have something that works.

Top-down design

The absolute value problem (two pages back) is an example of Top-down structured design. When using this design method you start with just one program block, and then replace that block with one of the structures: sequence, alternation, or iteration. You then refine the new flowchart by replacing one of its blocks with one of the same structures. You continue replacing blocks with structures until the flowchart has enough detail to work as a program design.

For example, say that you want to design a program that adds up integers entered by the user. The initial design might look like the picture.

(Actually, you would probably do this step conceptually and not actually draw the flowchart.)

Question 20

Think about how the flowchart could be refined to show more detail. Replace the code block with just one structure.

Answer

See below. Usually programs need an initialization step, a middle step that does most of the work, and a final step to finish the work.

Three steps

The initialization step for our problem initializes the sum to zero.

The chart is still not detailed enough for programming. It needs some detail about how all those integers are to be read in. A loop is needed.

Question 21

Replace one of the blocks with an iteration structure.

Answer

The iteration will be in the middle step (see below).

Final design

After initialization, the program loops as long as there is data. When it reaches the end of the data, it writes out the answer and exits.

There are still some details left to specify, such as how the end of data is determined. This might be done with a special sentinel value, or by prompting the user for a "yes" or "no" response. But the over-all design for the program will be the same.

Question 22

Does this flowchart look familiar?

Answer

Yes. It looks about the same as the previous absolute value problem.

Universal flow chart

Many programs have this over-all design. Almost always there is an initialization step, then a step that processes data in a loop, then finally a wrap-up step. This is so common, you can nearly always start with the this design for any program.

A good first design step is to determine what the main loop should do. Of course, there will likely be nested loops and ifs, possibly several levels down. But if you have correctly designed the outer loop you have made great progress.

Question 23

Do these design ideas work only for assembly language?

Answer

No. The ideas of structured programming work for any programming language.

How this can help you

In almost all problems (in life or in programming) it helps to have a high-level view of what you are trying to achieve. A high level view keeps you focused on the task and directs your efforts. If you don't have a high-level view, you will probably think only in terms of what to do next, and you may never reach your goal. You might flounder forever, making small moves that don't add up to progress. This is like wandering aimlessly through a murky woods without a map. Or like trying to write a novel by picking one word at a time. Not likely to succeed.

In programming, a high-level view is given by a top-level structured flowchart (very often the Universal Flow Chart, modified to fit your task). Structuring forces the chart to be a genuine plan and not merely a poorly-conceived collection of boxes and lines. Any program can be described by a structured flow chart. If you show your program in a chart using a small number of boxes, it will be a top-level plan.

Now you can more sensibly focus on smaller tasks, and then yet smaller tasks, using the same method. This method of divide and conquer is a very powerful technique.

Question 24

Does your life need more structure?

Answer

Yes!

Chapter quiz

Question 1

Some microprocessors have a 64-bit word size. Can these microprocessors compute things that 32-bit microprocessors can not?

(A) No. All microprocessors have the fundamental operations it takes to have equal computing power.
(B) No. Two 32-bit microprocessors can be wired together to get the same power as a 64-bit microprocessors.
(C) Yes. They can compute with much larger numbers than microprocessors with smaller word sizes.
(D) Yes. They can have many more machine operations by using 64-bit machine instructions.

Answer

Question 2

Is speed considered to be a part of computing power?

(A) No — since the speed of a microprocessor is easily affected by the rest of the computer system.
(B) No — Computing power is concerned only with what can be computed, not how long it takes.
(C) Yes — faster microprocessors have more computing power.
(D) Yes — but if two microprocessors have equal speed, the one with the most instructions is more powerful.

Answer

Question 3

Does a microprocessor need special instructions for input/output and for graphics?

(A) No. This is done by using ordinary load and store operations with special addresses that have been assigned to the devices.
(B) No. This is done by loading a storing special registers within the microprocessor.
(C) Yes. Special I/O and graphics machine instructions are used, but these are not included in the definition of computing power.
(D) Yes. Special I/O and graphics machine instructions are used, and these are included in the definition of computing power.

Answer

Question 4

How much computing power can be expected of a modern microprocessor?

(A) The more money a microprocessor costs, the more computing power it will have.
(B) Microprocessors show great increases in computing power every year, so more recent ones have more power than old ones.
(C) Within the same family, microprocessors have the same computing power. But microprocessors in different families cannot be compared.
(D) All past and present general purpose microprocessors are equal in computing power.

Answer

Question 5

What is throughput?

(A) ... another word for computing power.
(B) ... how fast a computer system runs.
(C) ... how much computing a computer system can perform in a unit of time.
(D) ... how much data a mass storage system can store.

Answer

Question 6

What does RISC stand for?

(A) Regularized Instruction System Chip
(B) Reduced Information System Computing
(C) Registers Implemented with Silicon Chips
(D) Reduced Instruction Set Computer

Answer

Question 7

In structured programming, what is a block?

(A) A block is a section of code with just one entry point and just one exit point.
(B) A block is a section of code with one or more entry points and just one exit point.
(C) A block is a section of code with one or more entry points and one or more exit points.
(D) A block is a sequential section of code.

Answer

Question 8

Which one of the following statements is true?

(A) Code blocks in sequence are not structured.
(B) Programing done in assembly language is automatically structured.
(C) Two or more code blocks in sequence are structured.
(D) Object oriented languages are not structured.

Answer

Question 9

Is the alternation of code blocks structured?

(A) No.
(B) Yes.

Answer

Question 10

Is it possible to write a program in assembly language that can compute something that can't be computed by a program in a structured language?

(A) No. Unstructured assembly language and structured high level language have equal computational power.
(B) No. But the high level language can compute many more things than can be done in assembly language.
(C) Yes. Assembly language gives access to many operations not possible in any high level language.
(D) Yes. Assembly language has many more control structures and therefore more computational power.

Answer

Programming exercises

For these programming exercises, use only those instructions that have been discussed so far in these notes:


`add`	`div`	`mflo`	`slt`, `slti`
`addi`	`divu`	`mult`	`sltu`, `sltiu`
`addiu`	`j`	`multu`	`sra`
`addu`	`lb`	`nor`	`srl`
`and`	`lbu`	`or`	`sub`
`andi`	`lh`	`ori`	`subu`
`beq`	`lhu`	`sb`	`sw`
`bgez`	`lui`	`sh`	`xor`
`bltz`	`lw`	`sll`	`xori`
`bne`	`mfhi`

In the Settings menu of SPIM set Bare Machine ON, Allow Pseudo Instructions OFF, Delayed Branches ON, Delayed Loads ON, Mapped IO OFF, Load Exception Handler OFF.

Run programs by single stepping (pushing F10) or by clicking run and allowing control to go beyond the program. Implement while loops by branching to a no-op (sll $0,$0,0) at the end of the loop when the loop finishes.

Exercise 1 (*) - sequential memory locations

Write a program that stores the number 0 in the first four bytes of the .data section, then stores the number 1 in the next four bytes, then stores the number 2 in the four bytes after that and so on. Do this for numbers 0 through 24.

Of course you will do this in a counting loop. The address in the data section is contained in a base register. Increment the base register each time a number is stored.

The data section of your program should look like

         .data
array:   .space    100

Exercise 2 (***) - perfect number verifier

A perfect number is an integer whose integer divisors sum up to the number. For example, 6 is perfect because 6 is divided by 1, 2, and 3 and 6 = 1 + 2 + 3. Other perfect numbers are 28 and 496.

Write a program that determines if an integer stored in memory is a perfect number. Set a location in memory to 1 if the number is perfect, to 0 if not.

            .data
N:          .word      496
isperfect:  .word      0

It would be enormously helpful to first do this by hand with a couple of examples. Then draw a flowchart. Check the flowchart by following it with a few examples. Then write the program.

Exercise 3 (****) - perfect number searcher

This exercise continues exercise 2.

Write a program that searches for perfect numbers. It has an outer loop that counts upward from two to some limit. Each number is tested (as above) to see if it is a perfect number. If it is a perfect number it is stored at a memory location. Store perfect numbers at successive full-word memory locations. Look at exercise 1 for a way to do this.

Again, to save time and effort, create a flowchart first.

Exercise 4 (**) - prime number tester

Write a program that determines if an integer is prime. An integer is prime if it cannot be divided by any positive integer other than one and itself. The integer N to be tested will be in memory. Set the location isprime to 1 if N is prime, 0 otherwise.

          .data
N:        .word     223
isprime:  .word      0

To determine if N is prime, try to divide it with trial divisors from 2 up to N/2. If one of them divides N without a remainder, then N is not prime. You only have to try trial divisors up to N/2 because if N = trial*X, then trial = N/X and the only integer X less than 2 is 1.

Exercise 5 (***) - greatest common divisor

Write a program that computes the greatest common divisor (GCD) of two integers. The two integers and the GCD will be in memory.

        .data
N:      .word     21
M  :    .word     14
GCD:    .word      0

The GCD of two integers is the largest integer that divides them both with a remainder of zero.

GCD(21,14) = 7
GCD(14,21) = 7
GCD(27,36) = 9
GCD(25,50) = 25
GCD(19,36) = 1
GCD(12,55) = 1

Notice that GCD(X,Y) = GCD(Y,x). If the GCD of two integers is one, then the two integers are relatively prime. Two integers might be relatively prime without either one of them being prime.

If two integers have a common divisor, the difference of the two integers has the same divisor. To find the common divisor, keep subtracting one integer from the other until a common value is reached. Why is this? Say x and y have a common divisor, say d.

Then x = md and y = nd
Then (x – y) = md – nd = (m – n)d = kd
So the difference kd has the same divisor d

Your program can follow this algorithm:

get N
get M

while (N != M)
    if ( N > M )
        N = N - M;
    else
        M = M - N;

GCD =  N;

Load N and M integer registers and implement the algorithm using registers. There is no need to change the N and M in memory. When the loop finishes, store the resulting GCD into memory.

Exercise 6 (****) - Euler phi function

Write a program that computes the Euler phi function. The phi function is important in cryptography and internet security.

The phi function of an integer N is the number of positive integers less than N that do not share a common factor greater than one with N. Another way to say this is that phi(N) is the number of positive integers less than N that are relatively prime to N.

Two integers share a common factor if there is an integer greater than one that divides them both with no remainder. For example, 14 and 21 share a common factor of 7. Also, 7 and 21 share a common factor of 7. But 5 and 21 do not have a common factor.

Two integers have a factor in common if they have greatest commmon divisor greater than one. The greatest common divisor of 14 and 21 is 7. The greatest common divisior of 5 and 21 is 1.

So the phi function of N is the number of positive integers x less than N for which gcd(N,x) = 1.

Write a program that computes phi(N) for an integer N in the data section. Store the result back into the data section

        .data
N:      .word     21
phi:    .word      0

The logic of your program could be:

phi = 0;
trial = 1;
while ( trial < N)
{
    if ( gcd(N,trial) == 1 ) phi++;  
}

Programming Examples

This chapter contains two example programs that illustrate structured programming of loops and branches. It also discusses how character strings and integer arrays are represented in assembly language.

Chapter Topics:

null-terminated character strings
- String length program
Integer arrays
- Array summing program

A character string is a sequence of bytes containing ascii patterns.

Question 1

What is at the end of a null-terminated string?

Answer

A zero byte (also called a null byte).

Null-terminated string

A null-terminated string is a sequence of ASCII characters, one to a byte, followed by a zero byte (a null byte). null-terminated strings are common in C and C++. Here is how a string is declared in assembly language:

        .data
str:    .asciiz "Time is the ghost of space."

The characters are placed in memory in order, starting with the 'T'. The picture shows this. The blank bytes are filled with ASCII space (0x20). The last byte is filled with 0x00, represented with a slash.

Here is what it looks like in the data display of SPIM running on a Windows PC. The characters are placed in memory in sequential order beginning with 'T' (at address 0x10000000). This is hard to see in the display.

The right section of the SPIM display shows the characters in a string in the order they are stored in memory, one character per byte. However, in the hex display of memory, each group of four bytes is displayed in backwards order. The byte with the lowest address of a group is on the right; the byte with the highest address of the group is the leftmost byte. The red letters over each byte in the hex memory display have been added (by me) to show what each byte actually contains.

The hex memory display is aimed at displaying 32-bit little-endian integers. For 4-byte integers, this puts the most significant byte on the left, where it ordinarily goes when integers are printed. Unfortunately, this makes the hex display hard to read for data other than integers.

'T' ( 0x54 ) is at address 0x10000000.
'i' ( 0x69 ) is at address 0x10000001.
'm' ( 0x6d ) is at address 0x10000002.
'e' ( 0x65 ) is at address 0x10000003.
' ' ( 0x20 ) is at address 0x10000004.
'i' ( 0x69 ) is at address 0x10000005.
's' ( 0x73 ) is at address 0x10000006.
' ' ( 0x20 ) is at address 0x10000007.

In the red text, space (0x20) is shown as '_'. The null byte is at address 0x1000001B, followed by more nulls that are not part of the string.

Question 2

What character is at 0x10000008?

What character is at 0x10000010?

Answer

What character is at 0x10000008? t

What character is at 0x10000010? f

String length

The length of a null-terminated string is defined as the number of characters it contains not counting the null. To calculate length, start the count at zero. Then increment the count for each successive non-null byte. When you hit the null, stop counting.

The structured flow chart shows this procedure. It describes the algorithm in general terms. Assembly language details are left to the coding stage. Here is an outline of the program:

## strlen.asm
##
## Count the characters in a string
##
## Registers:
## $8 -- count
## 

        .text
        .globl  main
        
# Initialize
main:    ori     $8,$0,0     #  count = 0

# while not ch==null do
loop:    . . . 
         . . .
         . . .
         
         j    loop
         sll   $0,$0,0       # branch delay slot
         
# finish
done:    sll   $0,$0,0       # target for branch

         .data
string:  .asciiz  "Time is the ghost of space."

Question 3

Why is the null not counted in the length of a string? (Hint: think about what happens when two strings are concatenated.)

Answer

You would like:

length( string1 ) + length( string2 )   =  length( concat( string1, string2) )

If the null at the end of each string counted as part of its length, then this would be false, because the combined string has only one null (at the end) but there are two nulls in the two original strings.

Program continued

There are several ways to do what the second box of the chart calls for: "point at the first character". In a few chapters, doing this will be much easier. But, for now, remember that the data section starts at the address 0x10000000. The top halfword of this is 0x1000. This is loaded into the top halfword of the base register with lui.

Question 4

A few more statements have been added to get and test the current character. Fill in the blanks.

Answer

The answer is below.

Loop body

The address of the byte in the lbu instruction is computed by (displacement + $9). Since the base, $9, has the address of (points at) the character, the displacement is zero. The lbu loads the character into the low-order byte of $10. The upper three bytes are zero. The beq instruction tests if the entire register is zero, but, of course, that also tests if the low-order byte is zero.

Next, the program must increment the count, then move the base register (the character pointer) to the next byte of the string.

Question 5

Fill in the blanks.

Answer

The complete program is below.

Complete program

The base register is "moved" through the string by increasing the address by one for each byte. This is also called "moving a pointer". This program has an important concept on every line. Using these concepts is how you program!

## strlen.asm
##
## Count the characters in a string
##
## Registers:
##  $8 -- count
##  $9 -- pointer to the char
## $10 -- the char (in low order byte)

         .text
         .globl  main
         
# Initialize
main:    ori      $8,$0,0        #  count = 0
         lui      $9,0x1000      #  point at first char

# while not ch==null do
loop:    lbu      $10,0($9)      # get the char
         sll      $0,$0,0        # branch delay
           
         beq      $10,$0,done    # exit loop if char == null
         sll      $0,$0,0        # branch delay

         addiu    $8,$8,1        # count++
         addiu    $9,$9,1        # point at the next char

         j        loop
         sll      $0,$0,0        # branch delay slot

# finish
done:    sll      $0,$0,0        # target for branch

         .data
string:  .asciiz  "Time is the ghost of space."

The program is very close to the C standard library function int strlen(char*). As it is written, the program contains its own data, and it is not written as a function that can be called. In a few chapters, we will write functions that can be called and that use parameters.

Question 6

Does the program work correctly if the string is the null string (the string of length zero that consists of just a null byte)?

Answer

Yes.

Integer array

An array of integers is a sequence of integers in successive words of memory. The number of integers in the array is also a value kept in memory.

In assembly language, an array of integers is declared using the directive .word followed by a list of comma separated integers.

        .data
size:   .word  17
array:  .word  12, -1, 8, 0, 6, 85, -74, 23, 99, -30, 30, 95, 4, 7, 10, 28, 14

The .word directive puts the data in word-aligned locations.

Question 7

Why can't an array end with a null word, like with strings?

Answer

Because a null word, 0x00000000, is a legitimate (and common) integer.

Example program: sum of an array

Sometimes programmers try to end an array with a pattern that they think will never occur in the data, like 0x00000000, of 0xFFFFFFFF. This is an invitation for bugs. The pattern might truly never occur in correct data, but faulty data is common, and your program should deal with it gracefully.

The next example program uses an array of integers. The length of the array is given in the data. The program calculates three sums:

The sum of all integers in the array
The sum of the positive integers in the array
The sum of the negative integers in the array

You have probably seen this example before, done in Java or C.

Question 8

Sketch out the flowchart for the program (or write it in pseudo code).

Answer

See below.

Flowchart

The flow chart could work for any language. Language details are left out of the design. The chart is a structured flowchart.

An array entry is one integer stored in the array, sometimes called an array element.

Here is an outline of the program. So far, it implements the first box of the flowchart, plus part of the loop. The data for the array has been declared.

SPIM initializes registers to zero, but it is good practice to explicitly zero accumulators.

Question 9

Fill in the blanks.

Answer

See below.

Loading the array size

Now, you need to get the size of the array. The size is the first word of the data section. Recall that the data section starts at the address 0x10000000. Load the top half of that address into the base register. Then, load the size into register $15.

After that, point the base register at the first word of the array.

Question 10

Fill in the blanks.

Answer

See below.

Building the loop

Perhaps you added one to the base register, rather than four. The base register $9 needs to be increased by the size of a full word, four. Now work on getting the loop correct. The beq instruction branches out of the loop if the count has reached the end of the array.

At the bottom of the loop, the count is incremented and the base register is moved to the next array entry.

Question 11

Fill in the blanks.

Answer

See below.

Add element to the sum

At the bottom of the loop, count is increased by one. The pointer is increased by four.

Now work on the middle of the loop body. Get the current array element and add it to the sum of all integers.

Question 12

Fill in the blanks.

Answer

See below.

Is the element negative?

Next, set a flag (register $14) if the array entry is negative. Branch to the symbolic address neg if it is.

Question 13

Fill in the blanks.

Answer

See below.

Branches for positive and negative

Now, implement the true block and the false block. One adds the current entry to the sum of negatives, the other adds it to the sum of positives. Check that you don't have the blocks confused.

Question 14

Fill in the blanks.

Answer

See below.

Complete program

Here, is the complete program suitable for running. The data in the array has been simplified to make testing easier. Study how the code matches the flow chart.

## addIntArray.asm
##
## Sum all integers, the positive integers,
## and the negative integers in an array.

## Registers:
##  $8 -- count
##  $9 -- pointer to the array entry
## $10 -- current array entry
## $11 -- sum of all integers
## $12 -- sum of negative integers
## $13 -- sum of positive integers
## $14 -- pos. or neg. flag
## $15 -- length of the array

         .text
         .globl  main
# Initialize
main:    ori      $8,$0,0        #  count = 0
         ori      $11,$0,0       #  sum = 0
         ori      $12,$0,0       #  neg = 0
         ori      $13,$0,0       #  pos = 0
         lui      $9,0x1000      #  point at SIZE

         lw       $15,0($9)      #  get SIZE
         addiu    $9,$9,4        #  point to first entry

# while  count < SIZE do
loop:    beq      $8,$15,done 
         sll      $0,$0,0        #  branch delay

# get entry, add to sum
         lw       $10,0($9)      #  get entry
         sll      $0,$0,0        #  load delay
         addu     $11,$11,$10    #  add to sum

#  test neg. or pos.
         slti     $14,$10,0x0    #  set $14 if entry is neg
         bne      $14,$0,neg     #  branch if negative
         sll      $0,$0,0        #  branch delay
         addu     $13,$13,$10    #  positive: add to PLUS
         j        ifend
         sll      $0,$0,0        #  branch delay

neg:     addu     $12,$12,$10    #  negative: add to NEG

ifend:   addiu    $8,$8,1        # count++
         addiu    $9,$9,4        # point at next entry
         j        loop
         sll      $0,$0,0        #  branch delay

# finish
done:    sll      $0,$0,0        # target for branch

        .data
size:   .word  4
array:  .word  1, 2, -2, -1

Question 15

With the testing data (in the above program), what values are computed for the three sums?

Answer

$11 — sum of all: 0

$12 — sum of negative: -3

$13 — sum of positive: 3

When testing a program, it is convenient to use data for which the correct results are obvious.

Running the program

You can run the program by initializing the PC (to 0x00400000) and repeatedly pushing single step (F10). But when the array has many entries this is tedious. On the version of SPIM for windows you can do the following to start the program and run it to completion:

Initialize the PC (as usual).
Hit F10 once (or more, if you want).
Click on "Continue" in the "Simulator" menu.
The program will execute until the PC goes beyond the end of the program.
The simulator halts.

This procedure is not very elegant. Set a breakpoint at the last instruction if you want. Or set a breakpoint at the top of the loop so that the loop body executes once per "Continue".

If you click "Go" in the menu, the simulator tries to start with some code that is not there and halts. "Bare Machine" is set in the options menu, so the extra code is not there. Keep things this way for now, unless you want to experiment.

Question 16

Is a single run of the program enough to be certain that it is correct?

Answer

No. Even a simple program like this might have some bugs that only careful testing can uncover.

Chapter quiz

Question 1

How many bytes are used for an ASCII-encoded character?

(A) 1
(B) 2
(C) 4
(D) 8

Answer

Question 2

Which of the following puts a null-terminated string "Hello World" in memory?

(A) .ascii "Hello World"
(B) .text "Hello World"
(C) .word "Hello World"
(D) .asciiz "Hello World"

Answer

Question 3

How is a null-terminated string arranged in memory?

(A) Characters are grouped four to a full word from right to left. Extra space at the end is filled with zeros.
(B) Characters are put in sequential order with a null byte after the last character.
(C) Characters are put in sequential order one per word with a word of null after the last character.
(D) Characters are put in sequential order in memory. Space characters are replaced with null bytes.

Answer

Question 4

A character has been loaded into a register by using a lbu instruction. What does the register look like?

(A) The character is in the low-order byte. The three high-order bytes are zero.
(B) The character is in the low-order byte. The three high-order bytes are whatever they were just before the instruction.
(C) The character is in the high-order byte. The three low-order bytes are zero.
(D) The character is in the low-order byte. The three high-order bytes contain one-bits.

Answer

Question 5

How is a character pointer typically moved from one character to the next?

(A) It is incremented by four with a addiu instruction.
(B) It is incremented by one with a add instruction.
(C) It is incremented by one with a addiu instruction.
(D) It is changed with a move instruction.

Answer

Question 6

How is an array of integers typically implemented?

(A) The integers of the array are put in sequential words of memory. A word of zeros follows the last integer.
(B) The integers of the array are put in sequential words of memory. Another word of memory contains the length of the array.
(C) The integers of the array are put in sequential bytes of memory. The last byte holds the length of the array.
(D) Each integer of the array is assigned to one of the general purpose registers.

Answer

Question 7

What instruction can be used to load register $10 with the first address of the .data section?

(A) lui $10,0x1000
(B) lui $10,0x10000000
(C) ori $10,0x1000
(D) andi $10,0x4000

Answer

Question 8

How much should a base register be incremented by to move from one integer to the next in an array of integers?

(A) 1
(B) 2
(C) 4
(D) 8

Answer

Question 9

Say that register $8 contains an integer and that register $9 contains a sum. The integer is to be added to the sum only if it is positive. Which of the following code sequences does this?

(A)

        bltz   $8,noadd
        sll    $0,$0,0
        addu   $9,$9,$8
noadd:

(B)

        bltz   $8,noadd
        addu   $9,$9,$8
noadd:

(C)

        blgez  $8,noadd
        sll    $0,$0,0
        addu   $9,$9,$8
noadd:

(D)

        slt    $5,$8,$0
        beq    $5,$0,noadd
        sll    $0,$0,0
        addu   $9,$9,$8
noadd:

Answer

Question 10

When the SPIM simulator is set to "bare machine" to what value should the PC be initilized?

(A) 0x00000000
(B) 0x10000000
(C) 0x100000
(D) 0x400000

Answer

Programming exercises

For these programming exercises, use only those instructions that have been discussed so far in these notes:


`add`	`div`	`mflo`	`slt, slti`
`addi`	`divu`	`mult`	`sltu, sltiu`
`addiu`	`j`	`multu`	`sra`
`addu`	`lb`	`nor`	`srl`
`and`	`lbu`	`or`	`sub`
`andi`	`lh`	`ori`	`subu`
`beq`	`lhu`	`sb`	`sw`
`bgez`	`lui`	`sh`	`xor`
`bltz`	`lw`	`sll`	`xori`
`bne`	`mfhi`

In the Settings menu of SPIM set Bare Machine ON, Allow Pseudo Instructions OFF, Load Trap File OFF, Delayed Branches ON, Delayed Loads ON, Mapped IO OFF, Quiet OFF.

A carefully thought-out and debugged flow chart will cut in half the time it takes to do one of these exercises. Plus you will learn good coding techniques and gain insight into programming languages and language translation.

Exercise 1 (*) - to lower case

Declare a string in the data section:

          .data
string:   .asciiz    "ABCDEFG"

Write a program that converts the string to all lower case characters. Do this by adding 0x20 to each character in the string. (See Appendix F to figure out why this works.)

Assume that the data consists only of upper-case alphabetical characters, with no spaces or punctuation.

Exercise 2 (***) - capitalization

Declare a string in the data section:

          .data
string:   .asciiz    "in a  hole in the   ground there lived a hobbit"

Write a program that capitalizes the first letter of each word, so that after running your program the data will look like this:

          .data
string:   .asciiz    "In A  Hole In The   Ground There Lived A Hobbit"

Easy version: assume that the data consists only of lower case characters and spaces. There may, however, be several spaces in a row. Be sure to capitalize the first letter of the string.

Medium-hard version: assume that the data consists only of upper and lower case characters and spaces. Alter a character only if it is lower case and follows a space.

Exercise 3 (***) - space removal

Declare a string in the data section:

          .data
string:   .asciiz    "Is  this a dagger    which I see before me?"

Write a program that removes all the spaces from the string so that the resulting string looks like:

          .data
string:  .asciiz    "IsthisadaggerwhichIseebeforeme?"

Be sure to end the result string with a null after its final character.

Easy version: declare a second buffer to hold the result string. Transfer non-space characters from the input string to the result string.

Medium-hard version: Use only the buffer that holds the original string. Use two character pointers, one for the current character and another for its destination.

The logic for this can be tricky. Figure out how you would do it with characters arrays in C or Java before you try it with assembly. For testing, use a data string such as the following:

          .data
string:  .asciiz    "aaaa bbbb  cccc   dddd    eeee"

Exercise 4 (**) - array maximum and minimum

Declare an array of integers, something like:

        .data
size:   .word 8
array:  .word 23, -12, 45, -32, 52, -72, 8, 13

Write a program that determines the minimum and the maximum element in the array. Assume that the array has at least one element (in which case, that element will be both the minimum and maximum.) Leave the results in registers.

Exercise 5 (**) - ascending numbers

Declare an array of integers, something like:

        .data
size:   .word 10
array:  .word 2, 4, 7, 12, 34, 36, 42, 8, 57, 78

Write a program that determines if the numbers form an increasing sequence where each integer is greater than the one to its left. If so, it sets a register to 1, otherwise it sets the register to 0.

Of course, write the program to work with an array of any size, including 0. Arrays of size 0 and size 1 are considered to be ascending sequences. The array can contain elements that are positive, negative, or zero. Test the program on several sets of data.

Exercise 6 (*) - paired data

In this program data comes in pairs, say height and weight:

        .data
pairs:  .word 5                  # number of pairs
        .word 60, 90             # first pair: height, weight
        .word 65, 105
        .word 72, 256
        .word 68, 270
        .word 62, 115

Write a program that computes the average height and average weight. Leave the results in two registers.

Exercise 7 (**) - array reversal

Declare an array of integers in the usual way:

        .data
size :  .word 7                     # number of elements
        .word 1, 2, 3, 4, 5, 6, 7

Write a program that reverses the order of elements in the array. The result will be as if the array were declared:

        .data
size :  .word 7                     # number of elements
        .word 7, 6, 5, 4, 3, 2, 1

Test that the program works on arrays of several lengths, both even and odd lengths.

Jump and Branch Instructions​

How a jump works​

Altering the PC​

Practice​

The jump instruction​

Most jumps (and branches) are local​

Jump practice​

Symbolic address​

Assembled program​

Conditional branches​

Ifs and whiles​

Branch on not equal​

Two-way decision​

Absolute value​

Shifting the sign bit​

Store -A​

Complete program​

Chapter quiz​

Programming exercises​

Exercise 1 (*) non-ending loop​

Exercise 2 (*) non-ending loop with overflow​

Exercise 3 (**) - counting loop​

Exercise 4 (**) - Fibonacci series​

Set Instructions and more Branch Instructions​

Branch on less than zero, branch on greater than zero​

Set on less than​

Set on less than immediate​

Temperature range tester​

Start on the program​

More blanks​

Complete program​

Delay slot bug​

Counting loop​

Assembly language loop​

Complete loop​

Summing program​

Chapter quiz​

Programming exercises​

Exercise 1 (*) - sum of evens; sum of odds​

Exercise 2 (*) - significant bits​

Exercise 3 (**) - allowed ranges​

Exercise 4 (****) - median of three​

Structured Programming​

Computing power​

Equal power processors​

Minimum instruction set​

Convenience and throughput​

CISC​

RISC​

Bug-free software​

Bug-free blocks​

Structure rule one: code block​

Implementing rule one​

Sequence​

Structure rule two: sequence​

Structure rule three: alternation​

Implementing alternation​

Structure rule four: iteration​

Looks like a code block​

Structure rule five: nesting structures​

Computational power of structured programming​

Top-down design​

Three steps​

Final design​

Universal flow chart​

How this can help you​

Chapter quiz​

Programming exercises​

Exercise 1 (*) - sequential memory locations​

Exercise 2 (***) - perfect number verifier​

Exercise 3 (****) - perfect number searcher​

Exercise 4 (**) - prime number tester​

Exercise 5 (***) - greatest common divisor​

Exercise 6 (****) - Euler phi function​

Programming Examples​

Null-terminated string​

String length​

Program continued​

Loop body​

Complete program​

Jump and Branch Instructions

How a jump works

Altering the PC

Practice

The jump instruction

Most jumps (and branches) are local

Jump practice

Symbolic address

Assembled program

Conditional branches

Ifs and whiles

Branch on not equal

Two-way decision

Absolute value

Shifting the sign bit

Store -A

Complete program

Chapter quiz

Programming exercises

Exercise 1 (*) non-ending loop

Exercise 2 (*) non-ending loop with overflow

Exercise 3 (**) - counting loop

Exercise 4 (**) - Fibonacci series

Set Instructions and more Branch Instructions

Branch on less than zero, branch on greater than zero

Set on less than

Set on less than immediate

Temperature range tester

Start on the program

More blanks

Complete program

Delay slot bug

Counting loop

Assembly language loop

Complete loop

Summing program

Chapter quiz

Programming exercises

Exercise 1 (*) - sum of evens; sum of odds

Exercise 2 (*) - significant bits

Exercise 3 (**) - allowed ranges

Exercise 4 (****) - median of three

Structured Programming

Computing power

Equal power processors

Minimum instruction set

Convenience and throughput

CISC

RISC

Bug-free software

Bug-free blocks

Structure rule one: code block

Implementing rule one

Sequence

Structure rule two: sequence

Structure rule three: alternation

Implementing alternation

Structure rule four: iteration

Looks like a code block

Structure rule five: nesting structures

Computational power of structured programming

Top-down design

Three steps

Final design

Universal flow chart

How this can help you

Chapter quiz

Programming exercises

Exercise 1 (*) - sequential memory locations

Exercise 2 (***) - perfect number verifier

Exercise 3 (****) - perfect number searcher

Exercise 4 (**) - prime number tester

Exercise 5 (***) - greatest common divisor

Exercise 6 (****) - Euler phi function

Programming Examples

Null-terminated string

String length

Program continued

Loop body

Complete program