## Simple Data Structures

### Dr. Tim McGuire CS 272 Sam Houston State University

• Three forms:
• Immediate data -- stored directly in machine code
• example: mov ax,5 ; 5 is an immediate value
• immediate operands are always source operands
• Register data -- held in processor registers
• Memory data -- held in memory
• processor calculates the 16-bit effective address
Register-indirect     mov ax, [bx]
Base                     mov ax,[record + bp]
Indexed                mov ax,[array + si]
Base-indexed        mov ax,[recordArray + bx + si]
String       lodsw
I/O Port               in ax, dx
• Direct address references are usually relative to ds.
• To change this, use a segment override:
• mov ch, [es:OverByte]
• Other segment bases are possible:
• mov dh, [cs:CodeByte]
• mov dh, [ss:StackByte]
• mov dh, [ds:DataByte]
• An override occupies a byte of machine code which is inserted just before the affected instruction
Register-Indirect Mode
• The offset address of the operand is contained in a register
• The register acts as a pointer to the memory location
• The operand format is
• [register]
• The register is bx, si, di, or bp
• For bx, si, or di, the segment register is ds
• For bp, ss has the segment number
Example
• si = 0100h, [0100h] = 1234h
• To execute mov ax,[si] the CPU
• examines si and obtains the offset address 0100h
• uses the address ds:0100h to obtain the value 1234h
• moves 1234h into ax
• This is not the same as mov ax,si which simply moves the value of si (0100h) into ax
Another example bx=1000h, si=2000h, di=3000h, [1000h]=1BACh, [2000h]=20FEh, [3000h]=031Dh

instruction source offset result

mov bx,[bx] 1000h     1BACh

mov cx,[si] 2000h     20FEh

mov bx,[ax]     illegal source register

inc [di]    3000h     031Eh

WORD and BYTE operators
• Both operands of an instruction must be of the same type
• mov ax,1 is a word operation because ax is a 16-bit register
• mov bh,5 is a byte operation
• mov [bx],1 is illegal because the assembler can't tell whether the destination is a byte or a word
• if you want the destination to be a byte, use
mov [BYTE bx],1
• and if you want it to be a word, use
mov [WORD bx],1 Based and Indexed Addressing Modes
• The operands offset address is obtained by adding a number called a displacement to the contents of a register
• The displacement may be:
• the offset address of a variable, e.g., A
• a constant, e.g., -2
• the offset address of a variable plus or minus a constant, e.g., A + 4
Syntax of an operand
• Any of the following expressions are equivalent:
• [register + displacement] ¬ preferred form
• [displacement + register]
• [register] + displacement
• displacement + [register]
• displacement[register]
• The register must be bx, bp, si, or di.
• If bx, si, or di is used, ds contains the segment number
• If bp is used, ss has the segment number
• The addressing is called based if bx or bp is used; it is called indexed if si or di is used

Application of Index Mode

• Replace the lowercase letters in the string to uppercase using index addressing mode
msg     db      "this is a message"
mov     cx,17         ; # chars in string
xor     si,si         ; si indexes a char
top:    cmp     [si+msg],' '  ; blank?
je      next          ; yes, skip over
and     [si+msg],DFh  ; no, convert to upper
next:   inc     si            ; index next byte
loop    top           ; loop until done

• In this mode the offset address is the sum of:
• the contents of a base register (bx or bp)
• the contents of an index register (si or di)
• optionally, a variable's offset address
• optionally, a positive or negative constant
• There are many valid ways to write the operand, some of them are:
• [base + index + variable + constant] ¬ preferred
• variable[base + index + constant]
• constant[base + index + variable]
Use of Based, Indexed, and Base-Indexed Modes
• Based and indexed addressing mode is often used for array and string processing
• Based-indexed addressing mode can be used for two dimensional arrays
• We will discuss these in greater detail later
Expressions and Operators
• Data-Defining Pseudo-Ops
db define byte (characters)

dw define word (integers)

dd define doubleword (long integers)

dt define tenbytes (BCD numbers)

dp define pointer (32 bits)

df define far pointer (48 bits)

• The DUP operator may be used to define arrays
arry dw 100 DUP (0) ;array of 100 words, all set to 0
str db 212 DUP (?)  ;array of 212 bytes, uninitialized
String Variables
• ASCII\$ string -- a series of ASCII characters ending in a dollar sign
• e.g. myString db "Hello, world!",'\$'
• Use DOS function 09 to display ASCII\$ strings
• ASCIIZ string -- ends in a null character
• e.g. myString db "Hello, world!",0
• Format used by C and other HLLs
• Must write own routines to handle ASCIIZ strings
Local Labels
• A local label begins with two @ signs
• A local label is only visible within the code bracketed by global labels (or PROC .. ENDP)
• Example:
```                jmp     There   ; jump to global label
@@10:
inc     ax
cmp     ax,10