Table of contents
1 x86 PC assembly tutorial
2 Basic information
3 Assembly in real mode
4 Protected Mode

x86 PC assembly tutorial

This is a tutorial, not a complete scientific description of how the x86 processor works.

This text is intended for those who want to gain a insight into programming real assembly language. Because the x86 processors are so common, most of you should be able to assemble most of the code that you find in this tutorial at your own computer.

This tutorial uses standard Intel syntax, not AT&T syntax in which most Linux assembly programs is written. All the code in this tutorial is intended for ordinary PC computers!

If you want to assemble the code you find in here you will have to download the free netwide assembler (NASM). Download it from this webpage: http://nasm.sourceforge.net/

Read about the hexadecimal numbers for a better understanding of this tutorial.

Basic information

There are mainly two different modes in which a x86 processor can work: real mode and protected mode. Because of the need for backward compatibility, the processor always starts in real mode. Almost all modern operating systems work in protected mode. AMD64 processors add another mode, long mode.

This tutorial will start with a brief introduction to real mode assembly and then go on with a larger protected mode section.

Assembly in real mode

There are several 16-bit processor registers that are commonly used by the average application programmer. Each register is specialized for one thing, and operations that deal with that thing are often smaller if the right register is used (smaller code runs faster). Here are the most used registers in real mode:

data registers
AX, the accumulator
BX, the base register
CX, the counter register
DX, the data register

address registers SI, the source register DI, the destination register SP, the stack pointer register BP, the stack base pointer register

All four data registers have 8-bit versions. There are two 8-bit registers "inside" each 16-bit register. ZH for the High 8 bits, and ZL for the Low 8 bits, where Z is the first letter in the 16-bit register. Example: AH is the 8-bit register that contains the same bits as the high 8 bits in AX.
CL is a register that contains the low 8 bits of CX.

AH, AL, BH, BL, CH, CL, DH and DL.

Collectively the data and address registers are called the general registers.

segment registers (not part of the 8 general registers)
CS, the code segment register
DS, the data segment register
ES, an extra segment register
FS, another extra segment register (not implemented before the 80386)
GS, yet another extra segment register (not implemented before the 80386)
SS, the stack segment register

other registers (not part of the 8 general registers) IP, the instruction pointer register FLAGS, the flag register

The IP register points to where the processor currently executes the code (i.e. where in the program the processor "is".) The IP register cannot be accessed by the programmer directly.

The FLAGS register contains the current state of the processor. Each bit in this register is called a flag. Each flag can be either 1 or 0, set or not set. Some of the flags that the FLAGS register contains is carry, overflow, zero and single step. The flags are often used to control the execution flow of the program. "IF A = B THEN A = C" and the like requires the use of the FLAGS-register.

The mnemonics used in real mode x86-assembly

They are:

aaa, aad, aam, aas, adc, add, and, call, cbw, clc, cld, cli, cmc, cmp, cmpsb, cmpsw, cwd, daa, das, dec, div, esc, hlt, idiv, imul, in, inc, int, into, iret, ja, jae, jb, jbe, jc, jcxz, je, jg, jge, jl, jle, jmp, jna, jnae, jnb, jnbe, jnc, jne, jng, jnge, jnl, jnle, jno, jnp, jns, jnz, jo, jp, jpe, jpo, js, jz, lahf, lds, lea, les, lock, lodsb, lodsw, loop, loope, loopne, loopnz, loopz, mov, movsb, movsw, mul, neg, nop, not, or, out, pop, popf, push, push, puchf, rcl, rcr, rep, repe, repne, repnz, repz, ret, rol, ror, sahf, sal, sar, sbb, scasb, scasw, shl, shr, stc, std, sti, stosb, stosw, sub, test, wait, xchg, xlat, xor

(copied from IA-32)
You will never use most of these codes. Click on them to read more about them.

The real mode addressing model

This is quite simple, but still much hated by ordinary programmers. It uses two registers to point to one address: one segment register and one offset register. Any general application registers (see above) could be used as an offset.

The segment register is shifted 4 bits left and then added to the offset register. The formula looks like this: segment*0x10+offset.

For example, if DS contains the hexadecimal number 0xDEAD and DX contains the number 0xCAFE they would together point to the memory address 0xDEAD * 0x10 + 0xCAFE = 0xEB5CE One quick way to do this without a hexadecimal calculator would be to just add a zero to the hexadecimal number in the segment register and then add the content of the offset register to that number. The above would be 0xDEAD0+0xCAFE, which is quite easy to calculate in the head :-)

Usually, the two registers (the segment- and the offset-register) are written like this to denote that they are together pointing to some memory address: segment-register:offset-register. For example: DS:DX, CS:IP, SS:SP, DS:SI and ES:DI.

There are some special combinations of segment registers and general registers that point to interesting things:

CS:IP points to the address where the processor will fetch its next byte of code.
SS:SP points to the location of the last item pushed onto the stack.
DS:SI is often used to point to data that is about to be copied to ES:DI

the PC memory layout

  0-3FF        IVT (Interrupt Vector Table)
  400-5FF      BDA (BIOS Data Area)
  600-9FFFF    ordinary application RAM <-- the place where our programs will be, probably :)
  A0000-BFFFF  Video memory
  C0000-EFFFF  Optional ROMs (the VGA ROM is usually located at C0000)
  F0000-FFFFF  BIOS ROM

That means that we have 640kB of application RAM..

Everything above 0xFFFFF is called the "high memory area".

Interrupts in real mode

An interrupt is what it sounds like. There are two kinds of interrupts, software- and hardware-interrupts. A typical software interrupt is (in real mode) interrupt 0x21 (the ISR that handles this interrupt gets the function number and all the parameters from the program and then it executes the selected DOS-function) and int3 (breakpoint, often used to enter some sort of software-debugger). A typical hardware interrupt would be when some external circuit decides that it need attention from the CPU, like when the system clock ticks, it triggers interrupt 0x01.

At the very beginning of the memory lies the Interrupt Vector Table (IVT). The IVT contains pointers to all the Interrupt Service Routines (ISR's).

The pointers to the different ISR's wired to the interrupts are saved in this format:

[offset_0][segment_0][offset_1][segment_1][... ...][offset_255][segment_255]
(each integer (that is: the offset or segment-pointers) is 16 bits wide)

There are 256 different interrupts, each with its own pointer.

Example code

[ORG 0x100]
[BITS 16]
   jmp installISR
;***********************
ISRstart:
   cli               ; disable interrupts
   push ax           ; save all the registers so that no one notice
   push ds           ; our presence.
   push di
   mov ax, 0xb800    ; point to text-video-memory
   mov ds, ax
   xor di, di
loop1:
   mov al,[ds:di]    ; check one letter
   cmp al, "t"       ; was it "t"?
   jne search        ; if not, search for it...
   mov al,[ds:di+2]  ; check next letter
   cmp al, "h"       ; was it "h"?
   jne search        ; if not, search for it...
   mov al,[ds:di+4]  ; check the last letter
   cmp al, "e"       ; if its an "e"?
   jne search        ; if not, search for it...
                     ; if we are here, we have found a "the"

; replace "the" with "=O)" mov al, "=" mov [ds:di], al ; replace "t" with "=" mov al, "O" mov [ds:di+2], al ; replace "h" with "o" mov al, ")" mov [ds:di+4], al ; and finally replace "e" with ")" jmp loop1 ; is there any more "the"? end: pop di ; return all registers pop ds ; and no one will notice our presence! ;) pop ax sti ; and reenable the interrupts iret ; and return to whatever misc. activity the ; computer was doing.. search: cmp di, 0x1f40 ; did we search all the letters? jae end ; yes: stop searching! add di, 2 ; no: select the next letter and return jmp loop1 ; to the find-more-"the"-stings-procedure ISRend: ISRlen EQU ISRend-ISRstart ;*********************** installISR: xor ax, ax mov es, ax ; es = 0 mov di, 0x70 ; 0:0070 (INT 0x1C offset) mov ax, ISRstart ; get our start-IP mov [es:di], ax ; write it to the IVT interrupt pointer for 0x1C mov di, 0x72 ; 0:0072 (INT 0x1C segment) mov ax, cs ; get our code segment mov [es:di], ax ; write it to the IVT interrupt segment for 0x1C mov ax,0x3100 ; select DOS-function "TSR_Install", errorcode 0. mov dx, ds ; when the program starts, DS = PSP int 0x21 ; make DOS reserve the piece of code and exit back to the shell. ; here, the computer will continue to work as if nothing happened.. ; but trying to write "the" anywhere on the screen =O) ...it doesn't work. ;***********************

Assemble with "nasm filename -o filename.com". (see NASM)

Protected Mode

Protected mode is the mode in which most modern operating systems run their code. When the computer boots, it first enters real mode; the operating system is responsible for switching into protected mode.

The 80286 processor did have a 16-bit protected mode. It is of historic interest only. We will concentrate on the 32-bit protected mode, found on the 80386 processors and all its successors.

There are new registers in protected mode:

All the general application registers (AX, BX, CX, DX, SI, DI, SP and BP) are extended to a total of 32 bits. To denote that you intend the 32-bit register instead of the low 16-bit part, add an E before the name of the register..

EAX, EBX, ECX, EDX, ESI, EDI, ESP and EBP

The segment registers remain 16 bits wide and do not change names.

CS, DS, ES, FS, GS and SS

The segment registers do not work like they did in real mode. Instead, they are used to point out an selector in a table pointed to by the GDTR- or LDTR-register.

The FLAGS- and IP-registers are also extended to 32 bits.

EIP, EFLAGS

There are also new registers. They are useful only to system programmers.

CR0, CR2, CR3, CR4, GDTR, LDTR, IDTR, TR..

They are used by the operating system and cannot be accessed by ordinary user programs.

There are also some other registers, like the test registers, debug registers, MMX, MSR, XMM and others.

How to switch to protected mode:

* load GDTR with the pointer to the GDT-table.
* load IDTR with the pointer to the IDT OR dissable interrupts.
* set the PE-bit in the CR0-register.
* make a far jump to the 32-bit code.
* initialize TR with the selector of a valid TSS.
* optional: load LDTR with the pointer to the LDT-table.

SNAFU (System Normal, All F*cked Up) is an operating system written in x86-assembly. It quite clearly shows how an operating system works at a very basic level. SNAFU is published as Public Domain. http://home.swipnet.se/smaffy/snafu/

Menuet is also an operating system written in pure x86-assembly. It is larger than SNAFU and has almost left the development phase. Menuet OS is distributed under the GNU Public License. http://www.menuetos.org