Unit - 1
80386DX – Basic programming model and application instruction set
1. The Global Descriptor Table (GDT) is a data structure used by Intel x86-family processors starting with the 80286 in order to define the characteristics of the various memory areas used during program execution, including the base address, the size, and access privileges like executability and writ ability.
2. The GDT can hold things other than segment descriptors as well. Every 8-byte entry in the GDT is a descriptor, but these descriptors can be references not only to memory segments but also to Task State Segment (TSS), Local Descriptor Table (LDT), or Call Gate structures in memory. The last ones, Call Gates, are particularly important for transferring control between x86 privilege levels although this mechanism is not used on most modern operating systems.
3. There is also a Local Descriptor Table (LDT). Multiple LDTs can be defined in the GDT, but only one is current at any one time: usually associated with the current Task. While the LDT contains memory segments which are private to a specific program, the GDT contains global segments. The x86 processors have facilities for automatically switching the current LDT on specific machine events, but no facilities for automatically switching the GDT.
4. Every memory access which a program can perform always goes through a segment. On the 80386 processor and later, because of 32-bit segment offsets and limits, it is possible to make segments cover the entire addressable memory, which makes segment-relative addressing transparent to the user.
5. In order to reference a segment, a program must use its index inside the GDT or the LDT. Such an index is called a segment selector (or selector). The selector must generally be loaded into a segment register to be used. Apart from the machine instructions which allow one to set/get the position of the GDT, and of the Interrupt Descriptor Table (IDT), in memory, every machine instruction referencing memory has an implicit Segment Register, occasionally two. Most of the time this Segment Register can be overridden by adding a Segment Prefix before the instruction.
6. Loading a selector into a segment register automatically reads the GDT or the LDT and stores the properties of the segment inside the processor itself. Subsequent modifications to the GDT or LDT will not be effective unless the segment register is reloaded.
1.1.1 Example Shown below is an assembly implementation of a GDT which opens up all 4 GB of available memory: Base = 0x00000000, segment limit = 0xffffffff
; Offset 0x0 .null descriptor: dq 0
; Offset 0x8 .code: ; cs should point to this descriptor dw 0xffff ; segment limit first 0-15 bits dw 0 ; base first 0-15 bits db 0 ; base 16-23 bits db 0x9a ; access byte db 11001111b ; high 4 bits (flags) low 4 bits (limit 4 last bits)(limit is 20 bit wide) db 0 ; base 24-31 bits
; Offset 0x10 .data: ; ds, ss, es, fs, and gs should point to this descriptor dw 0xffff ; segment limit first 0-15 bits dw 0 ; base first 0-15 bits db 0 ; base 16-23 bits db 0x92 ; access byte db 11001111b ; high 4 bits (flags) low 4 bits (limit 4 last bits)(limit is 20 bit wide) db 0 ; base 24-31 bits
|
Key takeaways
- The GDT can hold things other than segment descriptors as well. Every 8-byte entry in the GDT is a descriptor, but these descriptors can be references not only to memory segments but also to Task State Segment (TSS), Local Descriptor Table (LDT), or Call Gate structures in memory.
- The selector must generally be loaded into a segment register to be used. Apart from the machine instructions which allow one to set/get the position of the GDT,
- The x86 processors have facilities for automatically switching the current LDT on specific machine events, but no facilities for automatically switching the GDT.
1. A Local Descriptor Table (LDT) is a memory table used in the x86 architecture in protected mode and containing memory segment descriptors, just like the GDT: address start in linear memory, size, executability, writability, access privilege, actual presence in memory, etc.
2. LDTs are the siblings of the Global Descriptor Table (GDT), and each define up to 8192 memory segments accessible to programs - note that unlike the GDT, the zeroeth entry is a valid entry, and can be used like any other LDT entry.
3. Also note that unlike the GDT, the LDT cannot be used to store certain system entries: TSSs or LDTs.
Key takeaways
- LDTs are the siblings of the Global Descriptor Table (GDT), and each define up to 8192 memory segments accessible to programs
1.3.1 Introduction:
1. The Interrupt Descriptor Table (IDT) is a data structure used by the x86 architecture to implement an interrupt vector table. The IDT is used by the processor to determine the correct response to interrupts and exceptions.
2. The details in the description below apply specifically to the x86 architecture and the AMD64 architecture.
3. The Interrupt Descriptor Table, or IDT, is used in order to show the processor what Interrupt Service Routine (ISR) to call to handle either an exception or an 'int' opcode (in assembly). IDT entries are also called by Interrupt Requests whenever a device has completed a request and needs to be serviced.
4. Each IDT entry is similar to that of a GDT entry. Both have hold a base address, both hold an access flag, and both are 64-bits long. The major differences in these two types of descriptors is in the meanings of these fields. In an IDT, the base address specified in the descriptor is actually the address of the Interrupt Service Routine that the processor should call when this interrupt is 'raised' (called).
5. An IDT entry doesn't have a limit, instead it has a segment that you need to specify. The segment must be the same segment that the given ISR is located in. This allows the processor to give control to the kernel through an interrupt that has occurred when the processor is in a different ring (like when an application is running).
6. The access flags of an IDT entry are also similar to a GDT entries. There is a field to say if the descriptor is actually present or not. There is a field for the Descriptor Privilege Level (DPL) to say which ring is the highest number that is allowed to use the given interrupt.
1.3.2 Preliminary Initialization of the IDT
1. The IDT is initialized and used by the BIOS routines when the computer still operates in Real Mode. Once Linux takes over, however, the IDT is moved to another area of RAM and initialized a second time, since Linux does not use any BIOS routines
2. The idt variable points to the IDT, while the IDT itself is stored in the idt_table table, which includes 256 entries. The 6-byte idt_descr variable stores both the size of the IDT and its address and is used only when the kernel initializes the idtr register with the lidt assembly language instruction.
3. During kernel initialization, the setup_idt( ) assembly language function starts by filling all 256 entries of idt_table with the same interrupt gate, which refers to the ignore_int( ) interrupt handler
4. Example program:
setup_idt: lea ignore_int, %edx movl $(_ _KERNEL_CS << 16), %eax movw %dx, %ax /* selector = 0x0010 = cs */ movw $0x8e00, %dx /* interrupt gate, dpl=0, present */ lea idt_table, %edi mov $256, %ecx rp_sidt: movl %eax, (%edi) movl %edx, 4(%edi) addl $8, %edi dec %ecx jne rp_sidt ret The ignore_int( ) interrupt handler, which is in assembly language, may be viewed as a null handler that executes the following actions:
1. Saves the content of some registers in the stack 2. Invokes the printk( ) function to print an “Unknown interrupt” system message 3. Restores the register contents from the stack 4. Executes an iret instruction to restart the interrupted program
The ignore_int( ) handler should never be executed. The occurrence of “Unknown interrupt” messages on the console or in the log files denotes either a hardware problem (an I/O device is issuing unforeseen interrupts) or a kernel problem (an interrupt or exception is not being handled properly). Following this preliminary initialization, the kernel makes a second pass in the IDT to replace some of the null handlers with meaningful trap and interrupt handlers. Once this is done, the IDT includes a specialized interrupt, trap, or system gate for each different exception issued by the control unit and for each IRQ recognized by the interrupt controller.
|
Key takeaways
- IDT entries are also called by Interrupt Requests whenever a device has completed a request and needs to be serviced.
- Once Linux takes over, however, the IDT is moved to another area of RAM and initialized a second time, since Linux does not use any BIOS routines
The following data types are directly supported and thus implemented by one or more 80386 machine instructions; these data types are briefly described here.
1. Bit (boolean value), bit field (group of up to 32 bits) and bit string (up to 4 Gbit in length).
2. 8-bit integer (byte), either signed (range −128..127) or unsigned (range 0..255).
3. 16-bit integer, either signed (range −32,768..32,767) or unsigned (range 0..65,535).
4. 32-bit integer, either signed (range −231..231−1) or unsigned (range 0..232−1).
5. 64-bit integer, either signed (range −263..263−1) or unsigned (range 0..264−1).
6. Offset, a 16- or 32-bit displacement referring to a memory location (using any addressing mode).
7. Pointer, a 16-bit selector together with a 16- or 32-bit offset.
8. Character (8-bit character code).
9. String, a sequence of 8-, 16- or 32-bit words (up to 4 Gbit in length).[15]
10. BCD, decimal digits (0..9) represented by unpacked bytes.
11. Packed BCD, two BCD digits in one byte (range 0..99).
Key takeaways
- Offset, a 16- or 32-bit displacement referring to a memory location (using any addressing mode).
- BCD, decimal digits (0..9) represented by unpacked bytes
1. The 80386 contains a total of sixteen registers that are of interest to the applications programmer. As, these registers may be grouped into these basic categories:
2. General registers. These eight 32-bit general-purpose registers are used primarily to contain operands for arithmetic and logical operations.
3. Segment registers. These special-purpose registers permit systems software designers to choose either a flat or segmented model of memory organization. These six registers determine, at any given time, which segments of memory are currently addressable.
4. Status and instruction registers. These special-purpose registers are used to record and alter certain aspects of the 80386 processor state.
1.5.1 General Registers
1. The general registers of the 80386 are the 32-bit registers EAX, EBX, ECX, EDX, EBP, ESP, ESI, and EDI. These registers are used interchangeably to contain the operands of logical and arithmetic operations. They may also be used interchangeably for operands of address computations (except that ESP cannot be used as an index operand).
2. All of the general-purpose registers are available for addressing calculations and for the results of most arithmetic and logical calculations; however, a few functions are dedicated to certain registers. By implicitly choosing registers for these functions, the 80386 architecture can encode instructions more compactly. The instructions that use specific registers include: double-precision multiply and divide, I/O, string instructions, translate, loop, variable shift and rotate, and stack operations.
1.5.2 Segment Registers
1. The segment registers of the 80386 give systems software designers the flexibility to choose among various models of memory organization. Implementation of memory models is the subject of Part II -- Systems Programming. Designers may choose a model in which applications programs do not need to modify segment registers, in which case applications programmers may skip this section.
2. Complete programs generally consist of many different modules, each consisting of instructions and data. However, at any given time during program execution, only a small subset of a program's modules are actually in use.
3. The 80386 architecture takes advantage of this by providing mechanisms to support direct access to the instructions and data of the current module's environment, with access to additional segments on demand.
1.5.3 Stack Implementation
Stack operations are facilitated by three registers:
1. The stack segment (SS) register. Stacks are implemented in memory. A system may have a number of stacks that is limited only by the maximum number of segments. A stack may be up to 4 gigabytes long, the maximum length of a segment.
2. One stack is directly addressable at a -- one located by SS. This is the current stack, often referred to simply as "the" stack. SS is used automatically by the processor for all stack operations.
3. The stack pointer (ESP) register. ESP points to the top of the push-down stack (TOS). It is referenced implicitly by PUSH and POP operations, subroutine calls and returns, and interrupt operations.
4. The stack-frame base pointer (EBP) register. The EBP is the best choice of register for accessing data structures, variables and dynamically allocated work space within the stack. EBP is often used to access elements on the stack relative to a fixed point on the stack rather than relative to the current TOS. It typically identifies the base address of the current stack frame established for the current procedure.
1.5.4 Flags Register
1. The flags register is a 32-bit register named EFLAGS. The flags control certain operations and indicate the status of the 80386.
2. The low-order 16 bits of EFLAGS is named FLAGS and can be treated as a unit. This feature is useful when executing 8086 and 80286 code, because this part of EFLAGS is identical to the FLAGS register of the 8086 and the 80286.
3. The flags may be considered in three groups: the status flags, the control flags, and the systems flags.
1.5.5 Status Flags
The status flags of the EFLAGS register allow the results of one instruction to influence later instructions. The arithmetic instructions use OF, SF, ZF, AF, PF, and CF. The SCAS (Scan String), CMPS (Compare String), and LOOP instructions use ZF to signal that their operations are complete. There are instructions to set, clear, and complement CF before execution of an arithmetic instruction.
1.5.6 Control Flag
The control flag DF of the EFLAGS register controls string instructions. DF (Direction Flag, bit 10) Setting DF causes string instructions to auto-decrement; that is, to process strings from high addresses to low addresses. Clearing DF causes string instructions to auto-increment, or to process strings from low addresses to high addresses.
1.5.7 Instruction Pointer
The instruction pointer register (EIP) contains the offset address, relative to the start of the current code segment, of the next sequential instruction to be executed. The instruction pointer is not directly visible to the programmer; it is controlled implicitly by control-transfer instructions, interrupts, and exceptions.
Key takeaways
- Status and instruction registers. These special-purpose registers are used to record and alter certain aspects of the 80386 processor state.
- general registers are used interchangeably to contain the operands of logical and arithmetic operations.
- The segment registers of the 80386 give systems software designers the flexibility to choose among various models of memory organization
- The stack pointer (ESP) register. ESP points to the top of the push-down stack (TOS). It is referenced implicitly by PUSH and POP operations, subroutine calls and returns, and interrupt operations.
- The flags control certain operations and indicate the status of the 80386
- The arithmetic instructions use OF, SF, ZF, AF, PF, and CF. The SCAS (Scan String), CMPS (Compare String), and LOOP instructions use ZF to signal that their operations are complete
- Clearing DF causes string instructions to auto-increment, or to process strings from low addresses to high addresses
- The instruction pointer is not directly visible to the programmer; it is controlled implicitly by control-transfer instructions, interrupts, and exceptions.
1. The information encoded in an 80386 instruction includes a specification of the operation to be performed, the type of the operands to be manipulated, and the location of these operands. If an operand is located in memory, the instruction must also select, explicitly or implicitly, which of the currently addressable segments contains the operand.
2. 80386 instructions are composed of various elements and have various formats. The exact format of instructions is shown in Appendix B; the elements of instructions are described below. Of these instruction elements, only one, the opcode, is always present. The other elements may or may not be present, depending on the particular operation involved and on the location and type of the operands. The elements of an instruction, in order of occurrence are as follows:
3.Prefixes -- one or more bytes preceding an instruction that modify the operation of the instruction. The following types of prefixes can be used by applications programs:
a. Segment override -- explicitly specifies which segment register an instruction should use, thereby overriding the default segment-register selection used by the 80386 for that instruction.
b. Address size -- switches between 32-bit and 16-bit address generation.
c. Operand size -- switches between 32-bit and 16-bit operands.
d. Repeat -- used with a string instruction to cause the instruction to act on each element of the string.
4. Opcode -- specifies the operation performed by the instruction. Some operations have several different opcodes, each specifying a different variant of the operation.
5. Register specifier -- an instruction may specify one or two register operands. Register specifiers may occur either in the same byte as the opcode or in the same byte as the addressing-mode specifier.
6. Addressing-mode specifier -- when present, specifies whether an operand is a register or memory location; if in memory, specifies whether a displacement, a base register, an index register, and scaling are to be used.
7. SIB (scale, index, base) byte -- when the addressing-mode specifier indicates that an index register will be used to compute the address of an operand, an SIB byte is included in the instruction to encode the base register, the index register, and a scaling factor.
8. Displacement -- when the addressing-mode specifier indicates that a displacement will be used to compute the address of an operand, the displacement is encoded in the instruction. A displacement is a signed integer of 32, 16, or eight bits. The eight-bit form is used in the common case when the displacement is sufficiently small. The processor extends an eight-bit displacement to 16 or 32 bits, taking into account the sign.
9. Immediate operand -- when present, directly provides the value of an operand of the instruction. Immediate operands may be 8, 16, or 32 bits wide. In cases where an eight-bit immediate operand is combined in some way with a 16- or 32-bit operand, the processor automatically extends the size of the eight-bit operand, taking into account the sign.
Key takeaways
- If an operand is located in memory, the instruction must also select, explicitly or implicitly, which of the currently addressable segments contains the operand.
- Of these instruction elements, only one, the opcode, is always present. The other elements may or may not be present, depending on the particular operation involved and, on the location, and type of the operands.
1. An instruction can act on zero or more operands, which are the data manipulated by the instruction. An example of a zero-operand instruction is NOP (no operation). An operand can be in any of these locations:
In the instruction itself (an immediate operand)
2. In a register (EAX, EBX, ECX, EDX, ESI, EDI, ESP, or EBP in the case of 32-bit operands; AX, BX, CX, DX, SI, DI, SP, or BP in the case of 16-bit operands; AH, AL, BH, BL, CH, CL, DH, or DL in the case of 8-bit operands; the segment registers; or the EFLAGS register for flag operations)
In memory
At an I/O port
3. Immediate operands and operands in registers can be accessed more rapidly than operands in memory since memory operands must be fetched from memory. Register operands are available in the CPU. Immediate operands are also available in the CPU, because they are prefetched as part of the instruction.
For most instructions, one of the two explicitly specified -- the source or the -- be either in a register or in memory. The other operand must be in a register or be an immediate source operand. Thus, the explicit two-operand instructions of the 80386 permit operations of the following kinds:
1. Register-to-register
2. Register-to-memory
3. Memory-to-register
4. Immediate-to-register
5. Immediate-to-memory
1.7.1 Immediate Operands
1. Certain instructions use data from the instruction itself as one (and sometimes two) of the operands. Such an operand is called an immediate operand. The operand may be 32-, 16-, or 8-bits long. For example:
SHR PATTERN, 2
2. One byte of the instruction holds the value 2, the number of bits by which to shift the variable PATTERN.
TEST PATTERN, 0FFFF00FFH
A double word of the instruction holds the mask that is used to test the variable PATTERN.
1.7.2 Register Operands
1. Operands may be located in one of the 32-bit general registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, or EBP), in one of the 16-bit general registers (AX, BX, CX, DX, SI, DI, SP, or BP), or in one of the 8-bit general registers (AH, BH, CH, DH, AL, BL, CL,or DL).
2. The 80386 has instructions for referencing the segment registers (CS, DS, ES, SS, FS, GS). These instructions are used by applications programs only if systems designers have chosen a segmented memory model.
3. The 80386 also has instructions for referring to the flag register. The flags may be stored on the stack and restored from the stack. Certain instructions change the commonly modified flags directly in the EFLAGS register. Other flags that are seldom modified can be modified indirectly via the flags image in the stack.
1.7.3 Memory Operands
1. Data-manipulation instructions that address operands in memory must specify (either directly or indirectly) the segment that contains the operand and the offset of the operand within the segment. However, for speed and compact instruction encoding, segment selectors are stored in the high speed segment registers.
2. Therefore, data-manipulation instructions need to specify only the desired segment register and an offset in order to address a memory operand.
3. An 80386 data-manipulation instruction that accesses memory uses one of the following methods for specifying the offset of a memory operand within its segment:
4. Most data-manipulation instructions that access memory contain a byte that explicitly specifies the addressing method for the operand. A byte, known as the modR/M byte, follows the opcode and specifies whether the operand is in a register or in memory.
5. If the operand is in memory, the address is computed from a segment register and any of the following values: a base register, an index register, a scaling factor, a displacement. When an index register is used, the modR/M byte is also followed by another byte that identifies the index register and scaling factor. This addressing method is the most flexible.
6. A few data-manipulation instructions implicitly use specialized addressing methods:
7. For a few short forms of MOV that implicitly use the EAX register, the offset of the operand is coded as a doubleword in the instruction. No base register, index register, or scaling factor are used.
8. String operations implicitly address memory via DS:ESI, (MOVS, CMPS, OUTS, LODS, SCAS) or via ES:EDI (MOVS, CMPS, INS, STOS).
9. Stack operations implicitly address operands via SS:ESP registers; e.g., PUSH, POP, PUSHA, PUSHAD, POPA, POPAD, PUSHF, PUSHFD, POPF, POPFD, CALL, RET, IRET, IRETD, exceptions, and interrupts.
Key takeaways
- An instruction can act on zero or more operands, which are the data manipulated by the instruction
- Immediate operands and operands in registers can be accessed more rapidly than operands in memory since memory operands must be fetched from memory.
- One byte of the instruction holds the value 2, the number of bits by which to shift the variable PATTERN.
- The flags may be stored on the stack and restored from the stack. Certain instructions change the commonly modified flags directly in the EFLAGS register
- If the operand is in memory, the address is computed from a segment register and any of the following values: a base register, an index register, a scaling factor, a displacement.
1.8.1 Introduction
The 80386 has two mechanisms for interrupting program execution:
1. Exceptions are synchronous events that are the responses of the CPU to certain conditions detected during the execution of an instruction.
2. Interrupts are asynchronous events typically triggered by external devices needing attention.
3. Interrupts and exceptions are alike in that both cause the processor to temporarily suspend its present program execution in order to execute a program of higher priority. The major distinction between these two kinds of interrupts is their origin.
4. An exception is always reproducible by re-executing with the program and data that caused the exception, whereas an interrupt is generally independent of the currently executing program.
5. Application programmers are not normally concerned with servicing interrupts. More information on interrupts for systems programmers. Certain exceptions, however, are of interest to applications programmers, and many operating systems give applications programs the opportunity to service these exceptions. However, the operating system itself defines the interface between the applications programs and the exception mechanism of the 80386.
1.8.2 Table highlights
The exceptions that may be of interest to applications programmers.
Vector Number Description
0 Divide Error 1 Debug Exceptions 2 NMI Interrupt 3 Breakpoint 4 INTO Detected Overflow 5 BOUND Range Exceeded 6 Invalid Opcode 7 Coprocessor Not Available 8 Double Exception 9 Coprocessor Segment Overrun 10 Invalid Task State Segment 11 Segment Not Present 12 Stack Fault 13 General Protection 14 Page Fault 15 (reserved) 16 Coprocessor Error 17-32 (reserved)
1. A divide error exception results when the instruction DIV or IDIV is executed with a zero denominator or when the quotient is too large for the destination operand. (Refer to Chapter 3 for a discussion of DIV and IDIV.) 2. The debug exception may be reflected back to an applications program if it results from the trap flag (TF). 3. A breakpoint exception results when the instruction INT 3 is executed. This instruction is used by some debuggers to stop program execution at specific points. 4. An overflow exception results when the INTO instruction is executed and the OF (overflow) flag is set (after an arithmetic operation that set the OF flag). (Refer to Chapter 3 for a discussion of INTO). 5. A bounds check exception results when the BOUND instruction is executed and the array index it checks falls outside the bounds of the array. (Refer to Chapter 3 for a discussion of the BOUND instruction.) 6. Invalid opcodes may be used by some applications to extend the instruction set. In such a case, the invalid opcode exception presents an opportunity to emulate the opcode. 7. The "coprocessor not available" exception occurs if the program contains instructions for a coprocessor, but no coprocessor is present in the system. |
Key takeaways
- Interrupts and exceptions are alike in that both cause the processor to temporarily suspend its present program execution in order to execute a program of higher priority.
- Certain exceptions, however, are of interest to applications programmers, and many operating systems give applications programs the opportunity to service these exceptions.
- Invalid opcodes may be used by some applications to extend the instruction set. In such a case, the invalid opcode exception presents an opportunity to emulate the opcode.
Data movement instructions move data from one place, called the source operand, to another place, called the destination operand. Data movement instructions can be grouped into loads, stores, moves, and immediate loads.
1. Load instructions move data from memory to registers.
2. Store instructions move data from registers to memory.
3. Move instructions move data from one register to another.
4. Immediate load instructions move constants, including addresses, to registers.
1. Load Instructions Load instructions move data from memory to registers.
Instruction Operation lb Rd, addr load byte sign extended lbu Rd, addr load byte zero extended lh Rd, addr load halfword sign extended lhu Rd, addr load halfword zero extended lw Rd, addr load word l.d Fd, addr load double l.s Fd, addr load float
2. Store Instructions Store instructions move data from registers to memory.
Instruction Operation sb Rs, addr store byte sh Rs, addr store halfword sw Rs, addr store word s.d Fs, addr store double s.s Fs, addr store float
3. Move Instructions
Move instructions move data from one register to another.
Instruction Operation mfc0 Rd, Cs load from control register mfc1 Rd, Fs load from floating point register move Rd, Rs move integer data move.d Fd, Fs move double move.s Fd, Fs move float mtc0 Rs, Cd store to control register mtc1 Rs, Fd store to floating point register
4. Immediate Load Instructions
Immediate load instructions move constants to registers.
Instruction Operation la Rd, label load address li Rd, const load immediate li.d Fd, const load double immediate li.s Fd, const load float immediate
|
Key takeaways
- Data movement instructions can be grouped into loads, stores, moves, and immediate loads.
1.10.1 Introduction
1. The arithmetic instructions of the 80386 processor simplify the manipulation of numeric data that is encoded in binary. Operations include the standard add, subtract, multiply, and divide as well as increment, decrement, compare, and change sign.
2. Both signed and unsigned binary integers are supported. The binary arithmetic instructions may also be used as one step in the process of performing arithmetic on decimal integers.
3. Many of the arithmetic instructions operate on both signed and unsigned integers. These instructions update the flags ZF, CF, SF, and OF in such a manner that subsequent instructions can interpret the results of the arithmetic as either signed or unsigned.
4. CF contains information relevant to unsigned integers; SF and OF contain information relevant to signed integers. ZF is relevant to both signed and unsigned integers; ZF is set when all bits of the result are zero.
5. If the integer is unsigned, CF may be tested after one of these arithmetic operations to determine whether the operation required a carry or borrow of a one-bit in the high-order position of the destination operand.
6.CF is set if a one-bit was carried out of the high-order position (addition instructions ADD, ADC, AAA, and DAA) or if a one-bit was carried (i.e. borrowed) into the high-order bit (subtraction instructions SUB, SBB, AAS, DAS, CMP, and NEG).
1.10.2 Addition and Subtraction Instructions
1. ADD (Add Integers) replaces the destination operand with the sum of the source and destination operands. Sets CF if overflow.
2. ADC (Add Integers with Carry) sums the operands, adds one if CF is set, and replaces the destination operand with the result. If CF is cleared, ADC performs the same operation as the ADD instruction. An ADD followed by multiple ADC instructions can be used to add numbers longer than 32 bits.
3. INC (Increment) adds one to the destination operand. INC does not affect CF. Use ADD with an immediate value of 1 if an increment that updates carry (CF) is needed.
4. SUB (Subtract Integers) subtracts the source operand from the destination operand and replaces the destination operand with the result. If a borrow is required, the CF is set. The operands may be signed or unsigned bytes, words, or doublewords.
5. SBB (Subtract Integers with Borrow) subtracts the source operand from the destination operand, subtracts 1 if CF is set, and returns the result to the destination operand. If CF is cleared, SBB performs the same operation as SUB. SUB followed by multiple SBB
6. instructions may be used to subtract numbers longer than 32 bits. If CF is cleared, SBB performs the same operation as SUB.
7. DEC (Decrement) subtracts 1 from the destination operand. DEC does not update CF. Use SUB with an immediate value of 1 to perform a decrement that affects carry.
1.10.3 Multiplication Instructions
1. The 80386 has separate multiply instructions for unsigned and signed operands. MUL operates on unsigned numbers, while IMUL operates on signed integers as well as unsigned.
2. MUL (Unsigned Integer Multiply) performs an unsigned multiplication of the source operand and the accumulator. If the source is a byte, the processor multiplies it by the contents of AL and returns the double-length result to AH and AL.
3. If the source operand is a word, the processor multiplies it by the contents of AX and returns the double-length result to DX and AX. If the source operand is a double word, the processor multiplies it by the contents of EAX and returns the 64-bit result in EDX and EAX. MUL sets CF and OF when the upper half of the result is nonzero; otherwise, they are cleared.
4. IMUL (Signed Integer Multiply) performs a signed multiplication operation. IMUL has three variations:
4.1 A one-operand form. The operand may be a byte, word, or double word located in memory or in a general register. This instruction uses EAX and EDX as implicit operands in the same way as the MUL instruction.
4.2 A two-operand form. One of the source operands may be in any general register while the other may be either in memory or in a general register. The product replaces the general-register operand.
4.3 A three-operand form; two are source and one is the destination operand. One of the source operands is an immediate value stored in the instruction; the second may be in memory or in any general register.
8. The product may be stored in any general register. The immediate operand is treated as signed. If the immediate operand is a byte, the processor automatically sign-extends it to the size of the second operand before performing the multiplication.
1.10.4 Division Instructions
1. The 80386 has separate division instructions for unsigned and signed operands. DIV operates on unsigned numbers, while IDIV operates on signed integers as well as unsigned. In either case, an exception (interrupt zero) occurs if the divisor is zero or if the quotient is too large for AL, AX, or EAX.
2. DIV (Unsigned Integer Divide) performs an unsigned division of the accumulator by the source operand. The dividend (the accumulator) is twice the size of the divisor (the source operand); the quotient and remainder have the same size as the divisor, as the following table shows.
3. Size of Source Operand (Divisor) Dividend Quotient Remainder Byte AX AL AH Word DX:AX AX DX Double word EDX:EAX EAX EDX
|
Key takeaways
- Operations include the standard add, subtract, multiply, and divide as well as increment, decrement, compare, and change sign.
- ADC (Add Integers with Carry) sums the operands, adds one if CF is set, and replaces the destination operand with the result.
- SBB (Subtract Integers with Borrow) subtracts the source operand from the destination operand, subtracts 1 if CF is set, and returns the result to the destination operand.
- MUL (Unsigned Integer Multiply) performs an unsigned multiplication of the source operand and the accumulator.
- DIV operates on unsigned numbers, while IDIV operates on signed integers as well as unsigned.
1.11.1 Introduction
1. Decimal arithmetic is performed by combining the binary arithmetic instructions (already discussed in the prior section) with the decimal arithmetic instructions. The decimal arithmetic instructions are used in one of the following ways:
2. To adjust the results of a previous binary arithmetic operation to produce a valid packed or unpacked decimal result.
3. To adjust the inputs to a subsequent binary arithmetic operation so that the operation will produce a valid packed or unpacked decimal result.
4. These instructions operate only on the AL or AH registers. Most utilize the AF flag.
1.11.2 Packed BCD Adjustment Instructions
1. DAA (Decimal Adjust after Addition) adjusts the result of adding two valid packed decimal operands in AL. DAA must always follow the addition of two pairs of packed decimal numbers (one digit in each half-byte) to obtain a pair of valid packed decimal digits as results. The carry flag is set if carry was needed.
2. DAS (Decimal Adjust after Subtraction) adjusts the result of subtracting two valid packed decimal operands in AL. DAS must always follow the subtraction of one pair of packed decimal numbers (one digit in each half- byte) from another to obtain a pair of valid packed decimal digits as results. The carry flag is set if a borrow was needed.
1.11.3 Unpacked BCD Adjustment Instructions
1. AAA (ASCII Adjust after Addition) changes the contents of register AL to a valid unpacked decimal number, and zeros the top 4 bits. AAA must always follow the addition of two unpacked decimal operands in AL. The carry flag is set and AH is incremented if a carry is necessary.
2. AAS (ASCII Adjust after Subtraction) changes the contents of register AL to a valid unpacked decimal number, and zeros the top 4 bits. AAS must always follow the subtraction of one unpacked decimal operand from another in AL. The carry flag is set and AH decremented if a borrow is necessary.
3. AAM (ASCII Adjust after Multiplication) corrects the result of a multiplication of two valid unpacked decimal numbers. AAM must always follow the multiplication of two decimal numbers to produce a valid decimal result. The high order digit is left in AH, the low order digit in AL.
4. AAD (ASCII Adjust before Division) modifies the numerator in AH and AL to prepare for the division of two valid unpacked decimal operands so that the quotient produced by the division will be a valid unpacked decimal number. AH should contain the high-order digit and AL the low-order digit. This instruction adjusts the value and places the result in AL. AH will contain zero.
Key takeaways
- Decimal arithmetic is performed by combining the binary arithmetic instructions (already discussed in the prior section) with the decimal arithmetic instructions
- DAS must always follow the subtraction of one pair of packed decimal numbers (one digit in each half- byte) from another to obtain a pair of valid packed decimal digits as results
- The carry flag is set and AH is incremented if a carry is necessary.
1.12.1 Introduction
The group of logical instructions includes:
1. The Boolean operation instructions
2. Bit test and modify instructions
3. Bit scan instructions
4. Rotate and shift instructions
5. Byte set on condition
1.12.2 Boolean Operation Instructions
1. The logical operations are AND, OR, XOR, and NOT.
NOT (Not) inverts the bits in the specified operand to form a one's complement of the operand. The NOT instruction is a unary operation that uses a single operand in a register or memory. NOT has no effect on the flags.
2. The AND, OR, and XOR instructions perform the standard logical operations "and", "(inclusive) or", and "exclusive or". These instructions can use the following combinations of operands:
3. Two register operands
A general register operand with a memory operand
An immediate operand with either a general register operand or a memory operand.
AND, OR, and XOR clear OF and CF, leave AF undefined, and update SF, ZF, and PF.
1.12.3 Bit Test and Modify Instructions
1. This group of instructions operates on a single bit which can be in memory or in a general register. The location of the bit is specified as an offset from the low-order end of the operand. The value of the offset either may be given by an immediate byte in the instruction or may be contained in a general register. 2. These instructions first assign the value of the selected bit to CF, the carry flag. Then a new value is assigned to the selected bit, as determined by the operation. OF, SF, ZF, AF, PF are left in an undefined state. Table 3-1 defines these instructions.
3. Bit Test and Modify Instructions Instruction Effect on CF Effect on Selected Bit
Bit (Bit Test) CF := BIT (none) BTS (Bit Test and Set) CF := BIT BIT := 1 BTR (Bit Test and Reset) CF := BIT BIT := 0 BTC (Bit Test and Complement) CF := BIT BIT := NOT(BIT)
1.12.4 Bit Scan Instructions
1. These instructions scan a word or doubleword for a one-bit and store the index of the first set bit into a register. The bit string being scanned may be either in a register or in memory. The ZF flag is set if the entire word is zero (no set bits are found); ZF is cleared if a one-bit is found. If no set bit is found, the value of the destination register is undefined. 2. BSF (Bit Scan Forward) scans from low-order to high-order (starting from bit index zero).
BSR (Bit Scan Reverse) scans from high-order to low-order (starting from bit index 15 of a word or index 31 of a doubleword).
1.12.5 Shift and Rotate Instructions The shift and rotate instructions reposition the bits within the specified operand. These instructions fall into the following classes: 1. Shift instructions 2. Double shift instructions 3. Rotate instructions
|
Key takeaways
- The NOT instruction is a unary operation that uses a single operand in a register or memory.
- The AND, OR, and XOR instructions perform the standard logical operations "and", "(inclusive) or", and "exclusive or".
- The location of the bit is specified as an offset from the low-order end of the operand.
- 4.The value of the offset either may be given by an immediate byte in the instruction or may be contained in a general register.
- BSF (Bit Scan Forward) scans from low-order to high-order (starting from bit index zero).
- BSR (Bit Scan Reverse) scans from high-order to low-order (starting from bit index 15 of a word or index 31 of a doubleword).
1.13.1 Introduction
The 80386 provides both conditional and unconditional control transfer instructions to direct the flow of execution. Conditional control transfers depend on the results of operations that affect the flag register. Unconditional control transfers are always executed.
1.13.2 Unconditional Transfer Instructions
1. JMP, CALL, RET, INT and IRET instructions transfer control from one code segment location to another. These locations can be within the same code segment (near control transfers) or in different code segments (far control transfers).
2. The variants of these instructions that transfer control to other segments are discussed in a later section of this chapter. If the model of memory organization used in a particular 80386 application does not make segments visible to applications programmers, intersegment control transfers will not be used.
1.13.3 Jump Instruction
1. JMP (Jump) unconditionally transfers control to the target location. JMP is a one-way transfer of execution; it does not save a return address on the stack.
2. The JMP instruction always performs the same basic function of transferring control from the current location to a new location. Its implementation varies depending on whether the address is specified directly within the instruction or indirectly through a register or memory.
3. A direct JMP instruction includes the destination address as part of the instruction. An indirect JMP instruction obtains the destination address indirectly through a register or a pointer variable.
1.13.4 Call Instruction
1. CALL (Call Procedure) activates an out-of-line procedure, saving on the stack the address of the instruction following the CALL for later use by a RET (Return) instruction. 2. CALL places the current value of EIP on the stack. The RET instruction in the called procedure uses this address to transfer control back to the calling program.
3. CALL instructions, like JMP instructions have relative, direct, and indirect versions.
1.13.5 Return and Return-From-Interrupt Instruction
1. RET (Return From Procedure) terminates the execution of a procedure and transfers control through a back-link on the stack to the program that originally invoked the procedure. RET restores the value of EIP that was saved on the stack by the previous CALL instruction.
2. RET instructions may optionally specify an immediate operand. By adding this constant to the new top-of-stack pointer, RET effectively removes any arguments that the calling program pushed on the stack before the execution of the CALL instruction.
3. IRET (Return From Interrupt) returns control to an interrupted procedure. IRET differs from RET in that it also pops the flags from the stack into the flags register. The flags are stored on the stack by the interrupt mechanism.
Key takeaways
- Conditional control transfers depend on the results of operations that affect the flag register
- If the model of memory organization used in a particular 80386 application does not make segments visible to applications programmers, intersegment control transfers will not be used.
- Implementation varies depending on whether the address is specified directly within the instruction or indirectly through a register or memory.
- The RET instruction in the called procedure uses this address to transfer control back to the calling program.
- RET restores the value of EIP that was saved on the stack by the previous CALL instruction.
The instructions in this category operate on strings rather than on logical or numeric values. The power of 80386 string operations derives from the following features of the architecture:
a. A set of primitive string operations
1. MOVS -- Move String
2. CMPS -- Compare string
3. SCAS -- Scan string
4. LODS -- Load string
5. STOS -- Store string
b. Indirect, indexed addressing, with automatic incrementing or decrementing of the indexes.
Indexes:
1. ESI -- Source index register
2. EDI -- Destination index register
c. Control flag:
1. DF -- Direction flag
Control flag instructions:
1. CLD -- Clear direction flag instruction
2. STD -- Set direction flag instruction
D. Repeat prefixes
1. REP -- Repeat while ECX not xero
2. REPE/REPZ -- Repeat while equal or zero
3. REPNE/REPNZ -- Repeat while not equal or not zero
The primitive string operations operate on one element of a string. A string element may be a byte, a word, or a doubleword. The string elements are addressed by the registers ESI and EDI. After every primitive operation ESI and/or EDI are automatically updated to point to the next element of the string. If the direction flag is zero, the index registers are incremented; if one, they are decremented. The amount of the increment or decrement is 1, 2, or 4 depending on the size of the string element.
Key takeaways
- A string element may be a byte, a word, or a doubleword
- The string elements are addressed by the registers ESI and EDI. After every primitive operation ESI and/or EDI are automatically updated to point to the next element of the string.
1. The instructions in this section provide machine-language support for functions normally found in high-level languages. These instructions include ENTER and LEAVE, which simplify the programming of procedures.
2. ENTER (Enter Procedure) creates a stack frame that may be used to implement the scope rules of block-structured high-level languages. A LEAVE instruction at the end of a procedure complements an ENTER at the beginning of the procedure to simplify stack management and to control access to variables for nested procedures.
3. The ENTER instruction includes two parameters. The first parameter specifies the number of bytes of dynamic storage to be allocated on the stack for the routine being entered. The second parameter corresponds to the lexical nesting level (0-31) of the routine. (Note that the lexical level has no relationship to either the protection privilege levels or to the I/O privilege level.)
4. The specified lexical level determines how many sets of stack frame pointers the CPU copies into the new stack frame from the preceding frame. This list of stack frame pointers is sometimes called the display. The first word of the display is a pointer to the last stack frame. This pointer enables a LEAVE instruction to reverse the action of the previous ENTER instruction by effectively discarding the last stack frame.
Key takeaways
- A LEAVE instruction at the end of a procedure complements an ENTER at the beginning of the procedure to simplify stack management and to control access to variables for nested procedures
- This list of stack frame pointers is sometimes called the display. The first word of the display is a pointer to the last stack frame.
The flag control instructions provide a method for directly changing the state of bits in the flag register.
1.16.1 Carry and Direction Flag Control Instructions
1. The carry flag instructions are useful in conjunction with rotate-with-carry instructions RCL and RCR. They can initialize the carry flag, CF, to a known state before execution of a rotate that moves the carry bit into one end of the rotated operand.
2. The direction flag control instructions are specifically included to set or clear the direction flag, DF, which controls the left-to-right or right-to-left direction of string processing. If DF=0, the processor automatically increments the string index registers, ESI and EDI, after each execution of a string primitive. If DF=1, the processor decrements these index registers. Programmers should use one of these instructions before any procedure that uses string instructions to insure that DF is set properly.
3. table for flag control Flag Control Instruction Effect STC (Set Carry Flag) CF: = 1 CLC (Clear Carry Flag) CF: = 0 CMC (Complement Carry Flag) CF: = NOT (CF) CLD (Clear Direction Flag) DF: = 0 STD (Set Direction Flag) DF: = 1
|
Key takeaways
- The carry flag instructions are useful in conjunction with rotate-with-carry instructions RCL and RCR.
- If DF=0, the processor automatically increments the string index registers, ESI and EDI, after each execution of a string primitive. If DF=1, the processor decrements these index registers.
1. A numerics coprocessor (e.g., the 80387 or 80287) provides an extension to the instruction set of the base architecture. The coprocessor extends the instruction set of the base architecture to support high-precision integer and floating-point calculations. This extended instruction set includes arithmetic, comparison, transcendental, and data transfer instructions. The coprocessor also contains a set of useful constants to enhance the speed of numeric calculations.
2. A program contains instructions for the coprocessor in line with the instructions for the CPU. The system executes these instructions in the same order as they appear in the instruction stream. The coprocessor operates concurrently with the CPU to provide maximum throughput for numeric calculations.
3. The 80386 also has features to support emulation of the numerics coprocessor when the coprocessor is absent. The software emulation of the coprocessor is transparent to application software but requires more time for execution .
4. ESC (Escape) is a 5-bit sequence that begins the opcodes that identify floating point numeric instructions. The ESC pattern tells the 80386 to send the opcode and addresses of operands to the numerics coprocessor. The numerics coprocessor uses the escape instructions to perform high-performance, high-precision floating point arithmetic that conforms to the IEEE floating point standard 754.
5. WAIT (Wait) is an 80386 instruction that suspends program execution until the 80386 CPU detects that the BUSY pin is inactive. This condition indicates that the coprocessor has completed its processing task and that the CPU may obtain the results.
Key takeaways
- The coprocessor extends the instruction set of the base architecture to support high-precision integer and floating-point calculations.
- The software emulation of the coprocessor is transparent to application software but requires more time for execution
1. The segment registers of the 80386 give systems software designers the flexibility to choose among various models of memory organization. Implementation of memory models is the subject of Part II -- Systems Programming. Designers may choose a model in which applications programs do not need to modify segment registers, in which case applications programmers may skip this section.
2. Complete programs generally consist of many different modules, each consisting of instructions and data. However, at any given time during program execution, only a small subset of a program's modules are actually in use. The 80386 architecture takes advantage of this by providing mechanisms to support direct access to the instructions and data of the current module's environment, with access to additional segments on demand.
3. At any given instant, six segments of memory may be immediately accessible to an executing 80386 program. The segment registers CS, DS, SS, ES, FS, and GS are used to identify these six current segments. Each of these registers specifies a particular kind of segment, as characterized by the associated mnemonics ("code," "data," or "stack") . Each register uniquely determines one particular segment, from among the segments that make up the program, that is to be immediately accessible at highest speed.
4. The segment containing the currently executing sequence of instructions is known as the current code segment; it is specified by means of the CS register. The 80386 fetches all instructions from this code segment, using as an offset the contents of the instruction pointer. CS is changed implicitly as the result of intersegment control-transfer instructions (for example, CALL and JMP), interrupts, and exceptions.
5. Subroutine calls, parameters, and procedure activation records usually require that a region of memory be allocated for a stack. All stack operations use the SS register to locate the stack. Unlike CS, the SS register can be loaded explicitly, thereby permitting programmers to define stacks dynamically.
6. The DS, ES, FS, and GS registers allow the specification of four data segments, each addressable by the currently executing program. Accessibility to four separate data areas helps programs efficiently access different types of data structures; for example, one data segment register can point to the data structures of the current module, another to the exported data of a higher-level module, another to a dynamically created data structure, and another to data shared with another task. An operand within a data segment is addressed by specifying its offset either directly in an instruction or indirectly via general registers.
7. Depending on the structure of data (e.g., the way data is parceled into one or more segments), a program may require access to more than four data segments. To access additional segments, the DS, ES, FS, and GS registers can be changed under program control during the course of a program's execution. This simply requires that the program execute an instruction to load the appropriate segment register prior to executing instructions that access the data.
8. The processor associates a base address with each segment selected by a segment register. To address an element within a segment, a 32-bit offset is added to the segment's base address. Once a segment is selected (by loading the segment selector into a segment register), a data manipulation instruction only needs to specify the offset. Simple rules define which segment register is used to form an address when only an offset is specified.
Key takeaways
- The 80386 architecture takes advantage of this by providing mechanisms to support direct access to the instructions and data of the current module's environment, with access to additional segments on demand.
- Accessibility to four separate data areas helps programs efficiently access different types of data structures.
- To address an element within a segment, a 32-bit offset is added to the segment's base address. Once a segment is selected (by loading the segment selector into a segment register), a data manipulation instruction only needs to specify the offset.
1. Write Back and Invalidate Cache (wbinvd) [486 only] wbinvd Example Write back and invalidate the cache. wbinvd
2. Invalidate (invd) [486 only] invd Example Invalidate the entire cache. invd
3. Invalidate Page (invlpg) [486 only] invlpg mem32 Example Invalidate a single entry in the translation lookaside buffer. invlpg 5(%ebx)
4. LOCK Prefix (lock)
lock Operation LOCK# -> NEXT Instruction Description 1. The LOCK # signal is asserted during execution of the instruction following the lock prefix. This signal can be used in a multiprocessor system to ensure exclusive use of shared memory while LOCK # is asserted. The bts instruction is the read-modify-write sequence used to implement test-and-run. 2. The lock prefix works only with the instructions listed here. If a lock prefix is used with any other instructions, an undefined opcode trap is generated.
bt, bts, btr, btc m, r/imm xchg r, m xchg m, r add, or, adc, sbb, and, sub, xor m, r/imm not, neg, inc, dec m Memory field alignment does not affect the integrity of lock. If a different 80386 processor is concurrently executing an instruction that has a characteristic listed here, locked access is not guaranteed. The previous instruction: Does not follow a lock prefix Is not on the previous list of acceptable instructions A memory operand specified has a partial overlap with the destination operand.
3. Example lock No Operation (nop) nop Operation
5. NO OPERATION 1. Description No operations are performed by nop. The xchgl %eax, %eax instruction is an alias for the nop instruction. 2. Example nop Halt (hlt) hlt Address Prefix addr16 Data Prefix data16 Operation HLT -> ENTER HALT STATE Description halt puts the 80386 in a HALT state by stopping instruction execution. Execution is resumed by an nmi or an enabled interrupt. After a halt, if an interrupt is used to continue execution, the saved CS:EIP or CS:IP value points to the next instruction (after the halt). The halt instruction is privileged. |
Key takeaways
- Execution is resumed by an nmi or an enabled interrupt.
- If a different 80386 processor is concurrently executing an instruction that has a characteristic listed here, locked access is not guaranteed
- The LOCK # signal is asserted during execution of the instruction following the lock prefix. This signal can be used in a multiprocessor system to ensure exclusive use of shared memory while LOCK # is asserted.
References
- A.Ray, K.Bhurchandi, ”Advanced Microprocessors and peripherals: Arch, Programming &
- Interfacing”, Tata McGraw Hill,2004 ISBN 0-07-463841-6
- Intel 80386 Programmer's Reference Manual 1986, Intel Corporation, Order no.: 231630-011,
- December 1995.
- James Turley, “Advanced 80386 Programming Techniques”, McGraw-Hill, ISBN: 10:0078813425, 13: 978-0078813429.