X86 Opcode and Instruction Reference

MazeGen, 2008-12-17

Tato reference je zamýšlená jako precizní referenční popis instrukční sady x86 (včetně x86-64). Jejím cílem je přesný popis všech parametrů a atributů instrukce.

Rychlá navigace

coder32, coder32-abc, geek32, geek32-abc

coder64, coder64-abc, geek64, geek64-abc

coder, coder-abc, geek, geek-abc (tyto obsahují instrukce x86-32 a x64 dohromady).

Na rozdíl od jiných referencí je primárním zdrojem této reference XML dokument, který zaručuje jasnou strukturu informací a tím možnost vytahování různých dat, např. seznamu instrukcí z požadovaných skupin atd.

Reference primárně vychází z podkladů Intelu jakožto původce architektury x86. Popisuje ovšem i nedokumentované instrukce a na příslušných místech upozorňuje na větší rozdíly v chování instrukcí na architektuře AMD. Podpora instrukcí specifických pro výrobce jako Cyrix, NexGen atd. není v plánu.

HTML Edice

V současné době jsou k dispozici tyto edice: Řada coder je určena pro běžnější použití a obsahuje tyto edice: coder32, coder64 a coder (řazeny podle operačního znaku), a coder32-abc, coder64-abc a coder-abc (řazeny podle instrukčního mnemonic). Řada geek je určena pro hlubší studium instrukční sady architektur x86. Jde o tyto edice: geek32, geek64 a geek (podle operačního znaku), a geek32-abc, geek64-abc a geek-abc (podle mnemonic). Více o smyslu a použití této řady viz kousek níže.

Nenech se zmást edicemi geek(-abc) a coder(-abc). Obě z nich obsahují instrukční sadu jak architektury x86-32, tak x86-64. Pokud není zvláštní důvod je používat (jako třeba vidět rozdíly mezi architekturami), ostatní edice se ti asi budou hodit více.

Edice coder32 a geek32 se týkají pouze architektury x86-32. Stejná logika platí pro edice coder64 a geek64: tyto se týkají pouze architektury x86-64.

The following chart illustrates the differencies between editions for current release:

Edition		coder	coder32	coder64	geek	geek32	geek64
Supported Architectures		both	pure x86-32	pure x86-64	both	pure x86-32	pure x86-64
Operand Codes		traditional	traditional	traditional	special	special	special
Abandoned Instructions		no	no	no	yes	yes	yes
Opcode Bitfields Information		no	no	no	yes	yes	yes
Instruction Extension Indicated		yes	yes	yes	yes	yes	yes
Instruction Group Indicated		no	no	no	yes	yes	yes
Present Instructions	general	yes	yes	yes	yes	yes	yes
	system	yes	yes	yes	yes	yes	yes
	x87 FPU	yes	yes	yes	yes	yes	yes
	MMX	yes	yes	yes	yes	yes	yes
	SSE	yes	yes	yes	yes	yes	yes
	SSE2	yes	yes	yes	yes	yes	yes
	SSE3	yes	yes	yes	yes	yes	yes
	SSSE3	yes	yes	yes	yes	yes	yes
	Itanium	no	no	no	yes	yes	yes

Krátce o smyslu edicí geek

Edice z řady geek obsahují tak kompletní informaci ze zdrojového XML souboru, jak je to jen možné. Proto nejsou moc přehledné. Oceníš je pouze v případě, když potřebuješ poznat instrukční sadu x86 opravdu do hloubky nebo když studuješ zdrojové XML a potřebuješ ho lépe vizualizovat.

Tyto edice používají zvláštní kódy pro označení operandů (jsou popsané v kapitole Kódy instrukčních operandů níže). Pokud se s nimi setkáváte poprvé, můžou vypadat podivně a nejasně. Důvod jejich používání je ten, že nesou mnohem víc informací než častěji používané kódy. Jedním příkladem může být kombinace operandů rAX, imm16/32, například v instrukci ADD rAX, imm16/32 v edici coder64. Můžeme vyčíst, že cílový operand je buďto ax, eax nebo rax, a zdrojový buď imm16 nebo imm32. Problém nastává při bližším zjišťování, co se přesně děje při kombinaci rax, imm32. Pokud se člověk teprve seznamuje s architekturou x64, tak neví, jak se přímá hodnota před přičtením rozšíří na celých 64 bitů. Na tuto otázku odpovídá odpovídající geek edice, ADD rAX, Ivds v edici geek64. Přímá hodnota zde má kód Ivds. Kód I znamená "Immediate", přímou hodnotu, v znamená "word" nebo "doubleword" (imm16 nebo imm32). To nejdůležitější je část ds, která znamená "doubleword, sign-extended to 64 bits for 64-bit operand size". Přímá hodnota je tedy nejprve znaménkově rozšířena na 64 bitů.

Co se týče instrukcí specifických pro Itanium, ty jsou zařazeny pouze pro zajímavost, aby upozorňovali, že příslušné operační kódy už jsou použity.

Hypertextová reference na určitý operační znak

Pokud chcete odkazovat konkrétní operační znak (v jakékoliv edici), např. 0FA0 PUSH FS, jde to snadno tímto způsobem:

ref.x86asm.net/geek.html#x0FA0 (zkus to)

Podobně to funguje i pro rozšíření operačního znaku, např. 83 /7 CMP:

ref.x86asm.net/coder32.html#x83_7 (zkus to)

Prohlížeče, tisk

Pro prohlížení mi připadá nejlepší Firefox. Opera 9 vypadá pomalejší. Internet Explorer 6 a 7 nepodporuje některé CSS vlastnosti, takže v něm reference vypadá mírně jinak.

Plná podpora tisku je k dispozici pouze jako součást výhod.

Takto vypadá vytištěná kopie:

Používání HTML edic

Protože můžou být HTML edice na první pohled dost komplikované, tady je krátký návod, jak s nimi pracovat. Ukázky pocházejí z edice coder32, protože je jednodušší na používání než edice z řady geek.

Příklad: instrukce ADC

Nejdřív nějakou známější instrukci, třeba ADC. V této edici najdeme něco, podobné následujícímu:

|pf|0F|po|so|flds|o|proc|st|m|rl|l|mnemonic|op1     |op2   |op3|op4|iext|grp1|grp2 |tested f|modif f |def f   |undef f|f values|description, notes|
|  |  |11|  |    |r|    |  | |  |L|ADC     |r/m16/32|r16/32|   |   |    |gen |arith|.......c|o..szapc|o..szapc|       |        |Add with Carry    |

První sloupec pf (Prefix) je prázdný, to znamená, že instrukce nemá žádný pevně daný prefix.

Další sloupec 0F je jenom vyhrazen pro prefix 0F vícebajtových operačních znaků, takže je prázdný.

Následující sloupec po (Primary Opcode) nese samotnou hodnotu operačního znaku.

Protože instrukce nemá žádný přidaný bajt operačního znaku, je sloupec so (Secondary Opcode) prázdný také.

Operační znak nenese žádné vyhrazené bity, proto je sloupec flds (Opcode Fields) prázdný.

Sloupec o (Register/Opcode Field) zde obsahuje hodnotu "r", která zjednodušeně řečeno značí, že instrukce obsahuje "plnou" slabiku ModR/M (bez rozšíření operačního znaku).

Protože je tato instrukce podporována už od procesoru 8086, je sloupec proc (Introduced with Processor) prázdný.

Instrukce je oficiálně dokumentována, proto je sloupec st prázdný také.

Instrukce ADC může běžet na všech úrovních oprávnění, proto je sloupec rl, Ring Level, prázdný.

Sloupec x obsahuje hodnotu "L", která značí, že instrukci lze použít s prefixem LOCK.

Následující tři sloupce mnemonic, op1 a op2 označují samotnou syntaxi instrukce. Cílový operand této instrukce je vysázen tučně, což vždy znamená, že je instrukcí modifikován.

Sloupec iext je prázdný, protože instrukce nepatří do žádného rozšíření instrukční sady (Instruction Extension Group).

Sloupce grp1 a grp2 zařazují instrukci do skupiny všeobecných artimetických instrukcí.

Instrukci ADC ovlivňuje příznak CF, což vyjadřuje sloupec tested f.

Instrukce ovlivňuje (přepisuje) všechny stavové příznaky, které jsou proto zapsány v následujícím sloupci modif f.

Všechny tyto příznaky jsou definované (instrukce žádný z nich nenastavuje náhodně), takže ty stejné příznaky najdeme i v následujícím sloupci def f a sloupec undef f musí být prázdný.

Žádný z příznaků není nikdy nastaven na pevnou hodnotu, ale jejich hodnoty záleží na vstupních operandech, proto je sloupec f values prázdný.

Sloupec description, notes obsahuje už pouze obecný popis instrukce.

Příklad: Rozšíření operačního znaku

Některé (ne mnoho) operační znaky souvisí s polem Rozšíření operačního znaku ve slabice ModR/M. V těchto případech je operační znak vlastně prodloužen o tři bity. Ve většině případů znamená odlišné rozšíření stejného operačního znaku více nebo méně odlišnou instrukci. Příkladem může být operační znak F6, z kterého si ukážeme poslední tři rozšíření operačního znaku:

|pf|0F|po|so|flds|o|proc|st|m|rl|l|mnemonic|op1|op2|op3 |op4 |iext|grp1|grp2 |tested f|modif f |def f   |undef f |f values|description, notes|
----------------------------------------------------------------------------------------------------------------------------------------------
|  |  |F6|  |    |5|    |  | |  | |IMUL    |AX |AL |r/m8|    |    |gen |arith|        |o..szapc|o......c|...szap.|        |Signed Multiply   |
----------------------------------------------------------------------------------------------------------------------------------------------
|  |  |F6|  |    |6|    |  | |  | |DIV     |AL |AH |AX  |r/m8|    |gen |arith|        |o..szapc|        |o..szapc|        |Unsigned Divide   |
----------------------------------------------------------------------------------------------------------------------------------------------
|  |  |F6|  |    |7|    |  | |  | |IDIV    |AL |AH |AX  |r/m8|    |gen |arith|        |o..szapc|        |o..szapc|        |Signed Divide     |

Rozšíření operačního znaku může obsahovat hodnoty od 0 po 7. Tyto hodnoty jsou vyznačeny ve sloupci o (Register/Opcode Field). V tomto příkladu jsme tedy vybrali hodnoty 5, 6 a 7.

V této ukázce je navíc vidět, že operandy, které nejsou explicitně vypisovány (operandy AL, AH a AX), jsou vysázeny kurzívou. Také ukazuje, že instrukce DIV a IDIV vždy zničí hodnoty ve všech statových příznacích: jak sloupec modif f, tak sloupec undef f obsahují tyto příznaky.

Příklad: jeden operační znak, více syntaxí

Některé operační znaky jsou prezentovány více instrukcemi se stejným významem, ale s různou syntaxí. (Toto se netýká případu, kdy operační znak souvisí s polem Rozšíření operačního znaku ve slabice ModR/M, kdy jde o instrukce s různým významem). Nejznámějším případem jsou podmíněné skoky, např. JZ/JE, kde vidíme něco podobného:

|pf|0F|po|so|flds|o|proc|st|m|rl|l|mnemonic|op1     |op2|op3|op4|iext|grp1|grp2  |tested f|modif f|def f|undef f|f values|description, notes             |
|  |  |74|  |    | |    |  | |  | |JZ      |rel8    |   |   |   |    |gen |branch|....z...|       |     |       |        |Jump short if zero/equal (ZF=0)|
|  |  |  |  |    | |    |  | |  | |JE      |rel8    |   |   |   |    |    |      |        |       |     |       |        |                               |

Syntaxe bývají vyznačeny zvětšením počtu řádků ve sloupci mnemonic a sloupcích s operandy instrukce.

Komplikovanějším příkladem je například instrukce MOVS/MOVSW/MOVSD:

|pf|0F|po|so|flds|o|proc|st|m|rl|l|mnemonic|op1   |op2   |op3|op4|iext|grp1|grp2   |tested f|modif f|def f|undef f|f values|description, notes             |
------------------------------------------------------------------------------------------------------------------------------------------------------------
|  |  |A5|  |    | |    |  | |  | |MOVS    |m16   |m16   |   |   |    |gen |datamov|.d......|       |     |       |        |Move Data from String to String|
|  |  |  |  |    | |    |  | |  | |MOVSW   |m16   |m16   |   |   |    |    |string |        |       |     |       |        |                               |
------------------------------------------------------------------------------------------------------------------------------------------------------------
|  |  |A5|  |    | |03+ |  | |  | |MOVS    |m16/32|m16/32|   |   |    |gen |datamov|.d......|       |     |       |        |Move Data from String to String|
|  |  |  |  |    | |    |  | |  | |MOVSD   |m32   |m32   |   |   |    |    |string |        |       |     |       |        |                               |

Zde je zápis zkomplikovaný navíc tím, že syntaxe je od procesoru 80386 (díky novým 32bitovým operandům) obohacena o mnemonic MOVSD a došlo ke změně syntaxe MOVS. Proto musí být čtyři možné syntaxe rozděleny po dvou.

Dalšími příklady, kde se vyskytuje víc syntaxí, jsou například PUSHA/PUSHAD, SHL/SAL nebo SLDT.

Příklad: nedokumentovaná instrukce SETALC

Všechny hlavní edice obsahují i několik málo nedokumentovaných instrukcí (z pohledu manuálů Intel). V této referenci se nedokumentovaný nerovná neplatný. Všech zmíněné nedokumentované instrukce fungují ve svojem rozsahu platnosti správně. Jde například o instrukci SETALC:

|pf|0F|po|so|flds|o|proc|st|m|rl|l|mnemonic|op1|op2|op3|op4|iext|grp1|grp2   |tested f|modif f |def f|undef f|f values|description, notes                           |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
|  |  |D6|  |    | |02+ |D⁵| |  | |undefined               |    |    |       |        |        |     |       |        |Undefined and Reserved; Does not Generate #UD|
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
|  |  |D6|  |    | |02+ |U⁶| |  | |SALC    |AL |   |   |   |    |gen |datamov|.......c|        |     |       |        |Set AL If Carry                              |
|  |  |  |  |    | |    |  | |  | |SETALC  |AL |   |   |   |    |    |       |        |        |     |       |        |                                             |

V tomto případě je nejprve uveden oficiálně dokumentovaný význam, což říká sloupec st hodnotou "D". Protože se nejedná o běžný význam, je v tomto sloupci navíc uveden odkaz na popis, kde je tento operační znak dokumentován. Sloupec mnemonic naznačuje hodnotou "undefined" (která je zapsaná kurzívou, která zde vždy značí, že nejde o původní mnemonic), že dokumentovaný význam tohoto operačního znaku je "undefined and reserved", jak je napsáno i v posledním sloupci.

Dále je uveden nedokumentovaný význam operačního znaku - sloupec st nese hodnotou "U". Každý nedokumentovaný význam by měl obsahovat odkaz na popis zdroje (existují výjimky), kde je neoficiálně dokumentován, tak jako je tomu v tomto případě.

Další příklady nedokumentovaných instrukcí: INT1/ICEBP nebo TEST.

Popis jednotlivých sloupců

Rychlá navigace:

pf Prefix
0F 0F Prefix
po Primary Opcode
so Secondary Opcode
flds Opcode Fields
o Register/Opcode Field
proc Introduced with Processor
st Documentation Status
m Mode of Operation
rl Ring Level
x Lock Prefix/FPU Push/FPU Pop
mnemonic Instruction Mnemonic
op1, op2, … Instruction Operands
iext Instruction Extension Group
grp1, grp2, grp3 Main Group, Sub-group, Sub-sub-group
tested f, modif f, def f, undef f Tested, Modified, Defined, and Undefined Flags
f values Flags Values
description, notes

Name	Meaning	Description, Examples
pf	Prefix	Fixed extraordinary prefix, which may change the semantic of the Primary Opcode. Usually used in case of waiting x87 FPU instructions, and many SSE instructions. `F390 PAUSE`, `9BD9/7 FSTCW`, `F30F10 MOVSS`
`0F`	`0F` Prefix	Dedicated for `0F` Prefix. `two-byte opcodes`
po	Primary Opcode	Basic opcode. Second opcode byte in case of two- and three-byte opcodes. For coder's editions: `+r` means a register code, from 0 through 7, added to the value. `50 PUSH`
so	Secondary Opcode	Fixed appended value to the primary opcode. It is used in some special cases, x87 FPU instructions and for new three-byte instructions. `D40A AAM`, `D50A AAD`, `D5F8 FLD1`, three-byte escape `0F38`
flds	Opcode Fields	This column is present only in geek's editions. It contain present Primary Opcode binary fields. These are: `+r` means a register code, from 0 through 7, added to the basic value of the Primary Opcode. `40 INC` The following fields are case-sensitive: if a letter of the code is set up in lower case, it means the appropriate bit is cleared, otherwise is set. `w` means bit `w` (bit index 0, operand size) is present; may be combined with bits `d` or `s`. `04 ADD` `s` means bit `s` (bit index 1, Sign-extend) is present; may be combined with bit `w`. `6B IMUL` `d` means bit `d` (bit index 1, Direction) is present; may be combined with bit `w`. `00 ADD` `tttn` means bit field `tttn` (4 bits, bit index 0, condition). Used only with conditional instructions. `70 JO` `sr` means segment register specifier - a code of one of original four segment registers (2 bits, bit index 3). See also `S2` addressing method. `06 PUSH` `sre` means segment register specifier - a code of any segment registers (3 bits, bit index 0 or 3). See also `S30` and `S33` addressing methods. `0FA0 PUSH` `mf` means bit field MF (2 bits, bit index 1, memory format); used only with x87 FPU instructions coded with second floating-point instruction format. `DA/0 FIADD`
o	Register/ Opcode Field	The value of the opcode extension (values from 0 through 7). `group 80` `r` indicates that the ModR/M byte contains a register operand and an r/m operand. `00 ADD`
proc	Introduced with Processor	Indicates the instruction's introductory processor: `00`: 8086 `01`: 80186 `02`: 80286 `03`: 80386 `04`: 80486 `P1`: Pentium (1) `PX`: Pentium with MMX `PP`: Pentium Pro `P2`: Pentium II `P3`: Pentium III `P4`: Pentium 4 `C1`: Core (1) `C2`: Core 2 `IT`: Itanium (only geek's editions) The opcodes that are not forward-compatible (the ones which have been abandoned) are present only in geek's editions. If the processor marking is a range (e.g., `03-04`), it means that the instruction is unsupported in latter processors. `0F24 MOV` `+` (e. g., `00+`) means the instruction is supported in any of latter processors and also in 64-bit mode, if the next row doesn't explicitly say otherwise. `06 PUSH ES` `++` (e. g., `P4++`) the same meaning, but only in the latter steppings of the processor (e. g., SSE3 instruction extensions). `0FA2 CPUID` If this column is empty: In case of 32-bit editions, it means `00+` (8086 and all latter processors). In case of 64-bit editions, it means `P4++` (P4, latter stepping, and all latter processors), because Intel 64 Architecture is available since latter stepping of the Pentium 4 processor.
st	Document. Status	Indicates how is the instruction documented in the Intel manuals: `D` means fully documented. It can contain a reference to description which chapter in Intel manual it is documented in, if it may be unclear. `D6` `M` means documented only marginally. `66 (SSE2)` `U` undocumented at all. It should contain a reference to description of the source. Note that in this reference, undocumented doesn't equal invalid. All mentioned undocumented instructions should work well in their scope. `D6 SALC` If this column is empty, it means `D` (documented with no further notes).
m	Mode of Operation	Indicates the mode, which is the instruction valid on. Virtual-8086 Mode is not taken into account. `R` applies for real, protected and 64-bit mode. SMM is not taken into account. `P` applies for protected and 64-bit mode. SMM is not taken into account. `group 0F00` `E` applies for 64-bit mode. SMM is not taken into account. `63 MOVSXD` `S` applies for SMM. `0FAA RSM` If this column is empty, it means `R`. For 64-bit editions, `E` code indicates in most cases that the semantics of the opcode is specific to 64-bit mode.
rl	Ring Level	The ring level, which is the instruction valid (3 or 0) from; `f` indicates that the level depends on a flag(s) and it should contain a reference to the description of that flag, if the flag is not too complex. If this column is empty, it means ring 3. `INT`, `INS`, `RDTSC`
x	Lock Prefix	`L` indicates that the instruction is basically valid with `F0 LOCK` prefix. `00 ADD`
x	FPU Push/ FPU Pop	The following codes apply only to x87 FPU instructions (none of them can use `LOCK` prefix). `s` incidates that the opcode performs additional push of a value to the register stack. `D9 /0 FLD` `p` incidates that the opcode performs additional pop of the register stack. `D9 /3 FSTP` `P` incidates the same like `p`, but pops twice. `DA /5 FUCOMPP`
mnemonic	Instr. Mnemonic	The instruction mnemonic itself. If there is no mnemonic, it holds additional information about the mnemonic or instruction: If the mnemonic is set up using italic, there is no oficial mnemonic and the present one is just suggested one. `D4 AMX`, `D5 ADX`, `0FB9 UD` no mnemonic means that there is no mnemonic for the opcode. `66` invalid means that the opcode is invalid. This option is not used everywhere the opcode is invalid, but only in some cases. `06 (64-bit mode)` undefined means that the behaviour of the instruction is according to official documentation undefined. `D6` nop means that the opcode is treated as integer `NOP` instruction. It should contain a reference to description of the source. `no mnenonic nop` null means that the prefix has no meaning (no operation). `26 (64-bit mode)` If there is a mnemonic, it can hold additional attributes of the instruction: nop means that the instruction is treated as integer `NOP` instruction (except `NOP` instructions themselves). It should contain a reference to description of the source. `DBE0 FNENI`
mnemonic	Instr. Mnemonic	Only geek's editions: alias means that the opcode is an alias to another opcode. The attribute should be a reference to that instruction. `group 82`, `C0 /6 SAL` part alias means not true alias. It should contain a reference to the description of the differences between referenced instructions. `F1 INT1`
op1, op2, ...	Instr. Operands	Instruction operands. Geek's editions use special operand codes, explained in Instruction Operand Codes chapter below. If an operand is set up using italic, it is an implicit operand, which is not explicitly used. If an operand is set up using boldface, it is modified by the instruction.
iext	Instr. Extension Group	The instruction extension group, which was the opcode released on: `MMX` MMX Technology `SSE1` Streaming SIMD Extensions (1) `SSE2` Streaming SIMD Extensions 2 `SSE3` Streaming SIMD Extensions 3 `SSSE3` Supplemental Streaming SIMD Extensions 3
grp1, grp2, grp3	Main Group, Sub-group, Sub -sub-group	These columns are present only in geek's editions. They classifies the instruction among groups. These groups don't match the instruction groups given by the Intel manual (I found them too loose). One instruction may fit into more groups. prefix segreg segment register branch cond conditional x87fpu control (only `WAIT`) obsol obsolete control gen general datamov data movement stack (implies `rSP` destination operand and `SS:[rSP]` source or destination operand) conver type conversion arith arithmetic binary decimal logical shftrot shift&rotate bit bit manipulation branch cond conditional break interrupt string inout I/O flgctrl flag control segreg segment register manipulation control system branch trans transitional (implies sensitivity to operand-size attribute) stack (implies `rSP` destination operand and `SS:[rSP]` source or destination operand) x87fpu x87 FPU datamov data movement arith basic arithmetic compar comparison trans transcendental ldconst load constant control conv conversion sm x87 FPU and SIMD state management `MMX` instruction extensions technology groups datamov data movement arith packed arithmetic compar comparison conver conversion logical shift unpack unpacking `SSE1` instruction extensions groups simdfp SIMD single-precision floating-point datamov data movement arith packed arithmetic compar comparison logical shunpck shuffle&unpacking conver conversion instructions simdint 64-bit SIMD integer word WORD operation mxcsrsm `MXCSR` state management cachect cacheability control fetch prefetch order instruction ordering `SSE2` instruction extensions groups: pcksclr packed and scalar double-precision floating-point datamov data movement conver conversion arith packed arithmetic compar comparison logical shunpck shuffle&unpacking pcksp packed single-precision floating-point simdint 128-bit SIMD integer datamov data movement arith packed arithmetic shunpck shuffle&unpacking shift cachect cacheability control order instruction ordering `SSE3` instruction extensions groups: simdfp SIMD single-precision floating-point (SIMD packed) datamov data movement arith packed arithmetic cachect cacheability control sync agent synchronization `SSSE3` instruction extensions group: simdint SIMD integer
tested f, modif f, def f, undef f	Tested, Modified, Defined, and Undefined Flags	For `rFlags` register, indicates these flags using odiszapc pattern. Present flag fits in with the appropriate group. `group C0` For x87 FPU flags, indicates these flags using 1234 x87 FPU flag pattern. Present flag fits in with the appropriate group. `DB/7 FSTP` Note that if a flag is present in both Defined and Undefined column, the flag fits in under further conditions, which are not described by this reference.
f values	Flags Values	For `rFlags` register, indicates the values of flags, which are always set or cleared, using case-sensitive odiszapc flag pattern. Lower-case flag means cleared flag, upper-case means set flag. `STC` For x87 FPU flags, indicates these flags using 1234 x87 FPU flag pattern. Present flag holds its value. `DBE3 FNINIT`
description, notes		Short desciption of the opcode. For now, the descriptions are very general. They will be improved in future perhaps.

Kódy instrukčních operandů

Tyto kódy vycházejí z oficiálních kódů, používaných v Intel manuálu Instruction Set Reference, N-Z pro Pentium 4 procesor, revize 17. Důvodem použití této starší revize je to, že kódy z této revize jsou nejvýstižnější. V dalších revizích manuálu bohužel došlo ke změně těchto kódů. Sada těchto kódů byla pro potřeby reference dále upravená a doplněná především proto, aby bylo možné kódovat operandy současně i pro 64bitový mód. V ideálním případě by bylo nejlepší vytvořit úplně nové kódy, ale mám obavy, že by byly těžko široce přijatelné.

To, zda jde o kód původní, přidaný, nebo upravený, popisuje sloupec State níže.

Část prvního sloupce v těchto tabulkách, označená jako "Geek", obsahuje kódy použité v HTML geek edicích a také ve zdrojovém XML dokumentu. Část "Coder" označuje alternativní kódy, používané v HTML coder edicích a jsou také používány v Intel manuálu používány při popisování instrukcí.

Kódy adresovacích metod

The following abbreviations are used for addressing methods:

Geek	State	Description
Coder	State	Description
`A`	Original	Direct address. The instruction has no ModR/M byte; the address of the operand is encoded in the instruction; no base register, index register, or scaling factor can be applied (for example, far `JMP` (`EA`)).
`ptr`	Original
`BA`	Added	Memory addresed by `[rAX]` (only `0F01C8 MONITOR`).
`m`	Added	Memory addresed by `[rAX]` (only `0F01C8 MONITOR`).
`BB`	Added	Memory addresed by `DS:[eBX+AL]`, or by `[rBX+AL]` in 64-bit mode, where `RBX` is promoted by `REX.W` (only `XLAT`). (This code changed from single `B` in revision 1.00)
`m`	Added
`C`	Original	The reg field of the ModR/M byte selects a control register (only `MOV` (`0F20`, `0F22`)).
`CRn`	Original
`D`	Original	The reg field of the ModR/M byte selects a debug register (only `MOV` (`0F21`, `0F23`)).
`DRn`	Original
`E`	Original	A ModR/M byte follows the opcode and specifies the operand. The operand is either a general-purpose register or a memory address. If it is a memory address, the address is computed from a segment register and any of the following values: a base register, an index register, a scaling factor, or a displacement.
`r/m`	Original
`ES`	Added	(Implies original `E`). A ModR/M byte follows the opcode and specifies the operand. The operand is either a x87 FPU stack register or a memory address. If it is a memory address, the address is computed from a segment register and any of the following values: a base register, an index register, a scaling factor, or a displacement.
`STi/m`	Added
`EST`	Added	(Implies original `E`). A ModR/M byte follows the opcode and specifies the x87 FPU stack register.
`STi`	Added
`F`	Original	rFLAGS register.
-	Original	rFLAGS register.
`G`	Original	The reg field of the ModR/M byte selects a general register (for example, `AX` (`000`)).
`r`	Original
`H`	Added	The r/m field of the ModR/M byte always selects a general register, regardless of the mod field (for example, `MOV` (`0F20`)).
r	Added
`I`	Original	Immediate data. The operand value is encoded in subsequent bytes of the instruction.
`imm`	Original
`J`	Original	The instruction contains a relative offset to be added to the instruction pointer register (for example, `JMP` (`E9`), `LOOP`)).
`rel`	Original
`M`	Original	The ModR/M byte may refer only to memory: mod != 11bin (`BOUND`, `LEA`, `CALLF`, `JMPF`, `LES`, `LDS`, `LSS`, `LFS`, `LGS`, `CMPXCHG8B`, `CMPXCHG16B`, `F20FF0 LDDQU`).
`m`	Original
`N`	Original	The R/M field of the ModR/M byte selects a packed quadword MMX technology register.
`mm`	Original
`O`	Original	The instruction has no ModR/M byte; the offset of the operand is coded as a word, double word or quad word (depending on address size attribute) in the instruction. No base register, index register, or scaling factor can be applied (only `MOV` (`A0`, `A1`, `A2`, `A3`)).
`moffs`	Original
`P`	Original	The reg field of the ModR/M byte selects a packed quadword MMX technology register.
`mm`	Original
`Q`	Original	A ModR/M byte follows the opcode and specifies the operand. The operand is either an MMX technology register or a memory address. If it is a memory address, the address is computed from a segment register and any of the following values: a base register, an index register, a scaling factor, and a displacement.
`mm/m64`	Original
`R`	Original	The mod field of the ModR/M byte may refer only to a general register (only `MOV` (`0F20`-`0F24`, `0F26`)).
`r`	Original
`S`	Original	The reg field of the ModR/M byte selects a segment register (only `MOV` (`8C`, `8E`)).
`Sreg`	Original
`T`	Original	The reg field of the ModR/M byte selects a test register (only `MOV` (`0F24`, `0F26`)).
`TRn`	Original
`U`	Original	The R/M field of the ModR/M byte selects a 128-bit XMM register.
`xmm`	Original
`V`	Original	The reg field of the ModR/M byte selects a 128-bit XMM register.
`xmm`	Original
`W`	Original	A ModR/M byte follows the opcode and specifies the operand. The operand is either a 128-bit XMM register or a memory address. If it is a memory address, the address is computed from a segment register and any of the following values: a base register, an index register, a scaling factor, and a displacement
`xmm/m`	Original
`X`	Original	Memory addressed by the `DS:eSI` or by `RSI` (only `MOVS`, `CMPS`, `OUTS`, and `LODS`). In 64-bit mode, only 64-bit (`RSI`) and 32-bit (`ESI`) address sizes are supported. In non-64-bit mode, only 32-bit (`ESI`) and 16-bit (`SI`) address sizes are supported.
`m`	Original
`Y`	Original	Memory addressed by the `ES:eDI` or by `RDI` (only `MOVS`, `CMPS`, `INS`, `STOS`, and `SCAS`). In 64-bit mode, only 64-bit (`RDI`) and 32-bit (`EDI`) address sizes are supported. In non-64-bit mode, only 32-bit (`EDI`) and 16-bit (`DI`) address sizes are supported.
`m`	Original
`YD`	Added	Memory addressed by the `DS:eDI` or by `RDI` (only `0FF7 MASKMOVQ` and `660FF7 MASKMOVDQU`)
`m`	Added
`Z`	Added	The instruction has no ModR/M byte; the three least-significant bits of the opcode byte selects a general-purpose register
`r`	Added

The following abbreviations are used for addressing methods only in case of direct segment registers and are accessible only in HTML geek's editions as segment register's title. As for source XML document, they are used within address atribute of syntax/dst or syntax/src elements. All of them are added:

`S2`	The two bits at bit index three of the opcode byte selects one of original four segment registers (for example, `PUSH ES`).
`S30`	The three leas-significant bits of the opcode byte selects segment register `SS`, `FS`, or `GS` (for example, `LSS`).
`S33`	The three bits at bit index three of the opcode byte selects segment register `FS` or `GS` (for example, `PUSH FS`).

Kódy pro typ operandu

The following abbreviations are used for operand types:

Geek	State	Description
Coder	State	Description
`a`	Original	Two one-word operands in memory or two double-word operands in memory, depending on operand-size attribute (only `BOUND`).
`16/32&16/32`	Original
`b`	Original	Byte, regardless of operand-size attribute.
`8`	Original	Byte, regardless of operand-size attribute.
`bcd`	Added	Packed-BCD. Only x87 FPU instructions (for example, `FBLD`).
`80dec`	Added
`bs`	Added; simplified `bsq`	Byte, sign-extended to the size of the destination operand.
`8`	Added; simplified `bsq`	Byte, sign-extended to the size of the destination operand.
`bsq`	Original; replaced by `bs`	(Byte, sign-extended to 64 bits.)
-	Original; replaced by `bs`	(Byte, sign-extended to 64 bits.)
`bss`	Original	Byte, sign-extended to the size of the stack pointer (for example, `PUSH` (`6A`)).
`8`	Original
`c`	Original	Byte or word, depending on operand-size attribute. (unused even by Intel?)
?	Original
`d`	Original	Doubleword, regardless of operand-size attribute.
`32`	Original	Doubleword, regardless of operand-size attribute.
`di`	Added	Doubleword-integer. Only x87 FPU instructions (for example, `FIADD`).
`32int`	Added
`dq`	Original	Double-quadword, regardless of operand-size attribute (for example, `CMPXCHG16B`).
`128`	Original
`dqp`	Added; combines `d` and `qp`	Doubleword, or quadword, promoted by `REX.W` in 64-bit mode (for example, `MOVSXD`, `MOV` (`0F21`, `0F23`)).
`32/64`	Added; combines `d` and `qp`
`dr`	Added	Double-real. Only x87 FPU instructions (for example, `FADD`).
`64real`	Added
`ds`	Original	Doubleword, sign-extended to 64 bits (for example, `CALL` (`E8`), `MOVSXD`).
`32`	Original
`e`	Added	x87 FPU environment (for example, `FSTENV`).
`14/28`	Added	x87 FPU environment (for example, `FSTENV`).
`er`	Added	Extended-real. Only x87 FPU instructions (for example, `FLD`).
`80real`	Added
`p`	Original	32-bit or 48-bit pointer, depending on operand-size attribute (for example, `CALLF` (`9A`).
`16:16/32`	Original
`pi`	Original	Quadword MMX technology data.
(`64`)	Original	Quadword MMX technology data.
`pd`	Original	128-bit packed double-precision floating-point data.
	Original	128-bit packed double-precision floating-point data.
`ps`	Original	128-bit packed single-precision floating-point data.
(`128`)	Original	128-bit packed single-precision floating-point data.
`psq`	Added	64-bit packed single-precision floating-point data.
`64`	Added	64-bit packed single-precision floating-point data.
`pt`	Original; replaced by `ptp`	(80-bit far pointer.)
-	Original; replaced by `ptp`	(80-bit far pointer.)
`ptp`	Added	32-bit or 48-bit pointer, depending on operand-size attribute, or 80-bit far pointer, promoted by `REX.W` in 64-bit mode (for example, `CALLF` (`FF /3`)).
`16:16/32/64`	Added
`q`	Original	Quadword, regardless of operand-size attribute (for example, `CALL` (`FF /2`)).
`64`	Original
`qi`	Added	Qword-integer. Only x87 FPU instructions (for example, `FILD`).
`64int`	Added
`qp`	Original	Quadword, promoted by `REX.W` (for example, `IRETQ`).
`64`	Original	Quadword, promoted by `REX.W` (for example, `IRETQ`).
`s`	Changed to	6-byte pseudo-descriptor, or 10-byte pseudo-descriptor in 64-bit mode (for example, `SGDT`).
-	Changed from	6-byte pseudo-descriptor.
`sd`	Original	Scalar element of a 128-bit packed double-precision floating data.
-	Original
`si`	Original	Doubleword integer register (e. g., `eax`). (unused even by Intel?)
?	Original
`sr`	Added	Single-real. Only x87 FPU instructions (for example, `FADD`).
`32real`	Added
`ss`	Original	Scalar element of a 128-bit packed single-precision floating data.
-	Original
`st`	Added	x87 FPU state (for example, `FSAVE`).
`94/108`	Added	x87 FPU state (for example, `FSAVE`).
`stx`	Added	x87 FPU and SIMD state (only `FXSAVE` and `FXRSTOR`).
`512`	Added	x87 FPU and SIMD state (only `FXSAVE` and `FXRSTOR`).
`t`	Original; replaced by `ptp`	10-byte far pointer.
-	Original; replaced by `ptp`	10-byte far pointer.
`v`	Original	Word or doubleword, depending on operand-size attribute (for example, `INC` (`40`), `PUSH` (`50`)).
`16/32`	Original
`vds`	Added; combines `v` and `ds`	Word or doubleword, depending on operand-size attribute, or doubleword, sign-extended to 64 bits for 64-bit operand size.
`16/32`	Added; combines `v` and `ds`
`vq`	Original	Quadword (default) or word if operand-size prefix is used (for example, `PUSH` (`50`)).
`64/16`	Original
`vqp`	Added; combines `v` and `qp`	Word or doubleword, depending on operand-size attribute, or quadword, promoted by `REX.W` in 64-bit mode.
`16/32/64`	Added; combines `v` and `qp`
`vs`	Original	Word or doubleword sign extended to the size of the stack pointer (for example, `PUSH` (`68`)).
`16/32`	Original
`w`	Original	Word, regardless of operand-size attribute (for example, `ENTER`).
`16`	Original
`wi`	Added	Word-integer. Only x87 FPU instructions (for example, `FIADD`).
`16int`	Added

The following abbreviations are used for operand types and are accessible only in HTML geek's editions as operand's code title. They are issued to indicate a dependency on address-size attribute instead of operand-size attribute. As for source XML document, they are used within address atribute of syntax/dst or syntax/src elements. All of them are added:

`va`	Word or doubleword, according to address-size attribute (only `REP` and `LOOP` families).
`dqa`	Doubleword or quadword, according to address-size attribute (only `REP` and `LOOP` families).
`wa`	Word, according to address-size attribute (only `JCXZ` instruction).
`da`	Doubleword, according to address-size attribute (only `JECXZ` instruction).
`qa`	Quadword, according to address-size attribute (only `JRCXZ` instruction).

Zdrojový XML dokument

Popis struktury zdrojové XML reference je k dispozici pouze jako součást výhod. Něco jde také vyčíst z poznámek v DTD.

Aktuální stav

V této verzi je už reference téměř hotová. Obsahuje všeobecné, systémové, x87 FPU, MMX, SSE, SSE1, SSE2, SSE3 a SSSE3 instrukce (jednobajtové i dvoubajtové). Postupně by měly přibývat zbývající.

Současná verze je v beta stádiu, což znamená, že ještě může dojít ke zpětně nekompatibilním změnám v XML struktuře.

Budoucí plány

Do budoucna je plánována spousta nových edic, např. edice obsahující instrukce pouze z určité skupiny nebo instrukčního rozšíření atp.

Proč příspívat - výhody

Přispěvatelé mohou získat přístup k výhodám, které nemají pasivní uživatelé k dispozici. Mezi výhody patří HTML edice, podporující tisk, PDF edice, mnoho XSL transformací, pracovní návrhy, články a další související soubory.

Proč nejsou všechny tyto výhody volně k dispozici?

Několik málo přispěvatelů tvrdě pracovalo na tom, aby mi pomohli dokončit tuto referenci, proto jsem jim poskytnul touto cestou výhody. Myslím, že tyto výhody tě mohou povzbudit k nalezení vlastní cesty, jak se stát aktivním přispěvovatelem.

Jak přispívat

Následující seznam ukazuje, jak lze konkrétně přispívat:

Napsat článek, nebo, pokud máš blog, napsat něco na blog o tvých zkušenostech s touto referencí. Svůj článek můžeš napsat i pro x86asm.net
Protože zdroj této reference je XML soubor, šlo by vymyslet nějakou dynamickou aplikaci napsanou v PHP, která by umožňovala nějakou interakci s uživatelem, třeba mu nabídla jenom ty informace, které si zvolí

Poznámka: Z pohledu rozvoje projektu jsou modifikace kterékoliv z HTML edic skoro zbytečné. HTML edice je jenom výsledkem transformace zdrojového XML souboru, takže všechny úpravy je potřeba dělat tam.

Licence

Licence tady není od toho, aby omezovala používání reference. Chci si pouze zachovat kontrolu nad jejím vývojem.

Pokud provedete vylepšení reference, ať už jejích zdrojových souborů (XML, DTD, XSL transformací) nebo některého z odvozených souborů v jakémkoliv formátu, zašlete tyto soubory autorovi. Autor si vyhrazuje právo tyto soubory použít za jakýmkoliv účelem.
Publikování zdrojových nebo jakýchkoliv odvozených souborů v jakémkoliv formátu je možné pouze se souhlasem autora a pod těmito podmínkami:
1. Uveďte jméno autora
2. Uveďte tento hypertextový odkaz na zdroj: ref.x86asm.net
3. Uveďte tyto licenční podmínky
Nemůžete prodávat tištěné kopie žádného ze souborů (původních nebo odvozených) této reference, a to ani jako součást jiného projektu.

Zdroje

Reference byla zkompletována s použitím následujících zdrojů:

Manuály Intelu

Sandpile.org

Manuály AMD

Poděkování

Díky všem, kteří se nějak viditelně podepsali na vývoji:

Christian Ludloff: maintainer of great Sandpile.org site, one of important sources for this project

Martin Mocko a.k.a. vid: many design ideas for HTML editions

Anthony Lopes: great XML and XSL contributions

Aquila: many great contributions

EliCZ: bug reports, design ideas

Cephexin: many great contributions to XML

Miloslav Ponkrác: helped with PHP and JavaScripts on this site

William Whistler: valuable reviews and bug reports

Download

Tady jsou pohromadě všechny hlavní soubory reference. Plno dalších souborů je k dispozici pouze jako součást výhod.

x86reference.xml 355 kB

x86reference.dtd 18 kB

Soubory HTML edic

coder.html	407 kB	coder-abc.html	377 kB
coder32.html	364 kB	coder32-abc.html	342 kB
coder64.html	362 kB	coder64-abc.html	334 kB
geek.html	478 kB	geek-abc.html	443 kB
geek32.html	426 kB	geek32-abc.html	402 kB
geek64.html	415 kB	geek64-abc.html	383 kB

Komentáře

Pokračujte na diskuzním fóru.

Můj kontakt naleznete zde.

Revize

2008-12-17	1.01β	I forgot to upload the XML reference for previous revision. Now it comes in this revision Bugfixes: `CALLF` (`FF /3`) and `JMPF` (`FF /5`): only a memory operand is allowed (reported by Fabio Fernandes) `PSRLD` (`0F72 /2`): typo in mnemonic (reported by Japheth) `PMADDUBSW` (`[66]0F3804`) description fixed The following bugfixes affect the geek suite: Opcodes `FF /2`, `FF /3`, `FF /4`, `FF /5`, `FF /6` had unfounded `W` opcode field (reported by William Whistler) The following changes and bugfixes affects mostly only the XML reference and DTD: Backward-incompatible change: Operand type `vaqp` removed, was wrong Backward-incompatible change: New operand type `dqa` issued to replace removed `vaqp` for `REP` family operands and `LOOP` family operands in 64-bit mode Backward-incompatible change: Decided not to indicate sign extension on `MOVSXD` operand New attribute `escape` for `sec_opcd` element to indicate three-byte escapes `0F38XX` and `0F3AXX` Removed all entities from DTD to make it ready to convert to XSD (suggested by Herbert Oppmann) Bugfix: all `@op_size` attributes removed from opcodes `FF /2`, `FF /3`, `FF /4`, `FF /5`, `FF /6` (reported by William Whistler) Bugfix: No (implicate) rFlags operand was declared correctly (reported by William Whistler) New implicate Addressing method `F` for rFlags operand defined in DTD Bugfix: Many `entry/@mod` and `syntax@mod` attributes changed and fixed	MazeGen
2008-10-19	1.00β	News: All SSE, SSE2, SSE3, and SSSE3 instructions added (Aquila and Cephexin contributions) Alphabetically sorted editions (postfixed with -abc) On-line store improved, prices discounted The HTML transformation process is not documented now Bugfixes: `FDIVRP ST1, ST` secondary opcode was missing, it should be `F1` `PAUSE` instruction came with SSE2 `PUSH FF/6`, `FCOMI`, `FCOMIP`, `FISTTP`, `FNSAVE`, `FSAVE` and `TAKEN` prefix description fixed The following changes and bugfixes affect mostly only the XML reference: `CALL FF/2`, `CALLF FF/3`, `JMP FF/4`, `JMPF FF/5`, `PUSH FF/6`: the operand must be `src` instead of `dst` Opcode `D9/3`, `doc_part_alias_ref` attribute fixed All MMX instructions' operand codes fixed using `a` and `t` elements Backward-incompatible change of `B` addressing code to `BB` The `gen_notes` and `ring_notes` nodes are no longer present in the XML All `id` attributes renamed to `xml:id` New `sup` and `sub` child elements for `notes` node New addressing code `BA` New `particular` attribute for `entry` node	MazeGen
2008-05-15	0.40β	News: All MMX instructions added (Anthony Lopes contribution) HTML transformation process has changed Support for printing from the public files is no longer available (i. e., PDF editions are no longer publicly available as well) Bugfixes: `CLTS` (0F06): valid only at ring 0; valid also in real mode (reported by Anthony Lopes, EliCZ) `STD` (FD): typo in mnemonic (reported by EliCZ, andrewl) `WRMSR` (0F30): confusing and unnecessary 64-bit operands (reported by EliCZ) `RDTSC` (0F31), `RDPMC` (0F33): unnecessary 64-bit entry (reported by EliCZ) `LAR` (0F02), `LSL` (0F03): valid only in protected mode (reported by EliCZ) `HLT` (F4), `SYSRET` (0F07), `SWAPGS` (0F01 /7): valid only at ring 0 (reported by EliCZ) The following changes and bugfixes affect mostly only the XML reference: `MOV` (A2, A3): `dst` must be `depend='no'`, `src` must not `CMPS` (A6, A7): first `src` must not be `depend='no'` `SCAS` (AE): no `dst` operand, both operands are `src` `SCAS` (AF): first `src` operand must not be `depend='no'` Attribute `proc_start/@post="no"` duplicated using `proc_end` element Operand address and type codes split into `a` and `t` subelements (DTD changed along)	MazeGen
2008-03-11	0.30β	Přidány všechny x87 FPU instrukce, včetně nových Sloupec l přejmenován na x a bylo rozšířeno jeho použití V HTML a PDF edicích byly hodnoty prefixů přesunuty do sloupce pf Projekt byl přejmenován na X86 Opcode and Instruction Reference	MazeGen
2007-11-29	0.21β	HTML tabulka je rozdělena do dvou částí na jednobajtové a dvoubajtové operační znaky. To by mělo pomoci prohlížečům vykreslovat referenci rychleji a víc jednoduše. Doufám, že to také pomůže Firefoxu, aby dokázal všechno vykreslit napoprvé (bez nutnosti obnovování stránky) Instrukce, které ve skutečnosti netestují všechny příznaky, ale jen je ukládají na zásobník (`PUSHF`, `INT` a pár dalších) jsou opraveny (doporučení: Wolfgang Kern) PDF edice pro každou HTML edici	MazeGen
2007-11-06	0.20β	Přidány edice coder, coder32, coder64, geek32 a geek64. Upraveny všechny hlavní soubory projektu. Doplněna dokumentace projektu.	MazeGen
2007-06-04	0.10β	První vydání pro publikaci	MazeGen

(formáty dat odpovídají ISO 8601)