LLVM API Documentation

X86Disassembler.h
Go to the documentation of this file.
00001 //===-- X86Disassembler.h - Disassembler for x86 and x86_64 -----*- C++ -*-===//
00002 //
00003 //                     The LLVM Compiler Infrastructure
00004 //
00005 // This file is distributed under the University of Illinois Open Source
00006 // License. See LICENSE.TXT for details.
00007 //
00008 //===----------------------------------------------------------------------===//
00009 //
00010 // The X86 disassembler is a table-driven disassembler for the 16-, 32-, and
00011 // 64-bit X86 instruction sets.  The main decode sequence for an assembly
00012 // instruction in this disassembler is:
00013 //
00014 // 1. Read the prefix bytes and determine the attributes of the instruction.
00015 //    These attributes, recorded in enum attributeBits
00016 //    (X86DisassemblerDecoderCommon.h), form a bitmask.  The table CONTEXTS_SYM
00017 //    provides a mapping from bitmasks to contexts, which are represented by
00018 //    enum InstructionContext (ibid.).
00019 //
00020 // 2. Read the opcode, and determine what kind of opcode it is.  The
00021 //    disassembler distinguishes four kinds of opcodes, which are enumerated in
00022 //    OpcodeType (X86DisassemblerDecoderCommon.h): one-byte (0xnn), two-byte
00023 //    (0x0f 0xnn), three-byte-38 (0x0f 0x38 0xnn), or three-byte-3a
00024 //    (0x0f 0x3a 0xnn).  Mandatory prefixes are treated as part of the context.
00025 //
00026 // 3. Depending on the opcode type, look in one of four ClassDecision structures
00027 //    (X86DisassemblerDecoderCommon.h).  Use the opcode class to determine which
00028 //    OpcodeDecision (ibid.) to look the opcode in.  Look up the opcode, to get
00029 //    a ModRMDecision (ibid.).
00030 //
00031 // 4. Some instructions, such as escape opcodes or extended opcodes, or even
00032 //    instructions that have ModRM*Reg / ModRM*Mem forms in LLVM, need the
00033 //    ModR/M byte to complete decode.  The ModRMDecision's type is an entry from
00034 //    ModRMDecisionType (X86DisassemblerDecoderCommon.h) that indicates if the
00035 //    ModR/M byte is required and how to interpret it.
00036 //
00037 // 5. After resolving the ModRMDecision, the disassembler has a unique ID
00038 //    of type InstrUID (X86DisassemblerDecoderCommon.h).  Looking this ID up in
00039 //    INSTRUCTIONS_SYM yields the name of the instruction and the encodings and
00040 //    meanings of its operands.
00041 //
00042 // 6. For each operand, its encoding is an entry from OperandEncoding
00043 //    (X86DisassemblerDecoderCommon.h) and its type is an entry from
00044 //    OperandType (ibid.).  The encoding indicates how to read it from the
00045 //    instruction; the type indicates how to interpret the value once it has
00046 //    been read.  For example, a register operand could be stored in the R/M
00047 //    field of the ModR/M byte, the REG field of the ModR/M byte, or added to
00048 //    the main opcode.  This is orthogonal from its meaning (an GPR or an XMM
00049 //    register, for instance).  Given this information, the operands can be
00050 //    extracted and interpreted.
00051 //
00052 // 7. As the last step, the disassembler translates the instruction information
00053 //    and operands into a format understandable by the client - in this case, an
00054 //    MCInst for use by the MC infrastructure.
00055 //
00056 // The disassembler is broken broadly into two parts: the table emitter that
00057 // emits the instruction decode tables discussed above during compilation, and
00058 // the disassembler itself.  The table emitter is documented in more detail in
00059 // utils/TableGen/X86DisassemblerEmitter.h.
00060 //
00061 // X86Disassembler.h contains the public interface for the disassembler,
00062 //   adhering to the MCDisassembler interface.
00063 // X86Disassembler.cpp contains the code responsible for step 7, and for
00064 //   invoking the decoder to execute steps 1-6.
00065 // X86DisassemblerDecoderCommon.h contains the definitions needed by both the
00066 //   table emitter and the disassembler.
00067 // X86DisassemblerDecoder.h contains the public interface of the decoder,
00068 //   factored out into C for possible use by other projects.
00069 // X86DisassemblerDecoder.c contains the source code of the decoder, which is
00070 //   responsible for steps 1-6.
00071 //
00072 //===----------------------------------------------------------------------===//
00073 
00074 #ifndef LLVM_LIB_TARGET_X86_DISASSEMBLER_X86DISASSEMBLER_H
00075 #define LLVM_LIB_TARGET_X86_DISASSEMBLER_X86DISASSEMBLER_H
00076 
00077 #include "X86DisassemblerDecoderCommon.h"
00078 #include "llvm/MC/MCDisassembler.h"
00079 
00080 namespace llvm {
00081 
00082 class MCInst;
00083 class MCInstrInfo;
00084 class MCSubtargetInfo;
00085 class MemoryObject;
00086 class raw_ostream;
00087 
00088 namespace X86Disassembler {
00089 
00090 /// X86GenericDisassembler - Generic disassembler for all X86 platforms.
00091 ///   All each platform class should have to do is subclass the constructor, and
00092 ///   provide a different disassemblerMode value.
00093 class X86GenericDisassembler : public MCDisassembler {
00094   std::unique_ptr<const MCInstrInfo> MII;
00095 public:
00096   /// Constructor     - Initializes the disassembler.
00097   ///
00098   X86GenericDisassembler(const MCSubtargetInfo &STI, MCContext &Ctx,
00099                          std::unique_ptr<const MCInstrInfo> MII);
00100 public:
00101 
00102   /// getInstruction - See MCDisassembler.
00103   DecodeStatus getInstruction(MCInst &instr, uint64_t &size,
00104                               const MemoryObject &region, uint64_t address,
00105                               raw_ostream &vStream,
00106                               raw_ostream &cStream) const override;
00107 
00108 private:
00109   DisassemblerMode              fMode;
00110 };
00111 
00112 } // namespace X86Disassembler
00113 
00114 } // namespace llvm
00115 
00116 #endif