Linux Kernel
3.7.1
|
#include <stddef.h>
#include <stdint.h>
Go to the source code of this file.
Data Structures | |
union | iorpc_offset |
struct | iorpc_mem_attr |
union | iorpc_mem_buffer |
union | iorpc_interrupt |
union | iorpc_pollfd_setup |
union | iorpc_pollfd |
Macros | |
#define | IORPC_OPCODE(FORMAT, CODE) (((FORMAT) << 16) | (CODE)) |
#define | IORPC_MEM_BUFFER_FLAG_NT_HINT (1 << 0) |
#define | IORPC_MEM_BUFFER_FLAG_IO_PIN (1 << 1) |
#define IORPC_MEM_BUFFER_FLAG_IO_PIN (1 << 1) |
#define IORPC_MEM_BUFFER_FLAG_NT_HINT (1 << 0) |
enum gxio_err_e |
The various iorpc devices use error codes from -1100 to -1299.
This range is distinct from netio (-700 to -799), the hypervisor (-800 to -899), tilepci (-900 to -999), ilib (-1000 to -1099), gxcr (-1300 to -1399) and gxpci (-1400 to -1499).
GXIO_ERR_MAX |
Largest iorpc error number. |
GXIO_ERR_OPCODE |
Bad RPC opcode - possible version incompatibility. |
GXIO_ERR_INVAL |
Invalid parameter. |
GXIO_ERR_ALIGNMENT |
Memory buffer did not meet alignment requirements. |
GXIO_ERR_COHERENCE |
Memory buffers must be coherent and cacheable. |
GXIO_ERR_ALREADY_INIT |
Resource already initialized. |
GXIO_ERR_NO_SVC_DOM |
No service domains available. |
GXIO_ERR_INVAL_SVC_DOM |
Illegal service domain number. |
GXIO_ERR_MMIO_ADDRESS |
Illegal MMIO address. |
GXIO_ERR_INTERRUPT |
Illegal interrupt binding. |
GXIO_ERR_CLIENT_MEMORY |
Unreasonable client memory. |
GXIO_ERR_IOTLB_ENTRY |
No more IOTLB entries. |
GXIO_ERR_INVAL_MEMORY_SIZE |
Invalid memory size. |
GXIO_ERR_UNSUPPORTED_OP |
Unsupported operation. |
GXIO_ERR_DMA_CREDITS |
Insufficient DMA credits. |
GXIO_ERR_TIMEOUT |
Operation timed out. |
GXIO_ERR_NO_DEVICE |
No such device or object. |
GXIO_ERR_BUSY |
Device or resource busy. |
GXIO_ERR_IO |
I/O error. |
GXIO_ERR_PERM |
Permissions error. |
GXIO_TEST_ERR_REG_NUMBER |
Illegal register number. |
GXIO_TEST_ERR_BUFFER_SLOT |
Illegal buffer slot. |
GXIO_MPIPE_ERR_INVAL_BUFFER_SIZE |
Invalid buffer size. |
GXIO_MPIPE_ERR_NO_BUFFER_STACK |
Cannot allocate buffer stack. |
GXIO_MPIPE_ERR_BAD_BUFFER_STACK |
Invalid buffer stack number. |
GXIO_MPIPE_ERR_NO_NOTIF_RING |
Cannot allocate NotifRing. |
GXIO_MPIPE_ERR_BAD_NOTIF_RING |
Invalid NotifRing number. |
GXIO_MPIPE_ERR_NO_NOTIF_GROUP |
Cannot allocate NotifGroup. |
GXIO_MPIPE_ERR_BAD_NOTIF_GROUP |
Invalid NotifGroup number. |
GXIO_MPIPE_ERR_NO_BUCKET |
Cannot allocate bucket. |
GXIO_MPIPE_ERR_BAD_BUCKET |
Invalid bucket number. |
GXIO_MPIPE_ERR_NO_EDMA_RING |
Cannot allocate eDMA ring. |
GXIO_MPIPE_ERR_BAD_EDMA_RING |
Invalid eDMA ring number. |
GXIO_MPIPE_ERR_BAD_CHANNEL |
Invalid channel number. |
GXIO_MPIPE_ERR_BAD_CONFIG |
Bad configuration. |
GXIO_MPIPE_ERR_IQUEUE_EMPTY |
Empty iqueue. |
GXIO_MPIPE_ERR_RULES_EMPTY |
Empty rules. |
GXIO_MPIPE_ERR_RULES_FULL |
Full rules. |
GXIO_MPIPE_ERR_RULES_CORRUPT |
Corrupt rules. |
GXIO_MPIPE_ERR_RULES_INVALID |
Invalid rules. |
GXIO_MPIPE_ERR_CLASSIFIER_TOO_BIG |
Classifier is too big. |
GXIO_MPIPE_ERR_CLASSIFIER_TOO_COMPLEX |
Classifier is too complex. |
GXIO_MPIPE_ERR_CLASSIFIER_BAD_HEADER |
Classifier has bad header. |
GXIO_MPIPE_ERR_CLASSIFIER_BAD_CONTENTS |
Classifier has bad contents. |
GXIO_MPIPE_ERR_CLASSIFIER_INVAL_SYMBOL |
Classifier encountered invalid symbol. |
GXIO_MPIPE_ERR_CLASSIFIER_INVAL_BOUNDS |
Classifier encountered invalid bounds. |
GXIO_MPIPE_ERR_CLASSIFIER_INVAL_RELOCATION |
Classifier encountered invalid relocation. |
GXIO_MPIPE_ERR_CLASSIFIER_UNDEF_SYMBOL |
Classifier encountered undefined symbol. |
GXIO_TRIO_ERR_NO_MEMORY_MAP |
Cannot allocate memory map region. |
GXIO_TRIO_ERR_BAD_MEMORY_MAP |
Invalid memory map region number. |
GXIO_TRIO_ERR_NO_SCATTER_QUEUE |
Cannot allocate scatter queue. |
GXIO_TRIO_ERR_BAD_SCATTER_QUEUE |
Invalid scatter queue number. |
GXIO_TRIO_ERR_NO_PUSH_DMA_RING |
Cannot allocate push DMA ring. |
GXIO_TRIO_ERR_BAD_PUSH_DMA_RING |
Invalid push DMA ring index. |
GXIO_TRIO_ERR_NO_PULL_DMA_RING |
Cannot allocate pull DMA ring. |
GXIO_TRIO_ERR_BAD_PULL_DMA_RING |
Invalid pull DMA ring index. |
GXIO_TRIO_ERR_NO_PIO |
Cannot allocate PIO region. |
GXIO_TRIO_ERR_BAD_PIO |
Invalid PIO region index. |
GXIO_TRIO_ERR_NO_ASID |
Cannot allocate ASID. |
GXIO_TRIO_ERR_BAD_ASID |
Invalid ASID. |
GXIO_MICA_ERR_BAD_ACCEL_TYPE |
No such accelerator type. |
GXIO_MICA_ERR_NO_CONTEXT |
Cannot allocate context. |
GXIO_MICA_ERR_PKA_CMD_QUEUE_FULL |
PKA command queue is full, can't add another command. |
GXIO_MICA_ERR_PKA_RESULT_QUEUE_EMPTY |
PKA result queue is empty, can't get a result from the queue. |
GXIO_GPIO_ERR_PIN_UNAVAILABLE |
Pin not available. Either the physical pin does not exist, or it is reserved by the hypervisor for system usage. |
GXIO_GPIO_ERR_PIN_BUSY |
Pin busy. The pin exists, and is available for use via GXIO, but it has been attached by some other process or driver. |
GXIO_GPIO_ERR_PIN_UNATTACHED |
Cannot access unattached pin. One or more of the pins being manipulated by this call are not attached to the requesting context. |
GXIO_GPIO_ERR_PIN_INVALID_MODE |
Invalid I/O mode for pin. The wiring of the pin in the system is such that the I/O mode or electrical control parameters requested could cause damage. |
GXIO_ERR_MIN |
Smallest iorpc error number. |
enum iorpc_format_e |
Error codes and struct definitions for the IO RPC library.
The hypervisor's IO RPC component provides a convenient way for driver authors to proxy system calls between user space, linux, and the hypervisor driver. The core of the system is a set of Python files that take ".idl" files as input and generates the following source code:
The IO RPC system also includes the Linux 'iorpc' driver, which proxies calls between the userspace library and the hypervisor driver. The Linux driver is almost entirely device agnostic; it watches for special flags indicating cases where a memory buffer address might need to be translated, etc. As a result, driver writers can avoid many of the problem cases related to registering hardware resources like memory pages or interrupts. However, the drivers must be careful to obey the conventions documented below in order to work properly with the generic Linux iorpc driver.
All iorpc-based drivers must support a notion of service domains. A service domain is basically an application context - state indicating resources that are allocated to that particular app which it may access and (perhaps) other applications may not access. Drivers can support any number of service domains they choose. In some cases the design is limited by a number of service domains supported by the IO hardware; in other cases the service domains are a purely software concept and the driver chooses a maximum number of domains based on how much state memory it is willing to preallocate.
For example, the mPIPE driver only supports as many service domains as are supported by the mPIPE hardware. This limitation is required because the hardware implements its own MMIO protection scheme to allow large MMIO mappings while still protecting small register ranges within the page that should only be accessed by the hypervisor.
In contrast, drivers with no hardware service domain limitations (for instance the TRIO shim) can implement an arbitrary number of service domains. In these cases, each service domain is limited to a carefully restricted set of legal MMIO addresses if necessary to keep one application from corrupting another application's state.
The driver's open routine is responsible for allocating a new service domain for each hv_dev_open() call. By convention, the return value from open() should be the service domain number on success, or GXIO_ERR_NO_SVC_DOM if no more service domains are available.
The implementations of hv_dev_pread() and hv_dev_pwrite() are responsible for validating the devhdl value passed up by the client. Since the device handle returned by hv_dev_open() should embed the positive service domain number, drivers should make sure that DRV_HDL2BITS(devhdl) is a legal service domain. If the client passes an illegal service domain number, the routine should return GXIO_ERR_INVAL_SVC_DOM. Once the service domain number has been validated, the driver can copy to/from the client buffer and call the dispatch_read() or dispatch_write() methods created by the RPC generator.
The hv_dev_close() implementation should reset all service domain state and put the service domain back on a free list for reallocation by a future application. In most cases, this will require executing a hardware reset or drain flow and denying any MMIO regions that were created for the service domain.
The .idl file syntax allows the creation of syscalls with special parameters that require permission checks or translations as part of the system call path. Because of limitations in the code generator, APIs are generally limited to just one of these special parameters per system call, and they are sometimes required to be the first or last parameter to the call. Special parameters include:
The MEM_BUFFER() datatype allows user space to "register" memory buffers with a device. Registering memory accomplishes two tasks: Linux keeps track of all buffers that might be modified by a hardware device, and the hardware device drivers bind registered buffers to particular hardware resources like ingress NotifRings. The MEM_BUFFER() idl syntax can take extra flags like ALIGN_64KB, ALIGN_SELF_SIZE, and FLAGS indicating that memory buffers must have certain alignment or that the user should be able to pass a "memory flags" word specifying attributes like nt_hint or IO cache pinning. The parser will accept multiple MEM_BUFFER() flags.
Implementations must obey the following conventions when registering memory buffers via the iorpc flow. These rules are a result of the Linux driver implementation, which needs to keep track of how many times a particular page has been registered with the hardware so that it can release the page when all those registrations are cleared.
The INTERRUPT .idl datatype allows the client to bind hardware interrupts to a particular combination of IPI parameters - CPU, IPI PL, and event bit number. This data is passed via a special datatype so that the Linux driver can validate the CPU and PL and the HV generic iorpc code can translate client CPUs to real CPUs.
The POLLFD_SETUP .idl datatype allows the client to set up hardware interrupt bindings which are received by Linux but which are made visible to user processes as state transitions on a file descriptor; this allows user processes to use Linux primitives, such as poll(), to await particular hardware events. This data is passed via a special datatype so that the Linux driver may recognize the pollable file descriptor and translate it to a set of interrupt target information, and so that the HV generic iorpc code can translate client CPUs to real CPUs.
The POLLFD .idl datatype allows manipulation of hardware interrupt bindings set up via the POLLFD_SETUP datatype; common operations are resetting the state of the requested interrupt events, and unbinding any bound interrupts. This data is passed via a special datatype so that the Linux driver may recognize the pollable file descriptor and translate it to an interrupt identifier previously supplied by the hypervisor as the result of an earlier pollfd_setup operation.
The BLOB .idl datatype allows the client to write an arbitrary length string of bytes up to the hypervisor driver. This can be useful for passing up large, arbitrarily structured data like classifier programs. The iorpc stack takes care of validating the buffer VA and CPA as the data passes up to the hypervisor. Unlike MEM_BUFFER(), the buffer is not registered - Linux does not bump page refcounts and the HV driver should not reuse the buffer once the system call is complete.
The iorpc_offset structure describes the formatting of the offset that is passed to pread() or pwrite() as part of the generated RPC code. When the user calls up to Linux, the rpc code fills in all the fields of the offset, including a 16-bit opcode, a 16 bit format indicator, and 32 bits of user-specified "sub-offset". The opcode indicates which syscall is being requested. The format indicates whether there is a "prefix struct" at the start of the memory buffer passed to pwrite(), and if so what data is in that prefix struct. These prefix structs are used to implement special datatypes like MEM_BUFFER() and INTERRUPT - we arrange to put data that needs translation and permission checks at the start of the buffer so that the Linux driver and generic portions of the HV iorpc code can easily access the data. The 32 bits of user-specified "sub-offset" are most useful for pread() calls where the user needs to also pass in a few bits indicating which register to read, etc.
The Linux iorpc driver watches for system calls that contain prefix structs so that it can translate parameters and bump reference counts as appropriate. It does not (currently) have any knowledge of the per-device opcodes - it doesn't care what operation you're doing to mPIPE, so long as it can do all the generic book-keeping. The hv/iorpc.h header file defines all of the generic encoding bits needed to translate iorpc calls without knowing which particular opcode is being issued.
Implementing mmap() required adding some special iorpc syscalls that are only called by the Linux driver, never by userspace. These include get_mmio_base() and check_mmio_offset(). These routines are described in globals.idl and must be included in every iorpc driver. By providing these routines in every driver, Linux's mmap implementation can easily get the PTE bits it needs and validate the PA offset without needing to know the per-device opcodes to perform those tasks.
The iorpc code generator also supports generation of kernel code implementing the gxio APIs. This capability is currently used by the mPIPE network driver, and will likely be used by the TRIO root complex and endpoint drivers and perhaps an in-kernel crypto driver. Each driver that wants to instantiate iorpc calls in the kernel needs to generate a kernel version of the generate rpc code and (probably) copy any related gxio source files into the kernel. The mPIPE driver provides a good example of this pattern. Code indicating translation services required within the RPC path. These indicate whether there is a translatable struct at the start of the RPC buffer and what information that struct contains.