The File I/O remote protocol extension (short: File-I/O) allows the target to use the hosts file system and console I/O when calling various system calls. System calls on the target system are translated into a remote protocol packet to the host system which then performs the needed actions and returns with an adequate response packet to the target system. This simulates file system operations even on targets that lack file systems.
The protocol is defined host- and target-system independent. It uses it's own independent representation of datatypes and values. Both, gdb and the target's gdb stub are responsible for translating the system dependent values into the unified protocol values when data is transmitted.
The communication is synchronous. A system call is possible only when GDB is waiting for the C, c, S or s packets. While gdb handles the request for a system call, the target is stopped to allow deterministic access to the target's memory. Therefore File-I/O is not interuptible by target signals. It is possible to interrupt File-I/O by a user interrupt (Ctrl-C), though.
The target's request to perform a host system call does not finish the latest C, c, S or s action. That means, after finishing the system call, the target returns to continuing the previous activity (continue, step). No additional continue or step request from gdb is required.
(gdb) continue <- target requests 'system call X' target is stopped, gdb executes system call -> GDB returns result ... target continues, GDB returns to wait for the target <- target hits breakpoint and sends a Txx packet |
The protocol is only used for files on the host file system and for I/O on the console. Character or block special devices, pipes, named pipes or sockets or any other communication method on the host system are not supported by this protocol.
The File-I/O protocol uses the F packet, as request as well as as reply packet. Since a File-I/O system call can only occur when gdb is waiting for the continuing or stepping target, the File-I/O request is a reply that gdb has to expect as a result of a former C, c, S or s packet. This F packet contains all information needed to allow gdb to call the appropriate host system call:
A unique identifier for the requested system call.
All parameters to the system call. Pointers are given as addresses in the target memory address space. Pointers to strings are given as pointer/length pair. Numerical values are given as they are. Numerical control values are given in a protocol specific representation.
At that point gdb has to perform the following actions.
If parameter pointer values are given, which point to data needed as input to a system call, gdb requests this data from the target with a standard m packet request. This additional communication has to be expected by the target implementation and is handled as any other m packet.
gdb translates all value from protocol representation to host representation as needed. Datatypes are coerced into the host types.
gdb calls the system call
It then coerces datatypes back to protocol representation.
If pointer parameters in the request packet point to buffer space in which a system call is expected to copy data to, the data is transmitted to the target using a M or X packet. This packet has to be expected by the target implementation and is handled as any other M or X packet.
Eventually gdb replies with another F packet which contains all necessary information for the target to continue. This at least contains
Return value.
errno, if has been changed by the system call.
"Ctrl-C" flag.
After having done the needed type and value coercion, the target continues the latest continue or step action.
The F request packet has the following format:
call-id is the identifier to indicate the host system call to be called. This is just the name of the function. parameter… are the parameters to the system call.
Parameters are hexadecimal integer values, either the real values in caseof scalar datatypes, as pointers to target buffer space in case of compounddatatypes and unspecified memory areas or as pointer/length pairs in caseof string parameters. These are appended to the call-id, each separatedfrom its predecessor by a comma. All values are transmitted in ASCIIstring representation, pointer/length pairs separated by a slash.
The F reply packet has the following format:
retcode is the return code of the system call as hexadecimal value. errno is the errno set by the call, in protocol specific representation. This parameter can be omitted if the call was successful. Ctrl-C flag is only send if the user requested a break. In this case, errno must be send as well, even if the call was successful. The Ctrl-C flag itself consists of the character 'C':
F0,0,C |
F-1,4,C |
Structured data which is transferred using a memory read or write as e.g. a struct stat is expected to be in a protocol specific format with all scalar multibyte datatypes being big endian. This should be done by the target before the F packet is sent resp. by gdb before it transfers memory to the target. Transferred pointers to structured data should point to the already coerced data at any time.
A special case is, if the Ctrl-C flag is set in the gdb reply packet. In this case the target should behave, as if it had gotten a break message. The meaning for the target is "system call interupted by SIGINT". Consequentially, the target should actually stop (as with a break message) and return to gdb with a T02 packet. In this case, it's important for the target to know, in which state the system call was interrupted. Since this action is by design not an atomic operation, we have to differ between two cases:
The system call hasn't been performed on the host yet.
The system call on the host has been finished.
These two states can be distinguished by the target by the value of the returned errno. If it's the protocol representation of EINTR, the system call hasn't been performed. This is equivalent to the EINTR handling on POSIX systems. In any other case, the target may presume that the system call has been finished -- successful or not -- and should behave as if the break message arrived right after the system call.
gdb must behave reliable. If the system call has not been called yet, gdb may send the F reply immediately, setting EINTR as errno in the packet. If the system call on the host has been finished before the user requests a break, the full action must be finshed by gdb. This requires sending M or X packets as they fit. The F packet may only be send when either nothing has happened or the full action has been completed.
By default and if not explicitely closed by the target system, the file descriptors 0, 1 and 2 are connected to the gdb console. Output on the gdb console is handled as any other file output operation (write(1, …) or write(2, …)). Console input is handled by gdb so that after the target read request from file descriptor 0 all following typing is buffered until either one of the following conditions is met:
The user presses Ctrl-C. The behaviour is as explained above, the read system call is treated as finished.
The user presses Enter. This is treated as end of input with a trailing line feed.
The user presses Ctrl-D. This is treated as end of input. No trailing character, especially no Ctrl-D is appended to the input.
If the user has typed more characters as fit in the buffer given to the read call, the trailing characters are buffered in gdb until either another read(0, …) is requested by the target or debugging is stopped on users request.
A special case in this protocol is the library call isatty which is implemented as it's own call inside of this protocol. It returns 1 to the target if the file descriptor given as parameter is attached to the gdb console, 0 otherwise. Implementing through system calls would require implementing ioctl and would be more complex than needed.
The other special case in this protocol is the system call which is implemented as it's own call, too. gdb is taking over the full task of calling the necessary host calls to perform the system call. The return value of system is simplified before it's returned to the target. Basically, the only signal transmitted back is EINTR in case the user pressed Ctrl-C. Otherwise the return value consists entirely of the exit status of the called command.
Due to security concerns, the system call is refused to be called by gdb by default. The user has to allow this call explicitly by entering
Set the call.
Disabling the system call is done by
Set the call.
The current setting is shown by typing
Set the call.
Synopsis: int open(const char *pathname, int flags); int open(const char *pathname, int flags, mode_t mode); Request: Fopen,pathptr/len,flags,mode |
flags is the bitwise or of the following values:
If the file does not exist it will be created. The host rules apply as far as file ownership and time stamps are concerned.
When used with O_CREAT, if the file already exists it is an error and open() fails.
If the file already exists and the open mode allows writing (O_RDWR or O_WRONLY is given) it will be truncated to length 0.
The file is opened in append mode.
The file is opened for reading only.
The file is opened for writing only.
The file is opened for reading and writing.
Each other bit is silently ignored.
mode is the bitwise or of the following values:
User has read permission.
User has write permission.
Group has read permission.
Group has write permission.
Others have read permission.
Others have write permission.
Each other bit is silently ignored.
Return value: open returns the new file descriptor or -1 if an error occured. Errors: |
pathname already exists and O_CREAT and O_EXCL were used.
pathname refers to a directory.
The requested access is not allowed.
pathname was too long.
A directory component in pathname does not exist.
pathname refers to a device, pipe, named pipe or socket.
pathname refers to a file on a read-only filesystem and write access was requested.
pathname is an invalid pointer value.
No space on device to create the file.
The process already has the maximum number of files open.
The limit on the total number of files open on the system has been reached.
The call was interrupted by the user.
Synopsis: int close(int fd); Request: Fclose,fd Return value: close returns zero on success, or -1 if an error occurred. Errors: |
fd isn't a valid open file descriptor.
The call was interrupted by the user.
Synopsis: int read(int fd, void *buf, unsigned int count); Request: Fread,fd,bufptr,count Return value: On success, the number of bytes read is returned. Zero indicates end of file. If count is zero, read returns zero as well. On error, -1 is returned. Errors: |
fd is not a valid file descriptor or is not open for reading.
buf is an invalid pointer value.
The call was interrupted by the user.
Synopsis: int write(int fd, const void *buf, unsigned int count); Request: Fwrite,fd,bufptr,count Return value: On success, the number of bytes written are returned. Zero indicates nothing was written. On error, -1 is returned. Errors: |
fd is not a valid file descriptor or is not open for writing.
buf is an invalid pointer value.
An attempt was made to write a file that exceeds the host specific maximum file size allowed.
No space on device to write the data.
The call was interrupted by the user.
Synopsis: long lseek (int fd, long offset, int flag); Request: Flseek,fd,offset,flag |
flag is one of:
The offset is set to offset bytes.
The offset is set to its current location plus offset bytes.
The offset is set to the size of the file plus offset bytes.
Return value: On success, the resulting unsigned offset in bytes from the beginning of the file is returned. Otherwise, a value of -1 is returned. Errors: |
fd is not a valid open file descriptor.
fd is associated with the gdb console.
flag is not a proper value.
The call was interrupted by the user.
Synopsis: int rename(const char *oldpath, const char *newpath); Request: Frename,oldpathptr/len,newpathptr/len Return value: On success, zero is returned. On error, -1 is returned. Errors: |
newpath is an existing directory, but oldpath is not a directory.
newpath is a non-empty directory.
oldpath or newpath is a directory that is in use by some process.
An attempt was made to make a directory a subdirectory of itself.
A component used as a directory in oldpath or new path is not a directory. Or oldpath is a directory and newpath exists but is not a directory.
oldpathptr or newpathptr are invalid pointer values.
No access to the file or the path of the file.
oldpath or newpath was too long.
A directory component in oldpath or newpath does not exist.
The file is on a read-only filesystem.
The device containing the file has no room for the new directory entry.
The call was interrupted by the user.
Synopsis: int unlink(const char *pathname); Request: Funlink,pathnameptr/len Return value: On success, zero is returned. On error, -1 is returned. Errors: |
No access to the file or the path of the file.
The system does not allow unlinking of directories.
The file pathname cannot be unlinked because it's being used by another process.
pathnameptr is an invalid pointer value.
pathname was too long.
A directory component in pathname does not exist.
A component of the path is not a directory.
The file is on a read-only filesystem.
The call was interrupted by the user.
Synopsis: int stat(const char *pathname, struct stat *buf); int fstat(int fd, struct stat *buf); Request: Fstat,pathnameptr/len,bufptr Ffstat,fd,bufptr Return value: On success, zero is returned. On error, -1 is returned. Errors: |
fd is not a valid open file.
A directory component in pathname does not exist or the path is an empty string.
A component of the path is not a directory.
pathnameptr is an invalid pointer value.
No access to the file or the path of the file.
pathname was too long.
The call was interrupted by the user.
Synopsis: int gettimeofday(struct timeval *tv, void *tz); Request: Fgettimeofday,tvptr,tzptr Return value: On success, 0 is returned, -1 otherwise. Errors: |
tz is a non-NULL pointer.
tvptr and/or tzptr is an invalid pointer value.
Synopsis: int isatty(int fd); Request: Fisatty,fd Return value: Returns 1 if fd refers to the gdb console, 0 otherwise. Errors: |
The call was interrupted by the user.
Synopsis: int system(const char *command); Request: Fsystem,commandptr/len Return value: The value returned is -1 on error and the return status of the command otherwise. Only the exit status of the command is returned, which is extracted from the hosts system return value by calling WEXITSTATUS(retval). In case /bin/sh could not be executed, 127 is returned. Errors: |
The call was interrupted by the user.
The integral datatypes used in the system calls are
int, unsigned int, long, unsigned long, mode_t and time_t |
Int, unsigned int, mode_t and time_t are implemented as 32 bit values in this protocol.
Long and unsigned long are implemented as 64 bit types.
Refer to Section D.7.12.5 Limits, for corresponding MIN and MAX values (similar to those in limits.h) to allow range checking on host and target.
time_t datatypes are defined as seconds since the Epoch.
All integral datatypes transferred as part of a memory read or write of a structured datatype e.g. a struct stat have to be given in big endian byte order.
Pointers to target data are transmitted as they are. An exception is made for pointers to buffers for which the length isn't transmitted as part of the function call, namely strings. Strings are transmitted as a pointer/length pair, both as hex values, e.g.
1aaf/12 |
which is a pointer to data of length 18 bytes at position 0x1aaf. The length is defined as the full string length in bytes, including the trailing null byte. Example:
``hello, world'' at address 0x123456 |
is transmitted as
123456/d |
The buffer of type struct stat used by the target and gdb is defined as follows:
struct stat { unsigned int st_dev; /* device */ unsigned int st_ino; /* inode */ mode_t st_mode; /* protection */ unsigned int st_nlink; /* number of hard links */ unsigned int st_uid; /* user ID of owner */ unsigned int st_gid; /* group ID of owner */ unsigned int st_rdev; /* device type (if inode device) */ unsigned long st_size; /* total size, in bytes */ unsigned long st_blksize; /* blocksize for filesystem I/O */ unsigned long st_blocks; /* number of blocks allocated */ time_t st_atime; /* time of last access */ time_t st_mtime; /* time of last modification */ time_t st_ctime; /* time of last change */ }; |
The integral datatypes are conforming to the definitions given in the approriate section ((refer to Section D.7.11.1 Integral datatypes, for details) so this structure is of size 64 bytes.
The values of several fields have a restricted meaning and/or range of values.
st_dev: 0 file 1 console st_ino: No valid meaning for the target. Transmitted unchanged. st_mode: Valid mode bits are described in Appendix C. Any other bits have currently no meaning for the target. st_uid: No valid meaning for the target. Transmitted unchanged. st_gid: No valid meaning for the target. Transmitted unchanged. st_rdev: No valid meaning for the target. Transmitted unchanged. st_atime, st_mtime, st_ctime: These values have a host and file system dependent accuracy. Especially on Windows hosts the file systems don't support exact timing values. |
The target gets a struct stat of the above representation and is responsible to coerce it to the target representation before continuing.
Note that due to size differences between the host and target representation of stat members, these members could eventually get truncated on the target.
The buffer of type struct timeval used by the target and gdb is defined as follows:
struct timeval { time_t tv_sec; /* second */ long tv_usec; /* microsecond */ }; |
The integral datatypes are conforming to the definitions given in the approriate section ((refer to Section D.7.11.1 Integral datatypes, for details) so this structure is of size 8 bytes.
The following values are used for the constants inside of the protocol. gdb and target are resposible to translate these values before and after the call as needed.
All values are given in hexadecimal representation.
O_RDONLY 0x0 O_WRONLY 0x1 O_RDWR 0x2 O_APPEND 0x8 O_CREAT 0x200 O_TRUNC 0x400 O_EXCL 0x800 |
All values are given in octal representation.
S_IFREG 0100000 S_IFDIR 040000 S_IRUSR 0400 S_IWUSR 0200 S_IXUSR 0100 S_IRGRP 040 S_IWGRP 020 S_IXGRP 010 S_IROTH 04 S_IWOTH 02 S_IXOTH 01 |
All values are given in decimal representation.
EPERM 1 ENOENT 2 EINTR 4 EBADF 9 EACCES 13 EFAULT 14 EBUSY 16 EEXIST 17 ENODEV 19 ENOTDIR 20 EISDIR 21 EINVAL 22 ENFILE 23 EMFILE 24 EFBIG 27 ENOSPC 28 ESPIPE 29 EROFS 30 ENAMETOOLONG 91 EUNKNOWN 9999 |
EUNKNOWN is used as a fallback error value if a host system returns any error value not in the list of supported error numbers.
Example sequence of a write call, file descriptor 3, buffer is at target address 0x1234, 6 bytes should be written:
<- Fwrite,3,1234,6 request memory read from target -> m1234,6 <- XXXXXX return "6 bytes written" -> F6 |
Example sequence of a read call, file descriptor 3, buffer is at target address 0x1234, 6 bytes should be read:
<- Fread,3,1234,6 request memory write to target -> X1234,6:XXXXXX return "6 bytes read" -> F6 |
Example sequence of a read call, call fails on the host due to invalid file descriptor (EBADF):
<- Fread,3,1234,6 -> F-1,9 |
Example sequence of a read call, user presses Ctrl-C before syscall on host is called:
<- Fread,3,1234,6 -> F-1,4,C <- T02 |
Example sequence of a read call, user presses Ctrl-C after syscall on host is called:
<- Fread,3,1234,6 -> X1234,6:XXXXXX <- T02 |