4.2 Restrictions

Throughout the kernel there are access restrictions relating to jailed processes. Usually, these restrictions only check if the process is jailed, and if so, returns an error. For example:

if (p->p_prison) 
        return EPERM;

4.2.1 SysV IPC

System V IPC is based on messages. Processes can send each other these messages which tell them how to act. The functions which deal with messages are: msgsys, msgctl, msgget, msgsend and msgrcv. Earlier, I mentioned that there were certain sysctls you could turn on or off in order to affect the behavior of Jail. One of these sysctls was jail_sysvipc_allowed. On most systems, this sysctl is set to 0. If it were set to 1, it would defeat the whole purpose of having a jail; privleged users from within the jail would be able to affect processes outside of the environment. The difference between a message and a signal is that the message only consists of the signal number.

/usr/src/sys/kern/sysv_msg.c:

In each of these system calls, there is this conditional:

/usr/src/sys/kern/sysv msg.c:
if (!jail.sysvipc.allowed && p->p_prison != NULL)
        return (ENOSYS);

Semaphore system calls allow processes to synchronize execution by doing a set of operations atomically on a set of semaphores. Basically semaphores provide another way for processes lock resources. However, process waiting on a semaphore, that is being used, will sleep until the resources are relinquished. The following semaphore system calls are blocked inside a jail: semsys, semget, semctl and semop.

/usr/src/sys/kern/sysv_sem.c:

System V IPC allows for processes to share memory. Processes can communicate directly with each other by sharing parts of their virtual address space and then reading and writing data stored in the shared memory. These system calls are blocked within a jailed environment: shmdt, shmat, oshmctl, shmctl, shmget, and shmsys.

/usr/src/sys/kern/sysv shm.c:

4.2.2 Sockets

Jail treats the socket(2) system call and related lower-level socket functions in a special manner. In order to determine whether a certain socket is allowed to be created, it first checks to see if the sysctl jail.socket.unixiproute.only is set. If set, sockets are only allowed to be created if the family specified is either PF_LOCAL, PF_INET or PF_ROUTE. Otherwise, it returns an error.

/usr/src/sys/kern/uipc_socket.c:
int socreate(dom, aso, type, proto, p) 
... 
register struct protosw *prp; 
... 
{
        if (p->p_prison && jail_socket_unixiproute_only &&
            prp->pr_domain->dom_family != PR_LOCAL && prp->pr_domain->dom_family != PF_INET 
            && prp->pr_domain->dom_family != PF_ROUTE)
                return (EPROTONOSUPPORT); 
...
}

4.2.3 Berkeley Packet Filter

The Berkeley Packet Filter provides a raw interface to data link layers in a protocol independent fashion. The function bpfopen() opens an Ethernet device. There is a conditional which disallows any jailed processes from accessing this function.

/usr/src/sys/net/bpf.c: 
static int bpfopen(dev, flags, fmt, p) 
... 
{
        if (p->p_prison) 
                return (EPERM);
...
}

4.2.4 Protocols

There are certain protocols which are very common, such as TCP, UDP, IP and ICMP. IP and ICMP are on the same level: the network layer 2. There are certain precautions which are taken in order to prevent a jailed process from binding a protocol to a certain port only if the nam parameter is set. nam is a pointer to a sockaddr structure, which describes the address on which to bind the service. A more exact definition is that sockaddr "may be used as a template for reffering to the identifying tag and length of each address"[2]. In the function in pcbbind, sin is a pointer to a sockaddr.in structure, which contains the port, address, length and domain family of the socket which is to be bound. Basically, this disallows any processes from jail to be able to specify the domain family.

/usr/src/sys/kern/netinet/in_pcb.c: 
int in.pcbbind(int, nam, p) 
...
        struct sockaddr *nam; 
        struct proc *p; 
{
        ... 
        struct sockaddr.in *sin; 
        ... 
        if (nam) {
                sin = (struct sockaddr.in *)nam; 
                ... 
                if (sin->sin_addr.s_addr != INADDR_ANY) 
                       if (prison.ip(p, 0, &sin->sin.addr.s_addr)) 
                              return (EINVAL); 
                ....
        }
...
}

You might be wondering what function prison_ip() does. prison.ip is given three arguments, the current process (represented by p), any flags, and an ip address. It returns 1 if the ip address belongs to a jail or 0 if it does not. As you can see from the code, if it is indeed an ip address belonging to a jail, the protcol is not allowed to bind to a certain port.

/usr/src/sys/kern/kern_jail.c:
int prison_ip(struct proc *p, int flag, u_int32_t *ip) {
        u_int32_t tmp;

       if (!p->p_prison) 
              return (0); 
       if (flag) 
              tmp = *ip; 
       else tmp = ntohl (*ip); 

       if (tmp == INADDR_ANY) {
              if (flag) 
                     *ip = p->p_prison->pr_ip; 
              else *ip = htonl(p->p_prison->pr_ip); 
              return (0); 
       }

       if (p->p_prison->pr_ip != tmp) 
              return (1); 
       return (0); 
}

Jailed users are not allowed to bind services to an ip which does not belong to the jail. The restriction is also written within the function in_pcbbind:

/usr/src/sys/net inet/in_pcb.c
        if (nam) {
               ... 
               lport = sin->sin.port; 
               ... if (lport) { 
                          ... 
                         if (p && p->p_prison)
                                prison = 1; 
                         if (prison &&
                             prison_ip(p, 0, &sin->sin_addr.s_addr))
                        return (EADDRNOTAVAIL);

4.2.5 Filesystem

Even root users within the jail are not allowed to set any file flags, such as immutable, append, and no unlink flags, if the securelevel is greater than 0.

/usr/src/sys/ufs/ufs/ufs_vnops.c:
int ufs.setattr(ap) 
        ... 
{
        if ((cred->cr.uid == 0) && (p->prison == NULL)) {
            if ((ip->i_flags 
                     & (SF_NOUNLINK | SF_IMMUTABLE | SF_APPEND)) && 
                     securelevel > 0)
               return (EPERM);
}

This, and other documents, can be downloaded from ftp://ftp.FreeBSD.org/pub/FreeBSD/doc/.

For questions about FreeBSD, read the documentation before contacting <[email protected]>.
For questions about this documentation, e-mail <[email protected]>.