Chapter 4. Development

Table of Contents

General
Why does lseek always fail?
What are "precomps"?
That's great, but how do I fix my precomps errors under Darwin?
How do I create a character device to Darwin?
Where are the major device numbers defined in Darwin?
How do I build fat (multiple architecture) binaries?
configure can't find my platform! How do I make it recognize Darwin?
How do I make lookupd invalidate its cache?
What is prebinding dynamic libraries?
From a network interface driver, how do I remove routes through that interface?
Does Darwin support AltiVec extensions?
What's the difference between all the kernel log functions?
What does the debug kernel argument do?
How can I determine how many processors are present?
How do I port shared libraries to Darwin?
How do I get a symbolic backtrace with my Kernel Module?
Why doesn't the link-editor find symbols in my static libraries?
How do I tell the difference between Darwin and OS X?
What is kld?
I'd like to become familiar with PowerPC assembly language. Any recommendations on good books for this topic?
Building the Kernel (xnu)
How do I build xnu, the Darwin kernel?
How do I build a "fat" kernel (i386 and ppc kernel)?
How do I install my new kernel?
My new kernel doesn't boot, what do I do now?
How do I configure my kernel?
After building xnu, is there anything else I should do?
Performance tuning
How can Shark be used to profile the code in a driver that exists solely in supervisor space?
When I run Shark on my code, it can't look up the symbols in my driver. Every line in the profile has the form "NO_SYMBOL_0xSOMEHEXADDR". How do I fix this?
Double-clicking on a trouble spot in Shark doesn't bring up the offending source code. How do I fix this

General

Why does lseek always fail?

Under Darwin, you always need to #include < unistd.h > , otherwise the compiler won't know how to handle the offset parameter. Specifically, off_t is a 64bit number, but the compiler doesn't know this, so assumes it to be a 32 bit number. Using the compiler flag -Wall will give you a warning on this.

What are "precomps"?

I had problems building some Darwin kernel modules, and it was always complaining about TrustedPrecomps.txt and would always die. Fortunately, Stan Shebs of Apple helped me out, and here was his reply:

"Precomps" are the version of precompiled headers implemented originally by NeXT. They're built and processed using a special preprocessor cpp-precomp that is like a normal C/ObjC preprocessor, except that it can read (and write) precompiled headers, which are binary files with predigested tokens and dependency info.

The theory of its operation is that it a) processes the precomp faster than the equivalent set of text files, and b) it knows what variables, decls, etc are actually used in the sources, and omits the decls of anything not actually used. So cpp-precomp can take in 100,000 lines of header plus 500 lines of source file, but put out maybe 2,000 lines of headers plus the 500 lines of source, which makes the main pass of the compiler go by more quickly.

In practice, compilation of Cocoa programs tends to speed up by a factor of four when using precomps, which is why we like them. On the downside, the dependency analysis requires full parsing and some semantic analysis, and that's why it's only been done for C and Objective-C; no C++.

Unfortunately, cpp-precomp is not included in Darwin. Fortunately, the compiler has been tweaked to not require it, although if you look closely at cc -v, you'll see a -smart option being passed to cpp, which is a cpp-precomp option that cpp has been modified to ignore.

Another cpp-precomp option is -precomp-trustfile, which specifies a file containing names of precomps that have already been validated (to make sure that real .h files aren't newer than the precompiled versions), thus making compiles going a little faster. As with -smart, regular cpp has no use for this option.

That's great, but how do I fix my precomps errors under Darwin?

Darwin doesn't support precomps, but some of the Makefiles distributed with 1.0.2 and earlier define precomps to be used. You can fix this by a couple of methods:

  • You can make a Makefile.postamble with the following line:

    PRECOMP_CFLAGS =

    This will cause no precomp flags to be passed to the compiler. This method was suggested by Ryan Nelson.

  • You can edit the original Makefile that is shipped with the system, and fix it globally. You'll have to edit /System/Developer/Makefiles/pb_makefiles/flags.make and comment out any line containing -precomp.

How do I create a character device to Darwin?

Good question.

Where are the major device numbers defined in Darwin?

Darwin defines all their major device numbers in xnu/bsd/dev/{arch}/conf.c.

How do I build fat (multiple architecture) binaries?

The -arch flag specifies the desired architecture that the binary is to be built for. So, to build a binary for multiple architectures, specify multiple -arch arguments:

cc -arch i386 -arch ppc -o foo foo.c

In order to use this, you'll need a compiler capable of generating code for the architecture you're targetting, a fat dyld, crt1.o, libSystem.B.dylib and fat versions of whatever libraries you're going to link against. Mac OS X does not include fat versions of any binaries, and the gcc included with the DevTools is only capable of targetting the ppc architecture.

configure can't find my platform! How do I make it recognize Darwin?

The default config.guess and config.sub scripts that people ship in autoconf enabled packages don't know about Darwin, so you need to replace them with scripts that do recognize Darwin. You can find updated versions of these scripts at gnu.org.

How do I make lookupd invalidate its cache?

Luke Howard sent in this bit of code that will make lookupd invalidate it's cache:

int i, proc;
char str[32];
mach_port_t port;

port = _lookupd_port(0);
_lookup_link(port, "_invalidatecache", &proc);
_lookup_one(port, proc, NULL, 0, &str, &i);

Under Mac OS X, you'll have to use port_t instead of mach_port_t.

What is prebinding dynamic libraries?

Prebinding your dynamic libraries makes them depend on the system's current library versions and locations. This makes the executable faster when it is starting, but library versions and locations are liable to change between OS versions. When library versions and locations change, the prebinding information is no longer valid and is ignored. From that point on, the prebinding information is just taking up space.

You can, however, update your prebinding information with the redo_prebinding command.

From a network interface driver, how do I remove routes through that interface?

Here is a little snippit of code from Stefan Arentz that will delete all routes through an interface so that it can be safely unloaded from the system:

for (ifa = ifp->if_addrhead.tqh_first; ifa; ifa = ifa->ifa_link.tqe_next) {
  if (ifa->ifa_addr->sa_family == AF_INET) {
    log(LOG_INFO, "module_stop: removing routes to %s%d\n", ifp->if_name, ifp->if_unit);
    rtinit(ifa, RTM_DELETE, RTF_HOST);
    rtinit(ifa, RTM_DELETE, 0);
  }
}

Just before the point where protocols and interfaces are detached.

Does Darwin support AltiVec extensions?

Yes, the gcc with Darwin can build AltiVec code, which is enabled with the -fvec switch. Stan Shebs is working on merging this back into the FSF gcc tree, but it is a tremendous amount of work.

What's the difference between all the kernel log functions?

Justin Walker provided an excellent explaination on the development list:

Use of the various kernel print mechanisms: log(), kprintf(), printf()

SYSLOG: The kernel function log() posts messages to a kernel buffer where they are picked up by the syslogd daemon and in turn written to a destination specified in /etc/syslog.conf. The mechanisms are discussed in man syslog, man syslogd, and man syslog.conf.

KPRINTF: The kernel function kprintf() will normally be disabled, but can be enabled with an argument to the boot command (see below). If enabled, this function will display its output on the system's serial line (modem port).

PRINTF: The kernel function printf() differs from kprintf() in the way the printouts are handled. "Printf" strings will be placed in the system log file (/var/log/system.log) in all cases. If the kernel boots in "verbose" mode, they will display on the video console until the window server comes up. When the user has "logged in" as 'console', or is otherwise not using the window manager, the strings will also display on the video monitor.

What does the debug kernel argument do?

The following values affect how debugging support functions on the system. It can be set with the nvram command or in the "additional args" window of SystemDisk. Note that for later G3 and G4 systems, you modify the boot-args variable; otherwise, the boot-command variable, if you use either the nvram command or the OpenFirmware console at boot time.

  • bit 0 - 1 early breakpoint

  • bit 1 - 1 enables debug print

  • bit 2 - 1 enables NMI (cmd-power)

  • bit 3 - 1 enables kprintfs

These bigs are set in the variable debug, for example as follows (as root):

nvram boot-command='0 bootr debug=VVV'

assuming that boot-command was initially 0 bootr

Early Break Point: this will halt the system after the kernel is loaded, and as soon after the initial operation as the network device can support the debugger.

Debug Print: this enables debugger output to overwrite the window-manager's screen. Normally, output from the kernel is not displayed when the window manager is active. Note that this does not affect the output of printf(), which is only visible in "console mode".

NMI (cmd-power)>: this key sequence will invoke the debugger, causing it to await a debugger connection, or signal the debugger if already connected. As will all kernel debugging, there can be circumstances where this is ineffective.

kprintfs: normally, these are not enabled. When enabled, the output is sent to the serial line (modem port), at a speed of 38400 baud.

How can I determine how many processors are present?

You can use the Mach host_info call in < mach/mach_host.h >

How do I port shared libraries to Darwin?

Unfortunatly, at the moment, there is no straightforward way. Since the defacto standard dl* functions are based on ELF and Darwin uses Mach-O binaries, there is no simple port. Darwin uses the NeXT dyld dynamic link editor, and there is a man page with plenty of good information. There are also higher level functions that are probably closer to what you want (NSObjectFileImage for example). Unfortunatly, the man pages for these functions are not in the Darwin CVS tree.

For examples, you can look at the Python dynamic loading. Specifically, the Python/dynload_next.c file.

How do I get a symbolic backtrace with my Kernel Module?

You need to generate a symbol file for the module, which GDB will read for presenting the symbolic information. Justin Walker explains:

kmodsyms -k kernel -o output-symbol-file kmod@loadaddr

Will dump a file (named whatever you chose as the arg to the -o flag). The kernel value needs to be (a copy of) the kernel running on your target system, and loadaddr is the address at which the module is loaded (e.g., 0x40cc000, below).

I believe that gdb itself doesn't have quite the same rules for relocation that kmods have, which is why we have the kmodsyms program. This can be executed either on the target system (and you then copy the symbol file to your debugging system) or on the debugging system (which must have a copy of the target system's kernel). Note that the kernel, in this case, does *not* have to be running on the debugging system; you just need a copy of the correct kernel on whatever system you run kmodsyms.

Why doesn't the link-editor find symbols in my static libraries?

If you're using Makefiles developed in a primarily Linux- (or Solaris-) based environment, you may have link commands that look like this:

cc -Lsome/path -Lother/path -lfoo -lbar... -o program file1.o file2.o...

To work properly across Linux, Darwin and other Unices, the link command needs to be rewritten like this:

cc -o program file1.o file2.o... -Lsome/path -Lother/path -lfoo -lbar...

The reason for putting the libraries last is that the Darwin link-editor loads files and libraries in the exact order given on the command line, and, in the case of static libraries, examines them once, looking for symbols that are undefined *at the moment the library is loaded*. If it doesn't find any currently undefined symbols in the library, the forgets all about the library.

In the case of the first gcc command above, the link-editor hasn't looked at any of your .o files when it looks at the static libraries, so *no* symbols are currently undefined, and *all* of the libraries are forgotten. The link-editor needs to encounter your .o files before the static libraries, as in the second gcc command, to function properly and find all the necessary symbols in your static libraries.

Contributed by Scott Hallock

How do I tell the difference between Darwin and OS X?

The short answer is that you shouldn't try. You should check for the specific functionality you're looking for. Just because the kernel used is from Mac OS X doesn't tell you if the libraries you're looking for are there.

What is kld?

kld is the kernel loader, a complete loader/linker in the kernel for loading modules early in the boot process.

I'd like to become familiar with PowerPC assembly language. Any recommendations on good books for this topic?

There are several books and papers available that cover the basics of PowerPC assembly. The following should get you started:

Searching for "powerpc assembly guide" can lead to many additional sites containing worthwhile information.