6.16 Using and Debugging DragonFly ACPI

Written by Nate Lawson. With contributions from Peter Schultz and Tom Rhodes.

ACPI is a fundamentally new way of discovering devices, managing power usage, and providing standardized access to various hardware previously managed by the BIOS. Progress is being made toward ACPI working on all systems, but bugs in some motherboards' ACPI Machine Language (AML) bytecode, incompleteness in DragonFly's kernel subsystems, and bugs in the Intel ACPI-CA interpreter continue to appear.

This document is intended to help you assist the DragonFly ACPI maintainers in identifying the root cause of problems you observe and debugging and developing a solution. Thanks for reading this and we hope we can solve your system's problems.

6.16.1 Submitting Debugging Information

Note: Before submitting a problem, be sure you are running the latest BIOS version and, if available, embedded controller firmware version.

For those of you that want to submit a problem right away, please send the following information to bugs

6.16.2 Background

ACPI is present in all modern computers that conform to the ia32 (x86), ia64 (Itanium), and amd64 (AMD) architectures. The full standard has many features including CPU performance management, power planes control, thermal zones, various battery systems, embedded controllers, and bus enumeration. Most systems implement less than the full standard. For instance, a desktop system usually only implements the bus enumeration parts while a laptop might have cooling and battery management support as well. Laptops also have suspend and resume, with their own associated complexity.

An ACPI-compliant system has various components. The BIOS and chipset vendors provide various fixed tables (e.g., FADT) in memory that specify things like the APIC map (used for SMP), config registers, and simple configuration values. Additionally, a table of bytecode (the Differentiated System Description Table DSDT) is provided that specifies a tree-like name space of devices and methods.

The ACPI driver must parse the fixed tables, implement an interpreter for the bytecode, and modify device drivers and the kernel to accept information from the ACPI subsystem. For DragonFly, Intel has provided an interpreter (ACPI-CA) that is shared with Linux and NetBSD®. The path to the ACPI-CA source code is src/sys/contrib/dev/acpica-unix-YYYYMMDD, where YYYYMMDD is the release date of the ACPI-CA source code. The glue code that allows ACPI-CA to work on DragonFly is in src/sys/dev/acpica5/Osd. Finally, drivers that implement various ACPI devices are found in src/sys/dev/acpica5, and architecture-dependent code resides in /sys/arch/acpica5.

6.16.3 Common Problems

For ACPI to work correctly, all the parts have to work correctly. Here are some common problems, in order of frequency of appearance, and some possible workarounds or fixes.

6.16.3.1 Suspend/Resume

ACPI has three suspend to RAM (STR) states, S1-S3, and one suspend to disk state (STD), called S4. S5 is ``soft off'' and is the normal state your system is in when plugged in but not powered up. S4 can actually be implemented two separate ways. S4BIOS is a BIOS-assisted suspend to disk. S4OS is implemented entirely by the operating system.

Start by checking sysctl hw.acpi for the suspend-related items. Here are the results for my Thinkpad:

hw.acpi.supported_sleep_state: S3 S4 S5
hw.acpi.s4bios: 0

This means that I can use acpiconf -s to test S3, S4OS, and S5. If s4bios was one (1), I would have S4BIOS support instead of S4 OS.

When testing suspend/resume, start with S1, if supported. This state is most likely to work since it doesn't require much driver support. No one has implemented S2 but if you have it, it's similar to S1. The next thing to try is S3. This is the deepest STR state and requires a lot of driver support to properly reinitialize your hardware. If you have problems resuming, feel free to email the bugs list but do not expect the problem to be resolved since there are a lot of drivers/hardware that need more testing and work.

To help isolate the problem, remove as many drivers from your kernel as possible. If it works, you can narrow down which driver is the problem by loading drivers until it fails again. Typically binary drivers like nvidia.ko, X11 display drivers, and USB will have the most problems while Ethernet interfaces usually work fine. If you can load/unload the drivers ok, you can automate this by putting the appropriate commands in /etc/rc.suspend and /etc/rc.resume. There is a commented-out example for unloading and loading a driver. Try setting hw.acpi.reset_video to zero (0) if your display is messed up after resume. Try setting longer or shorter values for hw.acpi.sleep_delay to see if that helps.

Another thing to try is load a recent Linux distribution with ACPI support and test their suspend/resume support on the same hardware. If it works on Linux, it's likely a DragonFly driver problem and narrowing down which driver causes the problems will help us fix the problem. Note that the ACPI maintainers do not usually maintain other drivers (e.g sound, ATA, etc.) so any work done on tracking down a driver problem should probably eventually be posted to the bugs list and mailed to the driver maintainer. If you are feeling adventurous, go ahead and start putting some debugging printf(3)s in a problematic driver to track down where in its resume function it hangs.

Finally, try disabling ACPI and enabling APM instead. If suspend/resume works with APM, you may be better off sticking with APM, especially on older hardware (pre-2000). It took vendors a while to get ACPI support correct and older hardware is more likely to have BIOS problems with ACPI.

6.16.3.2 System Hangs (temporary or permanent)

Most system hangs are a result of lost interrupts or an interrupt storm. Chipsets have a lot of problems based on how the BIOS configures interrupts before boot, correctness of the APIC (MADT) table, and routing of the System Control Interrupt (SCI).

Interrupt storms can be distinguished from lost interrupts by checking the output of vmstat -i and looking at the line that has acpi0. If the counter is increasing at more than a couple per second, you have an interrupt storm. If the system appears hung, try breaking to DDB (CTRL+ALT+ESC on console) and type show interrupts.

Your best hope when dealing with interrupt problems is to try disabling APIC support with hint.apic.0.disabled="1" in loader.conf.

6.16.3.3 Panics

Panics are relatively rare for ACPI and are the top priority to be fixed. The first step is to isolate the steps to reproduce the panic (if possible) and get a backtrace. Follow the advice for enabling options DDB and setting up a serial console (see Section 17.6.5.3) or setting up a dump(8) partition. You can get a backtrace in DDB with tr. If you have to handwrite the backtrace, be sure to at least get the lowest five (5) and top five (5) lines in the trace.

Then, try to isolate the problem by booting with ACPI disabled. If that works, you can isolate the ACPI subsystem by using various values of debug.acpi.disable. See the acpi(4) manual page for some examples.

6.16.3.4 System Powers Up After Suspend or Shutdown

First, try setting hw.acpi.disable_on_poweroff=``0'' in loader.conf(5). This keeps ACPI from disabling various events during the shutdown process. Some systems need this value set to ``1'' (the default) for the same reason. This usually fixes the problem of a system powering up spontaneously after a suspend or poweroff.

6.16.3.5 Other Problems

If you have other problems with ACPI (working with a docking station, devices not detected, etc.), please email a description to the mailing list as well; however, some of these issues may be related to unfinished parts of the ACPI subsystem so they might take a while to be implemented. Please be patient and prepared to test patches we may send you.

6.16.4 ASL, acpidump, and IASL

The most common problem is the BIOS vendors providing incorrect (or outright buggy!) bytecode. This is usually manifested by kernel console messages like this:

ACPI-1287: *** Error: Method execution failed [\\_SB_.PCI0.LPC0.FIGD._STA] \\
(Node 0xc3f6d160), AE_NOT_FOUND

Often, you can resolve these problems by updating your BIOS to the latest revision. Most console messages are harmless but if you have other problems like battery status not working, they're a good place to start looking for problems in the AML. The bytecode, known as AML, is compiled from a source language called ASL. The AML is found in the table known as the DSDT. To get a copy of your ASL, use acpidump(8). You should use both the -t (show contents of the fixed tables) and -d (disassemble AML to ASL) options. See the Submitting Debugging Information section for an example syntax.

The simplest first check you can do is to recompile your ASL to check for errors. Warnings can usually be ignored but errors are bugs that will usually prevent ACPI from working correctly. To recompile your ASL, issue the following command:

# iasl your.asl

6.16.5 Fixing Your ASL

In the long run, our goal is for almost everyone to have ACPI work without any user intervention. At this point, however, we are still developing workarounds for common mistakes made by the BIOS vendors. The Microsoft interpreter (acpi.sys and acpiec.sys) does not strictly check for adherence to the standard, and thus many BIOS vendors who only test ACPI under Windows never fix their ASL. We hope to continue to identify and document exactly what non-standard behavior is allowed by Microsoft's interpreter and replicate it so DragonFly can work without forcing users to fix the ASL. As a workaround and to help us identify behavior, you can fix the ASL manually. If this works for you, please send a diff(1) of the old and new ASL so we can possibly work around the buggy behavior in ACPI-CA and thus make your fix unnecessary.

Here is a list of common error messages, their cause, and how to fix them:

6.16.5.1 _OS dependencies

Some AML assumes the world consists of various Windows versions. You can tell DragonFly to claim it is any OS to see if this fixes problems you may have. An easy way to override this is to set hw.acpi.osname=``Windows 2001'' in /boot/loader.conf or other similar strings you find in the ASL.

6.16.5.2 Missing Return statements

Some methods do not explicitly return a value as the standard requires. While ACPI-CA does not handle this, DragonFly has a workaround that allows it to return the value implicitly. You can also add explicit Return statements where required if you know what value should be returned. To force iasl to compile the ASL, use the -f flag.

6.16.5.3 Overriding the Default AML

After you customize your.asl, you will want to compile it, run:

# iasl your.asl

You can add the -f flag to force creation of the AML, even if there are errors during compilation. Remember that some errors (e.g., missing Return statements) are automatically worked around by the interpreter.

DSDT.aml is the default output filename for iasl. You can load this instead of your BIOS's buggy copy (which is still present in flash memory) by editing /boot/loader.conf as follows:

acpi_dsdt_load="YES"
acpi_dsdt_name="/boot/DSDT.aml"

Be sure to copy your DSDT.aml to the /boot directory.

6.16.6 Getting Debugging Output From ACPI

The ACPI driver has a very flexible debugging facility. It allows you to specify a set of subsystems as well as the level of verbosity. The subsystems you wish to debug are specified as ``layers'' and are broken down into ACPI-CA components (ACPI_ALL_COMPONENTS) and ACPI hardware support (ACPI_ALL_DRIVERS). The verbosity of debugging output is specified as the ``level'' and ranges from ACPI_LV_ERROR (just report errors) to ACPI_LV_VERBOSE (everything). The ``level'' is a bitmask so multiple options can be set at once, separated by spaces. In practice, you will want to use a serial console to log the output if it is so long it flushes the console message buffer.

Debugging output is not enabled by default. To enable it, add options ACPI_DEBUG to your kernel config if ACPI is compiled into the kernel. You can add ACPI_DEBUG=1 to your /etc/make.conf to enable it globally. If it is a module, you can recompile just your acpi.ko module as follows:

# cd /sys/dev/acpica5
&& make clean &&
make ACPI_DEBUG=1

Install acpi.ko in /boot/kernel and add your desired level and layer to loader.conf. This example enables debug messages for all ACPI-CA components and all ACPI hardware drivers (CPU, LID, etc.) It will only output error messages, the least verbose level.

debug.acpi.layer="ACPI_ALL_COMPONENTS ACPI_ALL_DRIVERS"
debug.acpi.level="ACPI_LV_ERROR"

If the information you want is triggered by a specific event (say, a suspend and then resume), you can leave out changes to loader.conf and instead use sysctl to specify the layer and level after booting and preparing your system for the specific event. The sysctls are named the same as the tunables in loader.conf.

6.16.7 References

More information about ACPI may be found in the following locations:

Contact the Documentation mailing list for comments, suggestions and questions about this document.