Those who know don't talk.
Those who talk don't know.
Sometimes, PHP "as is" simply isn't enough. Although these cases are rare for the average user, professional applications will soon lead PHP to the edge of its capabilities, in terms of either speed or functionality. New functionality cannot always be implemented natively due to language restrictions and inconveniences that arise when having to carry around a huge library of default code appended to every single script, so another method needs to be found for overcoming these eventual lacks in PHP.
As soon as this point is reached, it's time to touch the heart of PHP and take a look at its core, the C code that makes PHP go.
This information is currently rather outdated, parts of it only cover early stages of the ZendEngine 1.0 API as it was used in early versions of PHP 4.
More recent information may be found in the various README files that come with the PHP source and the » Internals section on the Zend website.
"Extending PHP" is easier said than done. PHP has evolved to a full-fledged tool consisting of a few megabytes of source code, and to hack a system like this quite a few things have to be learned and considered. When structuring this chapter, we finally decided on the "learn by doing" approach. This is not the most scientific and professional approach, but the method that's the most fun and gives the best end results. In the following sections, you'll learn quickly how to get the most basic extensions to work almost instantly. After that, you'll learn about Zend's advanced API functionality. The alternative would have been to try to impart the functionality, design, tips, tricks, etc. as a whole, all at once, thus giving a complete look at the big picture before doing anything practical. Although this is the "better" method, as no dirty hacks have to be made, it can be very frustrating as well as energy- and time-consuming, which is why we've decided on the direct approach.
Note that even though this chapter tries to impart as much knowledge as possible about the inner workings of PHP, it's impossible to really give a complete guide to extending PHP that works 100% of the time in all cases. PHP is such a huge and complex package that its inner workings can only be understood if you make yourself familiar with it by practicing, so we encourage you to work with the source.
The name Zend refers to the language engine, PHP's core. The term PHP refers to the complete system as it appears from the outside. This might sound a bit confusing at first, but it's not that complicated ( see below). To implement a Web script interpreter, you need three parts:
The interpreter part analyzes the input code, translates it, and executes it.
The functionality part implements the functionality of the language (its functions, etc.).
The interface part talks to the Web server, etc.
The following sections discuss where PHP can be extended and how it's done.
As shown above, PHP can be extended primarily at three points: external modules, built-in modules, and the Zend engine. The following sections discuss these options.
External modules can be loaded at script runtime using the function dl(). This function loads a shared object from disk and makes its functionality available to the script to which it's being bound. After the script is terminated, the external module is discarded from memory. This method has both advantages and disadvantages, as described in the following table:
Advantages | Disadvantages | |
External modules don't require recompiling of PHP. | The shared objects need to be loaded every time a script is being executed (every hit), which is very slow. | |
The size of PHP remains small by "outsourcing" certain functionality. | External additional files clutter up the disk. | |
Every script that wants to use an external module's functionality has to specifically include a call to dl(), or the extension tag in php.ini needs to be modified (which is not always a suitable solution). |
Third parties might consider using the extension tag in php.ini to create additional external modules to PHP. These external modules are completely detached from the main package, which is a very handy feature in commercial environments. Commercial distributors can simply ship disks or archives containing only their additional modules, without the need to create fixed and solid PHP binaries that don't allow other modules to be bound to them.
Built-in modules are compiled directly into PHP and carried around with every PHP process; their functionality is instantly available to every script that's being run. Like external modules, built-in modules have advantages and disadvantages, as described in the following table:
Advantages | Disadvantages |
No need to load the module specifically; the functionality is instantly available. | Changes to built-in modules require recompiling of PHP. |
No external files clutter up the disk; everything resides in the PHP binary. | The PHP binary grows and consumes more memory. |
Of course, extensions can also be implemented directly in the Zend engine. This strategy is good if you need a change in the language behavior or require special functions to be built directly into the language core. In general, however, modifications to the Zend engine should be avoided. Changes here result in incompatibilities with the rest of the world, and hardly anyone will ever adapt to specially patched Zend engines. Modifications can't be detached from the main PHP sources and are overridden with the next update using the "official" source repositories. Therefore, this method is generally considered bad practice and, due to its rarity, is not covered in this book.
Note: Prior to working through the rest of this chapter, you should retrieve clean, unmodified source trees of your favorite Web server. We're working with Apache (available at » http://httpd.apache.org/) and, of course, with PHP (available at » http://www.php.net/ - does it need to be said?).
Make sure that you can compile a working PHP environment by yourself! We won't go into this issue here, however, as you should already have this most basic ability when studying this chapter.
Before we start discussing code issues, you should familiarize yourself with the source tree to be able to quickly navigate through PHP's files. This is a must-have ability to implement and debug extensions.
The following table describes the contents of the major directories.
Directory | Contents |
php-src | Main PHP source files and main header files; here you'll find all of PHP's API definitions, macros, etc. (important). Everything else is below this directory. |
php-src/ext | Repository for dynamic and built-in modules; by default, these are the "official" PHP modules that have been integrated into the main source tree. From PHP 4.0, it's possible to compile these standard extensions as dynamic loadable modules (at least, those that support it). |
php-src/main | This directory contains the main php macros and definitions. (important) |
php-src/pear | Directory for the PHP Extension and Application Repository. This directory contains core PEAR files. |
php-src/sapi | Contains the code for the different server abstraction layers. |
TSRM | Location of the "Thread Safe Resource Manager" (TSRM) for Zend and PHP. |
ZendEngine2 | Location of the Zend Engine files; here you'll find all of Zend's API definitions, macros, etc. (important). |
Discussing all the files included in the PHP package is beyond the scope of this chapter. However, you should take a close look at the following files:
php-src/main/php.h, located in the main PHP directory. This file contains most of PHP's macro and API definitions.
php-src/Zend/zend.h, located in the main Zend directory. This file contains most of Zend's macros and definitions.
php-src/Zend/zend_API.h, also located in the Zend directory, which defines Zend's API.
Zend is built using certain conventions; to avoid breaking its standards, you should follow the rules described in the following sections.
For almost every important task, Zend ships predefined macros that are extremely handy. The tables and figures in the following sections describe most of the basic functions, structures, and macros. The macro definitions can be found mainly in zend.h and zend_API.h. We suggest that you take a close look at these files after having studied this chapter. (Although you can go ahead and read them now, not everything will make sense to you yet.)
Resource management is a crucial issue, especially in server software. One of the most valuable resources is memory, and memory management should be handled with extreme care. Memory management has been partially abstracted in Zend, and you should stick to this abstraction for obvious reasons: Due to the abstraction, Zend gets full control over all memory allocations. Zend is able to determine whether a block is in use, automatically freeing unused blocks and blocks with lost references, and thus prevent memory leaks. The functions to be used are described in the following table:
Function | Description |
emalloc() | Serves as replacement for malloc(). |
efree() | Serves as replacement for free(). |
estrdup() | Serves as replacement for strdup(). |
estrndup() | Serves as replacement for strndup(). Faster than estrdup() and binary-safe. This is the recommended function to use if you know the string length prior to duplicating it. |
ecalloc() | Serves as replacement for calloc(). |
erealloc() | Serves as replacement for realloc(). |
To allocate resident memory that survives termination of the current script, you can use malloc() and free(). This should only be done with extreme care, however, and only in conjunction with demands of the Zend API; otherwise, you risk memory leaks.
The following directory and file functions should be used in Zend modules. They behave exactly like their C counterparts, but provide virtual working directory support on the thread level.
Zend Function | Regular C Function |
V_GETCWD() | getcwd() |
V_FOPEN() | fopen() |
V_OPEN() | open() |
V_CHDIR() | chdir() |
V_GETWD() | getwd() |
V_CHDIR_FILE() | Takes a file path as an argument and changes the current working directory to that file's directory. |
V_STAT() | stat() |
V_LSTAT() | lstat() |
Strings are handled a bit differently by the Zend engine than other values such as integers, Booleans, etc., which don't require additional memory allocation for storing their values. If you want to return a string from a function, introduce a new string variable to the symbol table, or do something similar, you have to make sure that the memory the string will be occupying has previously been allocated, using the aforementioned e*() functions for allocation. (This might not make much sense to you yet; just keep it somewhere in your head for now - we'll get back to it shortly.)
Complex types such as arrays and objects require different treatment. Zend features a single API for these types - they're stored using hash tables.
Note: To reduce complexity in the following source examples, we're only working with simple types such as integers at first. A discussion about creating more advanced types follows later in this chapter.
PHP 4 features an automatic build system that's very flexible. All modules reside in a subdirectory of the ext directory. In addition to its own sources, each module consists of a config.m4 file, for extension configuration. (for example, see » http://www.gnu.org/software/m4/)
All these stub files are generated automatically, along with .cvsignore, by a little shell script named ext_skel that resides in the ext directory. As argument it takes the name of the module that you want to create. The shell script then creates a directory of the same name, along with the appropriate stub files.
Step by step, the process looks like this:
:~/cvs/php4/ext:> ./ext_skel --extname=my_module Creating directory my_module Creating basic files: config.m4 .cvsignore my_module.c php_my_module.h CREDITS EXPERIMENTAL tests/001.phpt my_module.php [done]. To use your new extension, you will have to execute the following steps: 1. $ cd .. 2. $ vi ext/my_module/config.m4 3. $ ./buildconf 4. $ ./configure --[with|enable]-my_module 5. $ make 6. $ ./php -f ext/my_module/my_module.php 7. $ vi ext/my_module/my_module.c 8. $ make Repeat steps 3-6 until you are satisfied with ext/my_module/config.m4 and step 6 confirms that your module is compiled into PHP. Then, start writing code and repeat the last two steps as often as necessary.
The default config.m4 shown in The default config.m4. is a bit more complex:
Example #1 The default config.m4.
dnl $Id: build.xml 297078 2010-03-29 16:25:51Z degeberg $ dnl config.m4 for extension my_module dnl Comments in this file start with the string 'dnl'. dnl Remove where necessary. This file will not work dnl without editing. dnl If your extension references something external, use with: dnl PHP_ARG_WITH(my_module, for my_module support, dnl Make sure that the comment is aligned: dnl [ --with-my_module Include my_module support]) dnl Otherwise use enable: dnl PHP_ARG_ENABLE(my_module, whether to enable my_module support, dnl Make sure that the comment is aligned: dnl [ --enable-my_module Enable my_module support]) if test "$PHP_MY_MODULE" != "no"; then dnl Write more examples of tests here... dnl # --with-my_module -> check with-path dnl SEARCH_PATH="/usr/local /usr" # you might want to change this dnl SEARCH_FOR="/include/my_module.h" # you most likely want to change this dnl if test -r $PHP_MY_MODULE/; then # path given as parameter dnl MY_MODULE_DIR=$PHP_MY_MODULE dnl else # search default path list dnl AC_MSG_CHECKING([for my_module files in default path]) dnl for i in $SEARCH_PATH ; do dnl if test -r $i/$SEARCH_FOR; then dnl MY_MODULE_DIR=$i dnl AC_MSG_RESULT(found in $i) dnl fi dnl done dnl fi dnl dnl if test -z "$MY_MODULE_DIR"; then dnl AC_MSG_RESULT([not found]) dnl AC_MSG_ERROR([Please reinstall the my_module distribution]) dnl fi dnl # --with-my_module -> add include path dnl PHP_ADD_INCLUDE($MY_MODULE_DIR/include) dnl # --with-my_module -> chech for lib and symbol presence dnl LIBNAME=my_module # you may want to change this dnl LIBSYMBOL=my_module # you most likely want to change this dnl PHP_CHECK_LIBRARY($LIBNAME,$LIBSYMBOL, dnl [ dnl PHP_ADD_LIBRARY_WITH_PATH($LIBNAME, $MY_MODULE_DIR/lib, MY_MODULE_SHARED_LIBADD) dnl AC_DEFINE(HAVE_MY_MODULELIB,1,[ ]) dnl ],[ dnl AC_MSG_ERROR([wrong my_module lib version or lib not found]) dnl ],[ dnl -L$MY_MODULE_DIR/lib -lm -ldl dnl ]) dnl dnl PHP_SUBST(MY_MODULE_SHARED_LIBADD) PHP_NEW_EXTENSION(my_module, my_module.c, $ext_shared) fi
If you're unfamiliar with M4 files (now is certainly a good time to get familiar), this might be a bit confusing at first; but it's actually quite easy.
Note: Everything prefixed with dnl is treated as a comment and is not parsed.
The config.m4 file is responsible for parsing the command-line options passed to configure at configuration time. This means that it has to check for required external files and do similar configuration and setup tasks.
The default file creates two configuration directives in the configure script: --with-my_module and --enable-my_module. Use the first option when referring external files (such as the --with-apache directive that refers to the Apache directory). Use the second option when the user simply has to decide whether to enable your extension. Regardless of which option you use, you should uncomment the other, unnecessary one; that is, if you're using --enable-my_module, you should remove support for --with-my_module, and vice versa.
By default, the config.m4 file created by ext_skel accepts both directives and automatically enables your extension. Enabling the extension is done by using the PHP_EXTENSION macro. To change the default behavior to include your module into the PHP binary when desired by the user (by explicitly specifying --enable-my_module or --with-my_module), change the test for $PHP_MY_MODULE to == "yes":
if test "$PHP_MY_MODULE" == "yes"; then dnl Action.. PHP_EXTENSION(my_module, $ext_shared) fi
Note: Be sure to run buildconf every time you change config.m4!
We'll go into more details on the M4 macros available to your configuration scripts later in this chapter. For now, we'll simply use the default files.
We'll start with the creation of a very simple extension at first, which basically does nothing more than implement a function that returns the integer it receives as parameter. A simple extension. shows the source.
Example #2 A simple extension.
/* include standard header */ #include "php.h" /* declaration of functions to be exported */ ZEND_FUNCTION(first_module); /* compiled function list so Zend knows what's in this module */ zend_function_entry firstmod_functions[] = { ZEND_FE(first_module, NULL) {NULL, NULL, NULL} }; /* compiled module information */ zend_module_entry firstmod_module_entry = { STANDARD_MODULE_HEADER, "First Module", firstmod_functions, NULL, NULL, NULL, NULL, NULL, NO_VERSION_YET, STANDARD_MODULE_PROPERTIES }; /* implement standard "stub" routine to introduce ourselves to Zend */ #if COMPILE_DL_FIRST_MODULE ZEND_GET_MODULE(firstmod) #endif /* implement function that is meant to be made available to PHP */ ZEND_FUNCTION(first_module) { long parameter; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "l", ¶meter) == FAILURE) { return; } RETURN_LONG(parameter); }
This code contains a complete PHP module. We'll explain the source code in detail shortly, but first we'd like to discuss the build process. (This will allow the impatient to experiment before we dive into API discussions.)
Note: The example source makes use of some features introduced with the Zend version used in PHP 4.1.0 and above, it won't compile with older PHP 4.0.x versions.
There are basically two ways to compile modules:
Use the provided "make" mechanism in the ext directory, which also allows building of dynamic loadable modules.
Compile the sources manually.
The second method is good for those who (for some reason) don't have the full PHP source tree available, don't have access to all files, or just like to juggle with their keyboard. These cases should be extremely rare, but for the sake of completeness we'll also describe this method.
To compile the sample sources using the standard mechanism, copy all their subdirectories to the ext directory of your PHP source tree. Then run buildconf, which will create an updated configure script containing appropriate options for the new extension. By default, all the sample sources are disabled, so you don't have to fear breaking your build process.
After you run buildconf, configure --help shows the following additional modules:
--enable-array_experiments BOOK: Enables array experiments --enable-call_userland BOOK: Enables userland module --enable-cross_conversion BOOK: Enables cross-conversion module --enable-first_module BOOK: Enables first module --enable-infoprint BOOK: Enables infoprint module --enable-reference_test BOOK: Enables reference test module --enable-resource_test BOOK: Enables resource test module --enable-variable_creation BOOK: Enables variable-creation module
The module shown earlier in A simple extension. can be enabled with --enable-first_module or --enable-first_module=yes.
To compile your modules manually, you need the following commands:
Action | Command |
Compiling | cc -fpic -DCOMPILE_DL_FIRST_MODULE=1 -I/usr/local/include -I. -I.. -I../Zend -c -o <your_object_file> <your_c_file> |
Linking | cc -shared -L/usr/local/lib -rdynamic -o <your_module_file> <your_object_file(s)> |
Note: All include paths in the example are relative to the directory ext. If you're compiling from another directory, change the pathnames accordingly. Required items are the PHP directory, the Zend directory, and (if necessary), the directory in which your module resides.
The link command is also a plain vanilla command instructing linkage as a dynamic module.
You can include optimization options in the compilation command, although these have been omitted in this example (but some are included in the makefile template described in an earlier section).
Note: Compiling and linking manually as a static module into the PHP binary involves very long instructions and thus is not discussed here. (It's not very efficient to type all those commands.)
Depending on the build process you selected, you should either end up with a new PHP binary to be linked into your Web server (or run as CGI), or with an .so (shared object) file. If you compiled the example file first_module.c as a shared object, your result file should be first_module.so. To use it, you first have to copy it to a place from which it's accessible to PHP. For a simple test procedure, you can copy it to your htdocs directory and try it with the source in A test file for first_module.so.. If you compiled it into the PHP binary, omit the call to dl(), as the module's functionality is instantly available to your scripts.
For security reasons, you should not put your dynamic modules into publicly accessible directories. Even though it can be done and it simplifies testing, you should put them into a separate directory in production environments.
Example #3 A test file for first_module.so.
<?php
// remove next comment if necessary
// dl("first_module.so");
$param = 2;
$return = first_module($param);
print("We sent '$param' and got '$return'");
?>
Calling this PHP file should output the following:
We sent '2' and got '2'
If required, the dynamic loadable module is loaded by calling the dl() function. This function looks for the specified shared object, loads it, and makes its functions available to PHP. The module exports the function first_module(), which accepts a single parameter, converts it to an integer, and returns the result of the conversion.
If you've gotten this far, congratulations! You just built your first extension to PHP.
Actually, not much troubleshooting can be done when compiling static or dynamic modules. The only problem that could arise is that the compiler will complain about missing definitions or something similar. In this case, make sure that all header files are available and that you specified their path correctly in the compilation command. To be sure that everything is located correctly, extract a clean PHP source tree and use the automatic build in the ext directory with the fresh files; this will guarantee a safe compilation environment. If this fails, try manual compilation.
PHP might also complain about missing functions in your module. (This shouldn't happen with the sample sources if you didn't modify them.) If the names of external functions you're trying to access from your module are misspelled, they'll remain as "unlinked symbols" in the symbol table. During dynamic loading and linkage by PHP, they won't resolve because of the typing errors - there are no corresponding symbols in the main binary. Look for incorrect declarations in your module file or incorrectly written external references. Note that this problem is specific to dynamic loadable modules; it doesn't occur with static modules. Errors in static modules show up at compile time.
Now that you've got a safe build environment and you're able to include the modules into PHP files, it's time to discuss how everything works.
All PHP modules follow a common structure:
Header file inclusions (to include all required macros, API definitions, etc.)
C declaration of exported functions (required to declare the Zend function block)
Declaration of the Zend function block
Declaration of the Zend module block
Implementation of get_module()
Implementation of all exported functions
The only header file you really have to include for your modules is php.h, located in the PHP directory. This file makes all macros and API definitions required to build new modules available to your code.
Tip: It's good practice to create a separate header file for your module that contains module-specific definitions. This header file should contain all the forward definitions for exported functions and include php.h. If you created your module using ext_skel you already have such a header file prepared.
To declare functions that are to be exported (i.e., made available to PHP as new native functions), Zend provides a set of macros. A sample declaration looks like this:
ZEND_FUNCTION ( my_function );
ZEND_FUNCTION declares a new C function that complies with Zend's internal API. This means that the function is of type void and accepts INTERNAL_FUNCTION_PARAMETERS (another macro) as parameters. Additionally, it prefixes the function name with zif. The immediately expanded version of the above definitions would look like this:
void zif_my_function ( INTERNAL_FUNCTION_PARAMETERS );
void zif_my_function( int ht , zval * return_value , zval * this_ptr , int return_value_used , zend_executor_globals * executor_globals );
Since the interpreter and executor core have been separated from the main PHP package, a second API defining macros and function sets has evolved: the Zend API. As the Zend API now handles quite a few of the responsibilities that previously belonged to PHP, a lot of PHP functions have been reduced to macros aliasing to calls into the Zend API. The recommended practice is to use the Zend API wherever possible, as the old API is only preserved for compatibility reasons. For example, the types zval and pval are identical. zval is Zend's definition; pval is PHP's definition (actually, pval is an alias for zval now). As the macro INTERNAL_FUNCTION_PARAMETERS is a Zend macro, the above declaration contains zval. When writing code, you should always use zval to conform to the new Zend API.
The parameter list of this declaration is very important; you should keep these parameters in mind (see Zend's Parameters to Functions Called from PHP for descriptions).
Parameter | Description |
ht | The number of arguments passed to the Zend function. You should not touch this directly, but instead use ZEND_NUM_ARGS() to obtain the value. |
return_value | This variable is used to pass any return values of your function back to PHP. Access to this variable is best done using the predefined macros. For a description of these see below. |
this_ptr | Using this variable, you can gain access to the object in which your function is contained, if it's used within an object. Use the function getThis() to obtain this pointer. |
return_value_used | This flag indicates whether an eventual return value from this function will actually be used by the calling script. 0 indicates that the return value is not used; 1 indicates that the caller expects a return value. Evaluation of this flag can be done to verify correct usage of the function as well as speed optimizations in case returning a value requires expensive operations (for an example, see how array.c makes use of this). |
executor_globals | This variable points to global settings of the Zend engine. You'll find this useful when creating new variables, for example (more about this later). The executor globals can also be introduced to your function by using the macro TSRMLS_FETCH(). |
Now that you have declared the functions to be exported, you also have to introduce them to Zend. Introducing the list of functions is done by using an array of zend_function_entry. This array consecutively contains all functions that are to be made available externally, with the function's name as it should appear in PHP and its name as defined in the C source. Internally, zend_function_entry is defined as shown in Internal declaration of zend_function_entry..
Example #4 Internal declaration of zend_function_entry.
typedef struct _zend_function_entry { char *fname; void (*handler)(INTERNAL_FUNCTION_PARAMETERS); unsigned char *func_arg_types; } zend_function_entry;
Entry | Description |
fname | Denotes the function name as seen in PHP (for example, fopen, mysql_connect, or, in our example, first_module). |
handler | Pointer to the C function responsible for handling calls to this function. For example, see the standard macro INTERNAL_FUNCTION_PARAMETERS discussed earlier. |
func_arg_types | Allows you to mark certain parameters so that they're forced to be passed by reference. You usually should set this to NULL. |
zend_function_entry firstmod_functions[] = { ZEND_FE(first_module, NULL) {NULL, NULL, NULL} };
Note: You cannot use the predefined macros for the end marker, as these would try to refer to a function named "NULL"!
The macro ZEND_FE (short for 'Zend Function Entry') simply expands to a structure entry in zend_function_entry. Note that these macros introduce a special naming scheme to your functions - your C functions will be prefixed with zif_, meaning that ZEND_FE(first_module) will refer to a C function zif_first_module(). If you want to mix macro usage with hand-coded entries (not a good practice), keep this in mind.
Tip: Compilation errors that refer to functions named zif_*() relate to functions defined with ZEND_FE.
Macros for Defining Functions shows a list of all the macros that you can use to define functions.
Macro Name | Description |
ZEND_FE(name, arg_types) | Defines a function entry of the name name in zend_function_entry. Requires a corresponding C function. arg_types needs to be set to NULL. This function uses automatic C function name generation by prefixing the PHP function name with zif_. For example, ZEND_FE("first_module", NULL) introduces a function first_module() to PHP and links it to the C function zif_first_module(). Use in conjunction with ZEND_FUNCTION. |
ZEND_NAMED_FE(php_name, name, arg_types) | Defines a function that will be available to PHP by the name php_name and links it to the corresponding C function name. arg_types needs to be set to NULL. Use this function if you don't want the automatic name prefixing introduced by ZEND_FE. Use in conjunction with ZEND_NAMED_FUNCTION. |
ZEND_FALIAS(name, alias, arg_types) | Defines an alias named alias for name. arg_types needs to be set to NULL. Doesn't require a corresponding C function; refers to the alias target instead. |
PHP_FE(name, arg_types) | Old PHP API equivalent of ZEND_FE. |
PHP_NAMED_FE(runtime_name, name, arg_types) | Old PHP API equivalent of ZEND_NAMED_FE. |
Note: You can't use ZEND_FE in conjunction with PHP_FUNCTION, or PHP_FE in conjunction with ZEND_FUNCTION. However, it's perfectly legal to mix ZEND_FE and ZEND_FUNCTION with PHP_FE and PHP_FUNCTION when staying with the same macro set for each function to be declared. But mixing is not recommended; instead, you're advised to use the ZEND_* macros only.
This block is stored in the structure zend_module_entry and contains all necessary information to describe the contents of this module to Zend. You can see the internal definition of this module in Internal declaration of zend_module_entry..
Example #5 Internal declaration of zend_module_entry.
typedef struct _zend_module_entry zend_module_entry; struct _zend_module_entry { unsigned short size; unsigned int zend_api; unsigned char zend_debug; unsigned char zts; char *name; zend_function_entry *functions; int (*module_startup_func)(INIT_FUNC_ARGS); int (*module_shutdown_func)(SHUTDOWN_FUNC_ARGS); int (*request_startup_func)(INIT_FUNC_ARGS); int (*request_shutdown_func)(SHUTDOWN_FUNC_ARGS); void (*info_func)(ZEND_MODULE_INFO_FUNC_ARGS); char *version; [ Rest of the structure is not interesting here ] };
Entry | Description |
---|---|
size, zend_api, zend_debug and zts | Usually filled with the "STANDARD_MODULE_HEADER", which fills these four members with the size of the whole zend_module_entry, the ZEND_MODULE_API_NO, whether it is a debug build or normal build (ZEND_DEBUG) and if ZTS is enabled (USING_ZTS). |
name | Contains the module name (for example, "File functions", "Socket functions", "Crypt", etc.). This name will show up in phpinfo(), in the section "Additional Modules." |
functions | Points to the Zend function block, discussed in the preceding section. |
module_startup_func | This function is called once upon module initialization and can be used to do one-time initialization steps (such as initial memory allocation, etc.). To indicate a failure during initialization, return FAILURE; otherwise, SUCCESS. To mark this field as unused, use NULL. To declare a function, use the macro ZEND_MINIT. |
module_shutdown_func | This function is called once upon module shutdown and can be used to do one-time deinitialization steps (such as memory deallocation). This is the counterpart to module_startup_func(). To indicate a failure during deinitialization, return FAILURE; otherwise, SUCCESS. To mark this field as unused, use NULL. To declare a function, use the macro ZEND_MSHUTDOWN. |
request_startup_func | This function is called once upon every page request and can be used to do one-time initialization steps that are required to process a request. To indicate a failure here, return FAILURE; otherwise, SUCCESS. Note: As dynamic loadable modules are loaded only on page requests, the request startup function is called right after the module startup function (both initialization events happen at the same time). To mark this field as unused, use NULL. To declare a function, use the macro ZEND_RINIT. |
request_shutdown_func | This function is called once after every page request and works as counterpart to request_startup_func(). To indicate a failure here, return FAILURE; otherwise, SUCCESS. Note: As dynamic loadable modules are loaded only on page requests, the request shutdown function is immediately followed by a call to the module shutdown handler (both deinitialization events happen at the same time). To mark this field as unused, use NULL. To declare a function, use the macro ZEND_RSHUTDOWN. |
info_func | When phpinfo() is called in a script, Zend cycles through all loaded modules and calls this function. Every module then has the chance to print its own "footprint" into the output page. Generally this is used to dump environmental or statistical information. To mark this field as unused, use NULL. To declare a function, use the macro ZEND_MINFO. |
version | The version of the module. You can use NO_VERSION_YET if you don't want to give the module a version number yet, but we really recommend that you add a version string here. Such a version string can look like this (in chronological order): "2.5-dev", "2.5RC1", "2.5" or "2.5pl3". |
Remaining structure elements | These are used internally and can be prefilled by using the macro STANDARD_MODULE_PROPERTIES_EX. You should not assign any values to them. Use STANDARD_MODULE_PROPERTIES_EX only if you use global startup and shutdown functions; otherwise, use STANDARD_MODULE_PROPERTIES directly. |
In our example, this structure is implemented as follows:
zend_module_entry firstmod_module_entry = { STANDARD_MODULE_HEADER, "First Module", firstmod_functions, NULL, NULL, NULL, NULL, NULL, NO_VERSION_YET, STANDARD_MODULE_PROPERTIES, };
For reference purposes, you can find a list of the macros involved in declared startup and shutdown functions in Macros to Declare Startup and Shutdown Functions. These are not used in our basic example yet, but will be demonstrated later on. You should make use of these macros to declare your startup and shutdown functions, as these require special arguments to be passed (INIT_FUNC_ARGS and SHUTDOWN_FUNC_ARGS), which are automatically included into the function declaration when using the predefined macros. If you declare your functions manually and the PHP developers decide that a change in the argument list is necessary, you'll have to change your module sources to remain compatible.
Macro | Description |
ZEND_MINIT(module) | Declares a function for module startup. The generated name will be zend_minit_<module> (for example, zend_minit_first_module). Use in conjunction with ZEND_MINIT_FUNCTION. |
ZEND_MSHUTDOWN(module) | Declares a function for module shutdown. The generated name will be zend_mshutdown_<module> (for example, zend_mshutdown_first_module). Use in conjunction with ZEND_MSHUTDOWN_FUNCTION. |
ZEND_RINIT(module) | Declares a function for request startup. The generated name will be zend_rinit_<module> (for example, zend_rinit_first_module). Use in conjunction with ZEND_RINIT_FUNCTION. |
ZEND_RSHUTDOWN(module) | Declares a function for request shutdown. The generated name will be zend_rshutdown_<module> (for example, zend_rshutdown_first_module). Use in conjunction with ZEND_RSHUTDOWN_FUNCTION. |
ZEND_MINFO(module) | Declares a function for printing module information, used when phpinfo() is called. The generated name will be zend_info_<module> (for example, zend_info_first_module). Use in conjunction with ZEND_MINFO_FUNCTION. |
This function is special to all dynamic loadable modules. Take a look at the creation via the ZEND_GET_MODULE macro first:
#if COMPILE_DL_FIRSTMOD ZEND_GET_MODULE(firstmod) #endif
The function implementation is surrounded by a conditional compilation statement. This is needed since the function get_module() is only required if your module is built as a dynamic extension. By specifying a definition of COMPILE_DL_FIRSTMOD in the compiler command (see above for a discussion of the compilation instructions required to build a dynamic extension), you can instruct your module whether you intend to build it as a dynamic extension or as a built-in module. If you want a built-in module, the implementation of get_module() is simply left out.
get_module() is called by Zend at load time of the module. You can think of it as being invoked by the dl() call in your script. Its purpose is to pass the module information block back to Zend in order to inform the engine about the module contents.
If you don't implement a get_module() function in your dynamic loadable module, Zend will compliment you with an error message when trying to access it.
Implementing the exported functions is the final step. The example function in first_module looks like this:
ZEND_FUNCTION(first_module) { long parameter; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "l", ¶meter) == FAILURE) { return; } RETURN_LONG(parameter); }
After the declaration, code for checking and retrieving the function's arguments, argument conversion, and return value generation follows (more on this later).
That's it, basically - there's nothing more to implementing PHP modules. Built-in modules are structured similarly to dynamic modules, so, equipped with the information presented in the previous sections, you'll be able to fight the odds when encountering PHP module source files.
Now, in the following sections, read on about how to make use of PHP's internals to build powerful extensions.
One of the most important issues for language extensions is accepting and dealing with data passed via arguments. Most extensions are built to deal with specific input data (or require parameters to perform their specific actions), and function arguments are the only real way to exchange data between the PHP level and the C level. Of course, there's also the possibility of exchanging data using predefined global values (which is also discussed later), but this should be avoided by all means, as it's extremely bad practice.
PHP doesn't make use of any formal function declarations; this is why call syntax is always completely dynamic and never checked for errors. Checking for correct call syntax is left to the user code. For example, it's possible to call a function using only one argument at one time and four arguments the next time - both invocations are syntactically absolutely correct.
Since PHP doesn't have formal function definitions with support for call syntax checking, and since PHP features variable arguments, sometimes you need to find out with how many arguments your function has been called. You can use the ZEND_NUM_ARGS macro in this case. In previous versions of PHP, this macro retrieved the number of arguments with which the function has been called based on the function's hash table entry, ht, which is passed in the INTERNAL_FUNCTION_PARAMETERS list. As ht itself now contains the number of arguments that have been passed to the function, ZEND_NUM_ARGS has been stripped down to a dummy macro (see its definition in zend_API.h). But it's still good practice to use it, to remain compatible with future changes in the call interface. Note: The old PHP equivalent of this macro is ARG_COUNT.
The following code checks for the correct number of arguments:
if(ZEND_NUM_ARGS() != 2) WRONG_PARAM_COUNT;
"Warning: Wrong parameter count for firstmodule() in /home/www/htdocs/firstmod.php on line 5"
This macro prints a default error message and then returns to the caller. Its definition can also be found in zend_API.h and looks like this:
ZEND_API void wrong_param_count(void); #define WRONG_PARAM_COUNT { wrong_param_count(); return; }
Note: New parameter parsing API
This chapter documents the new Zend parameter parsing API introduced by Andrei Zmievski. It was introduced in the development stage between PHP 4.0.6 and 4.1.0.
Parsing parameters is a very common operation and it may get a bit tedious. It would also be nice to have standardized error checking and error messages. Since PHP 4.1.0, there is a way to do just that by using the new parameter parsing API. It greatly simplifies the process of receiving parameters, but it has a drawback in that it can't be used for functions that expect variable number of parameters. But since the vast majority of functions do not fall into those categories, this parsing API is recommended as the new standard way.
The prototype for parameter parsing function looks like this:
int zend_parse_parameters(int num_args TSRMLS_DC, char *type_spec, ...);
zend_parse_parameters() also performs type conversions whenever possible, so that you always receive the data in the format you asked for. Any type of scalar can be converted to another one, but conversions between complex types (arrays, objects, and resources) and scalar types are not allowed.
If the parameters could be obtained successfully and there were no errors during type conversion, the function will return SUCCESS, otherwise it will return FAILURE. The function will output informative error messages, if the number of received parameters does not match the requested number, or if type conversion could not be performed.
Here are some sample error messages:
Here is the full list of type specifiers:
l - long
d - double
s - string (with possible null bytes) and its length
b - boolean
r - resource, stored in zval*
a - array, stored in zval*
o - object (of any class), stored in zval*
O - object (of class specified by class entry), stored in zval*
z - the actual zval*
| - indicates that the remaining parameters are optional. The storage variables corresponding to these parameters should be initialized to default values by the extension, since they will not be touched by the parsing function if the parameters are not passed.
/ - the parsing function will call SEPARATE_ZVAL_IF_NOT_REF() on the parameter it follows, to provide a copy of the parameter, unless it's a reference.
! - the parameter it follows can be of specified type or NULL (only applies to a, o, O, r, and z). If NULL value is passed by the user, the storage pointer will be set to NULL.
The best way to illustrate the usage of this function is through examples:
/* Gets a long, a string and its length, and a zval. */ long l; char *s; int s_len; zval *param; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "lsz", &l, &s, &s_len, ¶m) == FAILURE) { return; } /* Gets an object of class specified by my_ce, and an optional double. */ zval *obj; double d = 0.5; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "O|d", &obj, my_ce, &d) == FAILURE) { return; } /* Gets an object or null, and an array. If null is passed for object, obj will be set to NULL. */ zval *obj; zval *arr; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "O!a", &obj, &arr) == FAILURE) { return; } /* Gets a separated array. */ zval *arr; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "a/", &arr) == FAILURE) { return; } /* Get only the first three parameters (useful for varargs functions). */ zval *z; zend_bool b; zval *r; if (zend_parse_parameters(3, "zbr!", &z, &b, &r) == FAILURE) { return; }
Note that in the last example we pass 3 for the number of received parameters, instead of ZEND_NUM_ARGS(). What this lets us do is receive the least number of parameters if our function expects a variable number of them. Of course, if you want to operate on the rest of the parameters, you will have to use zend_get_parameters_array_ex() to obtain them.
The parsing function has an extended version that allows for an additional flags argument that controls its actions.
int zend_parse_parameters_ex(int flags, int num_args TSRMLS_DC, char *type_spec, ...);
The only flag you can pass currently is ZEND_PARSE_PARAMS_QUIET, which instructs the function to not output any error messages during its operation. This is useful for functions that expect several sets of completely different arguments, but you will have to output your own error messages.
For example, here is how you would get either a set of three longs or a string:
long l1, l2, l3; char *s; if (zend_parse_parameters_ex(ZEND_PARSE_PARAMS_QUIET, ZEND_NUM_ARGS() TSRMLS_CC, "lll", &l1, &l2, &l3) == SUCCESS) { /* manipulate longs */ } else if (zend_parse_parameters_ex(ZEND_PARSE_PARAMS_QUIET, ZEND_NUM_ARGS(), "s", &s, &s_len) == SUCCESS) { /* manipulate string */ } else { php_error(E_WARNING, "%s() takes either three long values or a string as argument", get_active_function_name(TSRMLS_C)); return; }
With all the abovementioned ways of receiving function parameters you should have a good handle on this process. For even more example, look through the source code for extensions that are shipped with PHP - they illustrate every conceivable situation.
Note: Deprecated parameter parsing API
This API is deprecated and superseded by the new ZEND parameter parsing API.
After having checked the number of arguments, you need to get access to the arguments themselves. This is done with the help of zend_get_parameters_ex():
zval **parameter; if(zend_get_parameters_ex(1, ¶meter) != SUCCESS) WRONG_PARAM_COUNT;
zend_get_parameters_ex() accepts at least two arguments. The first argument is the number of arguments to retrieve (which should match the number of arguments with which the function has been called; this is why it's important to check for correct call syntax). The second argument (and all following arguments) are pointers to pointers to pointers to zvals. (Confusing, isn't it?) All these pointers are required because Zend works internally with **zval; to adjust a local **zval in our function,zend_get_parameters_ex() requires a pointer to it.
The return value of zend_get_parameters_ex() can either be SUCCESS or FAILURE, indicating (unsurprisingly) success or failure of the argument processing. A failure is most likely related to an incorrect number of arguments being specified, in which case you should exit with WRONG_PARAM_COUNT.
To retrieve more than one argument, you can use a similar snippet:
zval **param1, **param2, **param3, **param4; if(zend_get_parameters_ex(4, ¶m1, ¶m2, ¶m3, ¶m4) != SUCCESS) WRONG_PARAM_COUNT;
zend_get_parameters_ex() only checks whether you're trying to retrieve too many parameters. If the function is called with five arguments, but you're only retrieving three of them with zend_get_parameters_ex(), you won't get an error but will get the first three parameters instead. Subsequent calls of zend_get_parameters_ex() won't retrieve the remaining arguments, but will get the same arguments again.
If your function is meant to accept a variable number of arguments, the snippets just described are sometimes suboptimal solutions. You have to create a line calling zend_get_parameters_ex() for every possible number of arguments, which is often unsatisfying.
For this case, you can use the function zend_get_parameters_array_ex(), which accepts the number of parameters to retrieve and an array in which to store them:
zval **parameter_array[4]; /* get the number of arguments */ argument_count = ZEND_NUM_ARGS(); /* see if it satisfies our minimal request (2 arguments) */ /* and our maximal acceptance (4 arguments) */ if(argument_count < 2 || argument_count > 4) WRONG_PARAM_COUNT; /* argument count is correct, now retrieve arguments */ if(zend_get_parameters_array_ex(argument_count, parameter_array) != SUCCESS) WRONG_PARAM_COUNT;
A very clever implementation of this can be found in the code handling PHP's fsockopen() located in ext/standard/fsock.c, as shown in PHP's implementation of variable arguments in fsockopen().. Don't worry if you don't know all the functions used in this source yet; we'll get to them shortly.
Example #6 PHP's implementation of variable arguments in fsockopen().
pval **args[5]; int *sock=emalloc(sizeof(int)); int *sockp; int arg_count=ARG_COUNT(ht); int socketd = -1; unsigned char udp = 0; struct timeval timeout = { 60, 0 }; unsigned short portno; unsigned long conv; char *key = NULL; FLS_FETCH(); if (arg_count > 5 || arg_count < 2 || zend_get_parameters_array_ex(arg_count,args)==FAILURE) { CLOSE_SOCK(1); WRONG_PARAM_COUNT; } switch(arg_count) { case 5: convert_to_double_ex(args[4]); conv = (unsigned long) (Z_DVAL_PP(args[4]) * 1000000.0); timeout.tv_sec = conv / 1000000; timeout.tv_usec = conv % 1000000; /* fall-through */ case 4: if (!PZVAL_IS_REF(*args[3])) { php_error(E_WARNING,"error string argument to fsockopen not passed by reference"); } pval_copy_constructor(*args[3]); ZVAL_EMPTY_STRING(*args[3]); /* fall-through */ case 3: if (!PZVAL_IS_REF(*args[2])) { php_error(E_WARNING,"error argument to fsockopen not passed by reference"); return; } ZVAL_LONG(*args[2], 0); break; } convert_to_string_ex(args[0]); convert_to_long_ex(args[1]); portno = (unsigned short) Z_LVAL_P(args[1]); key = emalloc(Z_STRLEN_P(args[0]) + 10);
fsockopen() accepts two, three, four, or five parameters. After the obligatory variable declarations, the function checks for the correct range of arguments. Then it uses a fall-through mechanism in a switch() statement to deal with all arguments. The switch() statement starts with the maximum number of arguments being passed (five). After that, it automatically processes the case of four arguments being passed, then three, by omitting the otherwise obligatory break keyword in all stages. After having processed the last case, it exits the switch() statement and does the minimal argument processing needed if the function is invoked with only two arguments.
This multiple-stage type of processing, similar to a stairway, allows convenient processing of a variable number of arguments.
To access arguments, it's necessary for each argument to have a clearly defined type. Again, PHP's extremely dynamic nature introduces some quirks. Because PHP never does any kind of type checking, it's possible for a caller to pass any kind of data to your functions, whether you want it or not. If you expect an integer, for example, the caller might pass an array, and vice versa - PHP simply won't notice.
To work around this, you have to use a set of API functions to force a type conversion on every argument that's being passed (see Argument Conversion Functions).
Note: All conversion functions expect a **zval as parameter.
Function | Description |
convert_to_boolean_ex() | Forces conversion to a Boolean type. Boolean values remain untouched. Longs, doubles, and strings containing 0 as well as NULL values will result in Boolean 0 (FALSE). Arrays and objects are converted based on the number of entries or properties, respectively, that they have. Empty arrays and objects are converted to FALSE; otherwise, to TRUE. All other values result in a Boolean 1 (TRUE). |
convert_to_long_ex() | Forces conversion to a long, the default integer type. NULL values, Booleans, resources, and of course longs remain untouched. Doubles are truncated. Strings containing an integer are converted to their corresponding numeric representation, otherwise resulting in 0. Arrays and objects are converted to 0 if empty, 1 otherwise. |
convert_to_double_ex() | Forces conversion to a double, the default floating-point type. NULL values, Booleans, resources, longs, and of course doubles remain untouched. Strings containing a number are converted to their corresponding numeric representation, otherwise resulting in 0.0. Arrays and objects are converted to 0.0 if empty, 1.0 otherwise. |
convert_to_string_ex() | Forces conversion to a string. Strings remain untouched. NULL values are converted to an empty string. Booleans containing TRUE are converted to "1", otherwise resulting in an empty string. Longs and doubles are converted to their corresponding string representation. Arrays are converted to the string "Array" and objects to the string "Object". |
convert_to_array_ex(value) | Forces conversion to an array. Arrays remain untouched. Objects are converted to an array by assigning all their properties to the array table. All property names are used as keys, property contents as values. NULL values are converted to an empty array. All other values are converted to an array that contains the specific source value in the element with the key 0. |
convert_to_object_ex(value) | Forces conversion to an object. Objects remain untouched. NULL values are converted to an empty object. Arrays are converted to objects by introducing their keys as properties into the objects and their values as corresponding property contents in the object. All other types result in an object with the property scalar , having the corresponding source value as content. |
convert_to_null_ex(value) | Forces the type to become a NULL value, meaning empty. |
Note: You can find a demonstration of the behavior in cross_conversion.php on the accompanying CD-ROM.
Using these functions on your arguments will ensure type safety for all data that's passed to you. If the supplied type doesn't match the required type, PHP forces dummy contents on the resulting value (empty strings, arrays, or objects, 0 for numeric values, FALSE for Booleans) to ensure a defined state.
Following is a quote from the sample module discussed previously, which makes use of the conversion functions:
zval **parameter; if((ZEND_NUM_ARGS() != 1) || (zend_get_parameters_ex(1, ¶meter) != SUCCESS)) { WRONG_PARAM_COUNT; } convert_to_long_ex(parameter); RETURN_LONG(Z_LVAL_P(parameter));
Example #7 PHP/Zend zval type definition.
typedef pval zval; typedef struct _zval_struct zval; typedef union _zvalue_value { long lval; /* long value */ double dval; /* double value */ struct { char *val; int len; } str; HashTable *ht; /* hash table value */ struct { zend_class_entry *ce; HashTable *properties; } obj; } zvalue_value; struct _zval_struct { /* Variable information */ zvalue_value value; /* value */ unsigned char type; /* active type */ unsigned char is_ref; short refcount; };
Actually, pval (defined in php.h) is only an alias of zval (defined in zend.h), which in turn refers to _zval_struct. This is a most interesting structure. _zval_struct is the "master" structure, containing the value structure, type, and reference information. The substructure zvalue_value is a union that contains the variable's contents. Depending on the variable's type, you'll have to access different members of this union. For a description of both structures, see Zend zval Structure, Zend zvalue_value Structure and Zend Variable Type Constants.
Entry | Description |
value | Union containing this variable's contents. See Zend zvalue_value Structure for a description. |
type | Contains this variable's type. For a list of available types, see Zend Variable Type Constants. |
is_ref | 0 means that this variable is not a reference; 1 means that this variable is a reference to another variable. |
refcount | The number of references that exist for this variable. For every new reference to the value stored in this variable, this counter is increased by 1. For every lost reference, this counter is decreased by 1. When the reference counter reaches 0, no references exist to this value anymore, which causes automatic freeing of the value. |
Entry | Description |
lval | Use this property if the variable is of the type IS_LONG, IS_BOOLEAN, or IS_RESOURCE. |
dval | Use this property if the variable is of the type IS_DOUBLE. |
str | This structure can be used to access variables of the type IS_STRING. The member len contains the string length; the member val points to the string itself. Zend uses C strings; thus, the string length contains a trailing 0x00. |
ht | This entry points to the variable's hash table entry if the variable is an array. |
obj | Use this property if the variable is of the type IS_OBJECT. |
Constant | Description |
IS_NULL | Denotes a NULL (empty) value. |
IS_LONG | A long (integer) value. |
IS_DOUBLE | A double (floating point) value. |
IS_STRING | A string. |
IS_ARRAY | Denotes an array. |
IS_OBJECT | An object. |
IS_BOOL | A Boolean value. |
IS_RESOURCE | A resource (for a discussion of resources, see the appropriate section below). |
IS_CONSTANT | A constant (defined) value. |
To access a long you access zval.value.lval, to access a double you use zval.value.dval, and so on. Because all values are stored in a union, trying to access data with incorrect union members results in meaningless output.
Accessing arrays and objects is a bit more complicated and is discussed later.
If your function accepts arguments passed by reference that you intend to modify, you need to take some precautions.
What we didn't say yet is that under the circumstances presented so far, you don't have write access to any zval containers designating function parameters that have been passed to you. Of course, you can change any zval containers that you created within your function, but you mustn't change any zvals that refer to Zend-internal data!
We've only discussed the so-called *_ex() API so far. You may have noticed that the API functions we've used are called zend_get_parameters_ex() instead of zend_get_parameters(), convert_to_long_ex() instead of convert_to_long(), etc. The *_ex() functions form the so-called new "extended" Zend API. They give a minor speed increase over the old API, but as a tradeoff are only meant for providing read-only access.
Because Zend works internally with references, different variables may reference the same value. Write access to a zval container requires this container to contain an isolated value, meaning a value that's not referenced by any other containers. If a zval container were referenced by other containers and you changed the referenced zval, you would automatically change the contents of the other containers referencing this zval (because they'd simply point to the changed value and thus change their own value as well).
zend_get_parameters_ex() doesn't care about this situation, but simply returns a pointer to the desired zval containers, whether they consist of references or not. Its corresponding function in the traditional API, zend_get_parameters(), immediately checks for referenced values. If it finds a reference, it creates a new, isolated zval container; copies the referenced data into this newly allocated space; and then returns a pointer to the new, isolated value.
This action is called zval separation (or pval separation). Because the *_ex() API doesn't perform zval separation, it's considerably faster, while at the same time disabling write access.
To change parameters, however, write access is required. Zend deals with this situation in a special way: Whenever a parameter to a function is passed by reference, it performs automatic zval separation. This means that whenever you're calling a function like this in PHP, Zend will automatically ensure that $parameter is being passed as an isolated value, rendering it to a write-safe state:
my_function(&$parameter);
But this is not the case with regular parameters! All other parameters that are not passed by reference are in a read-only state.
This requires you to make sure that you're really working with a reference - otherwise you might produce unwanted results. To check for a parameter being passed by reference, you can use the macro PZVAL_IS_REF. This macro accepts a zval* to check if it is a reference or not. Examples are given in in Testing for referenced parameter passing..
Example #8 Testing for referenced parameter passing.
zval *parameter; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", ¶meter) == FAILURE) return; /* check for parameter being passed by reference */ if (!PZVAL_IS_REF(parameter)) { { zend_error(E_WARNING, "Parameter wasn't passed by reference"); RETURN_NULL(); } /* make changes to the parameter */ ZVAL_LONG(parameter, 10);
You might run into a situation in which you need write access to a parameter that's retrieved with zend_get_parameters_ex() but not passed by reference. For this case, you can use the macro SEPARATE_ZVAL, which does a zval separation on the provided container. The newly generated zval is detached from internal data and has only a local scope, meaning that it can be changed or destroyed without implying global changes in the script context:
zval **parameter; /* retrieve parameter */ zend_get_parameters_ex(1, ¶meter); /* at this stage, <parameter> still is connected */ /* to Zend's internal data buffers */ /* make <parameter> write-safe */ SEPARATE_ZVAL(parameter); /* now we can safely modify <parameter> */ /* without implying global changes */
Note: As you can easily work around the lack of write access in the "traditional" API (with zend_get_parameters() and so on), this API seems to be obsolete, and is not discussed further in this chapter.
When exchanging data from your own extensions with PHP scripts, one of the most important issues is the creation of variables. This section shows you how to deal with the variable types that PHP supports.
To create new variables that can be seen "from the outside" by the executing script, you need to allocate a new zval container, fill this container with meaningful values, and then introduce it to Zend's internal symbol table. This basic process is common to all variable creations:
zval *new_variable; /* allocate and initialize new container */ MAKE_STD_ZVAL(new_variable); /* set type and variable contents here, see the following sections */ /* introduce this variable by the name "new_variable_name" into the symbol table */ ZEND_SET_SYMBOL(EG(active_symbol_table), "new_variable_name", new_variable); /* the variable is now accessible to the script by using $new_variable_name */
The macro MAKE_STD_ZVAL allocates a new zval container using ALLOC_ZVAL and initializes it using INIT_ZVAL. As implemented in Zend at the time of this writing, initializing means setting the reference count to 1 and clearing the is_ref flag, but this process could be extended later - this is why it's a good idea to keep using MAKE_STD_ZVAL instead of only using ALLOC_ZVAL. If you want to optimize for speed (and you don't have to explicitly initialize the zval container here), you can use ALLOC_ZVAL, but this isn't recommended because it doesn't ensure data integrity.
ZEND_SET_SYMBOL takes care of introducing the new variable to Zend's symbol table. This macro checks whether the value already exists in the symbol table and converts the new symbol to a reference if so (with automatic deallocation of the old zval container). This is the preferred method if speed is not a crucial issue and you'd like to keep memory usage low.
Note that ZEND_SET_SYMBOL makes use of the Zend executor globals via the macro EG. By specifying EG(active_symbol_table), you get access to the currently active symbol table, dealing with the active, local scope. The local scope may differ depending on whether the function was invoked from within a function.
If you need to optimize for speed and don't care about optimal memory usage, you can omit the check for an existing variable with the same value and instead force insertion into the symbol table by using zend_hash_update():
zval *new_variable; /* allocate and initialize new container */ MAKE_STD_ZVAL(new_variable); /* set type and variable contents here, see the following sections */ /* introduce this variable by the name "new_variable_name" into the symbol table */ zend_hash_update( EG(active_symbol_table), "new_variable_name", strlen("new_variable_name") + 1, &new_variable, sizeof(zval *), NULL );
The variables generated with the snippet above will always be of local scope, so they reside in the context in which the function has been called. To create new variables in the global scope, use the same method but refer to another symbol table:
zval *new_variable; // allocate and initialize new container MAKE_STD_ZVAL(new_variable); // // set type and variable contents here // // introduce this variable by the name "new_variable_name" into the global symbol table ZEND_SET_SYMBOL(&EG(symbol_table), "new_variable_name", new_variable);
Note: The active_symbol_table variable is a pointer, but symbol_table is not. This is why you have to use EG(active_symbol_table) and &EG(symbol_table) as parameters to ZEND_SET_SYMBOL - it requires a pointer.
Similarly, to get a more efficient version, you can hardcode the symbol table update:
zval *new_variable; // allocate and initialize new container MAKE_STD_ZVAL(new_variable); // // set type and variable contents here // // introduce this variable by the name "new_variable_name" into the global symbol table zend_hash_update( &EG(symbol_table), "new_variable_name", strlen("new_variable_name") + 1, &new_variable, sizeof(zval *), NULL );
Note: You can see that the global variable is actually not accessible from within the function. This is because it's not imported into the local scope using global $global_variable; in the PHP source.
Example #9 Creating variables with different scopes.
ZEND_FUNCTION(variable_creation) { zval *new_var1, *new_var2; MAKE_STD_ZVAL(new_var1); MAKE_STD_ZVAL(new_var2); ZVAL_LONG(new_var1, 10); ZVAL_LONG(new_var2, 5); ZEND_SET_SYMBOL(EG(active_symbol_table), "local_variable", new_var1); ZEND_SET_SYMBOL(&EG(symbol_table), "global_variable", new_var2); RETURN_NULL(); }
Now let's get to the assignment of data to variables, starting with longs. Longs are PHP's integers and are very simple to store. Looking at the zval.value container structure discussed earlier in this chapter, you can see that the long data type is directly contained in the union, namely in the lval field. The corresponding type value for longs is IS_LONG (see Creation of a long.).
Example #10 Creation of a long.
zval *new_long; MAKE_STD_ZVAL(new_long); new_long->type = IS_LONG; new_long->value.lval = 10;
zval *new_long; MAKE_STD_ZVAL(new_long); ZVAL_LONG(new_long, 10);
Doubles are PHP's floats and are as easy to assign as longs, because their value is also contained directly in the union. The member in the zval.value container is dval; the corresponding type is IS_DOUBLE.
zval *new_double; MAKE_STD_ZVAL(new_double); new_double->type = IS_DOUBLE; new_double->value.dval = 3.45;
zval *new_double; MAKE_STD_ZVAL(new_double); ZVAL_DOUBLE(new_double, 3.45);
Strings need slightly more effort. As mentioned earlier, all strings that will be associated with Zend's internal data structures need to be allocated using Zend's own memory-management functions. Referencing of static strings or strings allocated with standard routines is not allowed. To assign strings, you have to access the structure str in the zval.value container. The corresponding type is IS_STRING:
zval *new_string; char *string_contents = "This is a new string variable"; MAKE_STD_ZVAL(new_string); new_string->type = IS_STRING; new_string->value.str.len = strlen(string_contents); new_string->value.str.val = estrdup(string_contents);
zval *new_string; char *string_contents = "This is a new string variable"; MAKE_STD_ZVAL(new_string); ZVAL_STRING(new_string, string_contents, 1);
If you want to truncate the string at a certain position or you already know its length, you can use ZVAL_STRINGL(zval, string, length, duplicate), which accepts an explicit string length to be set for the new string. This macro is faster than ZVAL_STRING and also binary-safe.
To create empty strings, set the string length to 0 and use empty_string as contents:
new_string->type = IS_STRING; new_string->value.str.len = 0; new_string->value.str.val = empty_string;
MAKE_STD_ZVAL(new_string); ZVAL_EMPTY_STRING(new_string);
Booleans are created just like longs, but have the type IS_BOOL. Allowed values in lval are 0 and 1:
zval *new_bool; MAKE_STD_ZVAL(new_bool); new_bool->type = IS_BOOL; new_bool->value.lval = 1;
Arrays are stored using Zend's internal hash tables, which can be accessed using the zend_hash_*() API. For every array that you want to create, you need a new hash table handle, which will be stored in the ht member of the zval.value container.
There's a whole API solely for the creation of arrays, which is extremely handy. To start a new array, you call array_init().
zval *new_array; MAKE_STD_ZVAL(new_array); array_init(new_array);
To add new elements to the array, you can use numerous functions, depending on what you want to do. Zend's API for Associative Arrays, Zend's API for Indexed Arrays, Part 1 and Zend's API for Indexed Arrays, Part 2 describe these functions. All functions return FAILURE on failure and SUCCESS on success.
Function | Description |
add_assoc_long(zval *array, char *key, long n);() | Adds an element of type long. |
add_assoc_unset(zval *array, char *key);() | Adds an unset element. |
add_assoc_bool(zval *array, char *key, int b);() | Adds a Boolean element. |
add_assoc_resource(zval *array, char *key, int r);() | Adds a resource to the array. |
add_assoc_double(zval *array, char *key, double d);() | Adds a floating-point value. |
add_assoc_string(zval *array, char *key, char *str, int duplicate);() | Adds a string to the array. The flag duplicate specifies whether the string contents have to be copied to Zend internal memory. |
add_assoc_stringl(zval *array, char *key, char *str, uint length, int duplicate); () | Adds a string with the desired length length to the array. Otherwise, behaves like add_assoc_string(). |
add_assoc_zval(zval *array, char *key, zval *value);() | Adds a zval to the array. Useful for adding other arrays, objects, streams, etc... |
Function | Description |
add_index_long(zval *array, uint idx, long n);() | Adds an element of type long. |
add_index_unset(zval *array, uint idx);() | Adds an unset element. |
add_index_bool(zval *array, uint idx, int b);() | Adds a Boolean element. |
add_index_resource(zval *array, uint idx, int r);() | Adds a resource to the array. |
add_index_double(zval *array, uint idx, double d);() | Adds a floating-point value. |
add_index_string(zval *array, uint idx, char *str, int duplicate);() | Adds a string to the array. The flag duplicate specifies whether the string contents have to be copied to Zend internal memory. |
add_index_stringl(zval *array, uint idx, char *str, uint length, int duplicate);() | Adds a string with the desired length length to the array. This function is faster and binary-safe. Otherwise, behaves like add_index_string(). |
add_index_zval(zval *array, uint idx, zval *value);() | Adds a zval to the array. Useful for adding other arrays, objects, streams, etc... |
Function | Description |
add_next_index_long(zval *array, long n);() | Adds an element of type long. |
add_next_index_unset(zval *array);() | Adds an unset element. |
add_next_index_bool(zval *array, int b);() | Adds a Boolean element. |
add_next_index_resource(zval *array, int r);() | Adds a resource to the array. |
add_next_index_double(zval *array, double d);() | Adds a floating-point value. |
add_next_index_string(zval *array, char *str, int duplicate);() | Adds a string to the array. The flag duplicate specifies whether the string contents have to be copied to Zend internal memory. |
add_next_index_stringl(zval *array, char *str, uint length, int duplicate);() | Adds a string with the desired length length to the array. This function is faster and binary-safe. Otherwise, behaves like add_index_string(). |
add_next_index_zval(zval *array, zval *value);() | Adds a zval to the array. Useful for adding other arrays, objects, streams, etc... |
All these functions provide a handy abstraction to Zend's internal hash API. Of course, you can also use the hash functions directly - for example, if you already have a zval container allocated that you want to insert into an array. This is done using zend_hash_update() for associative arrays (see Adding an element to an associative array.) and zend_hash_index_update() for indexed arrays (see Adding an element to an indexed array.):
Example #11 Adding an element to an associative array.
zval *new_array, *new_element; char *key = "element_key"; MAKE_STD_ZVAL(new_array); MAKE_STD_ZVAL(new_element); array_init(new_array); ZVAL_LONG(new_element, 10); if(zend_hash_update(new_array->value.ht, key, strlen(key) + 1, (void *)&new_element, sizeof(zval *), NULL) == FAILURE) { // do error handling here }
Example #12 Adding an element to an indexed array.
zval *new_array, *new_element; int key = 2; MAKE_STD_ZVAL(new_array); MAKE_STD_ZVAL(new_element); array_init(new_array); ZVAL_LONG(new_element, 10); if(zend_hash_index_update(new_array->value.ht, key, (void *)&new_element, sizeof(zval *), NULL) == FAILURE) { // do error handling here }
To emulate the functionality of add_next_index_*(), you can use this:
zend_hash_next_index_insert(ht, zval **new_element, sizeof(zval *), NULL)
Note: To return arrays from a function, use array_init() and all following actions on the predefined variable return_value (given as argument to your exported function; see the earlier discussion of the call interface). You do not have to use MAKE_STD_ZVAL on this.
Tip: To avoid having to write new_array->value.ht every time, you can use HASH_OF(new_array), which is also recommended for compatibility and style reasons.
Since objects can be converted to arrays (and vice versa), you might have already guessed that they have a lot of similarities to arrays in PHP. Objects are maintained with the same hash functions, but there's a different API for creating them.
To initialize an object, you use the function object_init():
zval *new_object; MAKE_STD_ZVAL(new_object); if(object_init(new_object) != SUCCESS) { // do error handling here }
Function | Description |
add_property_long(zval *object, char *key, long l);() | Adds a long to the object. |
add_property_unset(zval *object, char *key);() | Adds an unset property to the object. |
add_property_bool(zval *object, char *key, int b);() | Adds a Boolean to the object. |
add_property_resource(zval *object, char *key, long r);() | Adds a resource to the object. |
add_property_double(zval *object, char *key, double d);() | Adds a double to the object. |
add_property_string(zval *object, char *key, char *str, int duplicate);() | Adds a string to the object. |
add_property_stringl(zval *object, char *key, char *str, uint length, int duplicate);() | Adds a string of the specified length to the object. This function is faster than add_property_string() and also binary-safe. |
add_property_zval(zval *obect, char *key, zval *container):() | Adds a zval container to the object. This is useful if you have to add properties which aren't simple types like integers or strings but arrays or other objects. |
Resources are a special kind of data type in PHP. The term resources doesn't really refer to any special kind of data, but to an abstraction method for maintaining any kind of information. Resources are kept in a special resource list within Zend. Each entry in the list has a correspondending type definition that denotes the kind of resource to which it refers. Zend then internally manages all references to this resource. Access to a resource is never possible directly - only via a provided API. As soon as all references to a specific resource are lost, a corresponding shutdown function is called.
For example, resources are used to store database links and file descriptors. The de facto standard implementation can be found in the MySQL module, but other modules such as the Oracle module also make use of resources.
Note: In fact, a resource can be a pointer to anything you need to handle in your functions (e.g. pointer to a structure) and the user only has to pass a single resource variable to your function.
To create a new resource you need to register a resource destruction handler for it. Since you can store any kind of data as a resource, Zend needs to know how to free this resource if its not longer needed. This works by registering your own resource destruction handler to Zend which in turn gets called by Zend whenever your resource can be freed (whether manually or automatically). Registering your resource handler within Zend returns you the resource type handle for that resource. This handle is needed whenever you want to access a resource of this type later and is most of time stored in a global static variable within your extension. There is no need to worry about thread safety here because you only register your resource handler once during module initialization.
The Zend function to register your resource handler is defined as:
ZEND_API int zend_register_list_destructors_ex(rsrc_dtor_func_t ld, rsrc_dtor_func_t pld, char *type_name, int module_number);
There are two different kinds of resource destruction handlers you can pass to this function: a handler for normal resources and a handler for persistent resources. Persistent resources are for example used for database connection. When registering a resource, either of these handlers must be given. For the other handler just pass NULL.
zend_register_list_destructors_ex() accepts the following parameters:
ld | Normal resource destruction handler callback |
pld | Pesistent resource destruction handler callback |
type_name | A string specifying the name of your resource. It's always a good thing to specify a unique name within PHP for the resource type so when the user for example calls var_dump($resource); he also gets the name of the resource. |
module_number | The module_number is automatically available in your PHP_MINIT_FUNCTION function and therefore you just pass it over. |
The resource destruction handler (either normal or persistent resources) has the following prototype:
void resource_destruction_handler(zend_rsrc_list_entry *rsrc TSRMLS_DC);
typedef struct _zend_rsrc_list_entry { void *ptr; int type; int refcount; } zend_rsrc_list_entry;
Now we know how to start things, we define our own resource we want register within Zend. It is only a simple structure with two integer members:
typedef struct { int resource_link; int resource_type; } my_resource;
void my_destruction_handler(zend_rsrc_list_entry *rsrc TSRMLS_DC) { // You most likely cast the void pointer to your structure type my_resource *my_rsrc = (my_resource *) rsrc->ptr; // Now do whatever needs to be done with you resource. Closing // Files, Sockets, freeing additional memory, etc. // Also, don't forget to actually free the memory for your resource too! do_whatever_needs_to_be_done_with_the_resource(my_rsrc); }
Note: One important thing to mention: If your resource is a rather complex structure which also contains pointers to memory you allocated during runtime you have to free them before freeing the resource itself!
Now that we have defined
what our resource is and
our resource destruction handler
create a global variable within the extension holding the resource ID so it can be accessed from every function which needs it
define the resource name
write the resource destruction handler
and finally register the handler
// Somewhere in your extension, define the variable for your registered resources. // If you wondered what 'le' stands for: it simply means 'list entry'. static int le_myresource; // It's nice to define your resource name somewhere #define le_myresource_name "My type of resource" [...] // Now actually define our resource destruction handler void my_destruction_handler(zend_rsrc_list_entry *rsrc TSRMLS_DC) { my_resource *my_rsrc = (my_resource *) rsrc->ptr; do_whatever_needs_to_be_done_with_the_resource(my_rsrc); } [...] PHP_MINIT_FUNCTION(my_extension) { // Note that 'module_number' is already provided through the // PHP_MINIT_FUNCTION() function definition. le_myresource = zend_register_list_destructors_ex(my_destruction_handler, NULL, le_myresource_name, module_number); // You can register additional resources, initialize // your global vars, constants, whatever. }
To actually register a new resource you use can either use the zend_register_resource() function or the ZEND_REGISTER_RESOURE() macro, both defined in zend_list.h. Although the arguments for both map 1:1 it's a good idea to always use macros to be upwards compatible:
int ZEND_REGISTER_RESOURCE(zval *rsrc_result, void *rsrc_pointer, int rsrc_type);
rsrc_result | This is an already initialized zval * container. |
rsrc_pointer | Your resource pointer you want to store. |
rsrc_type | The type which you received when you registered the resource destruction handler. If you followed the naming scheme this would be le_myresource. |
What is really going on when you register a new resource is it gets inserted in an internal list in Zend and the result is just stored in the given zval * container:
rsrc_id = zend_list_insert(rsrc_pointer, rsrc_type); if (rsrc_result) { rsrc_result->value.lval = rsrc_id; rsrc_result->type = IS_RESOURCE; } return rsrc_id;
RETURN_RESOURCE(rsrc_id)
Note: It is common practice that if you want to return the resource immediately to the user you specify the return_value as the zval * container.
Zend now keeps track of all references to this resource. As soon as all references to the resource are lost, the destructor that you previously registered for this resource is called. The nice thing about this setup is that you don't have to worry about memory leakages introduced by allocations in your module - just register all memory allocations that your calling script will refer to as resources. As soon as the script decides it doesn't need them anymore, Zend will find out and tell you.
Now that the user got his resource, at some point he is passing it back to one of your functions. The value.lval inside the zval * container contains the key to your resource and thus can be used to fetch the resource with the following macro: ZEND_FETCH_RESOURCE:
ZEND_FETCH_RESOURCE(rsrc, rsrc_type, rsrc_id, default_rsrc_id, resource_type_name, resource_type)
rsrc | This is your pointer which will point to your previously registered resource. |
rsrc_type | This is the typecast argument for your pointer, e.g. myresource *. |
rsrc_id | This is the address of the zval *container the user passed to your function, e.g. &z_resource if zval *z_resource is given. |
default_rsrc_id | This integer specifies the default resource ID if no resource could be fetched or -1. |
resource_type_name | This is the name of the requested resource. It's a string and is used when the resource can't be found or is invalid to form a meaningful error message. |
resource_type | The resource_type you got back when registering the resource destruction handler. In our example this was le_myresource. |
To force removal of a resource from the list, use the function zend_list_delete(). You can also force the reference count to increase if you know that you're creating another reference for a previously allocated value (for example, if you're automatically reusing a default database link). For this case, use the function zend_list_addref(). To search for previously allocated resource entries, use zend_list_find(). The complete API can be found in zend_list.h.
In addition to the macros discussed earlier, a few macros allow easy creation of simple global variables. These are nice to know in case you want to introduce global flags, for example. This is somewhat bad practice, but Table Macros for Global Variable Creation describes macros that do exactly this task. They don't need any zval allocation; you simply have to supply a variable name and value.
Macro | Description |
SET_VAR_STRING(name, value) | Creates a new string. |
SET_VAR_STRINGL(name, value, length) | Creates a new string of the specified length. This macro is faster than SET_VAR_STRING and also binary-safe. |
SET_VAR_LONG(name, value) | Creates a new long. |
SET_VAR_DOUBLE(name, value) | Creates a new double. |
Zend supports the creation of true constants (as opposed to regular variables). Constants are accessed without the typical dollar sign ($) prefix and are available in all scopes. Examples include TRUE and FALSE, to name just two.
To create your own constants, you can use the macros in Macros for Creating Constants. All the macros create a constant with the specified name and value.
You can also specify flags for each constant:
CONST_CS - This constant's name is to be treated as case sensitive.
CONST_PERSISTENT - This constant is persistent and won't be "forgotten" when the current process carrying this constant shuts down.
// register a new constant of type "long" REGISTER_LONG_CONSTANT("NEW_MEANINGFUL_CONSTANT", 324, CONST_CS | CONST_PERSISTENT);
Macro | Description |
REGISTER_LONG_CONSTANT(name, value, flags) REGISTER_MAIN_LONG_CONSTANT(name, value, flags) | Registers a new constant of type long. |
REGISTER_DOUBLE_CONSTANT(name, value, flags) REGISTER_MAIN_DOUBLE_CONSTANT(name, value, flags) | Registers a new constant of type double. |
REGISTER_STRING_CONSTANT(name, value, flags) REGISTER_MAIN_STRING_CONSTANT(name, value, flags) | Registers a new constant of type string. The specified string must reside in Zend's internal memory. |
REGISTER_STRINGL_CONSTANT(name, value, length, flags) REGISTER_MAIN_STRINGL_CONSTANT(name, value, length, flags) | Registers a new constant of type string. The string length is explicitly set to length. The specified string must reside in Zend's internal memory. |
Sooner or later, you may need to assign the contents of one zval container to another. This is easier said than done, since the zval container doesn't contain only type information, but also references to places in Zend's internal data. For example, depending on their size, arrays and objects may be nested with lots of hash table entries. By assigning one zval to another, you avoid duplicating the hash table entries, using only a reference to them (at most).
To copy this complex kind of data, use the copy constructor. Copy constructors are typically defined in languages that support operator overloading, with the express purpose of copying complex types. If you define an object in such a language, you have the possibility of overloading the "=" operator, which is usually responsible for assigning the contents of the rvalue (result of the evaluation of the right side of the operator) to the lvalue (same for the left side).
Overloading means assigning a different meaning to this operator, and is usually used to assign a function call to an operator. Whenever this operator would be used on such an object in a program, this function would be called with the lvalue and rvalue as parameters. Equipped with that information, it can perform the operation it intends the "=" operator to have (usually an extended form of copying).
This same form of "extended copying" is also necessary for PHP's zval containers. Again, in the case of an array, this extended copying would imply re-creation of all hash table entries relating to this array. For strings, proper memory allocation would have to be assured, and so on.
Zend ships with such a function, called zend_copy_ctor() (the previous PHP equivalent was pval_copy_constructor()).
A most useful demonstration is a function that accepts a complex type as argument, modifies it, and then returns the argument:
zval *parameter; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", ¶meter) == FAILURE) return; } // do modifications to the parameter here // now we want to return the modified container: *return_value = *parameter; zval_copy_ctor(return_value);
The first part of the function is plain-vanilla argument retrieval. After the (left out) modifications, however, it gets interesting: The container of parameter is assigned to the (predefined) return_value container. Now, in order to effectively duplicate its contents, the copy constructor is called. The copy constructor works directly with the supplied argument, and the standard return values are FAILURE on failure and SUCCESS on success.
If you omit the call to the copy constructor in this example, both parameter and return_value would point to the same internal data, meaning that return_value would be an illegal additional reference to the same data structures. Whenever changes occurred in the data that parameter points to, return_value might be affected. Thus, in order to create separate copies, the copy constructor must be used.
The copy constructor's counterpart in the Zend API, the destructor zval_dtor(), does the opposite of the constructor.
Returning values from your functions to PHP was described briefly in an earlier section; this section gives the details. Return values are passed via the return_value variable, which is passed to your functions as argument. The return_value argument consists of a zval container (see the earlier discussion of the call interface) that you can freely modify. The container itself is already allocated, so you don't have to run MAKE_STD_ZVAL on it. Instead, you can access its members directly.
To make returning values from functions easier and to prevent hassles with accessing the internal structures of the zval container, a set of predefined macros is available (as usual). These macros automatically set the correspondent type and value, as described in Predefined Macros for Returning Values from a Function and Predefined Macros for Setting the Return Value of a Function.
Note: The macros in Predefined Macros for Returning Values from a Function automatically return from your function, those in Predefined Macros for Setting the Return Value of a Function only set the return value; they don't return from your function.
Macro | Description |
RETURN_RESOURCE(resource) | Returns a resource. |
RETURN_BOOL(bool) | Returns a Boolean. |
RETURN_NULL() | Returns nothing (a NULL value). |
RETURN_LONG(long) | Returns a long. |
RETURN_DOUBLE(double) | Returns a double. |
RETURN_STRING(string, duplicate) | Returns a string. The duplicate flag indicates whether the string should be duplicated using estrdup(). |
RETURN_STRINGL(string, length, duplicate) | Returns a string of the specified length; otherwise, behaves like RETURN_STRING. This macro is faster and binary-safe, however. |
RETURN_EMPTY_STRING() | Returns an empty string. |
RETURN_FALSE | Returns Boolean false. |
RETURN_TRUE | Returns Boolean true. |
Macro | Description |
RETVAL_RESOURCE(resource) | Sets the return value to the specified resource. |
RETVAL_BOOL(bool) | Sets the return value to the specified Boolean value. |
RETVAL_NULL | Sets the return value to NULL. |
RETVAL_LONG(long) | Sets the return value to the specified long. |
RETVAL_DOUBLE(double) | Sets the return value to the specified double. |
RETVAL_STRING(string, duplicate) | Sets the return value to the specified string and duplicates it to Zend internal memory if desired (see also RETURN_STRING). |
RETVAL_STRINGL(string, length, duplicate) | Sets the return value to the specified string and forces the length to become length (see also RETVAL_STRING). This macro is faster and binary-safe, and should be used whenever the string length is known. |
RETVAL_EMPTY_STRING | Sets the return value to an empty string. |
RETVAL_FALSE | Sets the return value to Boolean false. |
RETVAL_TRUE | Sets the return value to Boolean true. |
Complex types such as arrays and objects can be returned by using array_init() and object_init(), as well as the corresponding hash functions on return_value. Since these types cannot be constructed of trivial information, there are no predefined macros for them.
Often it's necessary to print messages to the output stream from your module, just as print() would be used within a script. PHP offers functions for most generic tasks, such as printing warning messages, generating output for phpinfo(), and so on. The following sections provide more details. Examples of these functions can be found on the CD-ROM.
zend_printf() works like the standard printf(), except that it prints to Zend's output stream.
zend_error() can be used to generate error messages. This function accepts two arguments; the first is the error type (see zend_errors.h), and the second is the error message.
zend_error(E_WARNING, "This function has been called with empty arguments");
Error | Description |
E_ERROR | Signals an error and terminates execution of the script immediately. |
E_WARNING | Signals a generic warning. Execution continues. |
E_PARSE | Signals a parser error. Execution continues. |
E_NOTICE | Signals a notice. Execution continues. Note that by default the display of this type of error messages is turned off in php.ini. |
E_CORE_ERROR | Internal error by the core; shouldn't be used by user-written modules. |
E_COMPILE_ERROR | Internal error by the compiler; shouldn't be used by user-written modules. |
E_COMPILE_WARNING | Internal warning by the compiler; shouldn't be used by user-written modules. |
After creating a real module, you'll want to show information about the module in phpinfo() (in addition to the module name, which appears in the module list by default). PHP allows you to create your own section in the phpinfo() output with the ZEND_MINFO() function. This function should be placed in the module descriptor block (discussed earlier) and is always called whenever a script calls phpinfo().
PHP automatically prints a section in phpinfo() for you if you specify the ZEND_MINFO function, including the module name in the heading. Everything else must be formatted and printed by you.
Typically, you can print an HTML table header using php_info_print_table_start() and then use the standard functions php_info_print_table_header() and php_info_print_table_row(). As arguments, both take the number of columns (as integers) and the column contents (as strings). Source code and screenshot for output in phpinfo. shows a source example and its output. To print the table footer, use php_info_print_table_end().
Example #13 Source code and screenshot for output in phpinfo().
php_info_print_table_start(); php_info_print_table_header(2, "First column", "Second column"); php_info_print_table_row(2, "Entry in first row", "Another entry"); php_info_print_table_row(2, "Just to fill", "another row here"); php_info_print_table_end();
You can also print execution information, such as the current file being executed. The name of the function currently being executed can be retrieved using the function get_active_function_name(). This function returns a pointer to the function name and doesn't accept any arguments. To retrieve the name of the file currently being executed, use zend_get_executed_filename(). This function accesses the executor globals, which are passed to it using the TSRMLS_C macro. The executor globals are automatically available to every function that's called directly by Zend (they're part of the INTERNAL_FUNCTION_PARAMETERS described earlier in this chapter). If you want to access the executor globals in another function that doesn't have them available automatically, call the macro TSRMLS_FETCH() once in that function; this will introduce them to your local scope.
Finally, the line number currently being executed can be retrieved using the function zend_get_executed_lineno(). This function also requires the executor globals as arguments. For examples of these functions, see Printing execution information..
Example #14 Printing execution information.
zend_printf("The name of the current function is %s<br>", get_active_function_name(TSRMLS_C)); zend_printf("The file currently executed is %s<br>", zend_get_executed_filename(TSRMLS_C)); zend_printf("The current line being executed is %i<br>", zend_get_executed_lineno(TSRMLS_C));
Startup and shutdown functions can be used for one-time initialization and deinitialization of your modules. As discussed earlier in this chapter (see the description of the Zend module descriptor block), there are module, and request startup and shutdown events.
The module startup and shutdown functions are called whenever a module is loaded and needs initialization; the request startup and shutdown functions are called every time a request is processed (meaning that a file is being executed).
For dynamic extensions, module and request startup/shutdown events happen at the same time.
Declaration and implementation of these functions can be done with macros; see the earlier section "Declaration of the Zend Module Block" for details.
You can call user functions from your own modules, which is very handy when implementing callbacks; for example, for array walking, searching, or simply for event-based programs.
User functions can be called with the function call_user_function_ex(). It requires a hash value for the function table you want to access, a pointer to an object (if you want to call a method), the function name, return value, number of arguments, argument array, and a flag indicating whether you want to perform zval separation.
ZEND_API int call_user_function_ex(HashTable *function_table, zval *object, zval *function_name, zval **retval_ptr_ptr, int param_count, zval **params[], int no_separation);
Note that you don't have to specify both function_table and object; either will do. If you want to call a method, you have to supply the object that contains this method, in which case call_user_function()automatically sets the function table to this object's function table. Otherwise, you only need to specify function_table and can set object to NULL.
Usually, the default function table is the "root" function table containing all function entries. This function table is part of the compiler globals and can be accessed using the macro CG. To introduce the compiler globals to your function, call the macro TSRMLS_FETCH once.
The function name is specified in a zval container. This might be a bit surprising at first, but is quite a logical step, since most of the time you'll accept function names as parameters from calling functions within your script, which in turn are contained in zval containers again. Thus, you only have to pass your arguments through to this function. This zval must be of type IS_STRING.
The next argument consists of a pointer to the return value. You don't have to allocate memory for this container; the function will do so by itself. However, you have to destroy this container (using zval_dtor()) afterward!
Next is the parameter count as integer and an array containing all necessary parameters. The last argument specifies whether the function should perform zval separation - this should always be set to 0. If set to 1, the function consumes less memory but fails if any of the parameters need separation.
Calling user functions. shows a small demonstration of calling a user function. The code calls a function that's supplied to it as argument and directly passes this function's return value through as its own return value. Note the use of the constructor and destructor calls at the end - it might not be necessary to do it this way here (since they should be separate values, the assignment might be safe), but this is bulletproof.
Example #15 Calling user functions.
zval **function_name; zval *retval; if((ZEND_NUM_ARGS() != 1) || (zend_get_parameters_ex(1, &function_name) != SUCCESS)) { WRONG_PARAM_COUNT; } if((*function_name)->type != IS_STRING) { zend_error(E_ERROR, "Function requires string argument"); } TSRMSLS_FETCH(); if(call_user_function_ex(CG(function_table), NULL, *function_name, &retval, 0, NULL, 0) != SUCCESS) { zend_error(E_ERROR, "Function call failed"); } zend_printf("We have %i as type\n", retval->type); *return_value = *retval; zval_copy_ctor(return_value); zval_ptr_dtor(&retval);
<?php dl("call_userland.so"); function test_function() { echo "We are in the test function!\n"; return 'hello'; } $return_value = call_userland("test_function"); echo "Return value: '$return_value'"; ?>
The above example will output:
We are in the test function! We have 3 as type Return value: 'hello'
PHP 4 features a redesigned initialization file support. It's now possible to specify default initialization entries directly in your code, read and change these values at runtime, and create message handlers for change notifications.
To create an .ini section in your own module, use the macros PHP_INI_BEGIN() to mark the beginning of such a section and PHP_INI_END() to mark its end. In between you can use PHP_INI_ENTRY() to create entries.
PHP_INI_BEGIN() PHP_INI_ENTRY("first_ini_entry", "has_string_value", PHP_INI_ALL, NULL) PHP_INI_ENTRY("second_ini_entry", "2", PHP_INI_SYSTEM, OnChangeSecond) PHP_INI_ENTRY("third_ini_entry", "xyz", PHP_INI_USER, NULL) PHP_INI_END()
The permissions are grouped into three sections:PHP_INI_SYSTEM allows a change only directly in the php.ini file; PHP_INI_USER allows a change to be overridden by a user at runtime using additional configuration files, such as .htaccess; and PHP_INI_ALL allows changes to be made without restrictions. There's also a fourth level, PHP_INI_PERDIR, for which we couldn't verify its behavior yet.
The fourth parameter consists of a pointer to a change-notification handler. Whenever one of these initialization entries is changed, this handler is called. Such a handler can be declared using the PHP_INI_MH macro:
PHP_INI_MH(OnChangeSecond); // handler for ini-entry "second_ini_entry" // specify ini-entries here PHP_INI_MH(OnChangeSecond) { zend_printf("Message caught, our ini entry has been changed to %s<br>", new_value); return(SUCCESS); }
#define PHP_INI_MH(name) int name(php_ini_entry *entry, char *new_value, uint new_value_length, void *mh_arg1, void *mh_arg2, void *mh_arg3)
The change-notification handlers should be used to cache initialization entries locally for faster access or to perform certain tasks that are required if a value changes. For example, if a constant connection to a certain host is required by a module and someone changes the hostname, automatically terminate the old connection and attempt a new one.
Access to initialization entries can also be handled with the macros shown in Macros to Access Initialization Entries in PHP.
Macro | Description |
INI_INT(name) | Returns the current value of entry name as integer (long). |
INI_FLT(name) | Returns the current value of entry name as float (double). |
INI_STR(name) | Returns the current value of entry name as string. Note: This string is not duplicated, but instead points to internal data. Further access requires duplication to local memory. |
INI_BOOL(name) | Returns the current value of entry name as Boolean (defined as zend_bool, which currently means unsigned char). |
INI_ORIG_INT(name) | Returns the original value of entry name as integer (long). |
INI_ORIG_FLT(name) | Returns the original value of entry name as float (double). |
INI_ORIG_STR(name) | Returns the original value of entry name as string. Note: This string is not duplicated, but instead points to internal data. Further access requires duplication to local memory. |
INI_ORIG_BOOL(name) | Returns the original value of entry name as Boolean (defined as zend_bool, which currently means unsigned char). |
Finally, you have to introduce your initialization entries to PHP. This can be done in the module startup and shutdown functions, using the macros REGISTER_INI_ENTRIES() and UNREGISTER_INI_ENTRIES():
ZEND_MINIT_FUNCTION(mymodule) { REGISTER_INI_ENTRIES(); } ZEND_MSHUTDOWN_FUNCTION(mymodule) { UNREGISTER_INI_ENTRIES(); }
You've learned a lot about PHP. You now know how to create dynamic loadable modules and statically linked extensions. You've learned how PHP and Zend deal with internal storage of variables and how you can create and access these variables. You know quite a set of tool functions that do a lot of routine tasks such as printing informational texts, automatically introducing variables to the symbol table, and so on.
Even though this chapter often had a mostly "referential" character, we hope that it gave you insight on how to start writing your own extensions. For the sake of space, we had to leave out a lot; we suggest that you take the time to study the header files and some modules (especially the ones in the ext/standard directory and the MySQL module, as these implement commonly known functionality). This will give you an idea of how other people have used the API functions - particularly those that didn't make it into this chapter.
The file config.m4 is processed by buildconf and must contain all the instructions to be executed during configuration. For example, these can include tests for required external files, such as header files, libraries, and so on. PHP defines a set of macros that can be used in this process, the most useful of which are described in M4 Macros for config.m4.
Macro | Description |
AC_MSG_CHECKING(message) | Prints a "checking <message>" text during configure. |
AC_MSG_RESULT(value) | Gives the result to AC_MSG_CHECKING; should specify either yes or no as value. |
AC_MSG_ERROR(message) | Prints message as error message during configure and aborts the script. |
AC_DEFINE(name,value,description) | Adds #define to php_config.h with the value of value and a comment that says description (this is useful for conditional compilation of your module). |
AC_ADD_INCLUDE(path) | Adds a compiler include path; for example, used if the module needs to add search paths for header files. |
AC_ADD_LIBRARY_WITH_PATH(libraryname,librarypath) | Specifies an additional library to link. |
AC_ARG_WITH(modulename,description,unconditionaltest,conditionaltest) | Quite a powerful macro, adding the module with description to the configure --help output. PHP checks whether the option --with-<modulename> is given to the configure script. If so, it runs the script unconditionaltest (for example, --with-myext=yes), in which case the value of the option is contained in the variable $withval. Otherwise, it executes conditionaltest. |
PHP_EXTENSION(modulename, [shared]) | This macro is a must to call for PHP to configure your extension. You can supply a second argument in addition to your module name, indicating whether you intend compilation as a shared module. This will result in a definition at compile time for your source as COMPILE_DL_<modulename>. |
A set of macros was introduced into Zend's API that simplify access to zval containers (see API Macros for Accessing zval Containers).
Macro | Refers to |
Z_LVAL(zval) | (zval).value.lval |
Z_DVAL(zval) | (zval).value.dval |
Z_STRVAL(zval) | (zval).value.str.val |
Z_STRLEN(zval) | (zval).value.str.len |
Z_ARRVAL(zval) | (zval).value.ht |
Z_LVAL_P(zval) | (*zval).value.lval |
Z_DVAL_P(zval) | (*zval).value.dval |
Z_STRVAL_P(zval_p) | (*zval).value.str.val |
Z_STRLEN_P(zval_p) | (*zval).value.str.len |
Z_ARRVAL_P(zval_p) | (*zval).value.ht |
Z_LVAL_PP(zval_pp) | (**zval).value.lval |
Z_DVAL_PP(zval_pp) | (**zval).value.dval |
Z_STRVAL_PP(zval_pp) | (**zval).value.str.val |
Z_STRLEN_PP(zval_pp) | (**zval).value.str.len |
Z_ARRVAL_PP(zval_pp) | (**zval).value.ht |