Chapter 10. Compiled (C) Language Functions

Functions written in C can be compiled into dynamically loadable objects (also called shared libraries), and used to implement user-defined SQL functions. The first time a user-defined function in a particular loadable object file is called in a backend session, the dynamic loader loads that object file into memory so that the function can be called. The CREATE FUNCTION command with the "LANGUAGE 'C'" clause is used to register a C function so that it can be invoked (or referenced) in other SQL commands. (Refer to the Red Hat Database SQL Guide and Reference for the syntax and other information.)

Calling Conventions for C-Language Functions

The calling convention relies on macros to suppress most of the complexity of passing arguments and results. The C declaration of a function is always
Datum funcname(PG_FUNCTION_ARGS)
In addition, the macro call
PG_FUNCTION_INFO_V1(funcname);
must appear in the same source file (conventionally it is written just before the function itself). This macro call is not needed for "internal"-language functions. However, it is required for dynamically-loaded functions.

Note

Two different calling conventions are currently accepted for C functions. The newer "version 1" calling convention is indicated by writing a PG_FUNCTION_INFO_V1() macro call for the function, as illustrated above. Lack of such a macro indicates an old-style ("version 0") function. The language name specified in CREATE FUNCTION is 'C' in either case. Old-style functions are now deprecated because of portability problems and lack of functionality, but they are still supported for compatibility reasons.

In a function, each actual argument is fetched using a PG_GETARG_xxx() macro that corresponds to the argument's datatype, and the result is returned using a PG_RETURN_xxx() macro for the return type.

Here are some examples:
#include "postgres.h"
#include <string.h>
#include "utils/geo_decls.h"
#include "fmgr.h"

/* By Value */

PG_FUNCTION_INFO_V1(add_one);
         
Datum
add_one(PG_FUNCTION_ARGS)
{
    int32   arg = PG_GETARG_INT32(0);

    PG_RETURN_INT32(arg + 1);
}

/* By Reference, Fixed Length */

PG_FUNCTION_INFO_V1(add_one_float8);

Datum
add_one_float8(PG_FUNCTION_ARGS)
{
    /* The macros for FLOAT8 hide 
     * its pass-by-reference nature 
     */
    float8   arg = PG_GETARG_FLOAT8(0);

    PG_RETURN_FLOAT8(arg + 1.0);
}

PG_FUNCTION_INFO_V1(makepoint);

Datum
makepoint(PG_FUNCTION_ARGS)
{
    /* Here, the pass-by-reference nature
     * of Point is not hidden            
    */
    Point  *pointx = PG_GETARG_POINT_P(0);
    Point  *pointy = PG_GETARG_POINT_P(1);
    Point  *new_point = (Point *) palloc(sizeof(Point));

    new_point->x = pointx->x;
    new_point->y = pointy->y;
       
    PG_RETURN_POINT_P(new_point);
}

/* By Reference, Variable Length */

PG_FUNCTION_INFO_V1(copytext);

Datum
copytext(PG_FUNCTION_ARGS)
{
    text     *t = PG_GETARG_TEXT_P(0);
    /*
     * VARSIZE is the total size 
     *  of the struct in bytes.
     */
    text  *new_t = (text *) palloc(VARSIZE(t));
    VARATT_SIZEP(new_t) = VARSIZE(t);
    /*
     * VARDATA is a pointer to the 
     * data region of the struct.
     */
    memcpy((void *) VARDATA(new_t), /* destination */
           (void *) VARDATA(t),     /* source */
           VARSIZE(t)-VARHDRSZ);    /* how many bytes */
    PG_RETURN_TEXT_P(new_t);
}

PG_FUNCTION_INFO_V1(concat_text);

Datum
concat_text(PG_FUNCTION_ARGS)
{
    text  *arg1 = PG_GETARG_TEXT_P(0);
    text  *arg2 = PG_GETARG_TEXT_P(1);
    int32 new_text_size = VARSIZE(arg1) + 
       VARSIZE(arg2) - VARHDRSZ;
    text *new_text = (text *) palloc(new_text_size);

    VARATT_SIZEP(new_text) = new_text_size;
    memcpy(VARDATA(new_text), VARDATA(arg1), 
           VARSIZE(arg1)-VARHDRSZ);
    memcpy(VARDATA(new_text) + (VARSIZE(arg1)-VARHDRSZ),
           VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
    PG_RETURN_TEXT_P(new_text);
}

Supposing that the above code has been prepared in file funcs_new.c and compiled into a shared object, we could define the functions to PostgreSQL with commands like this:
CREATE FUNCTION add_one(int4) RETURNS int4
     AS 'PGROOT/tutorial/funcs_new.so' LANGUAGE 'c'
     WITH (isStrict);

-- note overloading of SQL function name add_one()
CREATE FUNCTION add_one(float8) RETURNS float8
     AS 'PGROOT/tutorial/funcs_new.so',
        'add_one_float8'
     LANGUAGE 'c' WITH (isStrict);

CREATE FUNCTION makepoint(point, point) RETURNS point
     AS 'PGROOT/tutorial/funcs_new.so' LANGUAGE 'c'
     WITH (isStrict);
                         
CREATE FUNCTION copytext(text) RETURNS text
     AS 'PGROOT/tutorial/funcs_new.so' LANGUAGE 'c'
     WITH (isStrict);

CREATE FUNCTION concat_text(text, text) RETURNS text
     AS 'PGROOT/tutorial/funcs_new.so' LANGUAGE 'c'
     WITH (isStrict);

Here PGROOT stands for the full path to the Red Hat Database source tree. Note that depending on your system, the filename for a shared object might not end in .so, but in .sl or something else; adapt accordingly.

Notice that we have specified the functions as "strict", meaning that the system should automatically assume a NULL result if any input value is NULL. By doing this, we avoid having to check for NULL inputs in the function code. Without this, we would have to check for NULLs explicitly.

At first glance, the coding conventions may appear to be just pointless obscurantism. However, they do offer a number of improvements, because the macros can hide unnecessary detail. An example is that in coding add_one_float8, we do not need to be aware that float8 is a pass-by-reference type. Another example is that the GETARG macros for variable-length types hide the need to deal with fetching "toasted" (compressed or out-of-line) values.

The macro PG_ARGISNULL(n) allows a function to test whether each input is NULL (of course, doing this is necessary only in functions not declared "strict"). As with the PG_GETARG_xxx() macros, the input arguments are counted beginning at zero. To return a NULL result, execute PG_RETURN_NULL(); this works in both strict and non-strict functions.

The function call conventions make it possible to return "set" results and implement trigger functions and procedural-language call handlers. The code is also portable because it does not break ANSI C restrictions on function call protocol.