1 The El Library User's Guide
The Erl_Interface library contains functions. which help you integrate programs written in C and Erlang. The functions in Erl_Interface support the following:
- manipulation of data represented as Erlang data types
- conversion of data between C and Erlang formats
- encoding and decoding of Erlang data types for transmission or storage
- communication between C nodes and Erlang processes
- backup and restore of C node state to and from Mnesia
In the following sections, these topics are described:
- compiling your code for use with Erl_Interface
- initializing Erl_Interface
- encoding, decoding, and sending Erlang terms
- building terms and patterns
- pattern matching
- connecting to a distributed Erlang node
- using EPMD
- sending and receiving Erlang messages
- remote procedure calls
- global names
- the registry
1.1 Compiling and Linking Your Code
In order to use any of the Erl_Interface functions, include the following lines in your code:
#include "erl_interface.h" #include "ei.h"
Determine where the top directory of your OTP installation is. You can find this out by starting Erlang and entering the following command at the Eshell prompt:
Eshell V4.7.4 (abort with ^G) 1> code:root_dir(). /usr/local/otp
To compile your code, make sure that your C compiler knows where to find erl_interface.h by specifying an appropriate -I argument on the command line, or by adding it to the CFLAGS definition in your Makefile. The correct value for this path is $OTPROOT/lib/erl_interfaceVsn/include, where $OTPROOT is the path reported by code:root_dir/0 in the above example, and Vsn is the version of the Erl_interface application, for example erl_interface-3.2.3
$ cc -c -I/usr/local/otp/lib/erl_interface-3.2.3/include myprog.c
When linking, you will need to specify the path to liberl_interface.a and libei.a with -L$OTPROOT/lib/erl_interface-3.2.3/lib, and you will need to specify the name of the libraries with -lerl_interface -lei. You can do this on the command line or by adding the flags to the LDFLAGS definition in your Makefile.
$ ld -L/usr/local/otp/lib/erl_interface-3.2.3/ lib myprog.o -lerl_interface -lei -o myprog
Also, on some systems it may be necessary to link with some additional libraries (e.g. libnsl.a and libsocket.a on Solaris, or wsock32.lib on Windows) in order to use the communication facilities of Erl_Interface.
If you are using Erl_Interface functions in a threaded application based on POSIX threads or Solaris threads, then Erl_Interface needs access to some of the synchronization facilities in your threads package, and you will need to specify additional compiler flags in order to indicate which of the packages you are using. Define _REENTRANT and either STHREADS or PTHREADS. The default is to use POSIX threads if _REENTRANT is specified.
1.2 Initializing the erl_interface Library
Before calling any of the other Erl_Interface functions, you must call erl_init() exactly once to initialize the library. erl_init() takes two arguments, however the arguments are no longer used by Erl_Interface, and should therefore be specified as erl_init(NULL,0).
1.3 Encoding, Decoding and Sending Erlang Terms
Data sent between distributed Erlang nodes is encoded in the Erlang external format. Consequently, you have to encode and decode Erlang terms into byte streams if you want to use the distribution protocol to communicate between a C program and Erlang.
The Erl_Interface library supports this activity. It has a number of C functions which create and manipulate Erlang data structures. The library also contains an encode and a decode function. The example below shows how to create and encode an Erlang tuple {tobbe,3928}:
ETERM *arr[2], *tuple; char buf[BUFSIZ]; int i; arr[0] = erl_mk_atom("tobbe"); arr[1] = erl_mk_integer(3928); tuple = erl_mk_tuple(arr, 2); i = erl_encode(tuple, buf);
Alternatively, you can use erl_send() and erl_receive_msg, which handle the encoding and decoding of messages transparently.
Refer to the Reference Manual for a complete description of the following modules:
- the erl_eterm module for creating Erlang terms
- the erl_marshal module for encoding and decoding routines.
1.4 Building Terms and Patterns
The previous example can be simplified by using erl_format() to create an Erlang term.
ETERM *ep; ep = erl_format("{~a,~i}", "tobbe", 3928);
Refer to the Reference Manual, the erl_format module, for a full description of the different format directives. The following example is more complex:
ETERM *ep; ep = erl_format("[{name,~a},{age,~i},{data,~w}]", "madonna", 21, erl_format("[{adr,~s,~i}]", "E-street", 42)); erl_free_compound(ep);
As in previous examples, it is your responsibility to free the memory allocated for Erlang terms. In this example, erl_free_compound() ensures that the complete term pointed to by ep is released. This is necessary, because the pointer from the second call to erl_format() is lost.
The following example shows a slightly different solution:
ETERM *ep,*ep2; ep2 = erl_format("[{adr,~s,~i}]","E-street",42); ep = erl_format("[{name,~a},{age,~i},{data,~w}]", "madonna", 21, ep2); erl_free_term(ep); erl_free_term(ep2);
In this case, you free the two terms independently. The order in which you free the terms ep and ep2 is not important, because the Erl_Interface library uses reference counting to determine when it is safe to actually remove objects.
If you are not sure whether you have freed the terms properly, you can use the following function to see the status of the fixed term allocator:
long allocated, freed; erl_eterm_statistics(&allocated,&freed); printf("currently allocated blocks: %ld\n",allocated); printf("length of freelist: %ld\n",freed); /* really free the freelist */ erl_eterm_release();
Refer to the Reference Manual, the erl_malloc module for more information.
1.5 Pattern Matching
An Erlang pattern is a term that may contain unbound variables or "do not care" symbols. Such a pattern can be matched against a term and, if the match is successful, any unbound variables in the pattern will be bound as a side effect. The content of a bound variable can then be retrieved.
ETERM *pattern; pattern = erl_format("{madonna,Age,_}");
erl_match() is used to perform pattern matching. It takes a pattern and a term and tries to match them. As a side effect any unbound variables in the pattern will be bound. In the following example, we create a pattern with a variable Age which appears at two positions in the tuple. The pattern match is performed as follows:
- erl_match() will bind the contents of Age to 21 the first time it reaches the variable
- the second occurrence of Age will cause a test for equality between the terms since Age is already bound to 21. Since Age is bound to 21, the equality test will succeed and the match continues until the end of the pattern.
- if the end of the pattern is reached, the match succeeds and you can retrieve the contents of the variable
ETERM *pattern,*term; pattern = erl_format("{madonna,Age,Age}"); term = erl_format("{madonna,21,21}"); if (erl_match(pattern, term)) { fprintf(stderr, "Yes, they matched: Age = "); ep = erl_var_content(pattern, "Age"); erl_print_term(stderr, ep); fprintf(stderr,"\n"); erl_free_term(ep); } erl_free_term(pattern); erl_free_term(term);
Refer to the Reference Manual, the erl_match() function for more information.
1.6 Connecting to a Distributed Erlang Node
In order to connect to a distributed Erlang node you need to first initialize the connection routine with erl_connect_init(), which stores information such as the host name, node name, and IP address for later use:
int identification_number = 99; int creation=1; char *cookie="a secret cookie string"; /* An example */ erl_connect_init(identification_number, cookie, creation);
Refer to the Reference Manual, the erl_connect module for more information.
After initialization, you set up the connection to the Erlang node. Use erl_connect() to specify the Erlang node you want to connect to. The following example sets up the connection and should result in a valid socket file descriptor:
int sockfd; char *nodename="[email protected]"; /* An example */ if ((sockfd = erl_connect(nodename)) < 0) erl_err_quit("ERROR: erl_connect failed");
erl_err_quit() prints the specified string and terminates the program. Refer to the Reference Manual, the erl_error() function for more information.
1.7 Using EPMD
Epmd is the Erlang Port Mapper Daemon. Distributed Erlang nodes register with epmd on the localhost to indicate to other nodes that they exist and can accept connections. Epmd maintains a register of node and port number information, and when a node wishes to connect to another node, it first contacts epmd in order to find out the correct port number to connect to.
When you use erl_connect() to connect to an Erlang node, a connection is first made to epmd and, if the node is known, a connection is then made to the Erlang node.
C nodes can also register themselves with epmd if they want other nodes in the system to be able to find and connect to them.
Before registering with epmd, you need to first create a listen socket and bind it to a port. Then:
int pub; pub = erl_publish(port);
pub is a file descriptor now connected to epmd. Epmd monitors the other end of the connection, and if it detects that the connection has been closed, the node will be unregistered. So, if you explicitly close the descriptor or if your node fails, it will be unregistered from epmd.
Be aware that on some systems (such as VxWorks), a failed node will not be detected by this mechanism since the operating system does not automatically close descriptors that were left open when the node failed. If a node has failed in this way, epmd will prevent you from registering a new node with the old name, since it thinks that the old name is still in use. In this case, you must unregister the name explicitly:
erl_unpublish(node);
This will cause epmd to close the connection from the far end. Note that if the name was in fact still in use by a node, the results of this operation are unpredictable. Also, doing this does not cause the local end of the connection to close, so resources may be consumed.
1.8 Sending and Receiving Erlang Messages
Use one of the following two functions to send messages:
- erl_send()
- erl_reg_send()
As in Erlang, it is possible to send messages to a Pid or to a registered name. It is easier to send a message to a registered name because it avoids the problem of finding a suitable Pid.
Use one of the following two functions to receive messages:
- erl_receive()
- erl_receive_msg()
erl_receive() receives the message into a buffer, while erl_receive_msg() decodes the message into an Erlang term.
Example of Sending Messages
In the following example, {Pid, hello_world} is sent to a registered process my_server. The message is encoded by erl_send():
extern const char *erl_thisnodename(void); extern short erl_thiscreation(void); #define SELF(fd) erl_mk_pid(erl_thisnodename(),fd,0,erl_thiscreation()) ETERM *arr[2], *emsg; int sockfd, creation=1; arr[0] = SELF(sockfd); arr[1] = erl_mk_atom("Hello world"); emsg = erl_mk_tuple(arr, 2); erl_reg_send(sockfd, "my_server", emsg); erl_free_term(emsg);
The first element of the tuple that is sent is your own Pid. This enables my_server to reply. Refer to the Reference Manual, the erl_connect module for more information about send primitives.
Example of Receiving Messages
In this example {Pid, Something} is received. The received Pid is then used to return {goodbye,Pid}
ETERM *arr[2], *answer; int sockfd,rc; char buf[BUFSIZE]; ErlMessage emsg; if ((rc = erl_receive_msg(sockfd , buf, BUFSIZE, &emsg)) == ERL_MSG) { arr[0] = erl_mk_atom("goodbye"); arr[1] = erl_element(1, emsg.msg); answer = erl_mk_tuple(arr, 2); erl_send(sockfd, arr[1], answer); erl_free_term(answer); erl_free_term(emsg.msg); erl_free_term(emsg.to); }
In order to provide robustness, a distributed Erlang node occasionally polls all its connected neighbours in an attempt to detect failed nodes or communication links. A node which receives such a message is expected to respond immediately with an ERL_TICK message. This is done automatically by erl_receive(), however when this has occurred erl_receive returns ERL_TICK to the caller without storing a message into the ErlMessage structure.
When a message has been received, it is the caller's responsibility to free the received message emsg.msg as well as emsg.to or emsg.from, depending on the type of message received.
Refer to the Reference Manual for additional information about the following modules:
- erl_connect
- erl_eterm.
1.9 Remote Procedure Calls
An Erlang node acting as a client to another Erlang node typically sends a request and waits for a reply. Such a request is included in a function call at a remote node and is called a remote procedure call. The following example shows how the Erl_Interface library supports remote procedure calls:
char modname[]=THE_MODNAME; ETERM *reply,*ep; ep = erl_format("[~a,[]]", modname); if (!(reply = erl_rpc(fd, "c", "c", ep))) erl_err_msg("<ERROR> when compiling file: %s.erl !\n", modname); erl_free_term(ep); ep = erl_format("{ok,_}"); if (!erl_match(ep, reply)) erl_err_msg("<ERROR> compiler errors !\n"); erl_free_term(ep); erl_free_term(reply);
c:c/1 is called to compile the specified module on the remote node. erl_match() checks that the compilation was successful by testing for the expected ok.
Refer to the Reference Manual, the erl_connect module for more information about erl_rpc(), and its companions erl_rpc_to() and erl_rpc_from().
1.10 Using Global Names
A C node has access to names registered through the Erlang Global module. Names can be looked up, allowing the C node to send messages to named Erlang services. C nodes can also register global names, allowing them to provide named services to Erlang processes or other C nodes.
Erl_Interface does not provide a native implementation of the global service. Instead it uses the global services provided by a "nearby" Erlang node. In order to use the services described in this section, it is necessary to first open a connection to an Erlang node.
To see what names there are:
char **names; int count; int i; names = erl_global_names(fd,&count); if (names) for (i=0; i<count; i++) printf("%s\n",names[i]); free(names);
erl_global_names() allocates and returns a buffer containing all the names known to global. count will be initialized to indicate how many names are in the array. The array of strings in names is terminated by a NULL pointer, so it is not necessary to use count to determine when the last name is reached.
It is the caller's responsibility to free the array. erl_global_names() allocates the array and all of the strings using a single call to malloc(), so free(names) is all that is necessary.
To look up one of the names:
ETERM *pid; char node[256]; pid = erl_global_whereis(fd,"schedule",node);
If "schedule" is known to global, an Erlang pid is returned that can be used to send messages to the schedule service. Additionally, node will be initialized to contain the name of the node where the service is registered, so that you can make a connection to it by simply passing the variable to erl_connect().
Before registering a name, you should already have registered your port number with epmd. This is not strictly necessary, but if you neglect to do so, then other nodes wishing to communicate with your service will be unable to find or connect to your process.
Create a pid that Erlang processes can use to communicate with your service:
ETERM *pid; pid = erl_mk_pid(thisnode,14,0,0); erl_global_register(fd,servicename,pid);
After registering the name, you should use erl_accept() to wait for incoming connections.
Do not forget to free pid later with erl_free_term()!
To unregister a name:
erl_global_unregister(fd,servicename);
1.11 The Registry
This section describes the use of the registry, a simple mechanism for storing key-value pairs in a C-node, as well as backing them up or restoring them from a Mnesia table on an Erlang node. More detailed information about the individual API functions can be found in the Reference Manual.
Keys are strings, i.e. 0-terminated arrays of characters, and values are arbitrary objects. Although integers and floating point numbers are treated specially by the registry, you can store strings or binary objects of any type as pointers.
To start, you need to open a registry:
ei_reg *reg; reg = ei_reg_open(45);
The number 45 in the example indicates the approximate number of objects that you expect to store in the registry. Internally the registry uses hash tables with collision chaining, so there is no absolute upper limit on the number of objects that the registry can contain, but if performance or memory usage are important, then you should choose a number accordingly. The registry can be resized later.
You can open as many registries as you like (if memory permits).
Objects are stored and retrieved through set and get functions. In the following examples you see how to store integers, floats, strings and arbitrary binary objects:
struct bonk *b = malloc(sizeof(*b)); char *name = malloc(7); ei_reg_setival(reg,"age",29); ei_reg_setfval(reg,"height",1.85); strcpy(name,"Martin"); ei_reg_setsval(reg,"name",name); b->l = 42; b->m = 12; ei_reg_setpval(reg,"jox",b,sizeof(*b));
If you attempt to store an object in the registry and there is an existing object with the same key, the new value will replace the old one. This is done regardless of whether the new object and the old one have the same type, so you can, for example, replace a string with an integer. If the existing value is a string or binary, it will be freed before the new value is assigned.
Stored values are retrieved from the registry as follows:
long i; double f; char *s; struct bonk *b; int size; i = ei_reg_getival(reg,"age"); f = ei_reg_getfval(reg,"height"); s = ei_reg_getsval(reg,"name"); b = ei_reg_getpval(reg,"jox",&size);
In all of the above examples, the object must exist and it must be of the right type for the specified operation. If you do not know the type of a given object, you can ask:
struct ei_reg_stat buf; ei_reg_stat(reg,"name",&buf);
Buf will be initialized to contain object attributes.
Objects can be removed from the registry:
ei_reg_delete(reg,"name");
When you are finished with a registry, close it to remove all the objects and free the memory back to the system:
ei_reg_close(reg);
Backing Up the Registry to Mnesia
The contents of a registry can be backed up to Mnesia on a "nearby" Erlang node. You need to provide an open connection to the Erlang node (see erl_connect()). Also, Mnesia 3.0 or later must be running on the Erlang node before the backup is initiated:
ei_reg_dump(fd, reg, "mtab", dumpflags);
The example above will backup the contents of the registry to the specified Mnesia table "mtab". Once a registry has been backed up to Mnesia in this manner, additional backups will only affect objects that have been modified since the most recent backup, i.e. objects that have been created, changed or deleted. The backup operation is done as a single atomic transaction, so that the entire backup will be performed or none of it will.
In the same manner, a registry can be restored from a Mnesia table:
ei_reg_restore(fd, reg, "mtab");
This will read the entire contents of "mtab" into the specified registry. After the restore, all of the objects in the registry will be marked as unmodified, so a subsequent backup will only affect objects that you have modified since the restore.
Note that if you restore to a non-empty registry, objects in the table will overwrite objects in the registry with the same keys. Also, the entire contents of the registry is marked as unmodified after the restore, including any modified objects that were not overwritten by the restore operation. This may not be your intention.
Storing Strings and Binaries
When string or binary objects are stored in the registry it is important that a number of simple guidelines are followed.
Most importantly, the object must have been created with a single call to malloc() (or similar), so that it can later be removed by a single call to free(). Objects will be freed by the registry when it is closed, or when you assign a new value to an object that previously contained a string or binary.
You should also be aware that if you store binary objects that are context-dependent (e.g. containing pointers or open file descriptors), they will lose their meaning if they are backed up to a Mnesia table and subsequently restored in a different context.
When you retrieve a stored string or binary value from the registry, the registry maintains a pointer to the object and you are passed a copy of that pointer. You should never free an object retrieved in this manner because when the registry later attempts to free it, a runtime error will occur that will likely cause the C-node to crash.
You are free to modify the contents of an object retrieved this way. However when you do so, the registry will not be aware of the changes you make, possibly causing it to be missed the next time you make a Mnesia backup of the registry contents. This can be avoided if you mark the object as dirty after any such changes with ei_reg_markdirty(), or pass appropriate flags to ei_reg_dump().