[Prev]   [TOC][FAQ][Bugs][Home]   [Next]

Appendix: Application Programming Interface

A simple API is available to allow for embedded usage of MHonArc within other Perl programs.


Initialization

Before calling any MHonarc routines, you must initialize the MHonArc library. The following code snippet shows you how to initialize MHonArc:

# Require MHonArc library
require 'mhamain.pl';
# Initialize MHonArc
mhonarc::initialize();
NOTE:

The mhonarc::initialize() routine should only be called once within your program.

NOTE:

If mhamain.pl is not in perl's library search path, you will need to add the directory path to perl's search path before calling require.


Processing Input

To instruct MHonArc to process input, use the following routine:

# Tell MHonArc to start processing
mhonarc::process_input();

When mhonarc::process_input() is called with no arguments, it parses @ARGV for command-line arguments. If you pass a list of arguments into mhonarc::process_input() then that list will be processed for the command-line arguments. For example:

mhonarc::process_input(
    '-quiet',
    '-outdir', $archive_path,
    '-rcfile', $rcfile,
    $mailbox_filename
);

The return value of mhonarc::process_input() will be the CPU time, in seconds, MHonArc used. Example usage:

$cpu_time = mhonarc::process_input();

To determine what the status of the processing was, you can query the $mhonarc::CODE variable. The value of this variable reflects what the exit status of MHonArc would be if invoked from the shell. I.e. If $mhonarc::CODE is equal to 0, then no errors occured during processing. A non-zero value indicates some error occured. Example usage:

mhonarc::process_input(
    '-quiet',
    '-outdir', $archive_path,
    '-rcfile', $rcfile,
    $mailbox_filename
);
if ($mhonarc::CODE) {
    # error code here
}
NOTE:

If $mhonarc::CODE is equal to 75, this indicates that MHonArc was unable to obtain a lock on the archive. This exit code is recognized by MTAs like sendmail to requeue a message and try to deliver it again later. This is useful when MHonArc is invoked by a sendmail alias.

It is okay to call mhonarc::process_input() multiple times within a single program. This is useful if your program wants to process multiple archives.


Callbacks

Support is available for registering callbacks to be invoked when MHonArc is processing input. To register a callback, all you need to do is set the appriopriate MHonArc variable to a routine reference (hard or symbolic). For example, to set the callback when a message header is read, you can do something like the following:

$mhonarc::CBMessageHeadRead = \&my_callback_routine;
NOTE:

The mhasiteinit.pl site initialization library can be used to register callbacks. The advantages for using mhasiteinit.pl is that it is executed each time MHonArc is executed, and you do not have to create custom front-ends to MHonArc if all you want to do is register callbacks. See Installation and the example mhasiteinit.pl provided in the examples/ directory of the MHonArc distribution for more information about mhasiteinit.pl.

What follows is the type of callbacks supported by MHonArc:

$mhonarc::CBDbPreLoad

Invoked just before the database file is loaded.

Synopsis:

$do_load = &$mhonarc::CBDbPreLoad($pathname);

Arguments:

$pathname

Pathname to database file that will be loaded.

Return Value:

If a true value, MHonArc will load the database denoted by $pathname. If a false value, MHonArc will skip loading the database file.

Notes:

$mhonarc::CBDbPreSave

Invoked before data is saved to the database file.

Synopsis:

$do_save = &$mhonarc::CBDbPreSave($pathname, $tmp_pathname);

Arguments:

$pathname

Pathname to database file that will be written to.

$tmp_pathname

Pathname temporary file that data will be written to before replacing $pathname. Data is written to a temporary file first to prevent any I/O, or other, errors leaving a corrupt database. If the data is written successfully, MHonArc renames $tmp_pathname to $pathname.

Return Value:

If a true value, MHonArc will save the data to $pathname. If a false value, MHonArc will skip writing the database file.

Notes:

$mhonarc::CBDbSave

Invoked when data has been written to database file.

Synopsis:

&$mhonarc::CBDbSave($db_fh);

Arguments:

$db_fh

Open filehandle to database file. This filehandle can be used to write custom information to the database file.

Note: Any data written to $db_fh must be legal Perl code.

Return Value:

N/A

$mhonarc::CBMailFolderRead

The callback function after a mail folder has been processed.

Synopsis:

&$mhonarc::CBMailFolderRead($filename);

Arguments:

$filename

Filename of mail folder. If $filename equals "-", then the folder represents standard input.

To determine if the mail folder is a mailbox file or a directory, the following can be done:

    if (-d $filename) {
	# MH-style directory
    } else {
	# UUCP-style mailbox file
    }

Return Value:

N/A.

Notes:

$mhonarc::CBMessageConverted

The callback function after a mail message has been converted.

Synopsis:

&$mhonarc::CBMessageConverted(
		   $fields_hash_ref, $mesg_info_hash_ref);

Arguments:

$fields_hash_ref

Reference to hash containing parsed message header. the structure of this hash is the same as described for the $mhonarc::CBMessageHeadRead callback.

$mesg_info_hash_ref

Reference to hash contain meta-information about the message converted. The following lists the keys that exist in the hash and what their values represent:

folder

The filename of the mail folder the message was read from. If the filename is "-", then the folder was read from standard input. For cases when only a single mail message is added to an archive from standard input, folder will be undefined.

file

The source filename of the message just converted. This key is only defined for MH-style mail folders (i.e. the mail folder is a directory). For mailbox-type folders, this key is set to undef. For cases when only a single mail message is added to an archive from standard input, file will equal "-".

The following table summarizes what the values of the $mesg_info_hash_ref hash will be based upon the type of input:

MH-style folder $mesg_info_hash_ref->{'folder'} = directory-name
$mesg_info_hash_ref->{'file'} = filename
Mailbox-style folder $mesg_info_hash_ref->{'folder'} = filename
$mesg_info_hash_ref->{'file'} = undef
Single Message $mesg_info_hash_ref->{'folder'} = undef
$mesg_info_hash_ref->{'file'} = "-"

Return Value:

N/A.

$mhonarc::CBMessageBodyRead

The callback function after a mail message body has been converted.

Synopsis:

$boolean = &$mhonarc::CBMessageBodyRead(
		           $fields_hash_ref, $html_text_ref, $files_array_ref);

Arguments:

$fields_hash_ref

reference to hash containing parsed message header. the structure of this hash is the same as described for the $mhonarc::CBMessageHeadRead callback.

$html_text_ref

Reference to string contain the HTML markup for the body. Modifications to the referenced data will be reflected in the message page generated. Therefore, care should be observed when doing any modification.

If MHonArc was unable to convert the body of the message, the following expression will evaluate to true:

$$html_text_ref eq ""

If this is the case, you could set the value of $$html_text_ref to something else to customize the warning text MHonArc uses in the message page written.

$files_array_ref

Reference to array of derived files when the body was converted. Each file is typically relative to $mhonarc::OUTDIR, unless it is a full pathname. the mhonarc::OSis_absolute_path($filename) can be used to determine if a file is an absolute pathname or not. Note, it is possible that a file could designate a directory; this indicates that the directory, and all files in the directory, are derived.

Modifications to the array will affect the list of derived files MHonArc stores for the message. You can add files to the array if your routine creates files, but you can also delete items if your routine removes files; CAUTION: the HTML markup typically contains links to derived files so removing files could cause broken links unless $html_text_ref is modified to reflect the file deletions.

Return Value:

The return value is used by MHonArc to determine if the message should be excluded from any further processing. If the return value evaluates to true, then MHonArc will continue processing of the message. If the return value evaluates to false, the message will be excluded.

Notes:

$mhonarc::CBMessageHeadRead

The callback function after a mail message header is read and before any other processing is done. Note, the function is called after any exclusion checks (CHECKNOARCHIVE and MSGEXCFILTER) are performed by MHonArc.

Synopsis:

$boolean = &$mhonarc::CBMessageHeadRead(
			    $fields_hash_ref, $raw_header_txt);

Arguments:

$fields_hash_ref

Reference to hash containing parsed message header. Keys are the lowercase field names and the values are references to array contain the values for each field. If a field is only declared once in the header, the array will only contain one item. For example, to access the raw subject text, do the following: $fields_hash_ref->{'subject'}[0];

The hash also contains special keys represented the values MHonArc has extracted when parsing the message header. The values of these keys are regular scalars and NOT array references. The following summarizes the keys made available:

x-mha-index
The assigned index given to the message by MHonArc.
x-mha-message-id
The message-id MHonArc extracted. Note, if the message did not specified a message ID, MHonArc auto-generates one.
x-mha-from
Who MHonArc thinkgs the message is from. This value is controled by the FROMFIELDS resource.
x-mha-subject
The message subject that will be used by MHonArc. The value may be different from the raw subject text of the message since SUBJECTSTRIPCODE code will have been applied. If no subject is defined, then the value is the empty string.
x-mha-content-type
The content-type of the message MHonArc will use for the message.

For example, to access the subject text that MHonArc will use, do the following: $fields_hash_ref->{'x-mha-subject'};

$raw_header_txt

The raw header data of the message. This data may be useful if pattern matches are desired against header data.

Return Value:

The return value is used by MHonArc to determine if the message should be excluded from any further processing. If the return value evaluates to true, then MHonArc will continue processing of the message. If the return value evaluates to false, the message will be excluded.

Notes:

$mhonarc::CBRawMessageBodyRead

Invoked with the raw message body data is read from input. I.e. The message body has not been converted.

Synopsis:

$boolean = &$mhonarc::CBRawMessageBodyRead(
			  $fields_hash_ref, $body_data_ref);

Arguments:

$fields_hash_ref

Reference to hash containing parsed message header. The structure of this hash is the same as described for the $mhonarc::CBMessageHeadRead callback.

$body_data_ref

Reference to string contain the raw data of the message body. Modifications to the referenced data can be performed to change what data MHonArc will process.

Return Value:

The return value is used by MHonArc to determine if the message should be excluded from any further processing. If the return value evaluates to true, then MHonArc will continue processing of the message. If the return value evaluates to false, the message will be excluded.

$mhonarc::CBRcVarExpand

Invoked when a resource variable is to be expanded. With this callback, you can override and/or augment MHonArc's built-in resource variable expansion support.

Synopsis:

($result, $do_expand_again, $can_clip) =
    &$mhonarc::CBRcVarExpand($mha_index, $var_name, $arg_string);

Arguments:

$mha_index

The MHonArc index key of the current message.

$var_name

The variable name being expanded. For example, given the resource variable reference, "$VARIABLE$", $var_name would be equal to "VARIABLE".

$arg_string

The argument string for the resource variable reference. For example, given the resource variable reference, "$VARIABLE(arg1)$", $arg_string would be equal to "arg1".

Note: MHonArc generally uses the character ';' to separate multiple arguments to a resource variable. However, this is only convention, and if defining your own resource variable support via this callback, you can use whatever convention you like.

Return Value:

The return value is a list of values interpreted as follows:

$result

The result (or replace text) for the variable. If the result is equal to undef, MHonArc's built-in expansion code will be invoked to expand $var_name. If $result is defined, then MHonArc's built-in expansion will be skipped.

$do_expand_again

If a true value, MHonArc will parse $result and expand any resource variables contained within. Note: $mhonarc::CBRcVarExpand will be called for each resource variable found.

$can_clip

If a true value, clipping is allowed to be performed. Clipping is done if max length specification is specified in the resource variable reference.

Notes:


Utility Routines

The following are various utility routines available for use by custom extensions and filters:

mhonarc::get_icon_url

Retrieve icon URL for a give content-type as defined by the ICONS resource.

Synopsis:

$url = mhonarc::get_icon_url($content_type);

Arguments:

$content_type

MIME content-type to retrieve icon for.

Return Value:

URL to icon.

mhonarc::htmlize

Convert HTML special characters into entity references. The following table shows with characters are converted:

CharacterEntity Reference
<&lt;
>&gt;
&&amp;
"&quot;

Synopsis:

# Create an htmlized version of a string
$html_text = mhonarc::htmlize($text);

# Htmlize in-place
mhonarc::htmlize($text_ref);

Arguments:

$text

Text to convert. If a reference, conversion is done in-place,

Return Value:

The converted htmlized text.

mhonarc::write_attachment

Saves data to a file with a specified content-type.

Synopsis:

require 'mhmimetypes.pl';

($filename, $url) =
    mhonarc::write_attachment($content_type, $data_ref, $options_hash_ref);

Arguments:

$content_type

The content-type of the data. The value should be a string in standard MIME content-type format. Examples: images/jpeg, application/postscript.

$data_ref

Scalar reference of data to write to disk.

$options_hash_ref

Reference to hash containing options to routine. All options are optional. The following options are available:

-dirpath
Pathname to directory to write file to. If not specified, the value of ATTACHMENTDIR or OUTDIR if ATTACHMENTDIR is not defined.
-filename
Name of file to write to. If not specified, a random filename will be generated based on the value of $content_type.
-ext
Filename extension to use for file. If not specified, extension will be based on the value of $content_type. The -filename, if specified, supercedes this option.
-url
Base URL. If not specified the value of ATTACHMENTURL is used.

Return Value:

The return value is a list of values interpreted as follows:

$filename

The name of the file $data_ref was written to. $filename may contain pathname components. For filters, this value is suitable for use in the file return list.

$url

The URL that links to $filename. Calling code can use the URL within an HTML link.

Example:

The following illustrates the typical way of specifying options to mhonarc::write_attachment:

    ($filename, $urlfile) =
	mhonarc::write_attachment($ctype, $data, {
	  '-dirpath'  => $path,
	  '-filename' => $name,
	  '-ext'      => $ext,
	});

Notes:

readmail::MAILhead_get_disposition

Retrieve the content disposition of a message entity.

Synopsis:

require 'readmail.pl';

($disposition, $filename, $raw_filename, $html_name) =
    readmail::MAILhead_get_disposition($fields_hash_ref, $do_html);

Arguments:

$fields_hash_ref

reference to hash containing parsed message header. the structure of this hash is the same as described for the $mhonarc::CBMessageHeadRead callback.

$do_html

Generate an HTMLized version of filename designated in $fields_hash_ref for informational use within HTML markup.

Return Value:

The return value is a list of values interpreted as follows:

$disposition

The disposition of the entity. Generally, the value is either not defined, "attachment", or "inline".

$filename

Filename of entity as defined in $fields_hash_ref, but translated for safe usage. Any leading pathname component is removed and any unsafe characters are translated to underscores.

$raw_filename

Raw filename of entity as defined in $fields_hash_ref. This is only provided for informative uses and should not be used for creating files. Use $filename instead.

$html_name

Raw filename of entity converted to HTML. This return value is only provided if the $do_html argument is a true value. $html_name can be used for informative purposes in generated HTML by filters.

NOTE:

It is recommended that $html_name be used instead of HTMLizing $raw_filename directly since readmail::MAILhead_get_disposition does non-ASCII decoding and uses the CHARSETCONVERTERS resource.


Notes


[Prev]   [TOC][FAQ][Bugs][Home]   [Next]

$Date: 2005/07/08 02:04:05 $
MHonArc
Copyright © 2001,2005 Earl Hood, mhonarc@mhonarc.org