List Parsers

List Parsers are generated by the special predefined parser generator object list_p, which generates parsers recognizing list structures of the type

    item >> *(delimiter >> item) >> !end

where item is an expression, delimiter is a delimiter and end is an optional closing expression. As you can see, the list_p generated parser does not recognize empty lists, i.e. the parser must find at least one item in the input stream to return a successful match. If you wish to also match an empty list, you can make your list_p optional with operator! An example where this utility parser is helpful is parsing comma separated C/C++ strings, which can be easily formulated as:

    rule<> list_of_c_strings_rule
        =   list_p(confix_p('\"', *c_escape_char_p, '\"'), ',')
        ;

The confix_p and c_escape_char_p parser generators are described here and here.

The list_p parser generator object can be used to generate the following different types of List Parsers:

List Parsers
list_p

list_p used by itself parses comma separated lists without special item formatting, i.e. everything in between two commas is matched as an item, no end of list token is matched

list_p(delimiter)

generates a list parser, which recognizes lists with the given delimiter and matches everything in between them as an item, no end of list token is matched

list_p(item, delimiter)

generates a list parser, which recognizes lists with the given delimiter and matches items based on the given item parser, no end of list token is matched

list_p(item, delimiter, end)

generates a list parser, which recognizes lists with the given delimiter and matches items based on the given item parser and additionally recognizes an optional end expression

All of the parameters to list_p can be single characters, strings or, if more complex parsing logic is required, auxiliary parsers, each of which is automatically converted to the corresponding parser type needed for successful parsing.

If the item parser is an action_parser_category type (parser with an attached semantic action) we have to do something special. This happens, if the user wrote something like:

    list_p(item[func], delim)

where item is the parser matching one item of the list sequence and func is a functor to be called after matching one item. If we would do nothing, the resulting code would parse the sequence as follows:

    (item[func] - delim) >> *(delim >> (item[func] - delim))

what in most cases is not what the user expects. (If this is what you've expected, then please use one of the list_p generator functions direct(), which will inhibit refactoring of the item parser). To make the list parser behave as expected:

    (item - delim)[func] >> *(delim >> (item - delim)[func])

the actor attached to the item parser has to be re-attached to the (item - delim) parser construct, which will make the resulting list parser 'do the right thing'. This refactoring is done by the help of the Refactoring Parsers. Additionally special care must be taken, if the item parser is a unary_parser_category type parser as for instance:

    list_p(*anychar_p, ',')

which without any refactoring would result in

        (*anychar_p - ch_p(','))
    >> *( ch_p(',') >> (*anychar_p - ch_p(',')) )

and will not give the expected result (the first *anychar_p will eat up all the input up to the end of the input stream). So we have to refactor this into:

       *(anychar_p - ch_p(','))
    >> *( ch_p(',') >> *(anychar_p - ch_p(',')) )

what will give the correct result.

The case, where the item parser is a combination of the two mentioned problems (i.e. the item parser is a unary parser with an attached action), is handled accordingly too:

    list_p((*anychar_p)[func], ',')

will be parsed as expected:

        (*(anychar_p - ch_p(',')))[func]
    >> *( ch_p(',') >> (*(anychar_p - ch_p(',')))[func] )

The required refactoring is implemented with the help of the Refactoring Parsers.

Summary of List Parser refactorings
You write it as: It is refactored to:
list_p(item, delimiter) (item - delimiter)
>> *(
delimiter >> (item - delimiter))
list_p(item[func], delimiter) (item - delimiter)[func]
>> *(
delimiter >> (item - delimiter)[func])
list_p(*item, delimiter) *(item - delimiter)
>> *(
delimiter >> *(item - delimiter))
list_p((*item)[func], delimiter) (*(item - delimiter))[func]
>> *(
delimiter >> (*(item - delimiter))[func])

list_parser.cpp sample shows the usage of the list_p utility parser:

  1. parsing a simple ',' delimited list w/o item formatting
  2. parsing a CSV list (comma separated values - strings, integers or reals)
  3. parsing a token list (token separated values - strings, integers or reals)
    with an action parser directly attached to the item part of the list_p generated parser

This is part of the Spirit distribution.