Symfony\Component\DomCrawler\Crawler

class Crawler implements Countable, IteratorAggregate

Crawler eases navigation of a list of \DOMElement objects.

Methods

__construct(mixed $node = null, string $currentUri = null, string $baseHref = null)

Constructor.

clear()

Removes all the nodes.

add(DOMNodeList|DOMNode|array|string|null $node)

Adds a node to the current list of nodes.

addContent(string $content, null|string $type = null)

Adds HTML/XML content.

addHtmlContent(string $content, string $charset = 'UTF-8')

Adds an HTML content to the list of nodes.

addXmlContent(string $content, string $charset = 'UTF-8')

Adds an XML content to the list of nodes.

addDocument(DOMDocument $dom)

Adds a \DOMDocument to the list of nodes.

addNodeList(DOMNodeList $nodes)

Adds a \DOMNodeList to the list of nodes.

addNodes(array $nodes)

Adds an array of \DOMNode instances to the list of nodes.

addNode(DOMNode $node)

Adds a \DOMNode instance to the list of nodes.

Crawler

eq(int $position)

Returns a node given its position in the node list.

array

each(Closure $closure)

Calls an anonymous function on each node of the list.

Crawler

slice(int $offset, int $length = null)

Slices the list of nodes by $offset and $length.

Crawler

reduce(Closure $closure)

Reduces the list of nodes by calling an anonymous function.

Crawler

first()

Returns the first node of the current selection.

Crawler

last()

Returns the last node of the current selection.

Crawler

siblings()

Returns the siblings nodes of the current selection.

Crawler

nextAll()

Returns the next siblings nodes of the current selection.

Crawler

previousAll()

Returns the previous sibling nodes of the current selection.

Crawler

parents()

Returns the parents nodes of the current selection.

Crawler

children()

Returns the children nodes of the current selection.

string|null

attr(string $attribute)

Returns the attribute value of the first node of the list.

string

nodeName()

Returns the node name of the first node of the list.

string

text()

Returns the node value of the first node of the list.

string

html()

Returns the first node of the list as HTML.

array

extract(array $attributes)

Extracts information from the list of nodes.

Crawler

filterXPath(string $xpath)

Filters the list of nodes with an XPath expression.

Crawler

filter(string $selector)

Filters the list of nodes with a CSS selector.

Crawler

selectLink(string $value)

Selects links by name or alt value for clickable images.

Crawler

selectButton(string $value)

Selects a button by name or alt value for images.

Link

link(string $method = 'get')

Returns a Link object for the first node in the list.

Link[]

links()

Returns an array of Link objects for the nodes in the list.

Form

form(array $values = null, string $method = null)

Returns a Form object for the first node in the list.

setDefaultNamespacePrefix(string $prefix)

Overloads a default namespace prefix to be used with XPath and CSS expressions.

registerNamespace(string $prefix, string $namespace)

No description

static string

xpathLiteral(string $s)

Converts string for XPath expressions.

DOMElement|null

getNode(int $position)

No description

int

count()

No description

ArrayIterator

getIterator()

No description

Details

at line line 67
`__construct(mixed $node = null, string $currentUri = null, string $baseHref = null)`

Constructor.

Parameters

mixed	$node	A Node to use as the base for the crawling
string	$currentUri	The current URI
string	$baseHref	The base href value

at line line 78
`clear()`

Removes all the nodes.

at line line 94
`add(DOMNodeList|DOMNode|array|string|null $node)`

Adds a node to the current list of nodes.

This method uses the appropriate specialized add*() method based on the type of the argument.

Parameters

DOMNodeList|DOMNode|array|string|null

$node

A node

Exceptions

InvalidArgumentException

When node is not the expected type.

at line line 119
`addContent(string $content, null|string $type = null)`

Adds HTML/XML content.

If the charset is not set via the content type, it is assumed to be ISO-8859-1, which is the default charset defined by the HTTP 1.1 specification.

Parameters

string	$content	A string to parse as HTML/XML
null\|string	$type	The content type of the string

at line line 169
`addHtmlContent(string $content, string $charset = 'UTF-8')`

Adds an HTML content to the list of nodes.

The libxml errors are disabled when the content is parsed.

If you want to get parsing errors, be sure to enable internal errors via libxmluseinternalerrors(true) and then, get the errors via libxmlgeterrors(). Be sure to clear errors with libxmlclear_errors() afterward.

Parameters

string	$content	The HTML content
string	$charset	The charset

at line line 224
`addXmlContent(string $content, string $charset = 'UTF-8')`

Adds an XML content to the list of nodes.

The libxml errors are disabled when the content is parsed.

Parameters

string	$content	The XML content
string	$charset	The charset

at line line 254
`addDocument(DOMDocument $dom)`

Adds a \DOMDocument to the list of nodes.

Parameters

DOMDocument

$dom

A \DOMDocument instance

at line line 266
`addNodeList(DOMNodeList $nodes)`

Adds a \DOMNodeList to the list of nodes.

Parameters

DOMNodeList

$nodes

A \DOMNodeList instance

at line line 280
`addNodes(array $nodes)`

Adds an array of \DOMNode instances to the list of nodes.

Parameters

array

$nodes

An array of \DOMNode instances

at line line 292
`addNode(DOMNode $node)`

Adds a \DOMNode instance to the list of nodes.

Parameters

DOMNode

$node

A \DOMNode instance

at line line 325
`Crawler eq(int $position)`

Returns a node given its position in the node list.

Parameters

int

$position

The position

Return Value

Crawler

A new instance of the Crawler with the selected node, or an empty Crawler if it does not exist.

at line line 350
`array each(Closure $closure)`

Calls an anonymous function on each node of the list.

The anonymous function receives the position and the node wrapped in a Crawler instance as arguments.

Example:

$crawler->filter('h1')->each(function ($node, $i) {
    return $node->text();
});

Parameters

Closure

$closure

An anonymous function

Return Value

array

An array of values returned by the anonymous function

at line line 368
`Crawler slice(int $offset, int $length = null)`

Slices the list of nodes by $offset and $length.

Parameters

int	$offset
int	$length

Return Value

Crawler

A Crawler instance with the sliced nodes

at line line 382
`Crawler reduce(Closure $closure)`

Reduces the list of nodes by calling an anonymous function.

To remove a node from the list, the anonymous function must return false.

Parameters

Closure

$closure

An anonymous function

Return Value

Crawler

A Crawler instance with the selected nodes.

at line line 399
`Crawler first()`

Returns the first node of the current selection.

Return Value

Crawler

A Crawler instance with the first selected node

at line line 409
`Crawler last()`

Returns the last node of the current selection.

Return Value

Crawler

A Crawler instance with the last selected node

at line line 421
`Crawler siblings()`

Returns the siblings nodes of the current selection.

Return Value

Crawler

A Crawler instance with the sibling nodes

Exceptions

InvalidArgumentException

When current node is empty

at line line 437
`Crawler nextAll()`

Returns the next siblings nodes of the current selection.

Return Value

Crawler

A Crawler instance with the next sibling nodes

Exceptions

InvalidArgumentException

When current node is empty

at line line 453
`Crawler previousAll()`

Returns the previous sibling nodes of the current selection.

Return Value

Crawler

A Crawler instance with the previous sibling nodes

Exceptions

InvalidArgumentException

at line line 469
`Crawler parents()`

Returns the parents nodes of the current selection.

Return Value

Crawler

A Crawler instance with the parents nodes of the current selection

Exceptions

InvalidArgumentException

When current node is empty

at line line 494
`Crawler children()`

Returns the children nodes of the current selection.

Return Value

Crawler

A Crawler instance with the children nodes

Exceptions

InvalidArgumentException

When current node is empty

at line line 514
`string|null attr(string $attribute)`

Returns the attribute value of the first node of the list.

Parameters

string

$attribute

The attribute name

Return Value

string|null

The attribute value or null if the attribute does not exist

Exceptions

InvalidArgumentException

When current node is empty

at line line 532
`string nodeName()`

Returns the node name of the first node of the list.

Return Value

string

The node name

Exceptions

InvalidArgumentException

When current node is empty

at line line 548
`string text()`

Returns the node value of the first node of the list.

Return Value

string

The node value

Exceptions

InvalidArgumentException

When current node is empty

at line line 564
`string html()`

Returns the first node of the list as HTML.

Return Value

string

The node html

Exceptions

InvalidArgumentException

When current node is empty

at line line 591
`array extract(array $attributes)`

Extracts information from the list of nodes.

You can extract attributes or/and the node value (_text).

Example:

$crawler->filter('h1 a')->extract(array('_text', 'href'));

Parameters

array

$attributes

An array of attributes

Return Value

array

An array of extracted values

at line line 625
`Crawler filterXPath(string $xpath)`

Filters the list of nodes with an XPath expression.

The XPath expression is evaluated in the context of the crawler, which is considered as a fake parent of the elements inside it. This means that a child selector "div" or "./div" will match only the div elements of the current crawler, not their children.

Parameters

string

$xpath

An XPath expression

Return Value

Crawler

A new instance of Crawler with the filtered list of nodes

at line line 648
`Crawler filter(string $selector)`

Filters the list of nodes with a CSS selector.

This method only works if you have installed the CssSelector Symfony Component.

Parameters

string

$selector

A CSS selector

Return Value

Crawler

A new instance of Crawler with the filtered list of nodes

Exceptions

RuntimeException

if the CssSelector Component is not available

at line line 667
`Crawler selectLink(string $value)`

Selects links by name or alt value for clickable images.

Parameters

string

$value

The link text

Return Value

Crawler

A new instance of Crawler with the filtered list of nodes

at line line 682
`Crawler selectButton(string $value)`

Selects a button by name or alt value for images.

Parameters

string

$value

The button text

Return Value

Crawler

A new instance of Crawler with the filtered list of nodes

at line line 701
`Link link(string $method = 'get')`

Returns a Link object for the first node in the list.

Parameters

string

$method

The method for the link (get by default)

Return Value

Link	A Link instance

Exceptions

InvalidArgumentException

If the current node list is empty

at line line 717
`Link[] links()`

Returns an array of Link objects for the nodes in the list.

Return Value

Link[]

An array of Link instances

at line line 737
`Form form(array $values = null, string $method = null)`

Returns a Form object for the first node in the list.

Parameters

array	$values	An array of values for the form fields
string	$method	The method for the form

Return Value

Form	A Form instance

Exceptions

InvalidArgumentException

If the current node list is empty

at line line 757
`setDefaultNamespacePrefix(string $prefix)`

Overloads a default namespace prefix to be used with XPath and CSS expressions.

Parameters

string

$prefix

at line line 766
`registerNamespace(string $prefix, string $namespace)`

Parameters

string	$prefix
string	$namespace

at line line 792
`static string xpathLiteral(string $s)`

Converts string for XPath expressions.

Escaped characters are: quotes (") and apostrophe (').

Examples: echo Crawler::xpathLiteral('foo " bar'); //prints 'foo " bar'



echo Crawler::xpathLiteral("foo ' bar");
//prints "foo ' bar"

echo Crawler::xpathLiteral('a\'b"c');
//prints concat('a', "'", 'b"c')

Parameters

string

String to be escaped

Return Value

string

Converted string

at line line 913
`DOMElement|null getNode(int $position)`

Parameters

int

$position

Return Value

DOMElement|null

at line line 923
`int count()`

Return Value

int

at line line 931
`ArrayIterator getIterator()`

Return Value

ArrayIterator

Crawler

Methods

Details

at line line 67 __construct(mixed $node = null, string $currentUri = null, string $baseHref = null)

Parameters

at line line 78 clear()

at line line 94 add(DOMNodeList|DOMNode|array|string|null $node)

Parameters

Exceptions

at line line 119 addContent(string $content, null|string $type = null)

Parameters

at line line 169 addHtmlContent(string $content, string $charset = 'UTF-8')

Parameters

at line line 224 addXmlContent(string $content, string $charset = 'UTF-8')

Parameters

at line line 254 addDocument(DOMDocument $dom)

Parameters

at line line 266 addNodeList(DOMNodeList $nodes)

Parameters

at line line 280 addNodes(array $nodes)

Parameters

at line line 292 addNode(DOMNode $node)

Parameters

at line line 325 Crawler eq(int $position)

Parameters

Return Value

at line line 350 array each(Closure $closure)

Parameters

Return Value

at line line 368 Crawler slice(int $offset, int $length = null)

Parameters

Return Value

at line line 382 Crawler reduce(Closure $closure)

Parameters

Return Value

at line line 399 Crawler first()

Return Value

at line line 409 Crawler last()

Return Value

at line line 421 Crawler siblings()

Return Value

Exceptions

at line line 437 Crawler nextAll()

Return Value

Exceptions

at line line 453 Crawler previousAll()

Return Value

Exceptions

at line line 469 Crawler parents()

Return Value

Exceptions

at line line 494 Crawler children()

Return Value

Exceptions

at line line 514 string|null attr(string $attribute)

Parameters

Return Value

Exceptions

at line line 532 string nodeName()

Return Value

Exceptions

at line line 548 string text()

Return Value

Exceptions

at line line 564 string html()

Return Value

Exceptions

at line line 591 array extract(array $attributes)

Parameters

Return Value

at line line 625 Crawler filterXPath(string $xpath)

Parameters

Return Value

at line line 648 Crawler filter(string $selector)

Parameters

Return Value

Exceptions

at line line 667 Crawler selectLink(string $value)

Parameters

Return Value

at line line 67
`__construct(mixed $node = null, string $currentUri = null, string $baseHref = null)`

at line line 78
`clear()`

at line line 94
`add(DOMNodeList|DOMNode|array|string|null $node)`

at line line 119
`addContent(string $content, null|string $type = null)`

at line line 169
`addHtmlContent(string $content, string $charset = 'UTF-8')`

at line line 224
`addXmlContent(string $content, string $charset = 'UTF-8')`

at line line 254
`addDocument(DOMDocument $dom)`

at line line 266
`addNodeList(DOMNodeList $nodes)`

at line line 280
`addNodes(array $nodes)`

at line line 292
`addNode(DOMNode $node)`

at line line 325
`Crawler eq(int $position)`

at line line 350
`array each(Closure $closure)`

at line line 368
`Crawler slice(int $offset, int $length = null)`

at line line 382
`Crawler reduce(Closure $closure)`

at line line 399
`Crawler first()`

at line line 409
`Crawler last()`

at line line 421
`Crawler siblings()`

at line line 437
`Crawler nextAll()`

at line line 453
`Crawler previousAll()`

at line line 469
`Crawler parents()`

at line line 494
`Crawler children()`

at line line 514
`string|null attr(string $attribute)`

at line line 532
`string nodeName()`

at line line 548
`string text()`

at line line 564
`string html()`

at line line 591
`array extract(array $attributes)`

at line line 625
`Crawler filterXPath(string $xpath)`

at line line 648
`Crawler filter(string $selector)`

at line line 667
`Crawler selectLink(string $value)`

at line line 682
`Crawler selectButton(string $value)`

at line line 701
`Link link(string $method = 'get')`

at line line 717
`Link[] links()`

at line line 737
`Form form(array $values = null, string $method = null)`

at line line 757
`setDefaultNamespacePrefix(string $prefix)`

at line line 766
`registerNamespace(string $prefix, string $namespace)`

at line line 792
`static string xpathLiteral(string $s)`

at line line 913
`DOMElement|null getNode(int $position)`

at line line 923
`int count()`

at line line 931
`ArrayIterator getIterator()`