MediaWiki  REL1_24
HtmlFormatter Class Reference
Inheritance diagram for HtmlFormatter:

List of all members.

Public Member Functions

 __construct ($html)
 Constructor.
 filterContent ()
 Removes content we've chosen to remove.
 flatten ($elements)
 Adds one or more element name to the list to flatten (remove tag, but not its content) Can accept undelimited regexes.
 flattenAllTags ()
 Instructs the formatter to flatten all tags.
 getDoc ()
 getText ($element=null)
 Performs final transformations and returns resulting HTML.
 remove ($selectors)
 Adds one or more selector of content to remove.
 setRemoveMedia ($flag=true)
 Sets whether images/videos/sounds should be removed from output.

Static Public Member Functions

static wrapHTML ($html)
 Turns a chunk of HTML into a proper document.

Protected Member Functions

 onHtmlReady ($html)
 Override this in descendant class to modify HTML after it has been converted from DOM tree.
 parseItemsToRemove ()
 Transforms CSS-style selectors into an internal representation suitable for processing by filterContent()
 parseSelector ($selector, &$type, &$rawName)
 Helper function for parseItemsToRemove().

Protected Attributes

 $removeMedia = false

Private Member Functions

 fixLibXML ($html)
 libxml in its usual pointlessness converts many chars to entities - this function perfoms a reverse conversion
 removeElements ($elements)
 Removes a list of elelments from DOMDocument.

Private Attributes

DOMDocument $doc
 $elementsToFlatten = array()
 $html
 $itemsToRemove = array()

Detailed Description

Definition at line 23 of file HtmlFormatter.php.


Constructor & Destructor Documentation

Constructor.

Parameters:
string$htmlText to process

Definition at line 38 of file HtmlFormatter.php.


Member Function Documentation

Removes content we've chosen to remove.

The text of the removed elements can be extracted with the getText method.

Returns:
array Array of removed DOMElements

Definition at line 134 of file HtmlFormatter.php.

HtmlFormatter::fixLibXML ( html) [private]

libxml in its usual pointlessness converts many chars to entities - this function perfoms a reverse conversion

Parameters:
string$html
Returns:
string

Definition at line 236 of file HtmlFormatter.php.

HtmlFormatter::flatten ( elements)

Adds one or more element name to the list to flatten (remove tag, but not its content) Can accept undelimited regexes.

Note this interface may fail in surprising unexpected ways due to usage of regexes, so should not be relied on for HTML markup security measures.

Parameters:
array | string$elementsName(s) of tag(s) to flatten

Definition at line 118 of file HtmlFormatter.php.

Instructs the formatter to flatten all tags.

Definition at line 125 of file HtmlFormatter.php.

Returns:
DOMDocument DOM to manipulate

Reimplemented in MockHtmlFormatter.

Definition at line 63 of file HtmlFormatter.php.

HtmlFormatter::getText ( element = null)

Performs final transformations and returns resulting HTML.

Note that if you want to call this both without an element and with an element you should call it without an element first. If you specify the $element in the method it'll change the underlying dom and you won't be able to get it back.

Parameters:
DOMElement | string | null$elementID of element to get HTML from or false to get it from the whole tree
Returns:
string Processed HTML

Definition at line 265 of file HtmlFormatter.php.

HtmlFormatter::onHtmlReady ( html) [protected]

Override this in descendant class to modify HTML after it has been converted from DOM tree.

Parameters:
string$htmlHTML to process
Returns:
string Processed HTML

Definition at line 56 of file HtmlFormatter.php.

Transforms CSS-style selectors into an internal representation suitable for processing by filterContent()

Returns:
array

Definition at line 350 of file HtmlFormatter.php.

HtmlFormatter::parseSelector ( selector,
&$  type,
&$  rawName 
) [protected]

Helper function for parseItemsToRemove().

This function extracts the selector type and the raw name of a selector from a CSS-style selector string and assigns those values to parameters passed by reference. For example, if given '#toc' as the $selector parameter, it will assign 'ID' as the $type and 'toc' as the $rawName.

Parameters:
string$selectorCSS selector to parse
string$typeThe type of selector (ID, CLASS, TAG_CLASS, or TAG)
string$rawNameThe raw name of the selector
Returns:
bool Whether the selector was successfully recognised

Definition at line 325 of file HtmlFormatter.php.

HtmlFormatter::remove ( selectors)

Adds one or more selector of content to remove.

A subset of CSS selector syntax is supported:

<tag> <tag>.class .<class> #<id>

Parameters:
array | string$selectorsSelector(s) of stuff to remove

Definition at line 105 of file HtmlFormatter.php.

HtmlFormatter::removeElements ( elements) [private]

Removes a list of elelments from DOMDocument.

Parameters:
array | DOMNodeList$elements
Returns:
array Array of removed elements

Definition at line 213 of file HtmlFormatter.php.

Sets whether images/videos/sounds should be removed from output.

Parameters:
bool$flag

Definition at line 90 of file HtmlFormatter.php.

static HtmlFormatter::wrapHTML ( html) [static]

Turns a chunk of HTML into a proper document.

Parameters:
string$html
Returns:
string

Definition at line 47 of file HtmlFormatter.php.

Referenced by HtmlFormatterTest\testTransform().


Member Data Documentation

DOMDocument HtmlFormatter::$doc [private]

Definition at line 26 of file HtmlFormatter.php.

HtmlFormatter::$elementsToFlatten = array() [private]

Definition at line 30 of file HtmlFormatter.php.

HtmlFormatter::$html [private]

Definition at line 28 of file HtmlFormatter.php.

HtmlFormatter::$itemsToRemove = array() [private]

Definition at line 29 of file HtmlFormatter.php.

HtmlFormatter::$removeMedia = false [protected]

Definition at line 31 of file HtmlFormatter.php.


The documentation for this class was generated from the following file: