MediaWiki  REL1_22
Sanitizer Class Reference

HTML sanitizer for MediaWiki. More...

List of all members.

Static Public Member Functions

static attributeWhitelist ($element)
 Fetch the whitelist of acceptable attributes for a given element name.
static checkCss ($value)
 Pick apart some CSS and check it for forbidden or unsafe structures.
static cleanUrl ($url)
static cleanUrlCallback ($matches)
static cssDecodeCallback ($matches)
static decCharReference ($codepoint)
static decodeChar ($codepoint)
 Return UTF-8 string for a codepoint if that is a valid character reference, otherwise U+FFFD REPLACEMENT CHARACTER.
static decodeCharReferences ($text)
 Decode any character references, numeric or named entities, in the text and return a UTF-8 string.
static decodeCharReferencesAndNormalize ($text)
 Decode any character references, numeric or named entities, in the next and normalize the resulting string.
static decodeCharReferencesCallback ($matches)
static decodeEntity ($name)
 If the named entity is defined in the HTML 4.0/XHTML 1.0 DTD, return the UTF-8 encoding of that character.
static decodeTagAttributes ($text)
 Return an associative array of attribute names and values from a partial tag string.
static encodeAttribute ($text)
 Encode an attribute value for HTML output.
static escapeClass ($class)
 Given a value, escape it so that it can be used as a CSS class and return it.
static escapeHtmlAllowEntities ($html)
 Given HTML input, escape with htmlspecialchars but un-escape entities.
static escapeId ($id, $options=array())
 Given a value, escape it so that it can be used in an id attribute and return it.
static fixTagAttributes ($text, $element)
 Take a tag soup fragment listing an HTML element's attributes and normalize it to well-formed XML, discarding unwanted attributes.
static getAttribsRegex ()
 Regular expression to match HTML/XML attribute pairs within a tag.
static hackDocType ()
 Hack up a private DOCTYPE with HTML's standard entity declarations.
static hexCharReference ($codepoint)
static mergeAttributes ($a, $b)
 Merge two sets of HTML attributes.
static normalizeCharReferences ($text)
 Ensure that any entities and character references are legal for XML and XHTML specifically.
static normalizeCharReferencesCallback ($matches)
static normalizeCss ($value)
 Normalize CSS into a format we can easily search for hostile input.
static normalizeEntity ($name)
 If the named entity is defined in the HTML 4.0/XHTML 1.0 DTD, return the equivalent numeric entity reference (except for the core < > & ").
static normalizeSectionNameWhitespace ($section)
 Normalizes whitespace in a section name, such as might be returned by Parser::stripSectionName(), for use in the id's that are used for section links.
static removeHTMLcomments ($text)
 Remove '', and everything between.
static removeHTMLtags ($text, $processCallback=null, $args=array(), $extratags=array(), $removetags=array())
 Cleans up HTML, removes dangerous tags and attributes, and removes HTML comments.
static safeEncodeAttribute ($text)
 Encode an attribute value for HTML tags, with extra armoring against further wiki processing.
static safeEncodeTagAttributes ($assoc_array)
 Build a partial tag string from an associative array of attribute names and values as returned by decodeTagAttributes.
static setupAttributeWhitelist ()
 Foreach array key (an allowed HTML element), return an array of allowed attributes.
static stripAllTags ($text)
 Take a fragment of (potentially invalid) HTML and return a version with any tags removed, encoded as plain text.
static validateAttributes ($attribs, $whitelist)
 Take an array of attribute names and values and normalize or discard illegal values for the given whitelist.
static validateEmail ($addr)
 Does a string look like an e-mail address?
static validateTag ($params, $element)
 Takes attribute names and values for a tag and the tag name and validates that the tag is allowed to be present.
static validateTagAttributes ($attribs, $element)
 Take an array of attribute names and values and normalize or discard illegal values for the given element type.

Public Attributes

const CHAR_REFS_REGEX
 Regular expression to match various types of character references in Sanitizer::normalizeCharReferences and Sanitizer::decodeCharReferences.
const EVIL_URI_PATTERN = '!(^|\s|\*/\s*)(javascript|vbscript)([^\w]|$)!i'
 Blacklist for evil uris like javascript: WARNING: DO NOT use this in any place that actually requires blacklisting for security reasons.
const XMLNS_ATTRIBUTE_PATTERN = "/^xmlns:[:A-Z_a-z-.0-9]+$/"

Static Private Member Functions

static armorLinksCallback ($matches)
 Regex replace callback for armoring links against further processing.
static getTagAttributeCallback ($set)
 Pick the appropriate attribute value from a match set from the attribs regex matches.
static normalizeAttributeValue ($text)
 Normalize whitespace and character references in an XML source- encoded text for an attribute value.
static normalizeWhitespace ($text)
static validateCodepoint ($codepoint)
 Returns true if a given Unicode codepoint is a valid character in XML.

Static Private Attributes

static $attribsRegex
 Lazy-initialised attributes regex, see getAttribsRegex()
static $htmlEntities
 List of all named character entities defined in HTML 4.01 http://www.w3.org/TR/html4/sgml/entities.html As well as ' which is only defined starting in XHTML1.
static $htmlEntityAliases
 Character entity aliases accepted by MediaWiki.

Detailed Description

HTML sanitizer for MediaWiki.

Definition at line 31 of file Sanitizer.php.


Member Function Documentation

static Sanitizer::armorLinksCallback ( matches) [static, private]

Regex replace callback for armoring links against further processing.

Parameters:
$matchesArray
Returns:
string

Definition at line 1172 of file Sanitizer.php.

References $matches.

static Sanitizer::attributeWhitelist ( element) [static]

Fetch the whitelist of acceptable attributes for a given element name.

Parameters:
$elementString
Returns:
Array

Definition at line 1499 of file Sanitizer.php.

References array(), and setupAttributeWhitelist().

Referenced by validateTagAttributes().

static Sanitizer::checkCss ( value) [static]

Pick apart some CSS and check it for forbidden or unsafe structures.

Returns a sanitized string. This sanitized string will have character references and escape sequences decoded and comments stripped (unless it is itself one valid comment, in which case the value will be passed through). If the input is just too evil, only a comment complaining about evilness will be returned.

Currently URL references, 'expression', 'tps' are forbidden.

NOTE: Despite the fact that character references are decoded, the returned string may contain character references given certain clever input strings. These character references must be escaped before the return value is embedded in HTML.

Parameters:
string$value
Returns:
string

Definition at line 939 of file Sanitizer.php.

References $value, and normalizeCss().

Referenced by CoreParserFunctions\displaytitle(), SanitizerTest\testCssCommentsChecking(), and validateAttributes().

static Sanitizer::cleanUrl ( url) [static]
Parameters:
$urlstring
Returns:
mixed|string

Definition at line 1767 of file Sanitizer.php.

References $matches, array(), decodeCharReferences(), and list.

static Sanitizer::cleanUrlCallback ( matches) [static]
Parameters:
$matchesarray
Returns:
string

Definition at line 1814 of file Sanitizer.php.

References $matches.

static Sanitizer::cssDecodeCallback ( matches) [static]
Parameters:
$matchesarray
Returns:
String

Definition at line 965 of file Sanitizer.php.

References $matches, and codepointToUtf8().

static Sanitizer::decCharReference ( codepoint) [static]
Parameters:
$codepoint
Returns:
null|string

Definition at line 1369 of file Sanitizer.php.

References validateCodepoint().

Referenced by normalizeCharReferencesCallback().

static Sanitizer::decodeChar ( codepoint) [static]

Return UTF-8 string for a codepoint if that is a valid character reference, otherwise U+FFFD REPLACEMENT CHARACTER.

Parameters:
$codepointInteger
Returns:
String
Access:
private

Definition at line 1466 of file Sanitizer.php.

References codepointToUtf8(), and validateCodepoint().

Referenced by decodeCharReferencesCallback().

static Sanitizer::decodeCharReferencesAndNormalize ( text) [static]

Decode any character references, numeric or named entities, in the next and normalize the resulting string.

(bug 14952)

This is useful for page titles, not for text to be displayed, MediaWiki allows HTML entities to escape normalization as a feature.

Parameters:
string$text(already normalized, containing entities)
Returns:
String (still normalized, without entities)

Definition at line 1429 of file Sanitizer.php.

References $count, $wgContLang, array(), and global.

Referenced by Title\newFromText().

static Sanitizer::decodeCharReferencesCallback ( matches) [static]
Parameters:
$matchesString
Returns:
String

Definition at line 1447 of file Sanitizer.php.

References $matches, decodeChar(), and decodeEntity().

static Sanitizer::decodeEntity ( name) [static]

If the named entity is defined in the HTML 4.0/XHTML 1.0 DTD, return the UTF-8 encoding of that character.

Otherwise, returns pseudo-entity source (eg "&foo;")

Parameters:
$nameString
Returns:
String

Definition at line 1482 of file Sanitizer.php.

References $name, and codepointToUtf8().

Referenced by decodeCharReferencesCallback().

static Sanitizer::decodeTagAttributes ( text) [static]

Return an associative array of attribute names and values from a partial tag string.

Attribute names are forces to lowercase, character references are decoded to UTF-8 text.

Parameters:
$textString
Returns:
Array

Definition at line 1184 of file Sanitizer.php.

References $attribs, $value, array(), as, decodeCharReferences(), and getTagAttributeCallback().

Referenced by CoreParserFunctions\displaytitle(), fixTagAttributes(), Linker\makeKnownLinkObj(), SanitizerTest\testDecodeTagAttributes(), and validateTag().

static Sanitizer::encodeAttribute ( text) [static]

Encode an attribute value for HTML output.

Parameters:
$textString
Returns:
HTML-encoded text fragment

Definition at line 1021 of file Sanitizer.php.

References array().

Referenced by Xml\expandAttributes(), and safeEncodeAttribute().

static Sanitizer::escapeClass ( class) [static]

Given a value, escape it so that it can be used as a CSS class and return it.

Todo:
For extra validity, input should be validated UTF-8.
See also:
http://www.w3.org/TR/CSS21/syndata.html Valid characters/format
Parameters:
$classString
Returns:
String

Definition at line 1144 of file Sanitizer.php.

References array().

Referenced by ChangeTags\formatSummaryRow(), SpecialStatistics\getGroupStats(), Skin\getPageClasses(), SkinTemplate\outputPage(), EnhancedChangesList\recentChangesBlockGroup(), EnhancedChangesList\recentChangesBlockLine(), and OldChangesList\recentChangesLine().

static Sanitizer::escapeHtmlAllowEntities ( html) [static]

Given HTML input, escape with htmlspecialchars but un-escape entities.

This allows (generally harmless) entities like &#160; to survive.

Parameters:
string$htmlto escape
Returns:
String: escaped input

Definition at line 1159 of file Sanitizer.php.

References $html, and decodeCharReferences().

Referenced by Linker\formatComment(), and wfMsgExt().

static Sanitizer::escapeId ( id,
options = array() 
) [static]

Given a value, escape it so that it can be used in an id attribute and return it.

This will use HTML5 validation if $wgExperimentalHtmlIds is true, allowing anything but ASCII whitespace. Otherwise it will use HTML 4 rules, which means a narrow subset of ASCII, with bad characters escaped with lots of dots.

To ensure we don't have to bother escaping anything, we also strip ', ", & even if $wgExperimentalIds is true. TODO: Is this the best tactic? We also strip # because it upsets IE, and % because it could be ambiguous if it's part of something that looks like a percent escape (which don't work reliably in fragments cross-browser).

See also:
http://www.w3.org/TR/html401/types.html#type-name Valid characters in the id and name attributes
http://www.w3.org/TR/html401/struct/links.html#h-12.2.3 Anchors with the id attribute
http://www.whatwg.org/html/elements.html#the-id-attribute HTML5 definition of id attribute
Parameters:
string$idid to escape
$optionsMixed: string or array of strings (default is array()): 'noninitial': This is a non-initial fragment of an id, not a full id, so don't pay attention if the first character isn't valid at the beginning of an id. Only matters if $wgExperimentalHtmlIds is false. 'legacy': Behave the way the old HTML 4-based ID escaping worked even if $wgExperimentalHtmlIds is used, so we can generate extra anchors and links won't break.
Returns:
String

Definition at line 1100 of file Sanitizer.php.

References $options, array(), decodeCharReferences(), and global.

Referenced by Skin\addToSidebarPlain(), MonoBookTemplate\customBox(), Title\escapeFragmentForURL(), SpecialListGroupRights\execute(), VectorTemplate\execute(), Parser\guessSectionNameFromWikiText(), InfoAction\makeHeader(), CologneBlueTemplate\quickBar(), and validateAttributes().

static Sanitizer::fixTagAttributes ( text,
element 
) [static]

Take a tag soup fragment listing an HTML element's attributes and normalize it to well-formed XML, discarding unwanted attributes.

Output is safe for further wikitext processing, with escaping of values that could trigger problems.

  • Normalizes attribute names to lowercase
  • Discards attributes not on a whitelist for the given element
  • Turns broken or invalid entities into plaintext
  • Double-quotes all attribute values
  • Attributes without values are given the name as attribute
  • Double attributes are discarded
  • Unsafe style attributes are discarded
  • Prepends space if there are attributes.
Parameters:
$textString
$elementString
Returns:
String

Definition at line 1005 of file Sanitizer.php.

References decodeTagAttributes(), safeEncodeTagAttributes(), and validateTagAttributes().

Referenced by removeHTMLtags(), SanitizerTest\testAttributeSupport(), and SanitizerTest\testDeprecatedAttributesUnaltered().

static Sanitizer::getAttribsRegex ( ) [static]

Regular expression to match HTML/XML attribute pairs within a tag.

Allows some... latitude. Used in Sanitizer::fixTagAttributes and Sanitizer::decodeTagAttributes

Definition at line 332 of file Sanitizer.php.

References $attribsRegex.

static Sanitizer::getTagAttributeCallback ( set) [static, private]

Pick the appropriate attribute value from a match set from the attribs regex matches.

Parameters:
$setArray
Exceptions:
MWException
Returns:
String

Definition at line 1239 of file Sanitizer.php.

Referenced by decodeTagAttributes().

static Sanitizer::hackDocType ( ) [static]

Hack up a private DOCTYPE with HTML's standard entity declarations.

PHP 4 seemed to know these if you gave it an HTML doctype, but PHP 5.1 doesn't.

Use for passing XHTML fragments to PHP's XML parsing functions

Returns:
String

Definition at line 1754 of file Sanitizer.php.

References $out, and as.

static Sanitizer::hexCharReference ( codepoint) [static]
Parameters:
$codepoint
Returns:
null|string

Definition at line 1382 of file Sanitizer.php.

References validateCodepoint().

Referenced by normalizeCharReferencesCallback().

static Sanitizer::mergeAttributes ( a,
b 
) [static]

Merge two sets of HTML attributes.

Conflicting items in the second set will override those in the first, except for 'class' attributes which will be combined (if they're both strings).

Todo:
implement merging for other attributes such as style
Parameters:
$aArray
$bArray
Returns:
array

Definition at line 808 of file Sanitizer.php.

References $out.

Referenced by Linker\linkAttribs(), Linker\makeKnownLinkObj(), and TraditionalImageGallery\toHTML().

static Sanitizer::normalizeAttributeValue ( text) [static, private]

Normalize whitespace and character references in an XML source- encoded text for an attribute value.

See http://www.w3.org/TR/REC-xml/#AVNormalize for background, but note that we're not returning the value, but are returning XML source fragments that will be slapped into output.

Parameters:
$textString
Returns:
String

Definition at line 1272 of file Sanitizer.php.

References normalizeCharReferences().

static Sanitizer::normalizeCharReferences ( text) [static]

Ensure that any entities and character references are legal for XML and XHTML specifically.

Any stray bits will be &-escaped to result in a valid text fragment.

a. named char refs can only be < > & ", others are numericized (this way we're well-formed even without a DTD) b. any numeric char refs must be legal chars, not invalid or forbidden c. use lower cased "&#x", not "&#X" d. fix or reject non-valid attributes

Parameters:
$textString
Returns:
String
Access:
private

Definition at line 1316 of file Sanitizer.php.

References array().

Referenced by CoreParserFunctions\displaytitle(), and normalizeAttributeValue().

static Sanitizer::normalizeCharReferencesCallback ( matches) [static]
Parameters:
$matchesString
Returns:
String

Definition at line 1326 of file Sanitizer.php.

References $matches, $ret, decCharReference(), hexCharReference(), and normalizeEntity().

static Sanitizer::normalizeCss ( value) [static]

Normalize CSS into a format we can easily search for hostile input.

  • decode character references
  • decode escape sequences
  • convert characters that IE6 interprets into ascii
  • remove comments, unless the entire value is one single comment
    Parameters:
    string$valuethe css string
    Returns:
    string normalized css

Definition at line 830 of file Sanitizer.php.

References $matches, $value, array(), decodeCharReferences(), StringUtils\delimiterReplace(), and utf8ToCodepoint().

Referenced by checkCss(), and UploadBase\checkSvgScriptCallback().

static Sanitizer::normalizeEntity ( name) [static]

If the named entity is defined in the HTML 4.0/XHTML 1.0 DTD, return the equivalent numeric entity reference (except for the core < > & ").

If the entity is a MediaWiki-specific alias, returns the HTML equivalent. Otherwise, returns HTML-escaped text of pseudo-entity source (eg &foo;)

Parameters:
$nameString
Returns:
String

Definition at line 1352 of file Sanitizer.php.

References $name, and array().

Referenced by normalizeCharReferencesCallback().

static Sanitizer::normalizeSectionNameWhitespace ( section) [static]

Normalizes whitespace in a section name, such as might be returned by Parser::stripSectionName(), for use in the id's that are used for section links.

Parameters:
$sectionString
Returns:
String

Definition at line 1297 of file Sanitizer.php.

References $section.

Referenced by ApiFeedWatchlist\createFeedItem(), Linker\formatAutocommentsCallback(), and Parser\guessSectionNameFromWikiText().

static Sanitizer::normalizeWhitespace ( text) [static, private]
Parameters:
$textstring
Returns:
mixed

Definition at line 1282 of file Sanitizer.php.

Referenced by stripAllTags().

static Sanitizer::removeHTMLcomments ( text) [static]

Remove '', and everything between.

To avoid leaving blank lines, when a comment is both preceded and followed by a newline (ignoring spaces), trim leading and trailing spaces and one of the newlines.

Access:
private
Parameters:
$textString
Returns:
string

Definition at line 607 of file Sanitizer.php.

References wfProfileIn(), and wfProfileOut().

Referenced by removeHTMLtags().

static Sanitizer::removeHTMLtags ( text,
processCallback = null,
args = array(),
extratags = array(),
removetags = array() 
) [static]

Cleans up HTML, removes dangerous tags and attributes, and removes HTML comments.

Access:
private
Parameters:
$textString
$processCallbackCallback to do any variable or parameter replacements in HTML attribute values
array$argsfor the processing callback
array$extratagsfor any extra tags to include
array$removetagsfor any tags (default or extra) to exclude
Returns:
string

Definition at line 366 of file Sanitizer.php.

References $params, $t, $vars, array(), as, fixTagAttributes(), global, in, list, only, removeHTMLcomments(), table, tags(), that, them, used, validateTag(), wfProfileIn(), wfProfileOut(), wfRestoreWarnings(), wfSuppressWarnings(), and will.

Referenced by CoreParserFunctions\displaytitle(), SanitizerTest\testRemoveHTMLtags(), and SanitizerTest\testRemovehtmltagsOnHtml5Tags().

static Sanitizer::safeEncodeAttribute ( text) [static]

Encode an attribute value for HTML tags, with extra armoring against further wiki processing.

Parameters:
$textString
Returns:
HTML-encoded text fragment

Definition at line 1042 of file Sanitizer.php.

References array(), encodeAttribute(), and wfUrlProtocols().

Referenced by safeEncodeTagAttributes().

static Sanitizer::safeEncodeTagAttributes ( assoc_array) [static]

Build a partial tag string from an associative array of attribute names and values as returned by decodeTagAttributes.

Parameters:
$assoc_arrayArray
Returns:
String

Definition at line 1220 of file Sanitizer.php.

References $attribs, $value, array(), as, and safeEncodeAttribute().

Referenced by CoreParserFunctions\displaytitle(), and fixTagAttributes().

Foreach array key (an allowed HTML element), return an array of allowed attributes.

Returns:
Array

Definition at line 1511 of file Sanitizer.php.

References array(), data, directly, extensions, from, global, hooks, http, is, it, only, root, simple, such, t(), text, title, and used.

Referenced by attributeWhitelist().

static Sanitizer::stripAllTags ( text) [static]

Take a fragment of (potentially invalid) HTML and return a version with any tags removed, encoded as plain text.

Warning: this return value must be further escaped for literal inclusion in HTML output as of 1.10!

Parameters:
string$textHTML fragment
Returns:
String

Definition at line 1734 of file Sanitizer.php.

References decodeCharReferences(), StringUtils\delimiterReplace(), and normalizeWhitespace().

Referenced by MWDebug\appendDebugInfoToApiResult(), and CoreParserFunctions\displaytitle().

static Sanitizer::validateAttributes ( attribs,
whitelist 
) [static]

Take an array of attribute names and values and normalize or discard illegal values for the given whitelist.

  • Discards attributes not the given whitelist
  • Unsafe style attributes are discarded
  • Invalid id attributes are re-encoded
Parameters:
$attribsArray
array$whitelistlist of allowed attribute names
Returns:
Array
Todo:

Check for legal values where the DTD limits things.

Check for unique id attribute :P

Definition at line 712 of file Sanitizer.php.

References $attribs, $out, $value, array(), as, checkCss(), escapeId(), global, and wfUrlProtocols().

Referenced by validateTagAttributes().

static Sanitizer::validateCodepoint ( codepoint) [static, private]

Returns true if a given Unicode codepoint is a valid character in XML.

Parameters:
$codepointInteger
Returns:
Boolean

Definition at line 1396 of file Sanitizer.php.

Referenced by decCharReference(), decodeChar(), and hexCharReference().

static Sanitizer::validateEmail ( addr) [static]

Does a string look like an e-mail address?

This validates an email address using an HTML5 specification found at: http://www.whatwg.org/html/states-of-the-type-attribute.html#valid-e-mail-address Which as of 2011-01-24 says:

A valid e-mail address is a string that matches the ABNF production 1*( atext / "." ) "@" ldh-str *( "." ldh-str ) where atext is defined in RFC 5322 section 3.2.3, and ldh-str is defined in RFC 1034 section 3.5.

This function is an implementation of the specification as requested in bug 22449.

Client-side forms will use the same standard validation rules via JS or HTML 5 validation; additional restrictions can be enforced server-side by extensions via the 'isValidEmailAddr' hook.

Note that this validation doesn't 100% match RFC 2822, but is believed to be liberal enough for wide use. Some invalid addresses will still pass validation here.

Since:
1.18
Parameters:
string$addrE-mail address
Returns:
Bool

Definition at line 1846 of file Sanitizer.php.

References $result, array(), and wfRunHooks().

Referenced by Autopromote\checkCondition(), SanitizerValidateEmailTest\checkEmail(), ApiCreateAccount\execute(), EmailConfirmation\execute(), and WebInstaller_Name\submit().

static Sanitizer::validateTag ( params,
element 
) [static]

Takes attribute names and values for a tag and the tag name and validates that the tag is allowed to be present.

This DOES NOT validate the attributes, nor does it validate the tags themselves. This method only handles the special circumstances where we may want to allow a tag within content but ONLY when it has specific attributes set.

Parameters:
$params
$element
Returns:
bool

Definition at line 656 of file Sanitizer.php.

References $params, and decodeTagAttributes().

Referenced by removeHTMLtags().

static Sanitizer::validateTagAttributes ( attribs,
element 
) [static]

Take an array of attribute names and values and normalize or discard illegal values for the given element type.

  • Discards attributes not on a whitelist for the given element
  • Unsafe style attributes are discarded
  • Invalid id attributes are re-encoded
Parameters:
$attribsArray
$elementString
Returns:
Array
Todo:

Check for legal values where the DTD limits things.

Check for unique id attribute :P

Definition at line 692 of file Sanitizer.php.

References $attribs, attributeWhitelist(), and validateAttributes().

Referenced by fixTagAttributes(), and CoreTagHooks\pre().


Member Data Documentation

Sanitizer::$attribsRegex [static, private]

Lazy-initialised attributes regex, see getAttribsRegex()

Definition at line 325 of file Sanitizer.php.

Referenced by getAttribsRegex().

Sanitizer::$htmlEntities [static, private]

List of all named character entities defined in HTML 4.01 http://www.w3.org/TR/html4/sgml/entities.html As well as ' which is only defined starting in XHTML1.

Definition at line 58 of file Sanitizer.php.

Sanitizer::$htmlEntityAliases [static, private]
Initial value:
 array(
        'רלמ' => 'rlm',
        'رلم' => 'rlm',
    )

Character entity aliases accepted by MediaWiki.

Definition at line 317 of file Sanitizer.php.

Initial value:
        '/&([A-Za-z0-9\x80-\xff]+);
         |&\#([0-9]+);
         |&\#[xX]([0-9A-Fa-f]+);
         |(&)/x'

Regular expression to match various types of character references in Sanitizer::normalizeCharReferences and Sanitizer::decodeCharReferences.

Definition at line 36 of file Sanitizer.php.

const Sanitizer::EVIL_URI_PATTERN = '!(^|\s|\*/\s*)(javascript|vbscript)([^\w]|$)!i'

Blacklist for evil uris like javascript: WARNING: DO NOT use this in any place that actually requires blacklisting for security reasons.

There are NUMEROUS[1] ways to bypass blacklisting, the only way to be secure from javascript: uri based xss vectors is to whitelist things that you know are safe and deny everything else. [1]: http://ha.ckers.org/xss.html

Definition at line 50 of file Sanitizer.php.

const Sanitizer::XMLNS_ATTRIBUTE_PATTERN = "/^xmlns:[:A-Z_a-z-.0-9]+$/"

Definition at line 51 of file Sanitizer.php.


The documentation for this class was generated from the following file: