(PHP 4, PHP 5)
htmlentities — Convert all applicable characters to HTML entities
This function is identical to htmlspecialchars() in all ways, except with htmlentities(), all characters which have HTML character entity equivalents are translated into these entities.
If you're wanting to decode instead (the reverse) you can use html_entity_decode().
The input string.
Like htmlspecialchars(), the optional second flags parameter lets you define what will be done with 'single' and "double" quotes. It takes on one of three constants with the default being ENT_COMPAT, optionally combined with a fourth ENT_IGNORE (since PHP 5.3.0):
Constant Name | Description |
---|---|
ENT_COMPAT | Will convert double-quotes and leave single-quotes alone. |
ENT_QUOTES | Will convert both double and single quotes. |
ENT_NOQUOTES | Will leave both double and single quotes unconverted. |
ENT_IGNORE | Silently discard invalid code unit sequences instead of returning an empty string. Added in PHP 5.3.0. This is provided for backwards compatibility; avoid using it as it may have security implications. |
Like htmlspecialchars(), it takes an optional third argument charset which defines character set used in conversion. Presently, the ISO-8859-1 character set is used as the default.
Following character sets are supported in PHP 4.3.0 and later.
Charset | Aliases | Description |
---|---|---|
ISO-8859-1 | ISO8859-1 | Western European, Latin-1 |
ISO-8859-15 | ISO8859-15 | Western European, Latin-9. Adds the Euro sign, French and Finnish letters missing in Latin-1(ISO-8859-1). |
UTF-8 | ASCII compatible multi-byte 8-bit Unicode. | |
cp866 | ibm866, 866 | DOS-specific Cyrillic charset. This charset is supported in 4.3.2. |
cp1251 | Windows-1251, win-1251, 1251 | Windows-specific Cyrillic charset. This charset is supported in 4.3.2. |
cp1252 | Windows-1252, 1252 | Windows specific charset for Western European. |
KOI8-R | koi8-ru, koi8r | Russian. This charset is supported in 4.3.2. |
BIG5 | 950 | Traditional Chinese, mainly used in Taiwan. |
GB2312 | 936 | Simplified Chinese, national standard character set. |
BIG5-HKSCS | Big5 with Hong Kong extensions, Traditional Chinese. | |
Shift_JIS | SJIS, 932 | Japanese |
EUC-JP | EUCJP | Japanese |
Note: Any other character sets are not recognized and ISO-8859-1 will be used instead.
When double_encode is turned off PHP will not encode existing html entities. The default is to convert everything.
Returns the encoded string.
Version | Description |
---|---|
5.3.0 | The constant ENT_IGNORE was added. |
5.2.3 | The double_encode parameter was added. |
4.1.0 | The charset parameter was added. |
4.0.3 | The flags parameter was added. |
Example #1 A htmlentities() example
<?php
$str = "A 'quote' is <b>bold</b>";
// Outputs: A 'quote' is <b>bold</b>
echo htmlentities($str);
// Outputs: A 'quote' is <b>bold</b>
echo htmlentities($str, ENT_QUOTES);
?>
Example #2 Usage of ENT_IGNORE
<?php
$str = "\x8F!!!";
// Outputs an empty string
echo htmlentities($str, ENT_QUOTES, "UTF-8");
// Outputs "!!!"
echo htmlentities($str, ENT_QUOTES | ENT_IGNORE, "UTF-8");
?>