MediaWiki  REL1_22
UtfNormalUtil.php File Reference

Some of these functions are adapted from places in MediaWiki. More...

Go to the source code of this file.

Functions

 codepointToUtf8 ($codepoint)
 Return UTF-8 sequence for a given Unicode code point.
 escapeSingleString ($string)
 Escape a string for inclusion in a PHP single-quoted string literal.
 hexSequenceToUtf8 ($sequence)
 Take a series of space-separated hexadecimal numbers representing Unicode code points and return a UTF-8 string composed of those characters.
 utf8ToCodepoint ($char)
 Determine the Unicode codepoint of a single-character UTF-8 sequence.
 utf8ToHexSequence ($str)
 Take a UTF-8 string and return a space-separated series of hex numbers representing Unicode code points.

Detailed Description

Some of these functions are adapted from places in MediaWiki.

Should probably merge them for consistency.

Copyright © 2004 Brion Vibber <[email protected]> http://www.mediawiki.org/

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. http://www.gnu.org/copyleft/gpl.html

Definition in file UtfNormalUtil.php.


Function Documentation

codepointToUtf8 ( codepoint)

Return UTF-8 sequence for a given Unicode code point.

May die if fed out of range data.

Parameters:
$codepointInteger:
Returns:
String
Access:
public

Definition at line 36 of file UtfNormalUtil.php.

Referenced by Sanitizer\cssDecodeCallback(), Sanitizer\decodeChar(), Sanitizer\decodeEntity(), GenerateCollationData\generateFirstChars(), hexSequenceToUtf8(), EnhancedChangesList\spacerArrow(), and CleanUpTest\XtestAllChars().

escapeSingleString ( string)

Escape a string for inclusion in a PHP single-quoted string literal.

Parameters:
string$stringstring to be escaped.
Returns:
String: escaped string.
Access:
public

Definition at line 134 of file UtfNormalUtil.php.

References array().

hexSequenceToUtf8 ( sequence)

Take a series of space-separated hexadecimal numbers representing Unicode code points and return a UTF-8 string composed of those characters.

Used by UTF-8 data generation and testing routines.

Parameters:
$sequenceString
Returns:
String
Access:
private

Definition at line 61 of file UtfNormalUtil.php.

References $n, as, and codepointToUtf8().

Referenced by GenerateNormalizerData\generateArabic(), and GenerateNormalizerData\generateMalayalam().

utf8ToCodepoint ( char)

Determine the Unicode codepoint of a single-character UTF-8 sequence.

Does not check for invalid input data.

Parameters:
$charString
Returns:
Integer
Access:
public

Definition at line 94 of file UtfNormalUtil.php.

Referenced by IcuCollation\getFirstLetter(), Sanitizer\normalizeCss(), and utf8ToHexSequence().

utf8ToHexSequence ( str)

Take a UTF-8 string and return a space-separated series of hex numbers representing Unicode code points.

For debugging.

Parameters:
string$strUTF-8 string.
Returns:
string
Access:
private

Definition at line 78 of file UtfNormalUtil.php.

References as, and utf8ToCodepoint().

Referenced by Digit2Html\execute().