[ Index ] |
PHP Cross Reference of MediaWiki-1.24.0 |
[Source view] [Print] [Project Stats]
Unicode normalization routines Copyright © 2004 Brion Vibber <[email protected]> https://www.mediawiki.org/
File Size: | 790 lines (23 kb) |
Included or required: | 6 times |
Referenced: | 0 times |
Includes or requires: | 2 files includes/normal/UtfNormalData.inc includes/normal/UtfNormalDataK.inc |
UtfNormal:: (17 methods):
cleanUp()
toNFC()
toNFD()
toNFKC()
toNFKD()
loadData()
quickIsNFC()
quickIsNFCVerify()
NFC()
NFD()
NFKC()
NFKD()
fastDecompose()
fastCombiningSort()
fastCompose()
placebo()
replaceForNativeNormalize()
cleanUp( $string ) X-Ref |
The ultimate convenience function! Clean up invalid UTF-8 sequences, and convert to normal form C, canonical composition. Fast return for pure ASCII strings; some lesser optimizations for strings containing only known-good characters. Not as fast as toNFC(). param: string $string a UTF-8 string return: string a clean, shiny, normalized UTF-8 string |
toNFC( $string ) X-Ref |
Convert a UTF-8 string to normal form C, canonical composition. Fast return for pure ASCII strings; some lesser optimizations for strings containing only known-good characters. param: string $string a valid UTF-8 string. Input is not validated. return: string a UTF-8 string in normal form C |
toNFD( $string ) X-Ref |
Convert a UTF-8 string to normal form D, canonical decomposition. Fast return for pure ASCII strings. param: string $string a valid UTF-8 string. Input is not validated. return: string a UTF-8 string in normal form D |
toNFKC( $string ) X-Ref |
Convert a UTF-8 string to normal form KC, compatibility composition. This may cause irreversible information loss, use judiciously. Fast return for pure ASCII strings. param: string $string a valid UTF-8 string. Input is not validated. return: string a UTF-8 string in normal form KC |
toNFKD( $string ) X-Ref |
Convert a UTF-8 string to normal form KD, compatibility decomposition. This may cause irreversible information loss, use judiciously. Fast return for pure ASCII strings. param: string $string a valid UTF-8 string. Input is not validated. return: string a UTF-8 string in normal form KD |
loadData() X-Ref |
Load the basic composition data if necessary |
quickIsNFC( $string ) X-Ref |
Returns true if the string is _definitely_ in NFC. Returns false if not or uncertain. param: string $string a valid UTF-8 string. Input is not validated. return: bool |
quickIsNFCVerify( &$string ) X-Ref |
Returns true if the string is _definitely_ in NFC. Returns false if not or uncertain. param: string $string a UTF-8 string, altered on output to be valid UTF-8 safe for XML. return: bool |
NFC( $string ) X-Ref |
param: $string string return: string |
NFD( $string ) X-Ref |
param: $string string return: string |
NFKC( $string ) X-Ref |
param: $string string return: string |
NFKD( $string ) X-Ref |
param: $string string return: string |
fastDecompose( $string, $map ) X-Ref |
Perform decomposition of a UTF-8 string into either D or KD form (depending on which decomposition map is passed to us). Input is assumed to be *valid* UTF-8. Invalid code will break. param: string $string valid UTF-8 string param: array $map hash of expanded decomposition map return: string a UTF-8 string decomposed, not yet normalized (needs sorting) |
fastCombiningSort( $string ) X-Ref |
Sorts combining characters into canonical order. This is the final step in creating decomposed normal forms D and KD. param: string $string a valid, decomposed UTF-8 string. Input is not validated. return: string a UTF-8 string with combining characters sorted in canonical order |
fastCompose( $string ) X-Ref |
Produces canonically composed sequences, i.e. normal form C or KC. param: string $string a valid UTF-8 string in sorted normal form D or KD. return: string a UTF-8 string with canonical precomposed characters used |
placebo( $string ) X-Ref |
This is just used for the benchmark, comparing how long it takes to interate through a string without really doing anything of substance. param: $string string return: string |
replaceForNativeNormalize( $string ) X-Ref |
Function to replace some characters that we don't want but most of the native normalize functions keep. param: string $string The string return: String String with the character codes replaced. |
Generated: Fri Nov 28 14:03:12 2014 | Cross-referenced by PHPXref 0.7.1 |