| [ Index ] |
PHP Cross Reference of MediaWiki-1.24.0 |
[Source view] [Print] [Project Stats]
Unicode normalization routines Copyright © 2004 Brion Vibber <[email protected]> https://www.mediawiki.org/
| File Size: | 790 lines (23 kb) |
| Included or required: | 6 times |
| Referenced: | 0 times |
| Includes or requires: | 2 files includes/normal/UtfNormalData.inc includes/normal/UtfNormalDataK.inc |
UtfNormal:: (17 methods):
cleanUp()
toNFC()
toNFD()
toNFKC()
toNFKD()
loadData()
quickIsNFC()
quickIsNFCVerify()
NFC()
NFD()
NFKC()
NFKD()
fastDecompose()
fastCombiningSort()
fastCompose()
placebo()
replaceForNativeNormalize()
| cleanUp( $string ) X-Ref |
| The ultimate convenience function! Clean up invalid UTF-8 sequences, and convert to normal form C, canonical composition. Fast return for pure ASCII strings; some lesser optimizations for strings containing only known-good characters. Not as fast as toNFC(). param: string $string a UTF-8 string return: string a clean, shiny, normalized UTF-8 string |
| toNFC( $string ) X-Ref |
| Convert a UTF-8 string to normal form C, canonical composition. Fast return for pure ASCII strings; some lesser optimizations for strings containing only known-good characters. param: string $string a valid UTF-8 string. Input is not validated. return: string a UTF-8 string in normal form C |
| toNFD( $string ) X-Ref |
| Convert a UTF-8 string to normal form D, canonical decomposition. Fast return for pure ASCII strings. param: string $string a valid UTF-8 string. Input is not validated. return: string a UTF-8 string in normal form D |
| toNFKC( $string ) X-Ref |
| Convert a UTF-8 string to normal form KC, compatibility composition. This may cause irreversible information loss, use judiciously. Fast return for pure ASCII strings. param: string $string a valid UTF-8 string. Input is not validated. return: string a UTF-8 string in normal form KC |
| toNFKD( $string ) X-Ref |
| Convert a UTF-8 string to normal form KD, compatibility decomposition. This may cause irreversible information loss, use judiciously. Fast return for pure ASCII strings. param: string $string a valid UTF-8 string. Input is not validated. return: string a UTF-8 string in normal form KD |
| loadData() X-Ref |
| Load the basic composition data if necessary |
| quickIsNFC( $string ) X-Ref |
| Returns true if the string is _definitely_ in NFC. Returns false if not or uncertain. param: string $string a valid UTF-8 string. Input is not validated. return: bool |
| quickIsNFCVerify( &$string ) X-Ref |
| Returns true if the string is _definitely_ in NFC. Returns false if not or uncertain. param: string $string a UTF-8 string, altered on output to be valid UTF-8 safe for XML. return: bool |
| NFC( $string ) X-Ref |
param: $string string return: string |
| NFD( $string ) X-Ref |
param: $string string return: string |
| NFKC( $string ) X-Ref |
param: $string string return: string |
| NFKD( $string ) X-Ref |
param: $string string return: string |
| fastDecompose( $string, $map ) X-Ref |
| Perform decomposition of a UTF-8 string into either D or KD form (depending on which decomposition map is passed to us). Input is assumed to be *valid* UTF-8. Invalid code will break. param: string $string valid UTF-8 string param: array $map hash of expanded decomposition map return: string a UTF-8 string decomposed, not yet normalized (needs sorting) |
| fastCombiningSort( $string ) X-Ref |
| Sorts combining characters into canonical order. This is the final step in creating decomposed normal forms D and KD. param: string $string a valid, decomposed UTF-8 string. Input is not validated. return: string a UTF-8 string with combining characters sorted in canonical order |
| fastCompose( $string ) X-Ref |
| Produces canonically composed sequences, i.e. normal form C or KC. param: string $string a valid UTF-8 string in sorted normal form D or KD. return: string a UTF-8 string with canonical precomposed characters used |
| placebo( $string ) X-Ref |
| This is just used for the benchmark, comparing how long it takes to interate through a string without really doing anything of substance. param: $string string return: string |
| replaceForNativeNormalize( $string ) X-Ref |
| Function to replace some characters that we don't want but most of the native normalize functions keep. param: string $string The string return: String String with the character codes replaced. |
| Generated: Fri Nov 28 14:03:12 2014 | Cross-referenced by PHPXref 0.7.1 |