Symbian
Symbian OS Library

SYMBIAN OS V9.3

[Index] [Spacer] [Previous] [Next]



How to escape encode and decode


Reserved and unreserved characters

Reserved characters

URIs use some characters for special purposes in defining their syntax, these are called reserved characters. For example, - ; / ? : & = . When these characters are not used in their special role inside a URI, they need to be encoded.

The following lists the reserved characters for different URI components as defined in TEscapeMode:

Escape Mode URI component Reserved Characters

EscapeNormal

None

No reserved characters

EscapeQuery

query

- ;/?:&=+$,[]

EscapePath

path

- /;=?[]

EscapeAuth

authority

- /;:@?[]

EscapeUrlEncoded

URL

;/?:&=+$[]!\'()~

Unsafe characters

Some characters present the possibility of being misunderstood within URIs for various reasons. These are called unsafe characters and must always be encoded. For example, '#' character is used in URIs to indicate where a fragment identifier (bookmarks/anchors in HTML) begins.

Unreserved characters

Data characters that are allowed in a URI but do not have a reserved purpose are called "unreserved" characters. These include upper and lower case letters, decimal digits, a limited set of punctuation marks and symbols, ASCII control characters which are not printable. For example, the ISO-8859-1 (ISO-Latin) character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal).

EscapeUtils escape encodes and decodes unsafe data in URI. It also supports converting of Unicode data (16-bit descriptor) into UTF8 data (8-bit descriptor) and vice-versa.

EscapeUtils provides the following functionality.

[Top]


Escape decoding

EscapeUtils::EscapeDecodeL() escape decodes the data.

_LIT(KEscapeEncoded, %20%3C%3E%23%25%22%7B%7D%7C%5C%5E%60);        //data to decode
HBufC16* decode = EscapeUtils::EscapeDecodeL(KEscapeEncoded); //contains <>#%\"{}|\\^`
CleanupStack::PushL(decode);
...........................
CleanupStack::PopAndDestroy(decode);
          

This code escape decodes the data '%20%3C%3E%23%25%22%7B%7D%7C%5C%5E%60' to '<>#%\"{}|\\^'.

The URI must be split into its components before the escaped characters within the components are safely decoded.

[Top]


Escape encoding

URI encoding of a character consists of a "%" symbol, followed by two hexadecimal digits representing the octet code. For example, Space = decimal code point 32 in the ISO-Latin set. 32 decimal = 20 in hexadecimal. The URI encoded representation will be "%20"

EscapeUtils::EscapeEncodeL() escape encodes the invalid and reserved characters in the data as escape triples. The reserved characters and the set of excluded characters specified by RFC 2396 (refer to the above table) form the entire set of excluded data.

The code fragment checks for the invalid and reserved characters in the authority component of a URI and returns the string with these characters escape encoded. For other modes as defined in TEscapeMode, refer to the table above.

HBufC16* encode = EscapeUtils::EscapeEncodeL(*decode,  EscapeUtils::EEscapeAuth);
CleanupStack::PushL(encode);//encode contains %20%3C%3E%23%25%22%7B%7D%7C%5C%5E%60

......//use encode here

CleanupStack::PopAndDestroy(encode);

Escape encoding is ideal during creation of URI from the components.

[Top]


Converting between Unicode and Utf8

Convert to Utf8

EscapeUtils::ConvertFromUnicodeToUtf8L() converts the Unicode data into UTF8 format.

_LIT16(KUnicode, "Unicode string"); //data to be converted
    
    HBufC8* utf8  = EscapeUtils::ConvertFromUnicodeToUtf8L(KUnicode);

utf8 contains the UTF8 form of the string.

Convert to Unicode

EscapeUtils::ConvertToUnicodeFromUtf8L() converts the data from UTF8 format to Unicode.

_LIT8(KUtf8, "UTF-8 string"); // UTF8 string to be converted
    
    HBufC16* unicode = EscapeUtils::ConvertToUnicodeFromUtf8L(KUtf8); // convert the srting to Unicode
          

unicode contains the Unicode form of the string.

Call EscapeUtils::IsEscapeTriple to check if the input data contains an escape triple. For example, %2a. If there is a triple, its value is calculated and returned through the output argument HexVal. If there is no escape triple, then this argument is left unchanged.

_LIT(KEscapeTriple1, "%2a"); // input data containing escape triple
TInt KEscapeTriple1_value = 0x2a;
TInt HexVal;
EscapeUtils::IsEscapeTriple(KEscapeTriple1,HexVal); // escape triple value
//variable HexVal contains value 0x2a

The code above returns '42' , the value of escape triple.

[Top]


See also

InetProtUtils Overview