This document explains how the data recognition framework is used internally in Symbian OS and by client applications to identify non-native file types.
Note that data recognizers do not handle data, they just try to
identify its type. Once its type has been identified, the data can be passed to
the application that best handles it. Applications specify the mime types they
support, and the priority (high, medium, low, last resort) at which they
support them using the datatype_list
section in their
registration file.
The plug-in recognizer architecture allows additional data recognizers to be created and added. The process of writing a new recognizer is described in a separate document, How to write a recognizer.
Before v9.1, data recognizers were plug-in DLLs with a
.mdl
extension and a UID2
of 0x10003A19. They were
located in the \System\Recogs\
folder on any drive.
In v9.1 and onwards, recognizers are ECOM plugins, located under
\sys\bin\
. Each ECOM data recognizer is loaded by the application
architecture (apparc) during the OS startup sequence. The ECOM
recognizer scanning code searches for all ECOM plug-ins that specify the ECOM
interface UID for data recognizers (0x101F7D87) and loads them, calling the
implementation creation function.
CApaDataRecognizer
is an internal class that represents
the recognizer framework. This section summarizes how it works. Although
CApaDataRecognizer
is internal, clients can access most of its
properties through the RApaLsSession
API, described in the
next section.
The recognizer framework maintains an up to date list of all data
recognizers that exist in the system. The list is ordered by recognizer
priority, which is explained later. When a file in the file system
needs to be associated with an application, the framework opens the file, and
reads some data from the start of it into a buffer. It then calls the
DoRecognizeL()
function of each recognizer in the list in turn,
passing it the filename of the unrecognized file and the buffer.
The implementation of DoRecognizeL()
should use the
information it is passed to try to decide what the data type is and to specify
how confident it is of this decision. All data recognizers are derived from the
abstract base class, CApaDataRecognizerType
. The
CApaDataRecognizerType::iDataType
and
CApaDataRecognizerType::iConfidence
members are used for
this purpose. The confidence rating can be any value between
CApaDataRecognizerType::ECertain
and
CApaDataRecognizerType::ENotRecognized
.
The framework uses an accepted confidence level. If
in the course of data recognition, any recognizer sets its
iConfidence
member to the accepted confidence or greater, then its
iDataType
value is immediately selected, without testing any other
recognizers.
The framework's default accepted confidence value is
CApaDataRecognizerType::ECertain
, but this can be changed by
calling RApaLsSession::SetAcceptedConfidence()
.
If the accepted confidence level is not reached by any recognizer, after all the recognizers have been tried, the recognizer that returned the highest confidence level is selected.
Note: if recognition was unsuccessful, (in other words, no recognizer
set its iConfidence
level to higher than
CApaDataRecognizerType::ENotRecognized
), and any of the
recognizers reported an error by leaving, then the framework function
CApaDataRecognizer::RecognizeL()
will leave with the error code
reported by the first recognizer that reported an error.
After a recognizer has been selected, apparc attempts to find the application that can best handle the selected data type. Note that it is not guaranteed that an application will be available to handle a data type, even if it was successfully recognized.
The framework also owns:
a list that contains all the data types that the recognizers in
the list claim to support. This is populated by calling each recognizer's
implementation of
CApaDataRecognizerType::SupportedDataTypeL()
.
a preferred buffer size. This value determines the amount of data
passed to each recognizer's DoRecognizeL()
function. It is set by
the framework to the largest value returned by any recognizer's implementation
of CApaDataRecognizerType::PreferredBufSize()
. It can be
retrieved by calling RApaLsSession::GetPreferredBufSize()
.
This section summarizes the APIs provided by
RApaLsSession
to allow clients to use the data recognition
framework. More detailed information is provided by the reference documentation
for the class.
RApaLsSession::RecognizeData()
and
RApaLsSession::RecognizeSpecificData()
allow clients to do
data recognition themselves, using the recognizer framework. With the
RApaLsSession::RecognizeSpecificData()
overloads, the data
type is known in advance to the caller, so the framework only passes the data
to recognizers whose implementation of SupportedDataTypeL()
indicates support for that data type. The RecognizeData()
overloads, on the other hand, pass the data to all recognizers.
Note that if we unsuccessfully attempt to get the modification details of a file that has not been previously cached (i.e. if some error is returned in the call "get modification details"), then this file is omitted from the resultant recognition cache.
RApaLsSession::GetAcceptedConfidence()
and
SetAcceptedConfidence()
get/set the framework's accepted
confidence level. The framework's default value is
CApaDataRecognizerType::ECertain
.
The preferred buffer size (which determines the amount of data passed
to each recognizer's DoRecognizeL()
function) is set by the
framework, but cannot be greater than the value set by calling
RApaLsSession::SetMaxDataBufSize()
. If the maximum buffer
size is not set by the client, the default value is 256 bytes. It can be
retrieved using RApaLsSession::GetMaxDataBufSize()
, and
the preferred buffer size using
RApaLsSession::GetPreferredBufSize()
.
RApaLsSession::GetSupportedDataTypesL()
retrieves
the total list of data types supported by all recognizers in the framework's
list.
The various RApaLsSession::RecognizeFilesL()
overloads are used to recognize all files contained in a specified directory.
There are synchronous and asynchronous overloads, which can be used with or
without a mime type filter. The asynchronous versions can be cancelled using
RApaLsSession::CancelRecognizeFiles()
.