VorteXML.
The current Vortex language and engine is
[4]
geared primarily towards
relational data. We are working to extend Vortex so that it can also
work with XML-based data, thus allowing a uniform paradigm for
converting the raw data into higher-level semantic information and for
analyzing that higher-level information
(see [5]).
There are various ways in
which a web site can pass data to the decision engine.
For example, the web server might do some initial processing
and cleaning to transform the HTML to XML
(e.g., using XPath [6]), and then pass the XML to the
VorteXML engine.
The ability for Vortex to specify various
heuristics would be useful in converting that XML into higher
level semantics, especially for web sites that represent
information in non-uniform ways.
Another approach is to annotate the HTML content produced by a web site, by adding custom tags that are ignored by the client browsers, but which the decision engine can scan to extract the desired information. This would make the task of extraction easier, since the decision engine now only needs to look at a subset of the raw input.
There is also a move towards separating content from presentation on web sites, i.e., for each customer request, a web site would retrieve the actual content as an XML document, and then apply an XSL stylesheet to transform it into an HTML document before shipping it to the client. In such a case, the web site could simply forward the XML content to the decision engine, which would remove the task of transforming the HTML into XML.
Distributed Rules Processing. Section 4 described how DFP can be scaled, in an architectural sense, to environments with web server farms. Another challenge concerns scaling to large web sites, as found in the B2B sites of large corporations. These typically span multiple sub-organizations and span multiple geographic locations. For example, in Lucent a customer might enter the Lucent home page that is supported by web servers in New Jersey, but then access product information about the latest IP telephony switches via web servers in Illinois. Furthermore, while some decision policies might be applicable to all customers, others might be relevant only to certain products. This means that the rule sets used in connection with different locations may be overlapping but different. The challenge is to support the development of such overlapping rule sets, have appropriate rules apply to the pages being examined, and pass relevant data between geographic locations as the customer's session moves between those locations.
Reliability. We are currently experimenting with the use of fault-tolerant CORBA to provide reliability for the Vortex engine, and thus for the DFP approach. Another way to achieve reliability, and scalability for that matter, would be to implement the Vortex language as part of an application server platform (e.g., based on EJB).
Automated learning. Following the lead of companies such as Manna [15], it will be important to incorporate automated learning into any personalization technology. Because the Vortex language provides rules-constructs that are richer than many business rules systems, it will be more difficult to develop learning technology for Vortex. On the other hand, the structure of the rule sets and the availability of meaningful reports should provide important handles to the problem.