Mondrian FAQs

How do I use Mondrian in my application?
Why doesn't Mondrian use a standard API?
How does Mondrian's dialect of MDX differ from MSOLAP's?
How can Mondrian be extended?
Can Mondrian handle large datasets?
How do I enable tracing?
How do I enable logging?
What is the syntax of a Mondrian connect string?
Where is Mondrian going in the future?
Where can I find out more? (Further reading)
Mondrian is wonderful! How can I possibly thank you?
Modeling

Measures not in the fact table
How can I define my fact table based on an arbitrary SQL statement?
Why can't Mondrian find my tables?

Build/install

I get compilation errors? Why is this?

Performance

When I change the data in the RDBMS, the result doesn't change even if i refresh the browser. Why is this?
Tuning the Aggregate function

1. How do I use Mondrian in my application?

There are several ways. If you have a fixed set of queries which you'd like to display as HTML tables, use the tab library. webapp/taglib.jsp is an example of this.

The JPivot project (http://jpivot.sourceforge.net) is a JSP-based pivot table, and will allow you to dynamically explore a dataset over the web. It replaces the prototype pivot table webapp/morph.jsp.

You could also build a pivot table in a client technology such as Swing.

2. Why doesn't Mondrian use a standard API?

Because there isn't one. MDX is a component of Microsoft's OLE DB for OLAP standard which, as the name implies, only runs on Windows. Mondrian's API is fairly similar in flavor to ADO MD (ActiveX Data Objects for Multidimensional), a API which Microsoft built in order to make OLE DB for OLAP easier to use.

XML for Analysis is pretty much OLE DB for OLAP expressed in Web Services rather than COM, and therefore seems to offer a platform-neutral standard for OLAP, but take-up seems to be limited to vendors who supported OLE DB for OLAP already.

The other OLAP vendors failed to reach consensus several years ago with the OLAP Council API, then moved onto the JSR-069 ('JOLAP') specification. Mondrian included a partial implementation of the JOLAP API for several years, but this was removed in mondrian-2.3.

During 2006, Julian Hyde started work, in collaboration with some other projects and companies, on a pragmatic open API for Java-based OLAP called olap4j.

3. How does Mondrian's dialect of MDX differ from MSOLAP's?

See MDX language specification.

Not very much.

The StrToSet() and StrToTuple() functions take an extra parameter.
Parsing is case-sensitive.
Pseudo-functions Param() and ParamRef() allow you to create parameterized MDX statements.

4. How can Mondrian be extended?

todo: User-defined functions

todo: Cell readers

todo: Member readers

5. Can Mondrian handle large datasets?

Yes, if your RDBMS can. We delegate the aggregation to the RDBMS, and if your RDBMS happens to have materialized group by views created, your query will fly. And the next time you run the same or a similar query, that will really fly, because the results will be in the aggregation cache.

6. How do I enable tracing?

To enable tracing, set mondrian.trace.level to 1 in mondrian.properties. You will see text and execution time of each SQL statement, like this:

SqlMemberSource.getLevelMemberCount: executing sql [select count(*) as `c0` from (select distinct `store`.`store_country` as `c0` from `store` as `store`) as `foo`], 110 ms SqlMemberSource.getMembers: executing sql [select distinct `store`.`store_sqft` as `c0` from `store` as `store` order by `store`.`store_sqft`], 50 ms

Notes:

If you are running mondrian from the command-line, or via Ant, mondrian.properties should be in the current directory.
If you are running in Tomcat, mondrian.properties should be in TOMCAT_HOME/bin. Changes will only take effect when you re-start Tomcat. The output goes to the console from which you started Tomcat.

7. How do I enable logging?

Mondrian uses the Apache Log4j logger. To build, test, and run Mondrian requires a log4j.jar file. A log4j.jar file is provided as part of the Mondrian distribution.

Also provided is a log4j.properties file. Such a file is needed when running Mondrian in standalone mode (such as when running the Mondrian junit tests or the CmdRunner utility). Generally, Mondrian is embedded in an application, such as a webserver, which may have their own log4j.properties file or some other mechanism for setting log4j properties. In such cases, the user must use those for controlling Mondrian's logging.

Mondrian follows Apache's guidance on what type of information is logged at what level:

FATAL: A very severe error event that will presumably lead the application to abort.
ERROR: An error event that might still allow the application to continue running.
WARN: A potentially harmful situation.
INFO: An informational message that highlight the progress of the application at a coarse-grained level.
DEBUG: A fine-grained informational event that is most useful to debug an application.

It is recommended for general use that the Mondrian log level be set to WARN; arguably, its good to know when things are going South.

8. What is the syntax of a Mondrian connect string?

The syntax of the connect string is described in the Javadoc for the method mondrian.olap.DriverManager.getConnection(String connectString, boolean fresh).

9. Where is Mondrian going in the future?

Presentation layer (see JPivot for more details).
Complete implementation of MDX (not all of the functions implemented yet)
Tuning

10. Where can I find out more?

MDX Solutions with Microsoft SQL Server Analysis Services by George Spofford is the best book I have found on MDX. Despite the title, principles it describes can be applied to any RDBMS.

OLAP Solutions: Building Multidimensional Information Systems by Erik Thomsen is a great overview of multidimensional databases, but does not deal with MDX.

The reference work on data warehousing is The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition), by Ralph Kimball, Margy Ross. It covers the business process well, but the focus is more on star schemas and ROLAP than OLAP.

The Microsoft Analysis Services online documentation has excellent online documentation of MDX, including a list of MDX functions.

11. Mondrian is wonderful! How can I possibly thank you?

We'd love to hear what you liked and didn't like about it. If you can think of ways that Mondrian can be improved, roll up your sleeves and help make it better. If you use Mondrian in your application, consider sharing your work so that everyone can use it.

12. Modeling

12.1 Measures not stored in the fact table

I am trying to build a cube with measures from 2 different tables. I have tried a virtual cube, but it does not seem to work - it only relates measures and dimensions from the same table. Is there a way to specify that a measure is not coming from the fact table? Say using SQL select?

Virtual cubes sound like the right approach. The way to do it is to first create a dummy cube on your lookup table, with dimensions for as many columns as are applicable. (A classic example of this kind of cube is an 'ExchangeRate' cube, whose only dimensions are time and currency.)

Then create a virtual cube of the dummy cube and the real cube (onto your fact table).

Note that you will need to use shared dimensions for the cubes to join implicitly.

12.2 How can I define my fact table based on an arbitrary SQL statement?

Use the <View> element INSTEAD OF the <Table> element. You need to specify the 'alias' attribute, which Mondrian uses as a table alias.

The XML 'CDATA' construct is useful in case there are strange characters in your SQL, but isn't essential.

<View alias="DFACD_filtered"> <SQL dialect="generic"> <![CDATA[select * from DFACD where CSOC = '09']]> </SQL> </View>

12.3 Why can't Mondrian find my tables?

Consider this scenario. I have created some tables in Oracle, like this:

CREATE TABLE sales ( prodid INTEGER, day INTEGER, amount NUMBER);

and referenced it in my schema.xml like this:

<Cube name="Sales"> <Table name="sales"/> ... <Measure name="Sales" column="amount" aggregator="sum"/> <Measure name="Sales count" column="prodid" aggregator="count"/> </Cube>

Now I start up Mondrian and get an error ORA-00942: Table or view "sales" does not exist while executing the SQL statement SELECT "prodid", count(*) FROM "sales" GROUP BY "prodid". The query looks valid, and the table exists, so why is Oracle giving an error?

The problem is that table and column names are case-sensitive. You told Mondrian to look for a table called "sales", not "SALES" or "Sales".

Oracle's table and column names are case-sensitive too, provided that you enclose them in double-quotes, like this:

CREATE TABLE "sales" ( "prodid" INTEGER, "day" INTEGER, "amount" NUMBER);

If you omit the double-quotes, Oracle automatically converts the identifiers to upper-case, so the first CREATE TABLE command actually created a table called "SALES". When the query gets run, Mondrian is looking for a table called "sales" (because that's what you called it in your schema.xml), yet Oracle only has a table called "SALES".

There are two possible solutions. The simplest is to change the objects to upper-case in your schema.xml file:

<Cube name="Sales"> <Table name="SALES"/> ... <Measure name="Sales" column="AMOUNT" aggregator="sum"/> <Measure name="Sales count" column="PRODID" aggregator="count"/> </Cube>

Alternatively, if you decide you would like your table and column names to be in lower or mixed case (or even, for that matter, to contain spaces), then you must double-quote object names when you issue CREATE TABLE statements to Oracle.

13. Build/install

13.1 I get compilation errors? Why is this?

For example:

"SchemaTreeModel.java": Error #: 302 : cannot access class MondrianDef.Schema; java.io.IOException: class not found: class MondrianDef.Schema at line 29, column 14

You can't just compile the source code using your IDE; you must build using ant, as described in the build instructions. This is because several Java classes, such as mondrian.olap.MondrianDef (as in this case), mondrian.olap.MondrianResource and mondrian.olap.Parser are generated from other files. I recommend that you do ant clean before trying to build again.

Another example:

"NamedObject.java": Error #: 704 : cannot access directory javax\jmi\reflect at line 4, column 1

You don't have the correct JAR files (in this case, lib/jmi.jar) on your classpath. Again, you should have followed the build instructions. This problem often happens when people try to build using an IDE. You must use ant for the first ever build, but you may be able to setup your IDE to do incremental builds.

14. Performance

14.1 When I change the data in the RDBMS, the result doesn't change even if i refresh the browser. Why is this?

Mondrian uses a cache to improve performance. The first time you run a query, Mondrian will execute various SQL statements to load the data (you can see these statements by turning on tracing). The next time, it will use the information in the cache.

Cache control is primitive right now. If the data in the RDBMS is modified, Mondrian has no way to know, and does not refresh its cache. If you are using the JPivot web ui and refresh the browser, that will simply regenerate the web page, not flush the cache. The only way to refresh the cache is to call the following piece of code, which flushes the entire contents:

mondrian.rolap.CachePool.instance().flush();

See caching design for more information.

14.2 Tuning the Aggregate function

I am using an MDX query with a calculated "aggregate" member. It aggregates the values between Node A and Node B. The dimension that it is aggregating on is a Time dimension. This Time dimension has a granularity of one minute. When executing this MDX query, the performance seems to be fairly bad.

Here is the query:

WITH MEMBER [Time].[AggregateValues] AS 'Aggregate([Time].[2004].[October].[1].[12].[10] : [Time].[2004].[October].[20].[12].[10])' SELECT [Measures].[Volume] ON ROWS, NON EMPTY {[Service].[Name]} WHERE ([Time].[AggregateValues])

Is this normal behavior? Is there any way I can speed this up?

Answer:

The performance is bad because you are pulling 19 days * 1440 minutes per day = 27360 cells from the database into memory per cell that you actually display. Mondrian is a lot less efficient at crunching numbers than the database is, and uses a lot of memory.

The best way to improve performance is to push as much of the processing to the database as possible. If you were asking for a whole month, it would be easy:

WITH MEMBER [Time].[AggregateValues] AS 'Aggregate({[Time].[2004].[October]})' SELECT [Measures].[Volume] ON ROWS, NON EMPTY {[Service].[Name]} WHERE ([Time].[AggregateValues])

But since you're working with time periods which are not aligned with the dimensional structure, you'll have to chop up the interval:

WITH MEMBER [Time].[AggregateValues] AS 'Aggregate({ [Time].[2004].[October].[1].[12].[10] : [Time].[2004].[October].[1].[23].[59], [Time].[2004].[October].[2] : [Time].[2004].[October].[19], [Time].[2004].[October].[20].[0].[00] : [Time].[2004].[October].[20].[12].[10]})' SELECT [Measures].[Volume] ON ROWS, NON EMPTY {[Service].[Name]} WHERE ([Time].[AggregateValues])

This will retrieve a much smaller number of cells from the database — 18 days + no more than 1440 minutes — and therefore do more of the heavy lifting using SQL's GROUP BY operator. If you want to improve it still further, introduce hourly aggregates.

Q. I saw the perforce files, but a I couldn't find where to register and get new user, or the instructions that you have mentioned above;

A. The project administrators (Julian) register you. I would suggest that you start with guest level access and let's see if you need update access later.

Q. Do you have some model for development environment (e.g. eclipse 3.0 + ant 1.6 + jboss x.x + .....)?

A. Using Eclipse for Mondrian development works fine. There is an Eclipse Perforce plug-in, too, but you can use the Perforce client outside of Eclipse. Some people use Intellij (which is free for open-source use).

As a test web-server, most people use Tomcat 5.0.

Q. Are all the updated documentation in the perforce server? How could I get more materials, howtos, etc. to reduce my learn curve?

A. As with any open source project, the documentation is the web site (which is source-controlled in Perforce too), the forums and mailing lists, the test suite and the code.

Q. How could I enroll myself into mondrian source forge project?

A. Sign up as a SourceForge user and subscribe to the Mondrian mailing lists and forums. Also, there are a lot of Mondrian related questions from the JPivot project - I suggest you subscribe to JPivot too.

Author: Julian Hyde; last modified August 2006.
Version: $Id: //open/mondrian-release/3.0/doc/faq.html#2 $ (log)
Copyright (C) 2002-2007 Julian Hyde