Luke - Lucene Index Toolbox

Lucene is an Open Source, mature and high-performance Java search engine. It is highly flexible, and scalable from hundreds to millions of documents.

Luke is a handy development and diagnostic tool, which accesses already existing Lucene indexes and allows you to display and modify their content in several ways:

Recent versions of Luke are also extensible through plugins and scripting.

I started this project because I needed a tool like this. I decided to distribute it under Open Source license to express my gratitude to the Lucene team for creating such a high-quality product. Lucene is one of the landmark proofs that Open Source paradigm can result in high-quality and free products.

Java WebStart version

Java Web Start version: launch Luke now. (NOTE: requires Java 1.5 or higher)

NOTE: use this link if you want to make sure you use the right version of Java VM and Java WebStart.

NOTE 2: some functions (like plugins, additional analyzers, and parts of the scripting framework) are known to be broken when started from Java WebStart.

Download - source and binary

Current version is 0.9.1, released on 22 Nov 2008.

It uses the official Lucene 2.4.0 release JARs.

NOTE: Luke requires now Java 1.5 or higher.

You can download the binary JARs here:
You can download the source code ZIP (2MB): luke-src-0.9.zip
You can download the source code TGZ (2MB): luke-src-0.9.tgz

Changes in v. 0.9.1 (released on 2008.11.22):

This is mostly a bug fix release of 0.9.

Changes in v. 0.9 (released on 2008.11.15):

This release adds many functionality enhancements and advanced features available in Lucene 2.4.

Changes in v. 0.8.1 (released on 2008.02.13):

This release adds some functionality enhancements related to TermVectors and Payloads.

Changes in v. 0.8 (released on 2008.02.04):

This release upgrades to the official Lucene 2.3.0 release JARs.

NOTE: this version of Luke requires Java 1.5 or higher.

The following changes have been made in this release:

Changes in v. 0.7.1 (released on 2007.06.20):

This minor release is mostly an upgrade to the official Lucene 2.2.0 release JARs.

The following changes have been made in this release:


Changes in v. 0.7 (released on 2007.02.20):

This release uses the official Lucene 2.1.0 release JARs.

The following changes have been made in this release:

The following people contributed patches, suggestions, and generally kept prodding me and poking to produce this release: Volodymyr Bychkoviak, Juan Manuel Caicedo, Mark Harwood, Otis Gospodnetic, Benson Margulies, Jean-Philippe Robichaud, and many, many others. Thank you for your support!


Changes in v. 0.6:

The most important addition is the scripting framework based on Mozilla Rhino JavaScript engine. Additional plugins and functions were added, as follows:

I would like to thank the following people for their comments, suggestions, bug-fixes and patches (in no particular order): Daniel Naber, Erik Hatcher, Grant Ingersoll, Ryan Cox, Terry Steichen, Lubos Pochman, Michael Franken, Luke Shannon, Todd VanderVeen, and others. Thank you!

Changes in v. 0.5:

This release introduces many changes and new, unique features:

Please note that as a result of the package name changes, the main class is org.getopt.luke.Luke, and NOT as before luke.Luke.

I felt that all these changes merited a slight change in name, from "Lucene Index Browser" to "Lucene Index Toolbox", as this seems to better reflect the current functionality of the tool.


Changes in v. 0.45:


Changes in v. 0.4:

I'll update the screenshots in a few days ...


Changes in v. 0.3:


Changes in v. 0.2:

License

Luke is covered by Apache Software License, which means that it's free for any use, including commercial use. It comes with full source code included (see section above). Notice however that the Thinlet library is covered by GNU Library (Lesser) Public License, which puts different restrictions on that portion of the program.

If you feel inclined, I would appreciate a short email note, in case you find this program useful, or if you want to redistribute it in a software collection. Although it's not required by the license, it gives me some idea of how people use it, and what features are most useful to them...

Bug reports

Hopefully, there will be none! :-) Ok, let's be realistic... if you notice a bug, or if you come up with a useful feature request, or even better - with patches that implement new functionality - please contact the author (Andrzej Bialecki, ab at getopt dot org). Thank you in advance for your comments and contributions!

Screenshots

That's what tiggers love the most...

The following screenshot present the overview screen, just after you open an index.




The screenshot below shows you the document panel, where you can browse through documents sequentially, or select groups of documents by terms, which they contain.



The next screenshot shows you the Search panel, where you can enter search expressions in the standard Lucene QueryParser syntax. However, notice that you can select analyzer used to parse the query - either one of the predefined ones, or your own class in a classpath. You can also select the default field (this field is used when there is no specific field qualifier in your search expression).
You can also see in the "Parsed query view" area how the choice of analyzer affects the final query. In this case, please note how the phrase "more and more" has changed.



The screenshot below shows a dialog containing the explanation for a hit. The Explanation tree shows how various term matches and normalizations resulted in the final document score for the current query.
Please note how the fuzzy query expanded the term "book" into "books" (and, not visible here, "bookstore", "bookstores", etc...), adjusting the weight of this hit.



Last modified: Nov 14, 2008
http://www.getopt.org/luke/