The Zettair Search Engine -- Getting Started

Here's an example of how easy it is to use Zettair. Follow these steps:
  1. Download Zettair as a zip file.
  2. Change into the directory where you've saved Zettair and unzip it:
    [hugh@hugh hugh]$ cd ~
    [hugh@hugh hugh]$ unzip zettair-0.6.0.zip
    
  3. Download this zipped collection (40,872 bytes) of HTML documents (which are part of the HTML 4.01 standard at http://www.w3.org/TR/html4/)
  4. Change into the directory where you've saved the collection and unzip it:
    [hugh@hugh hugh]$ cd ~
    [hugh@hugh hugh]$ unzip html.zip
    Archive:  html.zip
      inflating: collection/about.html
      inflating: collection/charset.html
      inflating: collection/conform.html
    ...
    
  5. Make and install the Zettair software:
    [hugh@hugh hugh]$ cd zettair-0.6.0
    [hugh@hugh zettair-0.6.0]$ ./configure --prefix=$HOME/local/zettair-0.6.0
    [hugh@hugh zettair-0.6.0]$ make
    [hugh@hugh zettair-0.6.0]$ make install
    
  6. Build an index on the files in the collection:
    [hugh@hugh zettair-0.6.0]$ mkdir ~/index
    [hugh@hugh zettair-0.6.0]$ cd ~/index
    [hugh@hugh index]$ find ~/collection/* | ~/local/zettair-0.6.0/bin/zet -i -t HTML
    zettair version 0.6.0
    created new index 'index'
    sources (type html): /home/hugh/collection/about.html /home/hugh/collection/charset.html /home/hugh/collection/conform.html /home/hugh/collection/cover.html /home/hugh/collection/references.html /home/hugh/collection/types.html 
    parsing /home/hugh/collection/about.html...
    parsing /home/hugh/collection/charset.html...
    parsing /home/hugh/collection/conform.html...
    parsing /home/hugh/collection/cover.html...
    parsing /home/hugh/collection/references.html...
    parsing /home/hugh/collection/types.html...
    merging...
    
    summary: 6 documents, 2049 distinct index terms, 0 10541 terms
    
    A Unix note: the command find ~/collection/* lists all files in the directory ~/collection, and this is piped as input into the Zettair index construction process. The result is that Zettair indexes all files in the directory. This command does the same thing:
    /zet -i -c ../config/parser_settings.html -t HTML /home/hugh/collection/about.html /home/hugh/collection/charset.html /home/hugh/collection/conform.html /home/hugh/collection/cover.html /home/hugh/collection/references.html home/hugh/collection/types.html
    
  7. Search the collection:
    [hugh@hugh index]$ ~/local/zettair-0.6.0/bin/zet
    > Tim Berners-Lee
    1. file:///home/hugh/collection/about.html (score 2.455709, docid 0)
    2. file:///home/hugh/collection/references.html (score 1.087303, docid 4)
    
    2 results of 2 shown (took 0.001164 seconds)
    > tags
    1. file:///home/hugh/collection/conform.html (score 0.952401, docid 2)
    2. file:///home/hugh/collection/references.html (score 0.664334, docid 4)
    
    2 results of 2 shown (took 0.000962 seconds)
    
  8. Enjoy!

    Page created by Hugh Williams, 30 July 2003.
    Updated by William Webber for Zettair 0.6.0, 21 July 2004.