org.apache.nutch.parse
Interface Parser

All Known Implementing Classes:
HtmlParser, JSParseFilter, MSWordParser, PdfParser, TextParser

public interface Parser

A parser for content generated by a Protocol implementation. This interface is implemented by extensions. Nutch's core contains no page parsing code.


Field Summary
static String X_POINT_ID
          The name of the extension point.
 
Method Summary
 Parse getParse(Content c)
          Creates the parse for some content.
 

Field Detail

X_POINT_ID

public static final String X_POINT_ID
The name of the extension point.

Method Detail

getParse

public Parse getParse(Content c)
Creates the parse for some content.



Copyright © 2006 The Apache Software Foundation