org.apache.nutch.parse
Interface Parser
- All Known Implementing Classes:
- HtmlParser, JSParseFilter, MSWordParser, PdfParser, TextParser
- public interface Parser
A parser for content generated by a Protocol
implementation. This interface is implemented by extensions. Nutch's core
contains no page parsing code.
X_POINT_ID
public static final String X_POINT_ID
- The name of the extension point.
getParse
public Parse getParse(Content c)
- Creates the parse for some content.
Copyright © 2006 The Apache Software Foundation