Package org.apache.nutch.parse

Interface Summary
HtmlParseFilter Extension point for DOM-based HTML parsers.
Parse The result of parsing a page's raw content.
Parser A parser for content generated by a Protocol implementation.
 

Class Summary
HTMLMetaTags This class holds the information about HTML "meta" tags extracted from a page.
HtmlParseFilters Creates and caches HtmlParseFilter implementing plugins.
Outlink  
OutlinkExtractor Extractor to extract Outlinks / URLs from plain text using Regular Expressions.
ParseData Data extracted from a page's content.
ParseImpl The result of parsing a page's raw content.
ParserChecker Parser checker, useful for testing parser.
ParserFactory Creates and caches Parser plugins.
ParseStatus  
ParseText  
 

Exception Summary
ParseException  
ParserNotFound  
 



Copyright © 2006 The Apache Software Foundation