Package org.apache.nutch.net

A url filter plugin.

See:
          Description

Interface Summary
URLFilter Interface used to limit which URLs enter Nutch.
UrlNormalizer Interface used to convert URLs to normal form and optionally do regex substitutions
 

Class Summary
BasicUrlNormalizer Converts URLs to a normal form .
PrefixURLFilter Filters URLs based on a file of URL prefixes.
RegexURLFilter Filters URLs based on a file of regular expressions.
RegexUrlNormalizer Allows users to do regex substitutions on all/any URLs that are encountered, which is useful for stripping session IDs from URLs.
URLFilterChecker Checke one given filter or all filters.
URLFilters Creates and caches URLFilter implementing plugins.
UrlNormalizerFactory Factory to create a UrlNormalizer from "urlnormalizer.class" config property.
 

Exception Summary
URLFilterException  
 

Package org.apache.nutch.net Description

A url filter plugin.



Copyright © 2006 The Apache Software Foundation