|
|||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
URLFilter | Interface used to limit which URLs enter Nutch. |
UrlNormalizer | Interface used to convert URLs to normal form and optionally do regex substitutions |
Class Summary | |
BasicUrlNormalizer | Converts URLs to a normal form . |
PrefixURLFilter | Filters URLs based on a file of URL prefixes. |
RegexURLFilter | Filters URLs based on a file of regular expressions. |
RegexUrlNormalizer | Allows users to do regex substitutions on all/any URLs that are encountered, which is useful for stripping session IDs from URLs. |
URLFilterChecker | Checke one given filter or all filters. |
URLFilters | Creates and caches URLFilter implementing plugins. |
UrlNormalizerFactory | Factory to create a UrlNormalizer from "urlnormalizer.class" config property. |
Exception Summary | |
URLFilterException |
A url filter plugin.
|
|||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |