|
|||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||||
See:
Description
| Interface Summary | |
| URLFilter | Interface used to limit which URLs enter Nutch. |
| UrlNormalizer | Interface used to convert URLs to normal form and optionally do regex substitutions |
| Class Summary | |
| BasicUrlNormalizer | Converts URLs to a normal form . |
| PrefixURLFilter | Filters URLs based on a file of URL prefixes. |
| RegexURLFilter | Filters URLs based on a file of regular expressions. |
| RegexUrlNormalizer | Allows users to do regex substitutions on all/any URLs that are encountered, which is useful for stripping session IDs from URLs. |
| URLFilterChecker | Checke one given filter or all filters. |
| URLFilters | Creates and caches URLFilter implementing plugins. |
| UrlNormalizerFactory | Factory to create a UrlNormalizer from "urlnormalizer.class" config property. |
| Exception Summary | |
| URLFilterException | |
A url filter plugin.
|
|||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||||