|
|||||||||||
| PREV NEXT | FRAMES NO FRAMES | ||||||||||
| Packages that use Content | |
| org.apache.nutch.analysis.lang | Text document language identifier. |
| org.apache.nutch.parse | |
| org.apache.nutch.parse.html | An HTML document parsing plugin. |
| org.apache.nutch.parse.js | |
| org.apache.nutch.parse.msword | A Word document parsing plugin. |
| org.apache.nutch.parse.pdf | A pdf parsing plugin. |
| org.apache.nutch.parse.text | A plain text parsing plugin. |
| org.apache.nutch.protocol | |
| org.apache.nutch.protocol.file | Protocol plugin which supports retrieving local file resources. |
| org.apache.nutch.protocol.ftp | Protocol plugin which supports retrieving documents via the ftp protocol. |
| org.apache.nutch.protocol.http | Protocol plugin which supports retrieving documents via the http protocol. |
| org.apache.nutch.protocol.httpclient | Protocol plugin which supports retrieving documents via the HTTP protocol. |
| org.apache.nutch.segment | |
| org.creativecommons.nutch | Sample plugins that parse and index Creative Commons medadata. |
| Uses of Content in org.apache.nutch.analysis.lang |
| Methods in org.apache.nutch.analysis.lang with parameters of type Content | |
Parse |
HTMLLanguageParser.filter(Content content,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Scan the HTML document looking at possible indications of content language. |
| Uses of Content in org.apache.nutch.parse |
| Methods in org.apache.nutch.parse with parameters of type Content | |
Parse |
HtmlParseFilter.filter(Content content,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Adds metadata or otherwise modifies a parse of HTML content, given the DOM tree of a page. |
Parse |
Parser.getParse(Content c)
Creates the parse for some content. |
static Parse |
HtmlParseFilters.filter(Content content,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Run all defined filters. |
| Uses of Content in org.apache.nutch.parse.html |
| Methods in org.apache.nutch.parse.html with parameters of type Content | |
Parse |
HtmlParser.getParse(Content content)
|
| Uses of Content in org.apache.nutch.parse.js |
| Methods in org.apache.nutch.parse.js with parameters of type Content | |
Parse |
JSParseFilter.filter(Content content,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
|
Parse |
JSParseFilter.getParse(Content c)
|
| Uses of Content in org.apache.nutch.parse.msword |
| Methods in org.apache.nutch.parse.msword with parameters of type Content | |
Parse |
MSWordParser.getParse(Content content)
|
| Uses of Content in org.apache.nutch.parse.pdf |
| Methods in org.apache.nutch.parse.pdf with parameters of type Content | |
Parse |
PdfParser.getParse(Content content)
|
| Uses of Content in org.apache.nutch.parse.text |
| Methods in org.apache.nutch.parse.text with parameters of type Content | |
Parse |
TextParser.getParse(Content content)
|
| Uses of Content in org.apache.nutch.protocol |
| Methods in org.apache.nutch.protocol that return Content | |
static Content |
Content.read(DataInput in)
|
Content |
ProtocolOutput.getContent()
|
| Methods in org.apache.nutch.protocol with parameters of type Content | |
void |
ProtocolOutput.setContent(Content content)
|
| Constructors in org.apache.nutch.protocol with parameters of type Content | |
ProtocolOutput(Content content,
ProtocolStatus status)
|
|
ProtocolOutput(Content content)
|
|
| Uses of Content in org.apache.nutch.protocol.file |
| Methods in org.apache.nutch.protocol.file that return Content | |
Content |
FileResponse.toContent()
|
| Uses of Content in org.apache.nutch.protocol.ftp |
| Methods in org.apache.nutch.protocol.ftp that return Content | |
Content |
FtpResponse.toContent()
|
| Uses of Content in org.apache.nutch.protocol.http |
| Methods in org.apache.nutch.protocol.http that return Content | |
Content |
HttpResponse.toContent()
|
| Uses of Content in org.apache.nutch.protocol.httpclient |
| Methods in org.apache.nutch.protocol.httpclient that return Content | |
Content |
HttpResponse.toContent()
|
| Uses of Content in org.apache.nutch.segment |
| Methods in org.apache.nutch.segment with parameters of type Content | |
boolean |
SegmentReader.get(long n,
FetcherOutput fo,
Content co,
ParseText pt,
ParseData pd)
Get a specified entry from the segment. |
boolean |
SegmentReader.next(FetcherOutput fo,
Content co,
ParseText pt,
ParseData pd)
Read values from all open readers. |
void |
SegmentWriter.append(FetcherOutput fo,
Content co,
ParseText pt,
ParseData pd)
Append new values to the output segment. |
| Uses of Content in org.creativecommons.nutch |
| Methods in org.creativecommons.nutch with parameters of type Content | |
Parse |
CCParseFilter.filter(Content content,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Adds metadata or otherwise modifies a parse of an HTML document, given the DOM tree of a page. |
|
|||||||||||
| PREV NEXT | FRAMES NO FRAMES | ||||||||||