|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||
java.lang.Objectorg.apache.nutch.analysis.lang.LanguageIdentifier
Identify the language of a content, based on statistical analysis.
| Method Summary | |
static LanguageIdentifier |
getInstance()
Get a LanguageIdentifier instance. |
String |
identify(InputStream is)
Identify language from input stream. |
String |
identify(InputStream is,
String charset)
Identify language from input stream. |
String |
identify(String content)
Identify language of a content. |
String |
identify(StringBuffer content)
Identify language of a content. |
static void |
main(String[] args)
Main method used for command line process. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Method Detail |
public static LanguageIdentifier getInstance()
public static void main(String[] args)
LanguageIdentifier [-identifyrows filename maxlines]
[-identifyfile charset filename]
[-identifyfileset charset files]
[-identifytext text]
[-identifyurl url]
args - arguments.public String identify(String content)
content - is the content to analyze.
public String identify(StringBuffer content)
content - is the content to analyze.
public String identify(InputStream is)
throws IOException
identify(InputStream, String) method.
is - is the input stream to analyze.
IOException - if something wrong occurs on the input stream.
public String identify(InputStream is,
String charset)
throws IOException
is - is the input stream to analyze.charset - is the charset to use to read the input stream.
IOException - if something wrong occurs on the input stream.
|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||