org.apache.nutch.parse.msword
Class MSWordParser
java.lang.Object
org.apache.nutch.parse.msword.MSWordParser
- All Implemented Interfaces:
- Parser
- public class MSWordParser
- extends Object
- implements Parser
parser for mime type application/msword.
It is based on org.apache.poi.*. We have to see how well it performs.
- Author:
- John Xing
Note on 20040614 by Xing:
Some codes are stacked here for convenience (see inline comments).
They may be moved to more appropriate places when new codebase
stabilizes, especially after code for indexing is written., Andy Hedges
code to extract all msword properties.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
MSWordParser
public MSWordParser()
getParse
public Parse getParse(Content content)
- Description copied from interface:
Parser
- Creates the parse for some content.
- Specified by:
getParse
in interface Parser
Copyright © 2006 The Apache Software Foundation