HTML Parser Jar Home Page

org.htmlparser.visitors
Class TextExtractingVisitor

java.lang.Object
  extended by org.htmlparser.visitors.NodeVisitor
      extended by org.htmlparser.visitors.TextExtractingVisitor

public class TextExtractingVisitor
extends NodeVisitor

Extracts text from a web page. Usage: Parser parser = new Parser(...); TextExtractingVisitor visitor = new TextExtractingVisitor(); parser.visitAllNodesWith(visitor); String textInPage = visitor.getExtractedText();


Constructor Summary
TextExtractingVisitor()
           
 
Method Summary
 java.lang.String getExtractedText()
           
 void visitEndTag(Tag tag)
           
 void visitStringNode(Text stringNode)
           
 void visitTag(Tag tag)
           
 
Methods inherited from class org.htmlparser.visitors.NodeVisitor
beginParsing, finishedParsing, shouldRecurseChildren, shouldRecurseSelf, visitRemarkNode
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextExtractingVisitor

public TextExtractingVisitor()
Method Detail

getExtractedText

public java.lang.String getExtractedText()

visitStringNode

public void visitStringNode(Text stringNode)
Overrides:
visitStringNode in class NodeVisitor

visitTag

public void visitTag(Tag tag)
Overrides:
visitTag in class NodeVisitor

visitEndTag

public void visitEndTag(Tag tag)
Overrides:
visitEndTag in class NodeVisitor

© 2006 Derrick Oswald
April 1, 2006

HTML Parser is an open source library released under Common Public License.SourceForge.net