Package org.jsoup.parser
Contains the HTML parser, tag specifications, and HTML tokeniser.
-
Class Summary Class Description CharacterReader CharacterReader consumes tokens off a string.HtmlTreeBuilder HTML Tree Builder; creates a DOM from Tokens.HtmlTreeBuilderState.Constants ParseError A Parse Error records an error in the input HTML that occurs in either the tokenisation or the tree building phase.ParseErrorList A container for ParseErrors.Parser Parses HTML into aDocument
.ParseSettings Controls parser settings, to optionally preserve tag and/or attribute name case.Tag HTML Tag capabilities.Token Parse tokens for the Tokeniser.Token.CData Token.Character Token.Comment Token.Doctype Token.EndTag Token.EOF Token.StartTag Token.Tag Tokeniser Readers the input stream into tokens.TokenQueue A character queue with parsing helpers.TreeBuilder XmlTreeBuilder Use theXmlTreeBuilder
when you want to parse XML without any of the HTML DOM rules being applied to the document. -
Enum Summary Enum Description HtmlTreeBuilderState The Tree Builder's current state.Token.TokenType TokeniserState States and transition activations for the Tokeniser.