Class Node

  • All Implemented Interfaces:
    java.lang.Cloneable
    Direct Known Subclasses:
    Element, LeafNode

    public abstract class Node
    extends java.lang.Object
    implements java.lang.Cloneable
    The base, abstract Node model. Elements, Documents, Comments etc are all Node instances.
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      private static class  Node.OuterHtmlVisitor  
    • Field Summary

      Fields 
      Modifier and Type Field Description
      (package private) static java.lang.String EmptyString  
      (package private) Node parentNode  
      (package private) int siblingIndex  
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected Node()
      Default constructor.
    • Method Summary

      All Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      java.lang.String absUrl​(java.lang.String attributeKey)
      Get an absolute URL from a URL attribute that may be relative (i.e.
      protected void addChildren​(int index, Node... children)  
      protected void addChildren​(Node... children)  
      private void addSiblingHtml​(int index, java.lang.String html)  
      Node after​(java.lang.String html)
      Insert the specified HTML into the DOM after this node (i.e.
      Node after​(Node node)
      Insert the specified node into the DOM after this node (i.e.
      java.lang.String attr​(java.lang.String attributeKey)
      Get an attribute's value by its key.
      Node attr​(java.lang.String attributeKey, java.lang.String attributeValue)
      Set an attribute (key=value).
      abstract Attributes attributes()
      Get all of the element's attributes.
      abstract java.lang.String baseUri()
      Get the base URI of this node.
      Node before​(java.lang.String html)
      Insert the specified HTML into the DOM before this node (i.e.
      Node before​(Node node)
      Insert the specified node into the DOM before this node (i.e.
      Node childNode​(int index)
      Get a child node by its 0-based index.
      java.util.List<Node> childNodes()
      Get this node's children.
      protected Node[] childNodesAsArray()  
      java.util.List<Node> childNodesCopy()
      Returns a deep copy of this node's children.
      abstract int childNodeSize()
      Get the number of child nodes that this node holds.
      Node clearAttributes()
      Clear (remove) all of the attributes in this node.
      Node clone()
      Create a stand-alone, deep copy of this node, and all of its children.
      protected Node doClone​(Node parent)  
      protected abstract void doSetBaseUri​(java.lang.String baseUri)
      Set the baseUri for just this node (not its descendants), if this Node tracks base URIs.
      protected abstract java.util.List<Node> ensureChildNodes()  
      boolean equals​(java.lang.Object o)
      Check if this node is the same instance of another (object identity test).
      Node filter​(NodeFilter nodeFilter)
      Perform a depth-first filtering through this node and its descendants.
      private Element getDeepChild​(Element el)  
      boolean hasAttr​(java.lang.String attributeKey)
      Test if this element has an attribute.
      protected abstract boolean hasAttributes()
      Check if this Node has an actual Attributes object.
      boolean hasParent()  
      boolean hasSameValue​(java.lang.Object o)
      Check if this node is has the same content as another node.
      <T extends java.lang.Appendable>
      T
      html​(T appendable)
      Write this node and its children to the given Appendable.
      protected void indent​(java.lang.Appendable accum, int depth, Document.OutputSettings out)  
      Node nextSibling()
      Get this node's next sibling.
      (package private) void nodelistChanged()  
      abstract java.lang.String nodeName()
      Get the node name of this node.
      java.lang.String outerHtml()
      Get the outer HTML of this node.
      protected void outerHtml​(java.lang.Appendable accum)  
      (package private) abstract void outerHtmlHead​(java.lang.Appendable accum, int depth, Document.OutputSettings out)
      Get the outer HTML of this node.
      (package private) abstract void outerHtmlTail​(java.lang.Appendable accum, int depth, Document.OutputSettings out)  
      Document ownerDocument()
      Gets the Document associated with this Node.
      Node parent()
      Gets this node's parent node.
      Node parentNode()
      Gets this node's parent node.
      Node previousSibling()
      Get this node's previous sibling.
      private void reindexChildren​(int start)  
      void remove()
      Remove (delete) this node from the DOM tree.
      Node removeAttr​(java.lang.String attributeKey)
      Remove an attribute from this element.
      protected void removeChild​(Node out)  
      protected void reparentChild​(Node child)  
      protected void replaceChild​(Node out, Node in)  
      void replaceWith​(Node in)
      Replace this node in the DOM with the supplied node.
      Node root()
      Get this node's root node; that is, its topmost ancestor.
      void setBaseUri​(java.lang.String baseUri)
      Update the base URI of this node and all of its descendants.
      protected void setParentNode​(Node parentNode)  
      protected void setSiblingIndex​(int siblingIndex)  
      Node shallowClone()
      Create a stand-alone, shallow copy of this node.
      int siblingIndex()
      Get the list index of this node in its node sibling list.
      java.util.List<Node> siblingNodes()
      Retrieves this node's sibling nodes.
      java.lang.String toString()
      Gets this node's outer HTML.
      Node traverse​(NodeVisitor nodeVisitor)
      Perform a depth-first traversal through this node and its descendants.
      Node unwrap()
      Removes this node from the DOM, and moves its children up into the node's parent.
      Node wrap​(java.lang.String html)
      Wrap the supplied HTML around this node.
      • Methods inherited from class java.lang.Object

        finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • parentNode

        Node parentNode
      • siblingIndex

        int siblingIndex
    • Constructor Detail

      • Node

        protected Node()
        Default constructor. Doesn't setup base uri, children, or attributes; use with caution.
    • Method Detail

      • nodeName

        public abstract java.lang.String nodeName()
        Get the node name of this node. Use for debugging purposes and not logic switching (for that, use instanceof).
        Returns:
        node name
      • hasAttributes

        protected abstract boolean hasAttributes()
        Check if this Node has an actual Attributes object.
      • hasParent

        public boolean hasParent()
      • attr

        public java.lang.String attr​(java.lang.String attributeKey)
        Get an attribute's value by its key. Case insensitive

        To get an absolute URL from an attribute that may be a relative URL, prefix the key with abs, which is a shortcut to the absUrl(java.lang.String) method.

        E.g.:
        String url = a.attr("abs:href");
        Parameters:
        attributeKey - The attribute key.
        Returns:
        The attribute, or empty string if not present (to avoid nulls).
        See Also:
        attributes(), hasAttr(String), absUrl(String)
      • attributes

        public abstract Attributes attributes()
        Get all of the element's attributes.
        Returns:
        attributes (which implements iterable, in same order as presented in original HTML).
      • attr

        public Node attr​(java.lang.String attributeKey,
                         java.lang.String attributeValue)
        Set an attribute (key=value). If the attribute already exists, it is replaced. The attribute key comparison is case insensitive. The key will be set with case sensitivity as set in the parser settings.
        Parameters:
        attributeKey - The attribute key.
        attributeValue - The attribute value.
        Returns:
        this (for chaining)
      • hasAttr

        public boolean hasAttr​(java.lang.String attributeKey)
        Test if this element has an attribute. Case insensitive
        Parameters:
        attributeKey - The attribute key to check.
        Returns:
        true if the attribute exists, false if not.
      • removeAttr

        public Node removeAttr​(java.lang.String attributeKey)
        Remove an attribute from this element.
        Parameters:
        attributeKey - The attribute to remove.
        Returns:
        this (for chaining)
      • clearAttributes

        public Node clearAttributes()
        Clear (remove) all of the attributes in this node.
        Returns:
        this, for chaining
      • baseUri

        public abstract java.lang.String baseUri()
        Get the base URI of this node.
        Returns:
        base URI
      • doSetBaseUri

        protected abstract void doSetBaseUri​(java.lang.String baseUri)
        Set the baseUri for just this node (not its descendants), if this Node tracks base URIs.
        Parameters:
        baseUri - new URI
      • setBaseUri

        public void setBaseUri​(java.lang.String baseUri)
        Update the base URI of this node and all of its descendants.
        Parameters:
        baseUri - base URI to set
      • absUrl

        public java.lang.String absUrl​(java.lang.String attributeKey)
        Get an absolute URL from a URL attribute that may be relative (i.e. an <a href> or <img src>).

        E.g.: String absUrl = linkEl.absUrl("href");

        If the attribute value is already absolute (i.e. it starts with a protocol, like http:// or https:// etc), and it successfully parses as a URL, the attribute is returned directly. Otherwise, it is treated as a URL relative to the element's baseUri(), and made absolute using that.

        As an alternate, you can use the attr(java.lang.String) method with the abs: prefix, e.g.: String absUrl = linkEl.attr("abs:href");

        Parameters:
        attributeKey - The attribute key
        Returns:
        An absolute URL if one could be made, or an empty string (not null) if the attribute was missing or could not be made successfully into a URL.
        See Also:
        attr(java.lang.String), URL(java.net.URL, String)
      • ensureChildNodes

        protected abstract java.util.List<Node> ensureChildNodes()
      • childNode

        public Node childNode​(int index)
        Get a child node by its 0-based index.
        Parameters:
        index - index of child node
        Returns:
        the child node at this index. Throws a IndexOutOfBoundsException if the index is out of bounds.
      • childNodes

        public java.util.List<Node> childNodes()
        Get this node's children. Presented as an unmodifiable list: new children can not be added, but the child nodes themselves can be manipulated.
        Returns:
        list of children. If no children, returns an empty list.
      • childNodesCopy

        public java.util.List<Node> childNodesCopy()
        Returns a deep copy of this node's children. Changes made to these nodes will not be reflected in the original nodes
        Returns:
        a deep copy of this node's children
      • childNodeSize

        public abstract int childNodeSize()
        Get the number of child nodes that this node holds.
        Returns:
        the number of child nodes that this node holds.
      • childNodesAsArray

        protected Node[] childNodesAsArray()
      • parent

        public Node parent()
        Gets this node's parent node.
        Returns:
        parent node; or null if no parent.
      • parentNode

        public final Node parentNode()
        Gets this node's parent node. Not overridable by extending classes, so useful if you really just need the Node type.
        Returns:
        parent node; or null if no parent.
      • root

        public Node root()
        Get this node's root node; that is, its topmost ancestor. If this node is the top ancestor, returns this.
        Returns:
        topmost ancestor.
      • ownerDocument

        public Document ownerDocument()
        Gets the Document associated with this Node.
        Returns:
        the Document associated with this Node, or null if there is no such Document.
      • remove

        public void remove()
        Remove (delete) this node from the DOM tree. If this node has children, they are also removed.
      • before

        public Node before​(java.lang.String html)
        Insert the specified HTML into the DOM before this node (i.e. as a preceding sibling).
        Parameters:
        html - HTML to add before this node
        Returns:
        this node, for chaining
        See Also:
        after(String)
      • before

        public Node before​(Node node)
        Insert the specified node into the DOM before this node (i.e. as a preceding sibling).
        Parameters:
        node - to add before this node
        Returns:
        this node, for chaining
        See Also:
        after(Node)
      • after

        public Node after​(java.lang.String html)
        Insert the specified HTML into the DOM after this node (i.e. as a following sibling).
        Parameters:
        html - HTML to add after this node
        Returns:
        this node, for chaining
        See Also:
        before(String)
      • after

        public Node after​(Node node)
        Insert the specified node into the DOM after this node (i.e. as a following sibling).
        Parameters:
        node - to add after this node
        Returns:
        this node, for chaining
        See Also:
        before(Node)
      • addSiblingHtml

        private void addSiblingHtml​(int index,
                                    java.lang.String html)
      • wrap

        public Node wrap​(java.lang.String html)
        Wrap the supplied HTML around this node.
        Parameters:
        html - HTML to wrap around this element, e.g. <div class="head"></div>. Can be arbitrarily deep.
        Returns:
        this node, for chaining.
      • unwrap

        public Node unwrap()
        Removes this node from the DOM, and moves its children up into the node's parent. This has the effect of dropping the node but keeping its children.

        For example, with the input html:

        <div>One <span>Two <b>Three</b></span></div>

        Calling element.unwrap() on the span element will result in the html:

        <div>One Two <b>Three</b></div>

        and the "Two " TextNode being returned.
        Returns:
        the first child of this node, after the node has been unwrapped. Null if the node had no children.
        See Also:
        remove(), wrap(String)
      • nodelistChanged

        void nodelistChanged()
      • replaceWith

        public void replaceWith​(Node in)
        Replace this node in the DOM with the supplied node.
        Parameters:
        in - the node that will will replace the existing node.
      • setParentNode

        protected void setParentNode​(Node parentNode)
      • replaceChild

        protected void replaceChild​(Node out,
                                    Node in)
      • removeChild

        protected void removeChild​(Node out)
      • addChildren

        protected void addChildren​(Node... children)
      • addChildren

        protected void addChildren​(int index,
                                   Node... children)
      • reparentChild

        protected void reparentChild​(Node child)
      • reindexChildren

        private void reindexChildren​(int start)
      • siblingNodes

        public java.util.List<Node> siblingNodes()
        Retrieves this node's sibling nodes. Similar to node.parent.childNodes(), but does not include this node (a node is not a sibling of itself).
        Returns:
        node siblings. If the node has no parent, returns an empty list.
      • nextSibling

        public Node nextSibling()
        Get this node's next sibling.
        Returns:
        next sibling, or null if this is the last sibling
      • previousSibling

        public Node previousSibling()
        Get this node's previous sibling.
        Returns:
        the previous sibling, or null if this is the first sibling
      • siblingIndex

        public int siblingIndex()
        Get the list index of this node in its node sibling list. I.e. if this is the first node sibling, returns 0.
        Returns:
        position in node sibling list
        See Also:
        Element.elementSiblingIndex()
      • setSiblingIndex

        protected void setSiblingIndex​(int siblingIndex)
      • traverse

        public Node traverse​(NodeVisitor nodeVisitor)
        Perform a depth-first traversal through this node and its descendants.
        Parameters:
        nodeVisitor - the visitor callbacks to perform on each node
        Returns:
        this node, for chaining
      • filter

        public Node filter​(NodeFilter nodeFilter)
        Perform a depth-first filtering through this node and its descendants.
        Parameters:
        nodeFilter - the filter callbacks to perform on each node
        Returns:
        this node, for chaining
      • outerHtml

        public java.lang.String outerHtml()
        Get the outer HTML of this node. For example, on a p element, may return <p>Para</p>.
        Returns:
        outer HTML
        See Also:
        Element.html(), Element.text()
      • outerHtml

        protected void outerHtml​(java.lang.Appendable accum)
      • outerHtmlHead

        abstract void outerHtmlHead​(java.lang.Appendable accum,
                                    int depth,
                                    Document.OutputSettings out)
                             throws java.io.IOException
        Get the outer HTML of this node.
        Parameters:
        accum - accumulator to place HTML into
        Throws:
        java.io.IOException - if appending to the given accumulator fails.
      • outerHtmlTail

        abstract void outerHtmlTail​(java.lang.Appendable accum,
                                    int depth,
                                    Document.OutputSettings out)
                             throws java.io.IOException
        Throws:
        java.io.IOException
      • html

        public <T extends java.lang.Appendable> T html​(T appendable)
        Write this node and its children to the given Appendable.
        Parameters:
        appendable - the Appendable to write to.
        Returns:
        the supplied Appendable, for chaining.
      • toString

        public java.lang.String toString()
        Gets this node's outer HTML.
        Overrides:
        toString in class java.lang.Object
        Returns:
        outer HTML.
        See Also:
        outerHtml()
      • indent

        protected void indent​(java.lang.Appendable accum,
                              int depth,
                              Document.OutputSettings out)
                       throws java.io.IOException
        Throws:
        java.io.IOException
      • equals

        public boolean equals​(java.lang.Object o)
        Check if this node is the same instance of another (object identity test).
        Overrides:
        equals in class java.lang.Object
        Parameters:
        o - other object to compare to
        Returns:
        true if the content of this node is the same as the other
        See Also:
        to compare nodes by their value
      • hasSameValue

        public boolean hasSameValue​(java.lang.Object o)
        Check if this node is has the same content as another node. A node is considered the same if its name, attributes and content match the other node; particularly its position in the tree does not influence its similarity.
        Parameters:
        o - other object to compare to
        Returns:
        true if the content of this node is the same as the other
      • clone

        public Node clone()
        Create a stand-alone, deep copy of this node, and all of its children. The cloned node will have no siblings or parent node. As a stand-alone object, any changes made to the clone or any of its children will not impact the original node.

        The cloned node may be adopted into another Document or node structure using Element.appendChild(Node).

        Overrides:
        clone in class java.lang.Object
        Returns:
        a stand-alone cloned node, including clones of any children
        See Also:
        shallowClone()
      • shallowClone

        public Node shallowClone()
        Create a stand-alone, shallow copy of this node. None of its children (if any) will be cloned, and it will have no parent or sibling nodes.
        Returns:
        a single independent copy of this node
        See Also:
        clone()
      • doClone

        protected Node doClone​(Node parent)