Package org.jsoup.nodes
Class Entities
- java.lang.Object
-
- org.jsoup.nodes.Entities
-
public class Entities extends java.lang.Object
HTML entities, and escape routines. Source: W3C HTML named character references.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
Entities.CoreCharset
static class
Entities.EscapeMode
-
Field Summary
Fields Modifier and Type Field Description private static char[]
codeDelims
(package private) static int
codepointRadix
private static Document.OutputSettings
DefaultOutput
private static int
empty
private static java.lang.String
emptyName
private static java.util.HashMap<java.lang.String,java.lang.String>
multipoints
-
Constructor Summary
Constructors Modifier Constructor Description private
Entities()
-
Method Summary
All Methods Static Methods Concrete Methods Deprecated Methods Modifier and Type Method Description private static void
appendEncoded(java.lang.Appendable accum, Entities.EscapeMode escapeMode, int codePoint)
private static boolean
canEncode(Entities.CoreCharset charset, char c, java.nio.charset.CharsetEncoder fallback)
static int
codepointsForName(java.lang.String name, int[] codepoints)
(package private) static void
escape(java.lang.Appendable accum, java.lang.String string, Document.OutputSettings out, boolean inAttribute, boolean normaliseWhite, boolean stripLeadingWhite)
static java.lang.String
escape(java.lang.String string)
HTML escape an input string, using the default settings (UTF-8, base entities).static java.lang.String
escape(java.lang.String string, Document.OutputSettings out)
HTML escape an input string.static java.lang.String
getByName(java.lang.String name)
Get the character(s) represented by the named entitystatic java.lang.Character
getCharacterByName(java.lang.String name)
Deprecated.does not support characters outside the BMP or multiple character namesstatic boolean
isBaseNamedEntity(java.lang.String name)
Check if the input is a known named entity in the base entity set.static boolean
isNamedEntity(java.lang.String name)
Check if the input is a known named entityprivate static void
load(Entities.EscapeMode e, java.lang.String pointsData, int size)
static java.lang.String
unescape(java.lang.String string)
Un-escape an HTML escaped string.(package private) static java.lang.String
unescape(java.lang.String string, boolean strict)
Unescape the input string.
-
-
-
Field Detail
-
empty
private static final int empty
- See Also:
- Constant Field Values
-
emptyName
private static final java.lang.String emptyName
- See Also:
- Constant Field Values
-
codepointRadix
static final int codepointRadix
- See Also:
- Constant Field Values
-
codeDelims
private static final char[] codeDelims
-
multipoints
private static final java.util.HashMap<java.lang.String,java.lang.String> multipoints
-
DefaultOutput
private static final Document.OutputSettings DefaultOutput
-
-
Method Detail
-
isNamedEntity
public static boolean isNamedEntity(java.lang.String name)
Check if the input is a known named entity- Parameters:
name
- the possible entity name (e.g. "lt" or "amp")- Returns:
- true if a known named entity
-
isBaseNamedEntity
public static boolean isBaseNamedEntity(java.lang.String name)
Check if the input is a known named entity in the base entity set.- Parameters:
name
- the possible entity name (e.g. "lt" or "amp")- Returns:
- true if a known named entity in the base set
- See Also:
isNamedEntity(String)
-
getCharacterByName
public static java.lang.Character getCharacterByName(java.lang.String name)
Deprecated.does not support characters outside the BMP or multiple character namesGet the Character value of the named entity- Parameters:
name
- named entity (e.g. "lt" or "amp")- Returns:
- the Character value of the named entity (e.g. '<' or '&')
-
getByName
public static java.lang.String getByName(java.lang.String name)
Get the character(s) represented by the named entity- Parameters:
name
- entity (e.g. "lt" or "amp")- Returns:
- the string value of the character(s) represented by this entity, or "" if not defined
-
codepointsForName
public static int codepointsForName(java.lang.String name, int[] codepoints)
-
escape
public static java.lang.String escape(java.lang.String string, Document.OutputSettings out)
HTML escape an input string. That is,<
is returned as<
- Parameters:
string
- the un-escaped string to escapeout
- the output settings to use- Returns:
- the escaped string
-
escape
public static java.lang.String escape(java.lang.String string)
HTML escape an input string, using the default settings (UTF-8, base entities). That is,<
is returned as<
- Parameters:
string
- the un-escaped string to escape- Returns:
- the escaped string
-
escape
static void escape(java.lang.Appendable accum, java.lang.String string, Document.OutputSettings out, boolean inAttribute, boolean normaliseWhite, boolean stripLeadingWhite) throws java.io.IOException
- Throws:
java.io.IOException
-
appendEncoded
private static void appendEncoded(java.lang.Appendable accum, Entities.EscapeMode escapeMode, int codePoint) throws java.io.IOException
- Throws:
java.io.IOException
-
unescape
public static java.lang.String unescape(java.lang.String string)
Un-escape an HTML escaped string. That is,<
is returned as<
.- Parameters:
string
- the HTML string to un-escape- Returns:
- the unescaped string
-
unescape
static java.lang.String unescape(java.lang.String string, boolean strict)
Unescape the input string.- Parameters:
string
- to un-HTML-escapestrict
- if "strict" (that is, requires trailing ';' char, otherwise that's optional)- Returns:
- unescaped string
-
canEncode
private static boolean canEncode(Entities.CoreCharset charset, char c, java.nio.charset.CharsetEncoder fallback)
-
load
private static void load(Entities.EscapeMode e, java.lang.String pointsData, int size)
-
-