public final class Xml
extends java.lang.Object
provides static fields and methods which help to create regular expressions needed to parse XML.
Definitions as well as names are taken from XML 2nd Edition.
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
Attribute
a regular expression matching an XML attribute including name,
equal sign and attribute value.
|
static java.lang.String |
AttValue
a regular expression matching an XML attribute value.
|
static java.lang.String |
CDSect
a regular expression matching a CDATA section.
|
static java.lang.String |
CharRef
a regular expression matching an XML CharRef.
|
static java.lang.String |
CombiningChar
a regular expression matching an XML CombiningChar.
|
static java.lang.String |
Comment
a regular expression matching an XML comment.
|
static java.lang.String |
CONTENT
use this to retrieve the element content from the map passed to
splitElement(Map,StringBuilder,int) . |
static Dfa |
DFA_AttValue
|
static Dfa |
DFA_Eq
|
static Dfa |
DFA_Name
|
static Dfa |
DFA_S
|
static java.lang.String |
Digit
a regular expression matching an XML Digit.
|
static java.lang.String |
EncName
a regular expression matching an encoding name.
|
static java.lang.String |
EntityRef
a regular expresion matching an XML EntityRef.
|
static java.lang.String |
Eq
defines the equation sign surrouned by optional space used to
separate attributes from their values.
|
static java.lang.String |
Extender
a regular expression matching an XML Extender.
|
static java.lang.String |
Letter
a regular expression matching an XML Letter.
|
static java.lang.String |
Name
defines the the regular expression which matches a tag- or
attribute-name.
|
static java.lang.String |
NameChar
a regular expression matching an XML NameChar.
|
static java.lang.String |
PI
a regular expression matching a processing instruction.
|
static java.lang.String |
Reference
a regular expression matching an XML Reference.
|
static java.lang.String |
S
a regular expression matching an XML S, which is white space.
|
static java.lang.String |
TAGNAME
use this to retrieve the name of the tag from the map passed to
splitElement(Map,StringBuilder,int) . |
static java.lang.String |
XMLDecl
a regular expression matching the XML Declaration.
|
Modifier and Type | Method and Description |
---|---|
static java.lang.String |
EmptyElemTag()
returns a regular expression matching any empty element.
|
static java.lang.String |
EmptyElemTag(java.lang.String nameRe)
returns a string matching an empty element XML tag the name of
which matches the given regular expression.
|
static java.lang.String |
ETag()
returns a regular expression which matches any end tag.
|
static java.lang.String |
ETag(java.lang.String nameRe)
returns a string matching an XML end tag the name of which
matches the given regular expression.
|
static java.lang.String |
getETagName(java.lang.StringBuilder s,
int start)
return the name of an end tag.
|
static java.lang.String |
GoofedElement(java.lang.String nameRe)
creates a regular expression to match a whole XML element
including the start tag with its optional attributes, the content
and the end tag.
|
static void |
splitElement(java.util.Map<java.lang.String,java.lang.String> dst,
java.lang.StringBuilder s,
int start)
splits a start tag, a complete XML element or the
XML declaration into its parts and fills them
into the given Map . |
static java.util.Map<java.lang.String,java.lang.String> |
splitElement(java.lang.StringBuilder s,
int start)
calls
splitElement(Map,StringBuilder,int) with a freshly
allocated HashMap and returns the filled map. |
static java.lang.String |
STag()
returns a regular expression matching any start tag.
|
static java.lang.String |
STag(java.lang.String nameRe)
returns a string matching an XML start tag the name of which
matches the given regular expression.
|
public static final Dfa DFA_Name
public static final Dfa DFA_S
public static final Dfa DFA_Eq
public static final Dfa DFA_AttValue
public static final java.lang.String TAGNAME
splitElement(Map,StringBuilder,int)
.public static final java.lang.String CONTENT
splitElement(Map,StringBuilder,int)
.public static final java.lang.String S
public static final java.lang.String CombiningChar
public static final java.lang.String Extender
public static final java.lang.String Digit
public static final java.lang.String Letter
public static final java.lang.String NameChar
public static final java.lang.String CharRef
public static final java.lang.String Eq
public static final java.lang.String Name
public static final java.lang.String EntityRef
public static final java.lang.String Reference
public static final java.lang.String AttValue
public static final java.lang.String Attribute
public static final java.lang.String PI
a regular expression matching a processing instruction. In
deviation from the XML standard, also the processingn instruction
for PITarget
xml, i.e. <?xml
...?>
will be matched.
public static final java.lang.String CDSect
public static final java.lang.String Comment
a regular expression matching an XML comment.
FIX ME: This regular expression currently allows a double dash to be part of the comment
public static final java.lang.String EncName
a regular expression matching an encoding name.
public static final java.lang.String XMLDecl
a regular expression matching the XML Declaration. The match
of an XMLDecl can be taken apart with splitElement(Map,StringBuilder,int)
.
public static java.lang.String STag(java.lang.String nameRe)
returns a string matching an XML start tag the name of which
matches the given regular expression. To get a regular expression
which matches any tag name, call
STag()
.
public static java.lang.String STag()
public static java.lang.String EmptyElemTag(java.lang.String nameRe)
returns a string matching an empty element XML tag the name of
which matches the given regular expression. To get a regular
expression which matches any tag name, call
EmptyElemTag()
.
Reminder: An empty element tag is a tag which ends
in "/>
", like "<br/>
".
public static java.lang.String EmptyElemTag()
returns a regular expression matching any empty element.
public static java.lang.String GoofedElement(java.lang.String nameRe)
creates a regular expression to match a whole XML element including the start tag with its optional attributes, the content and the end tag. The method has its name because it does not match an XML element according to the strict rules of the standard. The following things can go wrong:
nameRe
is indeed a regular expression and
not just a string, you cannot be sure that the start tag and
the end tag are indeed the same. For example with
nameRe="[AB]"
the text
<A>hallo</B>
will be matched although it
is not well formed XML.<A><A>..</A></A>
only the part up to the first closing tag, namely
<A><A>..</A>
, will be
matched. Matching an element which contains elements with
different names, however, is not a problem.public static java.lang.String ETag(java.lang.String nameRe)
returns a string matching an XML end tag the name of which
matches the given regular expression. To get a regular expression
which matches any tag name, call
ETag()
.
public static java.lang.String ETag()
public static java.util.Map<java.lang.String,java.lang.String> splitElement(java.lang.StringBuilder s, int start)
calls splitElement(Map,StringBuilder,int)
with a freshly
allocated HashMap
and returns the filled map.
Hint: to prevent frequent reallocation of the map,
allocate it yourself and call splitElement(Map,StringBuilder,int)
directly.
public static void splitElement(java.util.Map<java.lang.String,java.lang.String> dst, java.lang.StringBuilder s, int start)
splits a start tag, a complete XML element or the XML declaration
into its parts and fills them
into the given Map
. Attributes are stored in the map
in the obvious way. The tagname is stored with the key TAGNAME
. If the method is applied to a complete element, the
element's content is stored with the key CONTENT
. The
input, starting at start
in s
may be a
start tag, an empty element, an element with content or the XML
declaration. In all
cases, trailing garbage like space does not harm and is
ignored.
The given Map
is not cleared by this
method. The content of s
is not changed.
Note: The outcome of the method is undefined, if the input is not a well formed according to the allowed input listed above. An exception is trailing space.
public static java.lang.String getETagName(java.lang.StringBuilder s, int start)
return the name of an end tag. Under the assumption that the
given StringBuilder
, starting at start
,
contains an XML end tag, possibly followed by characters not
containing a '>' character, the name of the end tag is
returned. At start, however, it is assumed and not tested that
the initial '<' of the end tag is located.