Binary Data

Michael Emmel mike at jmaca.com
Mon Feb 23 16:16:21 GMT 1998


 Okay  I read the spec better now that someone methiond NDATA and I undertstand
how
the  unparsed entity works.
What I still do not understand and it seems to be
undefinded is how the parser is restarted once and application consumes
a unparsed entity. At least for me.


                ExternalID ::= 'SYSTEM' S SystemLiteral | 'PUBLIC' S
PubidLiteral S SystemLiteral
                 NDataDecl::= S 'NDATA' S Name  [  VC: Notation Declared  ]

Hers the description of a VC

Validity Constraint: Notation Declared
          The Name must match the declared name of a notation.
            The SystemLiteral is called the entity's system identifier. It is a
URI, which may be used to retrieve the entity.
          Note that the hash mark (#) and fragment identifier frequently used
with URIs are not, formally, part of the URI
          itself; an XML processor may signal an error if a fragment identifier
is given as part of a system identifier. Unless
          otherwise provided by information outside the scope of this
specification (e.g. a special XML element type defined
          by a particular DTD, or a processing instruction defined by a
particular application specification), relative URIs
          are relative to the location of the resource within which the entity
declaration occurs. A URI might thus be relative
          to the document entity, to the entity containing the external DTD
subset, or to some other external parameter
          entity.
            An XML processor should handle a non-ASCII character in a URI by
representing the character in UTF-8 as
          one or more bytes, and then escaping these bytes with the URI
escaping mechanism (i.e., by converting each byte
          to %HH, where HH is the hexadecimal notation of the byte value).
            In addition to a system identifier, an external identifier may
include a public identifier. An XML processor
          attempting to retrieve the entity's content may use the public
identifier to try to generate an alternative URI. If the
          processor is unable to do so, it must use the URI specified in the
system literal. Before a match is attempted, all
          strings of white space in the public identifier must be normalized to
single space characters (#x20), and leading
          and trailing white space must be removed.
            Examples of external entity declarations:


and  here are some examples

<!ENTITY open-hatch
                    SYSTEM
"http://www.textuality.com/boilerplate/OpenHatch.xml">
           <!ENTITY open-hatch
                    PUBLIC "-//Textuality//TEXT Standard open-hatch
boilerplate//EN"
                    "http://www.textuality.com/boilerplate/OpenHatch.xml">
           <!ENTITY hatch-pic
                    SYSTEM "../grafix/OpenHatch.gif"
                    NDATA gif >


This says to me that binary data is required to either be encoded to ascii to
be included,
or have Mime type boundries for XML tags with  binary data not containing the
mime boundries included.
In the document or be obtained from a ascii normalized external URI link.
There is no way to tell a XML arser to skip x number of  arbitrary bytes of
embedded unparsed entity  data which is consumed by the "application"  and then
restart the parser
at the next valid section.

Am I wrong ???

Mike


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list