Why XML data typing isn't *too* hard

Charles Reitzel creitzel at mediaone.net
Mon Nov 30 15:42:16 GMT 1998


NOTATIONS are perhaps the best way for XML authors to define appropriate
data types.  I think this is something Eliot K. has been talking about for
some time.  As David M. suggests, however, data types are often much more
constrained and application specific than just the C/Java primitive types.
I think of NOTATIONs as a way for the XML author to say, "the application
will know what to do with this", including applying constraints.  Agree w/
Eliot that NOTATION parameters would be helpful in this regard, e.g.
Decimal(10,2).

In analyzing a set of mainframe transaction "views" for conversion to XML I
wanted to use strong data typing but pure XML.  I came up w/ the following.
Any comments and criticisms would be appreciated.  Apologies for the length.

Some things to note: 

I do *not* include the protocol with the SYSTEM id's.  One client might use
http: or anonymous ftp: from a trusted system.  Others might use https: over
the public internet.

For this scheme to work properly, Namespaces should apply to notations and
entities as well as element names.  The usual rant applies about not being
able to specify the URI associated with DTD's or fragments thereof.

It is interesting that you can't define an enumeration for an element value,
only attributes.

I have made an effort to break up the DTD into reusable files, much as I
would with C++ header files.  I think the analogy is appropriate to this
application.  I hope this example illustrates to you "document" folks how us
"database" people view things - or at least this one does.

I miss parameterized macros a la the C preprocessor!

No linking in this app.  But if there were, ID/IDREF wouldn't work because
the <POSITION> has a compound key: acct# + cusip.  ID/IDREF is a pretty weak
form of linking.  Only useful if the ID's are generated, like Oracle
sequence #'s.  The difference is that ID's have document scope.  Sequence
#'s generally have table scope (probably found in many "documents").  Real
keys are more complicated.

Rather than names, the target system for these views uses the numeric tags
associated with each "field".  A #FIXED attribute does a good job of
associating numbers w/ the names.

===========================

<!-- acmeid.dtd Acme public identifiers -->
<!ENTITY % AcmePublicId     "Acme, Inc." >
<!ENTITY % AcmeIdPrefix     "-//%AcmePublicId;//Common" >

<!-- Non-parameter entity for use with namespace declarations -->
<!ENTITY Acme_Common_Namespace 
        "-//%AcmeIdPrefix;//Namespace version X.XXXX " 
>

===========================

<!-- acmetype.dtd Define Acme data types -->
<!ENTITY % AcmeIds
        SYSTEM "//xml.acme.com/common/acmeid.dtd" >
%AcmeIds;

<!ENTITY % AcmeDataTypePfx      "-//%AcmeIdPrefix;//DataType: " >

<!NOTATION ShareQuantity 
        PUBLIC "%AcmeDataTypePfx; ShareQuantity (12345678.12345)"
>
<!NOTATION SharePrice
        PUBLIC "%AcmeDataTypePfx; ShareQuantity (12345678.123)"
>
<!NOTATION Integer
        PUBLIC "%AcmeDataTypePfx; Integer"
>
<!NOTATION Float
        PUBLIC "%AcmeDataTypePfx; Float"
>
<!NOTATION Date
        PUBLIC "%AcmeDataTypePfx; Simple Date (YYYYMMDD)"
>
<!NOTATION Time
        PUBLIC "%AcmeDataTypePfx; Simple Time (23:59:59)"
>
<!NOTATION AlphaNum
        PUBLIC "%AcmeDataTypePfx; Alpha-Numeric"
>
<!NOTATION Indicator
        PUBLIC "%AcmeDataTypePfx; Indicator (Y/N)"
>

<!NOTATION AcmeDataType
        PUBLIC "%AcmeDataTypePfx; Data Type"
>

<!ENTITY % AcmeDataTypeList
"ShareQuantity|SharePrice|Integer|Float|Date|Time|AlphaNum|Indicator"
>
<!ENTITY % DataTypeDecl   "DataType NOTATION( %AcmeDataTypeList; ) #FIXED ">

<!ENTITY % AlphaNumAttr   "%DataTypeDecl; &#34;AlphaNum&#34;" >
<!ENTITY % IntegerAttr    "%DataTypeDecl; &#34;Integer&#34;" >
<!ENTITY % FloatAttr      "%DataTypeDecl; &#34;Float&#34;" >
<!ENTITY % DateAttr       "%DataTypeDecl; &#34;Date&#34;" >
<!ENTITY % TimeAttr       "%DataTypeDecl; &#34;Time&#34;" >
<!ENTITY % IndicatorAttr  "%DataTypeDecl; &#34;Indicator&#34;" >
<!ENTITY % ShareQtrAttr   "%DataTypeDecl; &#34;ShareQuantity&#34;" >
<!ENTITY % SharePriceAttr "%DataTypeDecl; &#34;SharePrice&#34;" >

===========================

<!-- acmeflds.dtd Acme "Field" Definitions -->

<!ENTITY % AcmeDataTypes
        SYSTEM "//xml.acme.com/common/acmetype.dtd" >
%AcmeDataTypes;

<!ENTITY % AcmeElemDefPfx       "-//%AcmeIdPrefix;//Element: " >
<!ENTITY % AcmeViewDefPfx       "-//%AcmeIdPrefix;//View: " >

<!ENTITY % FixedIntAttrDecl     " CDATA #FIXED " >
<!ENTITY % SizeAttr             "Size           %FixedIntAttrDecl;" >
<!ENTITY % AcmeTagAttr          "AcmeTag        %FixedIntAttrDecl;" >
<!ENTITY % MaxOccursAttr        "MaxOccurs      %FixedIntAttrDecl;" >


<!-- ATTRIBUTE COMMON TO MANY Acme VIEWS -->

<!ENTITY % ProdIdAttr
"PRODUCT_IDENTIFIER    ( A | B | C | D | E | F | G | I |
                         J | K | L | M | N | Q | R | S | 
                         T | U | V | W | X )  #REQUIRED"
>

<!ENTITY % VersionIdAttr
        "VERSION_IDENTIFIER_ID CDATA #IMPLIED"
>

<!ENTITY % ContKeyAttr
        "CONTINUATION_KEY CDATA #IMPLIED"
>


<!-- DEFINE THE FIELDS -->

<!ELEMENT INSTRUMENT_TYPE_ID (#PCDATA)>
<!ATTLIST INSTRUMENT_TYPE_ID
        %AlphaNumAttr;
        %SizeAttr;      "1"
        %AcmeTagAttr;   "1"
>
<!ELEMENT MARKET_VALUE_ID (#PCDATA)>
<!ATTLIST MARKET_VALUE_ID
        %FloatAttr;
        %SizeAttr;      "14"
        %AcmeTagAttr;   "298"
>
<!ELEMENT AS_OF_DATE (#PCDATA)>
<!ATTLIST AS_OF_DATE
        %DateAttr;
        %SizeAttr;      "8"
        %AcmeTagAttr;   "702"
>
<!ELEMENT BROKERAGE_ACCT_TYPE (#PCDATA)>
<!ATTLIST BROKERAGE_ACCT_TYPE
        %AlphaNumAttr;
        %SizeAttr;      "1"
        %AcmeTagAttr;   "703"
>
<!ELEMENT SETTLEMENT_DATE_SHARES (#PCDATA)>
<!ATTLIST SETTLEMENT_DATE_SHARES
        %AlphaNumAttr;
        %SizeAttr;      "15"
        %AcmeTagAttr;   "704"
>
<!ELEMENT TRADE_DATE_SHARES (#PCDATA)>
<!ATTLIST TRADE_DATE_SHARES
        %AlphaNumAttr;
        %SizeAttr;      "15"
        %AcmeTagAttr;   "705"
>
<!ELEMENT UNCOMMITTED_SHARES (#PCDATA)>
<!ATTLIST UNCOMMITTED_SHARES
        %AlphaNumAttr;
        %SizeAttr;      "15"
        %AcmeTagAttr;   "706"
>
<!ELEMENT BROKERAGE_CLOSING_PRICE (#PCDATA)>
<!ATTLIST BROKERAGE_CLOSING_PRICE
        %AlphaNumAttr;
        %SizeAttr;      "20"
        %AcmeTagAttr;   "708"
>
<!ELEMENT BROKERAGE_ACCT_NBR (#PCDATA)>
<!ATTLIST BROKERAGE_ACCT_NBR
        %AlphaNumAttr;
        %SizeAttr;      "9"
        %AcmeTagAttr;   "709"
>
<!ELEMENT BROKERAGE_CUSIP (#PCDATA)>
<!ATTLIST BROKERAGE_CUSIP
        %AlphaNumAttr;
        %SizeAttr;      "12"
        %AcmeTagAttr;   "740"
>
<!ELEMENT BROKERAGE_SYMBOL (#PCDATA)>
<!ATTLIST BROKERAGE_SYMBOL
        %AlphaNumAttr;
        %SizeAttr;      "12"
        %AcmeTagAttr;   "741"
>
<!ELEMENT SECURITY_DESCRIPTION (#PCDATA)>
<!ATTLIST SECURITY_DESCRIPTION
        %AlphaNumAttr;
        %SizeAttr;      "40"
        %AcmeTagAttr;   "745"
>
<!ELEMENT CORE_ACCOUNT_IND (#PCDATA)>
<!ATTLIST CORE_ACCOUNT_IND
        %AlphaNumAttr;
        %SizeAttr;      "1"
        %AcmeTagAttr;   "890"
>

===========================

<!-- Acme Document Type Definition (DTD) -->
<!-- Define Acme Brokerage Positions View FOV0001A -->

<!ENTITY % AcmeFields
        SYSTEM "//xml.acme.com/brokerage/acmeflds.dtd" >
%AcmeFields;

<!-- DEFINE THE VIEWS -->

<!ELEMENT FOV0001A_INPUT ( BROKERAGE_ACCT_NBR+ ) >
<!ATTLIST FOV0001A_INPUT
	xmlns			CDATA #FIXED "&Acme_Common_Namespace;"
        %ProdIdAttr;
        %VersionIdAttr;
        %ContKeyAttr;
        %MaxOccursAttr;         "6"
>
        
<!ELEMENT FOV0001A_POSITION 
      ( 
        INSTRUMENT_TYPE_ID |
	MARKET_VALUE_ID |
        AS_OF_DATE |
        BROKERAGE_ACCT_TYPE |
	SETTLEMENT_DATE_SHARES |
	TRADE_DATE_SHARES |
	UNCOMMITTED_SHARES |
	BROKERAGE_CLOSING_PRICE |
        BROKERAGE_ACCT_NBR |
	BROKERAGE_CUSIP |
	BROKERAGE_SYMBOL |
	SECURITY_DESCRIPTION |
	CORE_ACCOUNT_IND
      )
>
<!ATTLIST FOV0001A_POSITION
	BROKERAGE_ERROR_MSG_CODE  CDATA ""
	CONTINUATION_KEY	  CDATA ""
>

<!-- It's OK to #FIX the URI.  #FIX'ing the prefix bugs me -->

<!ELEMENT FOV0001A_OUTPUT ( FOV0001A_POSITION+ ) >
<!ATTLIST FOV0001A_OUTPUT
	xmlns			CDATA #FIXED "&Acme_Common_Namespace;"
	PAGE_OCCURRENCE_CNT	CDATA "000001"
	SECURITY_DESCRIPTOR	CDATA ""
>

===========================

<?xml version="1.0"?>
<!-- INPUT VIEW -->
<!DOCTYPE FOV0001A_INPUT PUBLIC 
        "-//Acme, Inc.//TEXT view FOV0001A INPUT//EN"
        "//xml.acme.com/brokerage/fov0001a.dtd"
>
<FOV0001A_INPUT 
	xmlns="&Acme_Common_Namespace;"
	PRODUCT_IDENTIFIER="X" 
	VERSION_IDENTIFIER_ID="2.3"
	CONTINUATION_KEY="123DFJIE9SMD9E9WEFE"
>
    <BROKERAGE_ACCT_NBR>X01362986</BROKERAGE_ACCT_NBR>
    <BROKERAGE_ACCT_NBR>H85859483</BROKERAGE_ACCT_NBR>
</FOV0001A_INPUT>

===========================

<?xml version="1.0"?>
<!-- OUTPUT VIEW -->
<!DOCTYPE FOV0001A_OUTPUT PUBLIC 
        "-//Acme, Inc.//VIEW FOV0001A OUTPUT//EN"
        "//xml.acme.com/brokerage/fov0001a.dtd"
>
<FOV0001A_OUTPUT 
	xmlns="&Acme_Common_Namespace;"
	PAGE_OCCURRENCE_CNT="000002"
>
    <FOV0001A_POSITION>
	<INSTRUMENT_TYPE_ID>1</INSTRUMENT_TYPE_ID>
	<MARKET_VALUE_ID>23.50</MARKET_VALUE_ID>
	<AS_OF_DATE>19980716</AS_OF_DATE>
	<BROKERAGE_ACCT_TYPE>Z</BROKERAGE_ACCT_TYPE>
	<SETTLEMENT_DATE_SHARES>3442.12500</SETTLEMENT_DATE_SHARES>
	<TRADE_DATE_SHARES>3442.12500</TRADE_DATE_SHARES>
	<UNCOMMITTED_SHARES>0.00000</UNCOMMITTED_SHARES>
	<BROKERAGE_CLOSING_PRICE>13.125</BROKERAGE_CLOSING_PRICE>
	<BROKERAGE_ACCT_NBR>X01362986</BROKERAGE_ACCT_NBR>
	<BROKERAGE_CUSIP>ABCDEFGHIJ</BROKERAGE_CUSIP>
	<BROKERAGE_SYMBOL>BUD</BROKERAGE_SYMBOL>
	<SECURITY_DESCRIPTION>Anheuser-Busch, Inc.</SECURITY_DESCRIPTION>
	<CORE_ACCOUNT_IND>Y</CORE_ACCOUNT_IND>
    </FOV0001A_POSITION>
    <FOV0001A_POSITION>
	<INSTRUMENT_TYPE_ID>1</INSTRUMENT_TYPE_ID>
	<MARKET_VALUE_ID>6788.50</MARKET_VALUE_ID>
	<AS_OF_DATE>19980811</AS_OF_DATE>
	<BROKERAGE_ACCT_TYPE>Z</BROKERAGE_ACCT_TYPE>
	<SETTLEMENT_DATE_SHARES>8446.25000</SETTLEMENT_DATE_SHARES>
	<TRADE_DATE_SHARES>8446.25000</TRADE_DATE_SHARES>
	<UNCOMMITTED_SHARES>0.00000</UNCOMMITTED_SHARES>
	<BROKERAGE_CLOSING_PRICE>131.250</BROKERAGE_CLOSING_PRICE>
	<BROKERAGE_ACCT_NBR>H85859483</BROKERAGE_ACCT_NBR>
	<BROKERAGE_CUSIP>JEUYRU88392</BROKERAGE_CUSIP>
	<BROKERAGE_SYMBOL>ATT</BROKERAGE_SYMBOL>
	<SECURITY_DESCRIPTION>AT&amp;T, Inc.</SECURITY_DESCRIPTION>
	<CORE_ACCOUNT_IND>Y</CORE_ACCOUNT_IND>
    </FOV0001A_POSITION>
</FOV0001A_OUTPUT>

===========================

Regards,
Charles Reitzel
creitzel at mediaone.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list