XSL and the semantic web

Marcelo Cantos marcelo at mds.rmit.edu.au
Tue Jun 22 03:30:28 BST 1999

On Mon, Jun 21, 1999 at 04:47:35PM -0700, David Brownell wrote:
> Marc.McDonald at Design-Intelligence.com wrote:
> > 
> > When sensitive data needs to be hidden I would send it out
> > subsetted in the xml: <name>joe</name> <phone>555-12345</phone>

None of the following argument is at all relevant.  Both formats I
presented in my example made the phone number accessible.  Both could
be used by automatic agents to perform all the privacy invading
activities you warn against.  You are focussing far too much on the
specifics of the example (names and phone numbers) to express a
generic objective to exposition of semantic content.

What if the data were thus:

    <title>The Structure of Scientific Revolutions</title>
    <author>T. S. Khun</author>

and the two presentation examples were:

  <H3>The Structure of Scientific Revolutions</H3>
  <P>Author: <EM>T. S. Khun</EM></P>

    <title>The Structure of Scientific Revolutions</title>
    <author>T. S. Khun</author>

Where is big brother in this?  Where is the unspeakable danger of
losing our freedoms and privacy?

> When classifying information, there's an interesting category of
> "sensitive but not classified" information, which is often more
> useful in aggregate than in individual cases.  Schedules for
> transport could disclose military operations-to-be when many are
> analysed together; one alone is innocuous.
> Phone numbers are a great example of such "sensitive" data,
> particularly when linked with caller ID services that most phone
> companies are pushing.  (Less successfully in Calfirnia than in most
> states, I'm pleased to report!)
> Example:  Hmm, why is Karen calling from Joe's place again?  Could
> be those rumors are correct!  I'll tell ... <XYZ> !!
> The web makes it easy to do such aggregations.  Correlations against
> "Joe" will have lots of noise; against "Joe" and that phone number,
> a lot less.  How is Joe going to be able to defend himself?  Partly
> by not disclosing information in aggregatable form ... removing the
> labeling and content, pre-rendering it (HTML, FO, PDF, GIF, etc),
> and so on.  Partly by insisting that others not disclose such
> information either.
> That means controlling the information accessible through the
> "semantic web" ... if XSL is a tool that becomes effective at
> controlling information spread, more power to it!  (Both XSL-T and
> XSL-FO.)  And that's true of almost any information that's important
> enough to share -- it can be important enough to merit protection,
> too.
> > As to the privacy argument (too easy to get information about
> > other folk...):
> >
> > I agree, but having the information out there but hard to parse
> > doesn't really solve the problem. It just lets those with more
> > expertise, money, power define a first class which gets the
> > information and a second class that doesn't.
> Which is always the case.  The issue is how to keep the bar high
> enough to have some balance; security and privacy are never
> absolute, though lack of them can become absolute.

And HTML raises the bar about half an inch off the ground -- annoying
you when you trip over it, but easily stepped over.  The real problem
is you have to always remember it's there, and it might move (the
webmaster might decide to pretty it up, or bung an ad in front).  So
it provides only aggravation for honest users, and does nothing at all
to stop dishonest users.

> You assumed the context of an intranet, so the threat was less
> because access was restricted ... that's not particularly a good
> assumption, since most crime is "insider" crime, by folk who know
> the victim(s).  True not just in the corporate world, but elsewhere.

And don't forget my earlier comment.  This isn't just about exposing
personal information about humans.

The real danger, to me, of keeping such a potent capability in the
closet is that, in the absence of any real discussion of the nitty
gritty details (which can only occur once the community has started to
push in the direction of semantic content exposition), implementors
will undertake courses of action that expose people to all the perils
you warn against.  They will not listen to the community when it says,
"semantic content considered harmful."  They will do it anyway (it is,
after all an exceedingly useful concept), and they will do it badly.



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list