Chemical formulas in GIF files
Rzepa, Henry
h.rzepa at ic.ac.uk
Sun Jul 22 20:43:48 BST 2001
Following up the thread by Jerzy, Eugene and Michael,
I would like to comment that both GIF and PNG have
some deficiencies when it comes to capturing chemical formulas.
Although the GIF and PNG formats do have "invisible" fields
which could be made to contain a connection table, there is very
little software which can easily either detect or read this information.
If either the raster component, or the connection table were to be edited,
the two would certainly no longer correspond. Additionally,
the raster component cannot be scaled, indexed or transformed in
any useful way.
About 18 months ago, we suggested an alternative display format
called SVG (http://www.ch.ic.ac.uk/svg/ )
is based on XML and vector graphics. We expect that
more and more chemical programs will have an SVG (and CML) export filter.
The advantage of SVG is that, being an XML language, transforms
to and from it from other XML languages such as eg CML
(http://www.xml-cml.org/ ) are possible.
Indeed, such transforms are often done on the fly
For example, JUMBO3-JS (see http://www.xml-cml.org/jumbo3/jumbo3-JS/ )
is a collection of JavaScripts and XML/XSLT
which takes an XML file containing eg the CML namespace and
converts it to SVG for display. This example is rather nice in illustrating
that the primary purpose of any chemical file should be to carry well structured
"self-identifying" data, and that the decision on how to display it should be
managed elsewhere, ie in this case an SVG transform. The resulting SVG
can be displayed as high quality using eg Adobe's SVG viewer, or directly
in an SVG aware browser (i.e. http://www.croczilla.com/svg, or Amaya,
http://www.w3.org/Amaya/ ). These browsers have not yet reached the stage
of being usable on a daily basis, and we have not quite reached the
stage where capturing chemical structures into CML for SVG-based
display can immediately display the use of PNG images, but the day
cannot be too far off.
Google claims that 1.3 billion web pages exist. These probably reference
10 billion GIF/JPG/PNG images. If even only 1% of these images relate
to chemistry, that is 100 million chemical images out there which carry
no easily retrievable chemical information such as connection tables,
or 2D/3D coordinates. That is a terrible loss to the community. Imagine
however the situation if eg 100 million SVG/CML files existed
(SVG can contain CML and vice versa, ie you can have BOTH
in a file). That would be three times the known number of molecules,
and a fantastic resource! Dream on you may say, but we have to
plan for this day, which may come sooner than you think!
--
Henry Rzepa. +44 (0)20 7594 5774 (Office) +44 (0870) 132-3747 (eFax)
Dept. Chemistry, Imperial College, London, SW7 2AY, UK.
http://www.ch.ic.ac.uk/rzepa/
chemweb: A list for Chemical Applications of the Internet.
To post to list: mailto:chemweb at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/chemweb/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe chemweb
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the chemweb
mailing list