Chemical formulas in GIF files

Rzepa, Henry h.rzepa at ic.ac.uk
Sun Jul 22 20:43:48 BST 2001


Following up the thread by  Jerzy, Eugene and  Michael,
I would like to comment that both  GIF and PNG have
some deficiencies when it comes to capturing chemical formulas.
Although the GIF and PNG formats do have "invisible" fields
which could be made to contain a connection table, there is very
little software which can easily either detect or read this information.
If either the raster component, or the connection table were to be edited,
the two would certainly no longer correspond.  Additionally,
the  raster component cannot be scaled, indexed or transformed in
any useful way.

About 18 months ago, we suggested an alternative display format
called  SVG (http://www.ch.ic.ac.uk/svg/ )
is based on  XML and vector graphics. We expect that
more and more chemical programs will have an  SVG (and CML) export filter.

The advantage of  SVG is that, being an  XML language, transforms
to and from it from other  XML languages such as eg CML 
(http://www.xml-cml.org/ ) are possible.
Indeed, such transforms are often done on the fly


For example,  JUMBO3-JS (see http://www.xml-cml.org/jumbo3/jumbo3-JS/ ) 
is a collection of JavaScripts and  XML/XSLT
which takes an  XML file containing eg the CML namespace and
converts it to  SVG for display.  This example is rather nice in illustrating
that the primary purpose of any chemical  file should be to carry well structured
"self-identifying"  data, and that the decision on how to display it should be
managed elsewhere, ie in this case an  SVG transform. The resulting  SVG
can be displayed as high quality using eg  Adobe's SVG viewer, or directly
in an  SVG aware browser (i.e. http://www.croczilla.com/svg, or Amaya,
http://www.w3.org/Amaya/  ). These browsers have not yet reached the stage
of being usable on a daily basis, and we have not quite reached the
stage where capturing chemical structures into  CML for SVG-based
display can immediately display the use of  PNG images, but the day
cannot be too far off.

Google claims that  1.3 billion web pages exist. These probably reference
10 billion  GIF/JPG/PNG images. If even only  1% of these images relate
to chemistry, that is  100 million chemical images out there which carry
no easily retrievable chemical information such as connection tables,
or  2D/3D coordinates. That is a terrible loss to the community. Imagine
however the situation if eg 100 million  SVG/CML files existed
(SVG can contain CML and vice versa, ie you can have BOTH
in a file). That would be three times the known number of molecules,
and a fantastic resource!  Dream on you may say, but we have to 
plan for this day, which may come sooner than you think! 
-- 

Henry Rzepa. +44 (0)20 7594 5774 (Office) +44 (0870) 132-3747 (eFax)
Dept. Chemistry, Imperial College, London, SW7  2AY, UK. 
http://www.ch.ic.ac.uk/rzepa/


chemweb: A list for Chemical Applications of the Internet.
To post to list:  mailto:chemweb at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/chemweb/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe chemweb
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the chemweb mailing list