Python Buzz Forum - Naming known molecules

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Python Buzz Forum
Naming known molecules

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Andrew Dalke

Posts: 291
Nickname: dalke
Registered: Sep, 2003

Andrew Dalke is a consultant and software developer in computational chemistry and biology.

Naming known molecules

Posted: Oct 8, 2003 2:33 AM

This post originated from an RSS feed registered with Python Buzz by Andrew Dalke.
Original Post: Naming known molecules Feed Title: Andrew Dalke's writings Feed URL: http://www.dalkescientific.com/writings/diary/diary-rss.xml Feed Description: Writings from the software side of bioinformatics and chemical informatics, with a heaping of Python thrown in for good measure.	Latest Python Buzz Posts Latest Python Buzz Posts by Andrew Dalke Latest Posts From Andrew Dalke's writings

Surely there must be a cleaner way to name a molecule.

That runs over and and over in your head. Why do you want a name? You're looking for additional information about a chemical graph, so what about using a graph search instead of a text search? Suppose all chemical compounds were stored in a computer as a graph. To search the database, sketch the compound then do a graph isomorphism search. Graph isomorphism is slower than a text compare, so the search could be sped up with filters. Eg, search first for a matching molecular formula and only do the graph search on the records which pass the filter.

Hey! That could work! It would be even better if all the chemistry papers were put into the database, so anyone could look up a paper given the graph of a compound of interest. Oooh! And if it included published reactions as well, then people can get pointers on how to synthesize a compound.

Much to your delight (or chagrin), you find that the Chemical Abstracts Service beat you to this.

Substance identification is a special strength of CAS. It is widely known as the CAS Registry, the largest substance identification system in existence. When a chemical substance, newly encountered in the literature, is processed by CAS, its molecular structure diagram, systematic chemical name, molecular formula, and other identifying information are added to the Registry and it is assigned a unique CAS Registry Number. Registry now contains records for more than 22 million organic and inorganic substances and more than 34 million sequences.

They digitize all this information, make it searchable, and license the technology for others to develop search software for your computer. Or if you want, you can get it on paper, microfilm, or CD-ROM. All for a price of somewhere between a few hundred and nearly 30,000 dollars/year depending on who you are and what you want. (Who says information wants to be ~~anthropomorphized~~free? Actually, the cost in part reflects the service needed to keep things up to date with the literature and in part the high barrier to anyone else reproducing their database; the skills of inexpensive off-shore chemists not withstanding.)

They are also a naming service. They assign a new, unique CAS number for every compound in the database. Ethanol is CAS# 64175. You can design your compound database system to store the CAS# as the primary key. When you need more ethanol -- without the tasty impurities you'll get from your pub -- ring up your supplier and order it by CAS#. This helps make sure both parties are talking about the same thing.

Problem solved. You can isolate a compound, determine its structure, get the CAS# and/or its IUPAC name, and look it up in the literature. Or is it solved?....

Read: Naming known molecules

Previous Topic

Next Topic


	Web Artima.com