Python Buzz Forum - Exceptions to the graph model

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Python Buzz Forum
Exceptions to the graph model

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Andrew Dalke

Posts: 291
Nickname: dalke
Registered: Sep, 2003

Andrew Dalke is a consultant and software developer in computational chemistry and biology.

Exceptions to the graph model

Posted: Oct 7, 2003 8:33 AM

This post originated from an RSS feed registered with Python Buzz by Andrew Dalke.
Original Post: Exceptions to the graph model Feed Title: Andrew Dalke's writings Feed URL: http://www.dalkescientific.com/writings/diary/diary-rss.xml Feed Description: Writings from the software side of bioinformatics and chemical informatics, with a heaping of Python thrown in for good measure.	Latest Python Buzz Posts Latest Python Buzz Posts by Andrew Dalke Latest Posts From Andrew Dalke's writings

Yesterday I said that many compounds could be represented as a graph. The computer scientist in you might ask what can't be represented as graphs. Salts (like good old NaCl) are one. These have ionic bonds caused by the interactions between strong charges, and not covalent ones caused by the overlap of orbitals. Salt dissolves so easily because a charge doesn't care what it interacts with so long as it's the opposite charge. Water has its own partial charges which love to join in on the fun, and the result is entropy heaven.

Crystals are infinite in extent. What's the graph of diamond or graphite?

A graph based on covalent bonds cannot handle metals, which have delocalized electrons and so no real covalent bonds. There's even problems when there's only one metal atom, like in the heme of hemoglobin or photosynthesis. An atom binds to the delocalized ring of the heme and not to any one atom. There are ways to fake it, but I recall that Cherwell's tookit supported "ring" as a new graph element, in addition to atoms and bonds.

And then there's hydrogen gas absorbed in platinum. Not only is Pt a metal but the hydrogens reversibly bind to the Pt atoms so move about.

Polymers cause more problems. These can have a huge number of different lengths and branchings ("10²⁰ and more" for polysaccharides!). While each one can be represented as a discrete graph, you'll need better than 64 bits to enumerate them all. Even if you do that, it isn't obvious what you can do with that data.

The nice thing about these exceptions is that they rarely make good drugs. Good drugs are small, have well-defined shapes, and only come in one of at most a couple chiral forms. The exceptions are usually protein-based drugs, like insulin, where non-graph models are more appropriate.

The usefulness of the molecular graph decreases with the size of the molecule. Molecules have a three-dimensional structure (called a conformation) as well as the two-dimensional structure (the graph). Small molecules usually have one or only a few common conformations and these can be determined reasonably well directly from the graph using Corina, Omicron, Concord, or similar programs. Even when the prediction isn't exactly the minimum energy state, thermal motion and other flucuations make it likely that the molecule will be close to a predicted conformation at some point.

The larger the molecule the less likely it will be predicted correctly, and the less likely it will be ergodic. (That's fancy for "explore all phase space." That's fancy for "take on all available conformations.") Huge molecules like proteins are extremely hard to predict ab initio and are not ergodic. That's the whole protein folding problem, and the Levinthal paradox is that folding does happen despite the combinitorial problems.

When a molecule gets that large, its conformation plays a much more important role. For example, the serine protease works because of the arrangement of the catalytic triad. You can change a lot of the rest of the molecule, making a very different graph, and still have a serine protease so long as the fold is correct.

A better model for this stage is the bioinformatics one, where proteins and DNA are represented by one-dimensional sequences (a high-level abstraction of the graph). This allows similarity searches, which are extremely hard to do for graphs, and supports concepts like phylogenies to describe the evolutionary relationship between sequences.

Read: Exceptions to the graph model

Previous Topic

Next Topic


	Web Artima.com