The Artima Developer Community
Sponsored Link

Weblogs Forum
Programmers Shouldn't Touch the Source

83 replies on 6 pages. Most recent reply: Jul 11, 2006 2:02 PM by Hossam Mashhady

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 83 replies on 6 pages [ 1 2 3 4 5 6 | » ]
Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Programmers Shouldn't Touch the Source (View in Weblogs)
Posted: Oct 21, 2005 7:18 AM
Reply to this message Reply
Summary
More and more programmers and researchers have been suggesting heresies along the lines of "programmers should only work with a view of source code, not the source itself".
Advertisement
Greg Wilson recently inspired me to think deeply about the issue of programmers only working with an abstraction of source code with his article Extensible Programming Systems for the 21st Century

Traditionally programmers work with source code of a language directly within a text editor. In most language there are numerous automated tools to help with writing and viewing the source, but in the end the programmer still works directly on the source code.

An idea that I have heard more than once is that an IDE should only present a specific view of the source code, while the real source is hidden behind the scenes in a more general format, like XML. If a source file was really an XML file at its base, and rarely directly edited by a programmer, it could have many advantages.

One of many advantages is that history tracking could be embedded in the XML source, without being in the fact of programmers all the time. When I first look at source code, I like to see the history of revisions, who did what when or why. If this is embedded in the source code as comments, I often strip it so that the code is easier for me to work with, and I know I am not the only one. If the editor only presents me with a view of the source it could be a simple matter of checking a property option to turn history viewing on or off.

Given the following program:

program answer {
  _main() {
    // the question of life, the universe and everything
    x = 15 + 3 * 9;
    // the answer to life, the universe and everything
    write(x);
  }
}
Here is an example of how an XML source-code might look:
<program name="answer">
  <function name="_main">
    <history>
      <modified>
        <author>Christoper Diggins</author>
        <date>10/21/2005</date>
        <license>BSD</license>
      </modified>
      <original>
        <author>unknown</author>
        <licence>Public Domain</licence>
        <url>http://www.somewhere.there</url>
      </original>
    </history>
    <statement>
      <raw>x = 15 + 3 * 9</raw>
      <ast>
        <push>x</push>
        <push>15</push>
        <push>3</push>
        <push>9</push>
        <call>_star</call>
        <call>_plus</call>
        <call>_eq</call>
      </ast>
      <comment>
        the question of life, the universe and everything
      </comment>
    </statement>
    <statement>
      <raw>write(x)</raw>
      <ast>
        <push>x</push>
        <call>write</call>
      </ast>
      <comment>
        the answer to life, the universe and everything
      </comment>
    </statement>
  </function>
</program>
I think that by storing source as XML, it could give a rebirth to theroetically good ideas like literate programming, which tend to be ignored in practice.

There are a lot of possibilities with using XML source representation. What are your ideas on things you would like stored with the source code that you don't want to always have to look at?


Max Lybbert

Posts: 314
Nickname: mlybbert
Registered: Apr, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 21, 2005 10:26 AM
Reply to this message Reply
This is an interesting concept.

Let's see, people generally need comments or other tools to record:

* build information/dependencies;
* todo lists;
* changes and rationale;
* design notes/Doxygen-like info;
* configuration info;
* RPC stuff (interfaces);
* tests;
* debug-specific info;
* lots of other things I'm not even aware of.

I think this approach could address each of these.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 21, 2005 11:20 AM
Reply to this message Reply
> This is an interesting concept.
>
> Let's see, people generally need comments or other tools
> to record:
>
> * build information/dependencies;
> * todo lists;
> * changes and rationale;
> * design notes/Doxygen-like info;
> * configuration info;
> * RPC stuff (interfaces);
> * tests;
> * debug-specific info;
> * lots of other things I'm not even aware of.
>
> I think this approach could address each of these.

I think so too.

I should also make note that I am discussing this topic on the Extensible Programming Languages mailing List http://pyre.third-bit.com/pipermail/extprog/

So far I have been pointed to several supposedly related projects:
MetaL - http://www.meta-language.net/
LEO - http://webpages.charter.net/edreamleo/front.html
XCVL - http://xvcl.comp.nus.edu.sg/overview_technical_a.php
X++ - http://xplusplus.sourceforge.net/index.htm

And another projects which I found on the web which seems related:
o:xml - http://www.o-xml.org/

Michael Feathers

Posts: 448
Nickname: mfeathers
Registered: Jul, 2003

Re: Programmers Shouldn't Touch the Source Posted: Oct 21, 2005 12:29 PM
Reply to this message Reply
It's interesting. In one way of looking at those systems, the XML is the source code. In another way, it's just the serialization format.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 21, 2005 1:01 PM
Reply to this message Reply
Here is another example for interest's sake. It is the description of a very simple stack interface.


<namespace name="heron">
<namespace name="collections">
<import>primitives</import>
<history>
<revision>
<version>0.1</version>
<author>Christopher Diggins</author>
<date>10/21/2005</date>
<archived>code:cdiggins.com/heron/collections/zips/hpc0-1.zip</archive d>
<tests>code:cdiggins.com/heron/collections/tests/0-1/tests.xml</tests& gt;
</revision>
<revision>
<version>0.0</version>
<author>Christopher Diggins</author>
<date>10/21/2005</date>
<comments>This was the original version</comments>
<license>Public Domain</license>
<archived>www.nowhere.com</archived>
</revision>
</history>
<license>Public Domain</license>
<interface language="heron" name="Stack">
<parameter>
<metatype>type</metatype>
<name>T</name>
<concept>any</concept>
</parameter>
<function>
<name>IsEmpty</name>
<type>bool</type>
</function>
<function>
<name>Push</name>
<parameters>
<parameter>
<type>T^</type>
<name>x</name>
</parameter>
</parameters>
<exception>IsFull</exception>
</function>
<function>
<name>Pop</name>
<result>
<type>T^</type>
</result>
<pre>
<raw>!IsEmpty()</raw>
<ast>
<call>IsEmpty</call>
<call>_bang</call>
<call>_cast[bool]</call>
</ast>
</pre>
</function>
</interface>
</namespace>
</namespace>

Max Lybbert

Posts: 314
Nickname: mlybbert
Registered: Apr, 2005

thought of another one Posted: Oct 21, 2005 1:50 PM
Reply to this message Reply
Plan 9, the operating system, came with an interesting utility to find threading issues (http://www.cs.bell-labs.com/sys/doc/spin.html), but it requires you to create a model of the program. Hmm. Sounds like a job for metadata.

Todd Blanchard

Posts: 316
Nickname: tblanchard
Registered: May, 2003

Odd idea Posted: Oct 21, 2005 2:11 PM
Reply to this message Reply
First, XML is crrraaaap (see other forum) and about the clumsiest format I can think of for code.

However, an AST representing the language isn't so far fetched. In fact, this is pretty much what you get in a Smalltalk environment. The text is a serialization format of the code. The code is what you execute and its just data.

Having an AST(think DOM if you like) allows interesting programmatic transformations.

Sean Conner

Posts: 19
Nickname: spc476
Registered: Aug, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 21, 2005 2:16 PM
Reply to this message Reply
CornerStone (database engine from Infocom in the mid 80s) had an interesting feature---user defined names (identifiers) where seperate from the internal IDs used by the system. Imagine changing the name of a function (or variable), and having it automatically change everywhere else in the source code where used (heck, IDEs could be doing that now, but since I still use an editor, I wouldn't know).

Bill Venners

Posts: 2284
Nickname: bv
Registered: Jan, 2002

Re: Odd idea Posted: Oct 21, 2005 6:59 PM
Reply to this message Reply
I am constantly re-amazed at the world's love affair with XML. Why use XML for every file format? In the case of storing a program, what's wrong with a context free grammar? YACC and LEX still work as far as I know, and the resulting files tend to be more readable and less bulky than XML in my opinion, and you can embed information in either.

Nevertheless, I think the notion of defining a programming language in terms of its AST instead of its syntax has a lot of merit, because it makes the "source code" a storage medium describing the AST, which developers can view and manipulate in interesting ways via tools.

My IDE, IntelliJ, does interesting and very useful transformations and analysis on the code, and though I don't know how it works on the inside, I somehow doubt it is working on text. A few years back I talked to Gosling about his Jackpot project, which does this sort of stuff. But his file format is defined as a context free grammar, not an XML schema, and that grammar is curiously identical to the grammar of a little language called Java. The hardest part, according to him, was the comments, because they aren't part of the grammar:

http://www.artima.com/intv/visualize2.html

The first article in this series gives an overview of the AST manipulation stuff:

http://www.artima.com/intv/jackpot.html

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Odd idea Posted: Oct 21, 2005 8:13 PM
Reply to this message Reply
> I am constantly re-amazed at the world's love affair with
> XML. Why use XML for every file format? In the case of
> storing a program, what's wrong with a context free
> grammar? YACC and LEX still work as far as I know, and the
> resulting files tend to be more readable and less bulky
> than XML in my opinion, and you can embed information in
> either.

I won't argue that there aren't disadavantages to XML. I am not in love with XML, but I think there are several advantages to using XML as a serialization format for source-code such as:

- there already exist numerous tools for parsing, editing, displaying, transforming, manipulating and translating.
- it is a mature format
- it has encoding information embedded internally
- it is quickly recognizable
- it is unambiguous
- it can be easily extended
- it is robust (i.e. works with partial information)
- it has a tree structure

For me however the issue is not so much how great XML is, but rather how bad a home-grown solution would be in comparison. Most non-trivial data representation formats that are hand-rolled are riddled with bugs, ambiguities, and often lead quickly down a road to incompatible versions from every vendor.

Todd Blanchard

Posts: 316
Nickname: tblanchard
Registered: May, 2003

Re: Odd idea Posted: Oct 21, 2005 10:24 PM
Reply to this message Reply
> I won't argue that there aren't disadavantages to XML. I
> am not in love with XML, but I think there are several
> advantages to using XML as a serialization format for
> source-code such as:
>
...
> - it is unambiguous

I'll disagree here - given a group of developers, a hierarchical data structure and a request to represent it as xml, you are likely to get multiple formats because of the attribute/entity ambiguity.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Odd idea Posted: Oct 21, 2005 10:29 PM
Reply to this message Reply
> > - it is unambiguous
>
> I'll disagree here - given a group of developers, a
> hierarchical data structure and a request to represent it
> as xml, you are likely to get multiple formats because of
> the attribute/entity ambiguity.

I agree with you. I was referring to the difficulty inherent in creating non-trivial data format with an unambiguous grammar. For instance a programming language grammar.

Alfredo Aldundi

Posts: 6
Nickname: cheesy
Registered: Oct, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 12:18 AM
Reply to this message Reply
> When I first look at source code, I like to see the history of revisions, who did > what when or why. If this is embedded in the source code as comments, I
> often strip it so that the code is easier for me to work with, and I know I am
> not the only one.

What's wrong about CVS, subversion, ...? They provide you exactly with this kind of information. It seems to be rather clumsy to include this information in source files (imagine a file with 1000 revisions ...). I never understood why people want to do that. Besides, this leads to duplication of information (in the revision control system and in the source files...)...

Greg Jorgensen

Posts: 65
Nickname: gregjor
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 12:39 AM
Reply to this message Reply
This is an old idea. 30 years ago timeshared BASIC systems stored programs as bytecodes and rendered them as text for viewing and editing by a programmer. Numerous 4GLs did the same thing. Every article that begins with the author expressing wonder that Unix has survived for so long is immediately suspicious; I think the author is either inexperienced in the real world, unable to learn something difficult, or a crank.

Tacking XML onto any old idea makes it seem new to the wide-eyed inexperienced audience Mr. Wilson is writing for. Us old relics, hanging on to Unix and vi and command line programs, who Mr. Wilson arrogantly sneers at while he waits for us to die and retire, need to get out of the way of progress. We need to make way for the "next generation" of programmers who have new ideas that can't be held hostage by text files and command line tools and languages that can't be arbitrarily extended into millions of personalized Towers of Babel.

In the future Mr. Wilson describes, every programmer will have their own custom language, and their source code will be stored in incompatible and unreadable XML files. That will make text files and emacs and gcc flags look easy and fun.

"It has only taken HTML and XML a decade to become the most popular data format in history." I think that honor actually goes to the humble plain text format. Using XML as a "data format," contrary to the original intentions or design of XML, has created a lot of brittle, overwrought software in the last five years.

Ideas like this never really go anywhere even if they sound good on paper, so I'm not too worried. In ten years we'll still have Unix and C and vi, and XML will be forgotten except as a legacy format used to send invoices and shipping manifests around.

Greg Jorgensen

Posts: 65
Nickname: gregjor
Registered: Feb, 2004

Re: Odd idea Posted: Oct 22, 2005 12:49 AM
Reply to this message Reply
Let's compare plain old text to XML according to these criteria:

> - there already exist numerous tools for parsing, editing,
> displaying, transforming, manipulating and translating.

text: YES
XML: yes, but not as many as text

> - it is a mature format

text: YES
XML: not really, it's still evolving

> - it has encoding information embedded internally

text: yes, but not according to any single standard
XML: YES, frequently unnecessarily

> - it is quickly recognizable

text: yes, and it's human-readable too
XML: yes, but not human-readable

> - it is unambiguous

text: not in the sense I think you mean
XML: not really

> - it can be easily extended

text: infinitely
XML: yes, as long as the extensions are described unambiguously

> - it is robust (i.e. works with partial information)

text: yes
XML: not in my experience -- parsers crash and burn on bad XML
XML is inherently more fragile than text because there is a lot more to go wrong

> - it has a tree structure

text: yes, can represent anything but not according to a single standard
XML: yes

I'm not sure tree structure is an advantage, though. I don't agree that program source code is always structured hierarchically.


> For me however the issue is not so much how great XML is,
> but rather how bad a home-grown solution would be in
> comparison. Most non-trivial data representation formats
> that are hand-rolled are riddled with bugs, ambiguities,
> and often lead quickly down a road to incompatible
> versions from every vendor.

That pretty much describes the state of XML today. Everyone is rolling their own using a more bloated language.

Flat View: This topic has 83 replies on 6 pages [ 1  2  3  4  5  6 | » ]
Topic: Programmers Shouldn't Touch the Source Previous Topic   Next Topic Topic: Language Purity and Dirty and Clean Functions

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use