The Artima Developer Community
Sponsored Link

Weblogs Forum
Programmers Shouldn't Touch the Source

83 replies on 6 pages. Most recent reply: Jul 11, 2006 2:02 PM by Hossam Mashhady

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 83 replies on 6 pages [ « | 1 2 3 4 5 6 | » ]
André Næss

Posts: 1
Nickname: andnaess
Registered: Oct, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 3:34 AM
Reply to this message Reply
Advertisement
If you think a little, this is what tools like IntelliJ, Eclipse and Visual Studio already do to some extent. I only know IntelliJ, so I can only speak about it's features.

IntelliJ parses the sourcecode and turns it into some internal format that let's it do lots of neat stuff. The automated refactoring, for example, depends heavily on this. IntelliJ also constantly reparses the code, so that it can detect errors when I make them. I never see compiler error any more. The feature set of IntelliJ is quite amazing, but I can still store my source code as text. And there's good reasons why I would want to.

Plaintext is an amazingly portable format. For most programming languages, the format of the file is fixed, in the case of Java by Sun. Sun have defined what a valid Java file looks like, and as long as my file is a valid Java file, any tool that can work with Java code can work on my file. That means that I can take my plaintext file, and move it to some different IDE if I want to. If the source was stored in some other format than plaintext, imagine what would happen when Borland decides to "enhance" the format a little so that they can offer feature X, to which the IntelliJ guys needs to respond, so they "enhance" the format in their (incompatible) way, and all hells breaks loose. We know this story, it's happened so many times.

Moreover, version control systems work well with plaintext, and allow me to easily track the history of a file through simple diff-operations on plaintext. For any other format, this would be much more complicated, and probably lead to even more vendor lock-in.

By storing the format in plaintext, any tool is free to add whatever features it wants. IntelliJ does this by parsing the file into what I assume looks very much like an abstract syntax tree, probably with lots of annotations. They even expose an API to this internal format for plugin-writers to use. Todays computer are powerful enough to do this realtime parsing, so performance is not a problem.

Also, when I really need to use a feature in Emacs to get something done, I can just load the file up in Emacs. If the tool I'm using doesn't give me all the features, I can use other tools for those features that are missing. This works so easily because plaintext is so portable.

I think IntelliJ IDEA does exactly what the OP wants. It gives me a view of the source code. I still work with the source-code as plaintext, but that's just because for the most part that makes most sense. I am after all a coder. Inside IntelliJ the code exists in an abstract format that I don't need to care about. And when I find that Borland's JBuilder completely rocks, I can change to it with very little hassle.

Greg Wilson

Posts: 6
Nickname: gvwilson
Registered: Oct, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 4:39 AM
Reply to this message Reply
I'm not actually a big fan of XML either --- I think that various code-in-database ideas would be more elegant. What makes me advocate XML-based representations is the number of XML manipulation tools out there, which would allow programmers to knock together their own code processors in a hurry (in the way that 'sed' and 'awk' allowed us to knock together text processors). Just as an example, any sensible XML-based representation would allow you to identify code locations with XPath, find code elements (like calls to particular methods) with XQuery, and write code transformations with XSLT.

The two other elements are just as important:

1) A WYSIWYG editor (no language that requires programmers to type in, or view, the tags is going to succeed), and

2) an extensible tool chain (so that I can write plugins for my debugger, just as I do for my web server).

A fuller version of the article is on-line at:

http://www.third-bit.com/~gvwilson/papers/queue2004-extprog.pdf

- Greg

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 6:32 AM
Reply to this message Reply
> What's wrong about CVS, subversion, ...? They provide you
> exactly with this kind of information.

Yes they do, but they require separate parsing engine, multiple files. and are non-standardized formats that are incompatible with each other. If the "true-source" was an XML file, then the same parsing engine could be used for an editor as for any CVS scheme. It's

> It seems to be
> rather clumsy to include this information in source files
> (imagine a file with 1000 revisions ...).

The data has to be stored somewhere, but you do realize that I propose that the revisions should be hidden from the user when editing source text. Revisions in my opinion are an important part of source code.

> I never
> understood why people want to do that. Besides, this leads
> to duplication of information (in the revision control
> system and in the source files...)...

I don't see why a revision control system would be required to duplicate the information if it was able to use the source code directly.

Alfredo Aldundi

Posts: 6
Nickname: cheesy
Registered: Oct, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 7:00 AM
Reply to this message Reply
> Yes they do, but they require separate parsing engine, multiple files. and are
> non-standardized formats that are incompatible with each other. If the "true-
> source" was an XML file, then the same parsing engine could be used for an
> editor as for any CVS scheme.

Revision control systems are highly optimized for what they do. I don't see what's wrong about that. I agree with you that a tighter integration between revision control, source files, meta-data, ... is a good thing. But why create one huge XML format? There are already existing solutions for parts of what you want to achieve. What's maybe missing in some cases is tight integration. I'd rather go with a system of integrated highly specialized tools instead of one tool that packs everything into one big clumsy XML file.

> The data has to be stored somewhere, but you do realize that I propose that
> the revisions should be hidden from the user when editing source text.

I got that one ;-).

> I don't see why a revision control system would be required to duplicate the
> information if it was able to use the source code directly.

I'm not sure if I understand you correctly. With your format you would not need a revision control system because this information is stored in the source. All the revisions, branches, tags, ...? Correct?

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 7:20 AM
Reply to this message Reply
> Revision control systems are highly optimized for what
> they do. I don't see what's wrong about that. I agree with
> you that a tighter integration between revision control,
> source files, meta-data, ... is a good thing. But why
> create one huge XML format?

For me the big problem with CVS is that there is a one-to-one mapping between source and CVS information, why share one without the other? If they are one and the same, why not a single file with all of the crucial data provided?

In practice, CVS data is almost always lost to people outside of the inner circle of developers. I find that a shame. If I found a source file floating around the internet (or my computer) I would like to be able to access the revision data.

It would be silly of me to insist that people can't separate data. I would propose that the revision data be stored either as an inlined entity, or a URI to a separate XML document. An interesting possibility would be easy integration with existing CVS systems, if we used an CVS to XML converter as a web-service.

> There are already existing
> solutions for parts of what you want to achieve. What's
> maybe missing in some cases is tight integration. I'd
> rather go with a system of integrated highly specialized
> tools instead of one tool that packs everything into one
> big clumsy XML file.

The specialized tools can still be used, by writing relatively simple converters from/to XML.

> I'm not sure if I understand you correctly. With your
> format you would not need a revision control system
> because this information is stored in the source. All the
> revisions, branches, tags, ...? Correct?

Well either inlined, or an external XML document which is linked through a URI.

robert young

Posts: 361
Nickname: funbunny
Registered: Sep, 2003

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 11:50 AM
Reply to this message Reply
>In ten years
> we'll still have Unix and C and vi, and XML will be
> forgotten except as a legacy format used to send invoices
> and shipping manifests around.

i'm too lazy to go look it up. my impression is that XML regimes are Not Replacing EDI for these functions. nor should they.

Greg Jorgensen

Posts: 65
Nickname: gregjor
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 12:50 PM
Reply to this message Reply
> For me the big problem with CVS is that there is a
> one-to-one mapping between source and CVS information, why
> share one without the other? If they are one and the same,
> why not a single file with all of the crucial data
> provided?

Only in trivial projects. Revision control systems operate on individual source files, but they also maintain project-level history. When a file is renamed, deleted, split into two, merged into another file, the history is not lost. If the history was tied to a specific source file something as common as splitting a source file into two would create problem of where to attach the history.

I'm also very skeptical about representing source code diffs in XML.

> In practice, CVS data is almost always lost to people
> outside of the inner circle of developers. I find that a
> shame. If I found a source file floating around the
> internet (or my computer) I would like to be able to
> access the revision data.

The trend seems to be to make the source code repository available to interested developers, rather than offer free-floating source. There's plenty of that around, but it's pretty typical these days to update your working copy from CVS or Subversion repositories instead of downloading standalone source files. If the history was attached to each free-floating XML file full of its own revision history I'd like to know how changes would be merged and reconciled across developers.

Greg Jorgensen

Posts: 65
Nickname: gregjor
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 12:55 PM
Reply to this message Reply
> I'm not actually a big fan of XML either --- I think that
> various code-in-database ideas would be more elegant.
> ...
> 1) A WYSIWYG editor (no language that requires programmers
> to type in, or view, the tags is going to succeed), and
>
> 2) an extensible tool chain (so that I can write plugins
> for my debugger, just as I do for my web server).

This would significantly raise the bar for sharing code among developers. Right now it's good enough for both of us to have any text editor and access to compatible tools -- compilers, linkers, interpreters, etc. With an XML or (worse) a database-based representation we'd need to have compatible versions of much more sophisticated tools.

Even with all the XML tools available today, if I was going to send you a source file, I can send it as plain text with 100% certainty you could read it and manipulate it. If I sent an XML file or some database dump we'd have to agree beforehand on a lot more than the ability to edit text.

Greg Jorgensen

Posts: 65
Nickname: gregjor
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 1:00 PM
Reply to this message Reply
> i'm too lazy to go look it up. my impression is that XML
> regimes are Not Replacing EDI for these functions. nor
> should they.

Quite a few industries have adopted their own XML-based standards for data interchange, and those are replacing proprietary EDI. All it takes is one big client or supplier to push a supply chain to a new standard. And this is precisely the business problem that XML is good at: defining a common text-based standard for data interchange across business systems. Microsoft has pushed this both in Office (success to be determined) and in their B2B efforts (frequently ignored in favor of neutral industry consortia).

The other success XML has enjoyed is among CIOs and technical managers imagining a world with no incompatible platforms, and the authors and publishers of XML books.

Peter Hickman

Posts: 41
Nickname: peterhi
Registered: Mar, 2003

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 1:39 PM
Reply to this message Reply
> I don't see why a revision control system would be
> required to duplicate the information if it was able to
> use the source code directly.

How about not seeing why a programming language would be required to duplicate the information that is available in the revision control system. Remember that revision control systems already exist, you are talking about something less than vapourware.

Also revision control systems can cope with any language, actually they are designed to cope with data, it is just they they handle text based data very well.

Peter Hickman

Posts: 41
Nickname: peterhi
Registered: Mar, 2003

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 2:07 PM
Reply to this message Reply
The more I think about this the more naive (I'm being polite here) it seems. A revision control system is about more than keeping track of what happens to a particular file. It is about keeping track of how that file is related to other files.

Change the source to a class and sure you have a revision in that file, but you may also have changes in all the files that use that class. A version control system, such as subversion, allows you to trace the state of the whole repository each time any element is changed. You can roll back a whole repository to how it looked last Tuesday week should some unforeseen problem arise.

The project I am currently working on has 1,202 source files in it, if each file contained it's own revision history and how it relates to all the others at the time of the change, which will have to be incorporated into the history of the other 1,201 files in the project, it would be a nightmare to update. The source files would be enormous without having to convert them into XML. Do that and they would simply be gigantic.

Just what are you trying achieve here?

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 4:37 PM
Reply to this message Reply
> The more I think about this the more naive (I'm being
> polite here) it seems. A revision control system is about
> more than keeping track of what happens to a particular
> file. It is about keeping track of how that file is
> related to other files.

I agree. The more I think about it too, the more that RCS data is definitely much too broad to try and capture in a source file.

> Just what are you trying achieve here?

What I am after, at a minimum is to include several vital pieces of information in a source file for every external release of a source file:
- author
- license
- comments
- major/minor version
- archival location
- documentation location
- date modified
- regression tests location
- reviews

And to have these maintained inlined in the source document, or alternatively hyperlinked from the source document. I think this kind of historical documentation lends itself very well to an XML representation embedded in the source code.

What I have observed with open-source code, is that history documentation is usually inadequate, non-standardized, difficult/impossible to parse/manipulate, or entirely absent.

I thought that maybe that this could have been taken a step further, and incoporated with more detailed RCS information.

I now see that there is an important delineation between internal project management issues which are addressed by RCS/SCM software and historical documentation, which is more universal, and of crucial importantance to consumers of open-source code.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 4:44 PM
Reply to this message Reply
> I think IntelliJ IDEA does exactly what the OP wants. It
> gives me a view of the source code. I still work with the
> source-code as plaintext, but that's just because for the
> most part that makes most sense. I am after all a coder.
> Inside IntelliJ the code exists in an abstract format that
> I don't need to care about.

Right, and I am tyring to standardize this abstract format as an XML schema. If a language (such as Heron) standardizes this abstract format, open-source tools will be able to share common code-bases built on existing XML parsers (which are incidentally much easier to build than CFG parsers). Hopefully this will help leap-frog an immature technology into a very advanced technology far more quickly.

Robert Parnell

Posts: 22
Nickname: robparnl
Registered: Jul, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 22, 2005 9:06 PM
Reply to this message Reply
>
> Right, and I am tyring to standardize this abstract format
> as an XML schema. If a language (such as Heron)
> standardizes this abstract format, open-source tools will
> be able to share common code-bases built on existing XML
> parsers (which are incidentally much easier to build than
> CFG parsers). Hopefully this will help leap-frog an
> immature technology into a very advanced technology far
> more quickly.

I have to go with the CVS/RCS guys on this one. Finding a Context-Free Grammar parser to embed your XML meta-data by "one more different" standard, just seems a waste of time to me. ie. Don't reinvent the whell - just use what everyone else does.

Unless, you really are going to be an IDE for HeronJ, first?

RobP

PS. Unless, you really do know of a better Context-Sensitive language with XML to drive it all?

Alfredo Aldundi

Posts: 6
Nickname: cheesy
Registered: Oct, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 23, 2005 12:16 AM
Reply to this message Reply
> What I am after, at a minimum is to include several vital pieces of information in
> a source file for every external release of a source file:
> - author
> - license
> - comments
> - major/minor version
> - archival location
> - documentation location
> - date modified
> - regression tests location
> - reviews

Some of the things you mentioned above can easily be achieved with CVS/subversion (author, license, date modified, archival location, ...). I'm not sure what you mean by comments (historical information, i.e. commit messages?).

External release is the key phrase in your text. Why not just have an export command that exports a source (or a web application) file and enhances it with the information you want?

> And to have these maintained inlined in the source document, or alternatively
> hyperlinked from the source document. I think this kind of historical
> documentation lends itself very well to an XML representation embedded in
> the source code.

I don't understand why having this information inline in the source file is so desirable. I'd rather have it external to the source file. A simple tool could compose the 'report' information you want.

Flat View: This topic has 83 replies on 6 pages [ « | 1  2  3  4  5  6 | » ]
Topic: Programmers Shouldn't Touch the Source Previous Topic   Next Topic Topic: Language Purity and Dirty and Clean Functions

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use