The Artima Developer Community
Sponsored Link

Weblogs Forum
Programmers Shouldn't Touch the Source

83 replies on 6 pages. Most recent reply: Jul 11, 2006 2:02 PM by Hossam Mashhady

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 83 replies on 6 pages [ « | 1 2 3 4 5 6 | » ]
Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 25, 2005 7:23 AM
Reply to this message Reply
Advertisement
> Its an old topic. The problem imo is that source code in
> general is represented as a file with text. Every tool to
> operate has to build its own model of the source code.

This is exactly my point. I was struggling to put it into words, thanks.

Martin Baker

Posts: 3
Nickname: mjb
Registered: Oct, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 25, 2005 8:33 AM
Reply to this message Reply
> Its an old topic. The problem imo is that source code in
> general is represented as a file with text. Every tool to
> operate has to build its own model of the source code.

And all these tools parse the file (including many code editors).
With XML encoded source this would be both easier and less ambiguous.

Isaac Gouy

Posts: 527
Nickname: igouy
Registered: Jul, 2003

Re: Programmers Shouldn't Touch the Source Posted: Oct 25, 2005 10:34 AM
Reply to this message Reply
"I mean, source code in files. How quaint. How 70's." Kent Beck

Max Lybbert

Posts: 314
Nickname: mlybbert
Registered: Apr, 2005

Looking from another angle. Posted: Oct 25, 2005 1:37 PM
Reply to this message Reply
/* XML comes with a heavy payload and I do not want to code or maintain all these tags when writing code. I am trying to imagine a editor that does this for me, and the XML tags is kind of hidden. But then, what is the XML good for?
*/

I can understand your feeling about writing XML by hand. I am in the middle of some documentation that will end up in XML, but I'm writing it in Perl's POD format and I'll write the tools to convert it over when I'm done.

But using XML to categorize data is useful, even if the programmer doesn't ever see it. The tools see it, and that's what matters.

Let's try a different line of attack. Many IDEs today let you click on a function name and go to other files where that function is declared, defined, or used. Imagine an Eclipse environment that let you also select a function and then, say, Unit Tests. Or perhaps Doxygen info. Why? So that you can update them whenever you update the function. And, of course, when you ask the environment to compile the thing, you get a binary. Does it matter how the back-end file is stored?

Of course there are several ways to implement this. One would be to use XML files, and then look for the relevant tags when people ask for different views. You could even "export to text" if needed. The programmer wouldn't need to read the XML, or touch it. Which, incidentally, is the reason for the post's title.

John Brown

Posts: 2
Nickname: hammerfish
Registered: Oct, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 25, 2005 11:24 PM
Reply to this message Reply
Is this a joke?

This must be one of the dumbest ideas I have ever heard of. XML is way too verbose and cumbersome to use for programming. It does not make coding easier, it makes it harder and slower.

If you need to view previous revisions of certain files, use version control! If you need to add comments, use /* this */ or // this (or whatever the language you are using supports).

There is absolutely no sense to add another "abstraction layer" on top of any programming language. Programming is already complex enough. And while your example might work with simple things such as 1 + 2 = 3, it certainly will not work with any real life code.

Your short description says that you got frustrated by the shortcomings of modern programming languages. I can't even begin to imagine what kind of limitations, complexity and shortcomings your suggested approach would cause.

Finally I don't see any need for yet another programming language or way to write code. Do you seriously think that C, C++, .NET, Java/J2EE, PHP, ASP, VB, Perl, Ruby etc. won't get the job done? Do you seriously think that using XML and e.g. Heron language is going to make programmers more productive and produce systems that are easier to maintain, extend, develop, debug etc.?

Martin Baker

Posts: 3
Nickname: mjb
Registered: Oct, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 26, 2005 2:58 AM
Reply to this message Reply
Some people seem to be suggesting that because source code is stored in XML format that it would have to be displayed and entered in this form.

I would have thought the code viewers, editors, debug programs could present the source to each programmer in the form that best suites their need, plain text, tree view, etc. Even things like inset conventions could be customised for each programmer in a team. Of course that could all be done regardless of what form the source is stored in, XML just means all these programs don't have the overhead of syntax analysis.

Its like intermediate code (Java bytecode or MSIL) but closer to the source than the runtime.

Terje Slettebø

Posts: 205
Nickname: tslettebo
Registered: Jun, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 26, 2005 4:27 AM
Reply to this message Reply
> This is an old idea. 30 years ago timeshared BASIC systems
> stored programs as bytecodes and rendered them as text for
> viewing and editing by a programmer.

Are you referring to storing the keywords as tokens ("bytecodes")? BASIC for the home computers (prior to the PCs) used to have this, as well. As I understand, the reason for it was to conserve space (as these had little memory, compared to today's computers).

However, at least for BASIC on the home computers, there was a one-to-one mapping between token and textual representation, and they didn't store the program in any kind of AST form, nor did the format allow selective viewing of the source, such as turning on/off comments. The goal was to save space in the program file, not to aid reading of the code, or enable program transformations. So if this is what you refer to, then you can't really say it's the same idea. That would be like saying this is the same idea as what Java does with compiling to bytecodes... (Yes, that's _also_ an old idea, dating back to Pascal's p-code, or earlier, but it's not the same as one-to-one mapping between text and internal representation).

> Every article that begins with the author
> expressing wonder that Unix has survived for so long is
> immediately suspicious; I think the author is either
> inexperienced in the real world, unable to learn something
> difficult, or a crank.

I don't understand what you refer to here, because I don't find anything of this sort in Christopher's blog, or the linked-to article. Could you provide a quote?

> Tacking XML onto any old idea makes it seem new to the
> wide-eyed inexperienced audience Mr. Wilson is writing
> for. Us old relics, hanging on to Unix and vi and command
> line programs, who Mr. Wilson arrogantly sneers at while
> he waits for us to die and retire, need to get out of the
> way of progress. We need to make way for the "next
> generation" of programmers who have new ideas that can't
> be held hostage by text files and command line tools and
> languages that can't be arbitrarily extended into millions
> of personalized Towers of Babel.

You may scoff of this, but I also think this may be the "next big thing". A lot of indicators point in that direction, such as numerous research projects (several links were given in a posting by Christopher), conferences on "generative programming", etc.

I don't think Mr. Wilson thinks this will be "the end of command-line tools way of doing things" - using text as the common medium and piping it from program to program, as that still works fine for data structured as records or tables. However, it will take more than ridicule to counter his arguments of its shortcomings, such as the difficulty of representing data in a tree structure this way. Besides, as someone else pointed out, "text" is not a format in itself - it's more of a medium - so comparing "text" to "XML" (a way to structure text) is like comparing apples to oranges.

What you talk of as "personalized Towers of Babel", others may call a DSL. Similar to the Unix model of small programs doing one task well, you may also have DSLs working well for a specific task or domain, allowing you to write programs in a more natural way in that domain. That's already used to a great extent, today (HTML, SQL, regex, etc.).

Regards,

Terje

Terje Slettebø

Posts: 205
Nickname: tslettebo
Registered: Jun, 2004

Re: Odd idea Posted: Oct 26, 2005 4:29 AM
Reply to this message Reply
> > For me however the issue is not so much how great XML
> is,
> > but rather how bad a home-grown solution would be in
> > comparison. Most non-trivial data representation
> formats
> > that are hand-rolled are riddled with bugs,
> ambiguities,
> > and often lead quickly down a road to incompatible
> > versions from every vendor.
>
> That pretty much describes the state of XML today.
> Everyone is rolling their own using a more bloated
> language.

I think Christopher refers to the XML standard, compared to various other ways of marking up text (CSV is pretty "standard", but doesn't handle a hierarchical structure well), not "XML applications" (like XHTML), where all the different applications/"languages" that have been created with XML is more a sign of its success, than anything else.

As for bloat, XML compresses pretty well, and while there exists a lot of tools for processing text, as well, there hardly exists a widely used hierarchical format, with the properties of XML, such as ability to validate (through a wide range of schemas), character set encoding, etc. as an alternative to XML.

For example, just now, my company is in the process of agreeing on a format for data transmission from another company, and XML is by no means the only possible alternative. However, as the data is structured (basically a database dump), XML is one likely candidate, all things considered, not at least as the other company want to be able to offer this service to other companies, as well, without having to tailor it to each one.

Having said there, there have also been made valid points in this thread for using plain text to store program code (version control systems already handle it, you may define a language any way you like, and so on).

I guess it just goes to show the old saying: Use the best tool for the job.

Regards,

Terje

Terje Slettebø

Posts: 205
Nickname: tslettebo
Registered: Jun, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 26, 2005 4:35 AM
Reply to this message Reply
> Well XML IS readable, but would you like to read a book
> formatted as XML. XML is about structuring data, but it is
> not necessarily the best format to express logical
> statements.
>
> XML comes with a heavy payload and I do not want to code
> or maintain all these tags when writing code. I am trying
> to imagine a editor that does this for me, and the XML
> tags is kind of hidden. But then, what is the XML good
> for?

One thing having an internal representation (whether it's encoded as XML or something else) can be good for, is to basically end the "style wars". :) Everyone can use their favourite brace placement, indentation, spacing, etc. preference, and the editor could use some sort of style sheet to format and display the source according to each person's preferences. [Addition: I see after having written it that Martin Baker makes the same point]

> Code is developed and tested far faster in clear text.
> What about developers trying to get familiar with a new
> piece of code. Code that is compact and self-excplaining
> is much faster comprehended than complex code with a lot
> of text not having anything to do with the subject at
> matter.

Yes, but the point is that the representation is an _internal_ format. The use works with the _presentation_ of that format, which may be anything. That was part of Christopher's point: to enable selective showing/hiding of information (such as Javadoc-type comments).

> If it is desirable to have an XML-structured version of
> the code, computers transforms textfiles to whatever you
> desire in (nano)seconds! Mayby that could be the next
> evolotinary step for code analyzers?

Why do these constant translations between textfile formats (including parsing the source), when you can parse it once, and potentially display and edit it in any way you like, without having to re-parse the source after every change?

Regards,

Terje

Terje Slettebø

Posts: 205
Nickname: tslettebo
Registered: Jun, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 26, 2005 4:41 AM
Reply to this message Reply
> There is absolutely no sense to add another "abstraction
> layer" on top of any programming language. Programming is
> already complex enough.

The idea, just as it is for higher-level programming languages and techniques in general, is to _simplify programming_ (by allowing you to embed more domain knowledge in the code and tools, and let the tools use it).

A simple example might help: In assembly programming, there's no type system at all on the level of instructions (you can read an integer from one location, and attempt to store it as a string at some other location, and the assembler won't stop you, but the program will likely crash). So, we added type systems so the compiler/interpreter knows about the legal/sensible operations that may be performed on the values, thus making programming simpler and safer. We may use casts to circumvent this, if we "know something the type system doesn't know").

Similarly, why shouldn't the language or environment know whether or not the text string "name" refers to the same entity as another text string "name", so that we may do automatic name-changing refactoring, without fearing we forget a place, or replace an inappropriate place where it means something else? When all the IDE sees is text, it has to potentially parse it, to understand it, to do this operation safely, or we have to resort to search-replace, and manually check each place (which is tedious and error-prone). Why can't the IDE have this information in a pre-parsed form (don't let this discussion be derailed by what that form should be, be it XML or any other format, likely a more efficient format, internally)?

> Finally I don't see any need for yet another programming
> language or way to write code. Do you seriously think that
> C, C++, .NET, Java/J2EE, PHP, ASP, VB, Perl, Ruby etc.
> won't get the job done? Do you seriously think that using
> XML and e.g. Heron language is going to make programmers
> more productive and produce systems that are easier to
> maintain, extend, develop, debug etc.?

No; you're right. We're already at the apex of programming language design - nay, of the whole history of computer science. No new major improvements will be done - we'll still use C++ as it is today, 100 years from now. You can all go home, now; nothing to see here.

(sarcasm off)

:)

Regards,

Terje

John Brown

Posts: 2
Nickname: hammerfish
Registered: Oct, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 26, 2005 5:45 AM
Reply to this message Reply
> > There is absolutely no sense to add another
> "abstraction
> > layer" on top of any programming language. Programming
> is
> > already complex enough.
>
> The idea, just as it is for higher-level programming
> languages and techniques in general, is to _simplify
> programming_ (by allowing you to embed more domain
> knowledge in the code and tools, and let the tools use
> it).

Errr... Let me get this straight. You are saying that we can simplify things by _adding_ information and making structures more verbose and complex? Heck, I guess we have been reading a very different version of the English dictionary.

> Similarly, why shouldn't the language or environment know
> whether or not the text string "name" refers to the same
> entity as another text string "name", so that we may do
> automatic name-changing refactoring, without fearing we
> forget a place, or replace an inappropriate place where it
> means something else? When all the IDE sees is text, it
> has to potentially parse it, to understand it, to do this
> operation safely, or we have to resort to search-replace,
> and manually check each place (which is tedious and
> error-prone). Why can't the IDE have this information in a
> pre-parsed form (don't let this discussion be derailed by
> what that form should be, be it XML or any other format,
> likely a more efficient format, internally)?

I must admit that I didn't understand your example but I guess that tells something about the applicability, simplicity and usefulness of that idea.

> No; you're right. We're already at the apex of programming
> language design - nay, of the whole history of computer
> science. No new major improvements will be done - we'll
> still use C++ as it is today, 100 years from now. You can
> all go home, now; nothing to see here.
>
> (sarcasm off)

The keyword you pointed out is IMPROVEMENT. New innovations and techniques become successful if there is demand and clear need for such techniques. I'm pretty confident that this XML-mess is certainly not in that category. You are free to think otherwise :)

John O'Shea

Posts: 4
Nickname: aehso
Registered: Oct, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 26, 2005 5:48 AM
Reply to this message Reply
> Yes they do, but they require separate parsing engine,
> multiple files. and are non-standardized formats that are
> incompatible with each other. If the "true-source" was an
> XML file, then the same parsing engine could be used for
> an editor as for any CVS scheme. It's
>

I'm curious Christopher, where would you store revision history when the source file is deleted (say, as part of refactoring the project)? Or, what if the parent folder was deleted - where would you store that revision history?

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 26, 2005 6:03 AM
Reply to this message Reply
> I'm curious Christopher, where would you store revision
> history when the source file is deleted (say, as part of
> refactoring the project)? Or, what if the parent folder
> was deleted - where would you store that revision history?

I never delete a file, I simply archive them. I should point out that I am no longer of the belief that all revision data should be serialized within the source-code format. There is simply too much data too track, and very advanced delta-compression algorithms would be needed. I think that simply tracking release version information, would be sufficient.

I am trying to address the problem of allowing programmers to more easily judge source code maturity by providing access to histories of releases in an easily parsed format.

John O'Shea

Posts: 4
Nickname: aehso
Registered: Oct, 2005

Re: Programmers Shouldn't Touch the Source Posted: Oct 26, 2005 7:36 AM
Reply to this message Reply
>
> I never delete a file, I simply archive them. I should
> point out that I am no longer of the belief that all
> revision data should be serialized within the source-code
> format. There is simply too much data too track, and very
> advanced delta-compression algorithms would be needed. I
> think that simply tracking release version information,
> would be sufficient.
>

I'm not sure what you mean by "I never delete a file I simply archive them" - either way you are removing a resource from the current revision of the source. I think what you are really looking for is greater integration between source snapshot distributions, tools and SCM repositories (where all the meta-data should be stored).


> I am trying to address the problem of allowing programmers
> to more easily judge source code maturity by providing
> access to histories of releases in an easily parsed format.

It would seem to me that tools that generate views of the source that only expose the agreed interface contract are the best way to meet this requirement - javadoc is a good example. unfortunately we all know how hard it is to get developers to annotate their source richly enough :-) Seriously though, much of the metadata you listed in an earlier reply can be automatically inserted by the SCM system as the file is revised (using tag substitution). Some requires the authors to make a judgement call (e.g. I need to bump the major rev number of this component as this breaks backward compatability)

I guess the bigger question is why do you think that having access to the revision history for single file helps one judge the code's maturity? Badly written code can be revised many times and still be bad code.
If you are maintaining the library then you'll already have the full set of facilities provided by modern SCM systems. If you're a user of the code, looking beyond the public interface definition of the library is always dangerous - perhaps peeking at the source while debugging into an open-source library is necessary to identify problems, but in most cases access to the revision history for that dependency isn't enough since earlier revisions invariably have dependencies on other versions of dependent libraries, invalidating your test environment.

The idea of using XML for this reminds me of the old saying that everything like a nail when you've got a hammer in your hand. I have to agree with Greg Jorgensen's post earlier, this is an old problem that is not going to be solved by any magic xml schema. Better tool integration will help more.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Programmers Shouldn't Touch the Source Posted: Oct 26, 2005 8:06 AM
Reply to this message Reply
> If you're a user of the code, looking beyond the
> public interface definition of the library is always
> dangerous - perhaps peeking at the source while debugging
> into an open-source library is necessary to identify
> problems, but in most cases access to the revision history
> for that dependency isn't enough since earlier revisions
> invariably have dependencies on other versions of
> dependent libraries, invalidating your test environment.

As a user of open-source code, I would like to see with every distributed source file:

- the authourship
- license
- links to reviews
- links to test suites
- life-cycle phase
- links to archived versions
- links to stable versions
- links to documentation
- version number
- known issues
- resolved issues
- library dependencies
- date modified
- date authoured
- description of file
- description of changes since last release

This information all is useful to me as a user of open-source code. As this information changes from version to version, I want to be able to browse the old version information as a history.

I see no good reason to obstruct access to this information to users of a source file by hiding it in an SCM. I can't understand what in the world is possibly dangerous by providing access to this information to users in a standardized format.

Flat View: This topic has 83 replies on 6 pages [ « | 1  2  3  4  5  6 | » ]
Topic: Programmers Shouldn't Touch the Source Previous Topic   Next Topic Topic: Language Purity and Dirty and Clean Functions

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use