The Artima Developer Community
Sponsored Link

Weblogs Forum
The Trouble with Searching for Open-Source Code

16 replies on 2 pages. Most recent reply: Oct 28, 2005 3:14 PM by Max Lybbert

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 16 replies on 2 pages [ 1 2 | » ]
Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

The Trouble with Searching for Open-Source Code (View in Weblogs)
Posted: Oct 27, 2005 8:08 AM
Reply to this message Reply
Summary
I frequently encounter open-source code which reimplements code which exists elsewhere (and usually does so badly). When everyone is busy reinventing the wheel, no one has the time to build a cart.
Advertisement
Even though some developers are guilty of simply not doing research, part of the problem is that finding open-source code for a particular purpose is hard. Search engines are well suited for finding text, but not source code. This is because:
  • Source code documents are not often distributed directly on the web, but rather as part of compressed packages
  • Documentation and source-code are often separated. Robots have trouble creating hard-links between documentation and the source code.
  • Comments in source-code, are treated with the same level of priority as function names, and variables. This means that they aren't indexed with the proper level of priority.
So how does this get solved? Well I can see two ways:
  1. Search engines start applying specialized techniques for parsing and indexing source code.
  2. Open-source developers come up with a new standardized language independant format for distributing source code. (perhaps Open-Source-XML?)
I think either (or both) of these technologies could have a significant impact on moving software technology forward.


Dominik Wei-Fieg

Posts: 60
Nickname: dominikwei
Registered: Aug, 2005

Re: The Trouble with Searching for Open-Source Code Posted: Oct 27, 2005 9:04 AM
Reply to this message Reply
Hi,

although it is far from perfect, www.koders.com is specialized on searching OpenSource code.

Chris Hulan

Posts: 1
Nickname: chulan
Registered: Oct, 2005

Re: The Trouble with Searching for Open-Source Code Posted: Oct 27, 2005 12:17 PM
Reply to this message Reply
O'Reilly is also addressing this issue with their Code Zoo (www.codezoo.com)

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: The Trouble with Searching for Open-Source Code Posted: Oct 27, 2005 12:46 PM
Reply to this message Reply
Thanks Chris and Dominik! I appreciate the links, it is interesting to see what other people are doing.

Michel Parisien

Posts: 25
Nickname: kriggs
Registered: Jul, 2005

Re: The Trouble with Searching for Open-Source Code Posted: Oct 27, 2005 8:34 PM
Reply to this message Reply
http://www.businessfinancemag.com/channels/budgetingReporting/article.html?articleID=4248

"Nobody wants to reinvent the wheel, but when the wheel isn't rolling the cart forward, something needs to change."

I didn't have anything to say. I just figured I'd share this other quote that continues on Christopher Diggins analogy.

Francisco Gortázar-Bellas

Posts: 1
Nickname: patxi
Registered: Oct, 2005

Re: The Trouble with Searching for Open-Source Code Posted: Oct 28, 2005 1:51 AM
Reply to this message Reply
This issue is also related to natural language processing. For instance, if I'm searching for an API aimed at creating diagrams I expect the search engine to return code related to diagrams, but also code related to graphs. The search engine should perform a search for synonims. Clustering the results?

Harrison Ainsworth

Posts: 57
Nickname: hxa7241
Registered: Apr, 2005

abstractions over software Posted: Oct 28, 2005 6:57 AM
Reply to this message Reply
Hmmm. We need abstractions over software. A function has a well defined form: what goes in, and what comes out can be described in a simple, common way. Almost the same in many languages. That abstraction could help searching.

But classes? They seem less clear. Can they be characterized by invariant? constructors? It seems harder to cover what they *are* and *do*, abstractly. Is the way we *write* classes just not disciplined enough?

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: abstractions over software Posted: Oct 28, 2005 7:05 AM
Reply to this message Reply
> Hmmm. We need abstractions over software. A function has a
> well defined form: what goes in, and what comes out can be
> described in a simple, common way. Almost the same in many
> languages. That abstraction could help searching.
>
> But classes? They seem less clear. Can they be
> characterized by invariant? constructors? It seems harder
> to cover what they *are* and *do*, abstractly. Is the way
> we *write* classes just not disciplined enough?

These are important questions. I have some incomplete thoughts and observations on the subject:

- a class has two parts: the interface and a contract. (arguably a third would be documentation)
- you can express a class interface (including constructors) as an interface.
- a crucial part of describing an interface is the expression of the preconditions/postconditions for each function along with any invariants
- all of the special member function of a class (i.e. the constructor/destructor/operator/fields/property/etc.) can be implemented/rewritten as member functions.

In summary I think it is possible to express classes in a more abstract form (or at least have our software do it for us).

Harrison Ainsworth

Posts: 57
Nickname: hxa7241
Registered: Apr, 2005

class classification Posted: Oct 28, 2005 9:23 AM
Reply to this message Reply
> contracts, pre/post-conditions, methods ...

Yes, these are ok, but are too *small*. They don't express the class as a whole. Like the fable of feeling the parts, but not realizing it is an elephant (?).

Classifications of kinds of classes would be valuable: like containers, iterators, or streams, each providing a pattern of methods. They only exist vaguely/colloquially though; something firmer/coded is needed.

Harrison Ainsworth

Posts: 57
Nickname: hxa7241
Registered: Apr, 2005

class classifications 2 Posted: Oct 28, 2005 9:33 AM
Reply to this message Reply
Class classifications can (of course) be just a set of base/super classes that every new class *must* derive from. The tech is there, but the culture is lacking. Who can be diligent and bold enough to set *that* standard?

Max Lybbert

Posts: 314
Nickname: mlybbert
Registered: Apr, 2005

Another Quote Posted: Oct 28, 2005 10:28 AM
Reply to this message Reply
From the conclusion of most recent State of the Onion (http://www.perl.com/lpt/a/2005/09/22/onion.html):

"So I'm just wondering if we're getting ourselves into a similar situation with open source software. More software is not always better software. Google notwithstanding, I think it's actually getting harder and harder over time to find that nugget you're looking for. This process of re-inventing the wheel makes better wheels, but we're running the risk of getting buried under a lot of half-built wheels."

Max Lybbert

Posts: 314
Nickname: mlybbert
Registered: Apr, 2005

Re: abstractions over software Posted: Oct 28, 2005 10:31 AM
Reply to this message Reply
I had imagined working with feature models to classify the software (http://www.boost.org/more/feature_model_diagrams.htm). Of course, there has to be some way of making the model without locking a bunch of monkeys in a room at keyboards.

And, to avoid a long argument over "that's bad syntax" a la XML and source code -- the feature model doesn't have to be presented to the end user. It's perfectly acceptable to give the user a friendly way to express himself, parse that and turn it into a feature model, and then search the database.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: class classifications 2 Posted: Oct 28, 2005 11:48 AM
Reply to this message Reply
> Class classifications can (of course) be just a set of
> base/super classes that every new class *must* derive
> from. The tech is there, but the culture is lacking. Who
> can be diligent and bold enough to set *that* standard?

Derivation is too restrictive. Classes can be functionally equivalent with very different implementations. The solution I believe lies with Models/Concepts (for instance: http://www.sgi.com/tech/stl/Vector.html ).

Harrison Ainsworth

Posts: 57
Nickname: hxa7241
Registered: Apr, 2005

feature models ? Posted: Oct 28, 2005 11:49 AM
Reply to this message Reply
> Generative Programming - feature models: http://www.boost.org/more/feature_model_diagrams.htm

I couldn't quite understand from that page. It appeared to be a structured language for writing comments, not code. If it can be bound tightly to the code, even within a limited domain, then we have something.

You want to search precisely for components, and also just drop them straight into your code, because all the interfaces match.

Harrison Ainsworth

Posts: 57
Nickname: hxa7241
Registered: Apr, 2005

representation Posted: Oct 28, 2005 12:28 PM
Reply to this message Reply
> Derivation is too restrictive. ... Models/Concepts

That is a question: what is the optimal complexity of representation? Simplicity is a challenge, but always an attractive target.

I can't believe it's not fairly easily do-able. You could probably take a simple class 'base' and prove all structures of computation are possible with it alone. Elaborate that a little, and you have a framework-language. It seems related to refactoring...

Flat View: This topic has 16 replies on 2 pages [ 1  2 | » ]
Topic: Encapsulation Violation Previous Topic   Next Topic Topic: Musing about Closures

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use