The Artima Developer Community
Sponsored Link

Java Community News
Relo Project's Vineet Sinha on Understanding Large Code Bases

5 replies on 1 page. Most recent reply: Mar 9, 2007 8:04 PM by Vineet Sinha

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 5 replies on 1 page
Frank Sommers

Posts: 2642
Nickname: fsommers
Registered: Jan, 2002

Relo Project's Vineet Sinha on Understanding Large Code Bases Posted: Mar 2, 2007 11:44 AM
Reply to this message Reply
Summary
Developers spend half their work hours understanding existing code. That's because current tools provide simplistic support for the exploration of large code bases, according Vineet Sinha, an MIT Ph.D. student and creator of Relo, an open-source Eclipse code explorer. Artima spoke with Sinha about the challenges in exploring large code bases.
Advertisement

Relo is an open-source code explorer plug-in for Eclipse. It is also a result of Vineet Sinha's work on helping developers understand large information spaces, such as sizable code repositories.

On the occasion of a new release of the Relo code explorer this week, Artima spoke with Sinha about the problems developers face when charting their paths through large code bases, and how Relo helps developers quickly comprehend large amounts of code.

The fundamental problem Relo helps with is understanding large projects. By large, I mean any code that’s more than a single method long. In fact, we’ve been doing evaluations of this tool on code bases that are more than 100,000 lines long.

Surveys indicate that developers spend on average half of their time understanding code. That shows to some extent that there is a problem with tools that try to help us understand code faster...

There’s been a lot of work on helping people understand code at the method level, or understand complex algorithms. But a harder problem most developers face today is comprehending how different methods interact with each other and, more generally, following the multitude of things connected in a project. Most tools currently have a hard time scaling in that aspect, and completely break down with code bases of over 30,000 lines of code.

It's important to point out that although developers spend half their time understanding code, understanding is typically secondary to what developers need to do. Fixing a bug or adding a feature are more important.

As a result, a typical developer will not try to understand the entire code base, because that would take a long time. He will look at those parts of a code base he cares about for a task assigned to him, such as a bug fix. A manager will point him to a class to look at, and then the developer needs to understand things starting from there.

The traditional exploration path many developers take is to follow methods and method references using an IDE’s built-in facilities. Studies we did showed that when people start exploring things that way, they can remember a limited amount of information. When you go past beyond two or three hops from the starting point, you start forgetting things. Some people at that point take down notes on paper to aid their memory...

Relo provides what I would call reverse engineering-based exploration: As you browse code inside your IDE [Editor's note: Currently only Eclipse is supported], Relo is paying attention in the background to whatever piece of code you look at, and keeps track of your path.

At any time, you can open a Relo session based on your history, and Relo will create a diagram based on that history. That supplements your short-term memory about the code.

In addition, instead of showing all the details, the diagram only shows aspects of the code relevant to your exploring. That is one way Relo differs from most UML tools. UML tools are great at helping you understand and create new designs, but not as good at helping you understand code that already exists. They don't help you focus on parts of an existing code base relevant to the tasks you need to accomplish.

To better understand code, you need to look at the interaction of multiple relationships. Eclipse’s package explorer, for instance, is great, but it shows items based on containment relationship only. Eclipse can also generate views based on method calls. But each of those tools, or views, focuses on just one particular relationship. As soon as you want to look at multiple relationships, you start to go further out in that exploration trail, forgetting what you were looking at. That's where Relo's strength comes in, because it can help you show multiple relationships focused on parts of the code you need to know about.

What methods have you found effective in helping you quickly understanding large code bases?


John Zabroski

Posts: 272
Nickname: zbo
Registered: Jan, 2007

Re: Relo Project's Vineet Sinha on Understanding Large Code Bases Posted: Mar 2, 2007 3:58 PM
Reply to this message Reply
OpenGrok and Doxygen.

Amanjit Gill

Posts: 3
Nickname: agill
Registered: Dec, 2003

Re: Relo Project's Vineet Sinha on Understanding Large Code Bases Posted: Mar 4, 2007 12:17 PM
Reply to this message Reply
GNU Global (for C/C++, Java)

Benjamin Collins

Posts: 1
Nickname: aggieben
Registered: Mar, 2007

Re: Relo Project's Vineet Sinha on Understanding Large Code Bases Posted: Mar 6, 2007 1:08 PM
Reply to this message Reply
Emacs + CScope. CScope doesn't draw pretty pictures, but it does track your path for you.

Andy Dent

Posts: 165
Nickname: andydent
Registered: Nov, 2005

Re: Relo Project's Vineet Sinha on Understanding Large Code Bases Posted: Mar 8, 2007 5:06 PM
Reply to this message Reply
Doxygen and using DOT language either directly (with the excellent GraphViz GUI on Mac) or with Python code that generates my trace diagrams for me.

Vineet Sinha

Posts: 154
Nickname: vineets
Registered: Mar, 2007

Re: Relo Project's Vineet Sinha on Understanding Large Code Bases Posted: Mar 9, 2007 8:04 PM
Reply to this message Reply
Thanks for pointers to the tools that you guys use. Do try Relo out and let me know how it compares (what you like, and what could be better).

Thanks!

Vineet

Flat View: This topic has 5 replies on 1 page
Topic: Building Workflows as Composite Applications Previous Topic   Next Topic Topic: Martin Fowler Takes ANTLR for a Test Drive

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use