The initial productivity gain of working with a dynamic language can decline as a project's codebase grows, and as refactoring becomes increasingly a chore.
Many arguments in favor of dynamic languages center around improved developer productivity. Productivity, however, is a loosely-defined term. For example, productivity can vary over the lifetime of a project: being productive today no more guarantees tomorrow's productivity than today's seeming inefficiency assures failure and waste in the future.
I've heard enough about productivity brought about by dynamic languages to want to test this premise first-hand. So when a client approached me about a tiny application that would be used internally by about twenty people, that sounded like the perfect opportunity for a dynamic language project: It would allow me to learn Rails, something that was high on my wish list, and to enjoy the productivity benefits of Ruby at the same time.
Coming from a Java background, I was very pleasantly surprised not just, or even primarily, with Rails, but especially with Ruby: There seemed to be a Ruby library for all the tasks I needed to accomplish, from connecting to an ODBC data source, to authenticating with Microsoft Exchange and fetching mail from an IMAP server, to generating simple PDF reports. And I could do all that in just a few lines of clean, neatly organized code. Ruby's conciseness and elegance, combined with Rails, allowed me to develop and put the application in production in only a few weeks, working part-time on the project.
Once deployed, the application worked very well. So well, in fact, that a few months into production, the client became interested in expanding the application's features. They agreed to a method whereby we would plan for and deliver each desired feature without a big, up-front design, since they were uncertain about the specifics of the features.
I'm a big believer in such an agile approach, but my experience in developing software that way was at the time limited to Java. With that background, I was confident that frequent refactoring, coupled with good unit and functional tests, would make this approach practicable and fruitful. My confidence in the agile way of doing things was bolstered by several Rails features, such as migrations—small Ruby tasks that affect changes to the database schema—as well as Rails' extensive and well thought-out testing framework.
These tools, along with the relatively small amount of Ruby code required to implement the desired features, made life easy during the first few iterations. With each new feature, however, the amount of refactoring required to keep the codebase clean increased. Although Rails' adherence to the DRY principle helped reduce duplication, each refactoring had to change numerous code artifacts. Some of the changes, admittedly, were needed to keep the code in line with Rails' convention-over-configuration philosophy.
Renaming an entity class, for instance, requires renaming the corresponding database tables, including other columns and indexes referencing the model's table, renaming the model class, changing the corresponding controller, helper classes, and every reference to the entity class in the views, as well as renaming the directory containing the view classes and templates. And, of course, you have to similarly rename the associated tests and all references to the model, controller, and views in the tests and fixtures.
Most of these refactorings, such as changing database table names, are not mandated by Ruby or even Rails. And Rails provides easy overrides for every one of its conventions. Still, I didn't want to start down the slippery slope of masking changes in design with configuration overrides, and decided to just take the time to perform every refactoring the right way.
A few days ago, after having spent many hours renaming things, using search-and-replace, and fixing tests, I was itching for a better way. Even the excellent RAD Rails IDE was of little help, since it is not at present able to rename even Ruby classes without breaking references to those classes. Some of the features that made development fast in the early stages of the project, were now slowing me down.
Some would argue at this point that the primary cause for this productivity decline at this stage of the project is that IDEs just can't refactor dynamic code as well as they can statically typed code. That, however, is only part of the reason, I think.
I wanted to ensure that my refactorings were correctly and consistently performed across the entire application stack mainly because I felt that in the absence of a compile-time type-checking system coding conventions became important crutches to lean on for comfort. For the same reason, I'm also more reliant on tests in this dynamic-language project, and tend to write more tests than I would create for, say, a Java application. And, of course, refactoring tests is a large part of the effort at this point in the project.
If I were to generalize my admittedly limited experience working on a growing codebase in a dynamic language, I would argue that dynamic languages do not necessarily lead to harder-to-refactor code. Rather, it's the infrastructure needed to develop larger applications in dynamic code that make refactoring more of a chore, especially as the codebase grows. Coding conventions and tests are, in a way, the "type system" of a dynamic project in terms of providing guarantees—a sort of contract—a developer can depend on.
I wonder to what extent that infrastructure will get in the way of productivity as this projects grows. While that ongoing overhead may be significant, one could look at it in a way one would consider an investment portfolio: Over a longer period, such a year, it is not the marginal gain or loss in the price of any one equity that counts, nor the loss or gain within a shorter time segment, but the total return on the portfolio over the longer period. A huge gain in the first few months may be neutralized by subsequent declines, and vice versa. In a similar way, it may be that even if my productivity declines over time, it will still exceed what I would have had I chosen a strongly-typed language for this project.
What is your experience with sustained productivity when working with dynamic languages on larger projects? How do you refactor larger code-bases written in a dynamic language?
I've recently been in your situation, refactoring Rails code.
I'm not sure I could generalize the problem to be a dynamic language problem. It sounds like the problem you had (as I did) was with Rails and its set of conventions.
For instance, the use of fixtures in both model (unit) and controller (functional) tests. This creates a somewhat hidden dependency that can make what seems to be an innocuous model change cause problems with the functional tests. At least that has been my experience.
Also, I have done the "Rails Refactoring Challenge" as well against a fair number of tools. None of them do a good job yet. I am sure that they will at some point, at least with "conventional" Rails code. However, it is very easy to subvert Ruby code, so it will be interesting to see just how well the tools will work.
It's a reasonable size and the way I keep it all working is by always reworking it as I can remove unnecessary lines so I don't have to deal with them in any way. If I had to "spell out" every line of code, it certainly would be much more painful to even look at the code.
Here's how I modularize it:
bases gui_apps libraries models reports services tools web web_apps web_libraries
Also, once I created a Web frontend for my VCS of choice Bazaar, my productivity increased a little more. I created my own line counter as well which has helped me to keep an eye on code. And I even have a custom Web editor which I can use to browse/search/replace/edit the files from the browser itself. One of the things which will help my productivity a little more is a maintenance tool for translations which will be Web based as well (even though I have a small piece of it in GUI).
I have two kinds of search and replace GUI dialogs which are integrated in my text editor. Both use regex, but one searches/replaces only in the file at hand (with previewing of the changes). The other one can search/replace anywhere I choose, multiple files at a time.
Testing is always included in my development cycle. I change and test in any possible way, be it with a unit test or on a live Web site. :-) I'm still adapting myself to unit tests though and eventually I want full reports of it in a Web application as well.
I generally call if "refactoring" if I can see all the code in a computer screen and I know it won't have any side-effects elsewhere. :-) If I'm renaming something in many modules at a time, I might search/replace automatically or do it in a mixed way sometimes, doing it automatically when feasible. It's all about context as well, so standards help to contextualize things and restrict ambiguity/side-effects.
All in all, to the best of my wishful thinking, the best code generation happens at runtime for me. At design time, I hardly want to see any extra code, let alone generated one. Like you say, I still pay a price, but it could be worth it in the end.
> I wanted to ensure that my refactorings were correctly > and consistently performed across the entire application > stack mainly because I felt that in the absence of a > compile-time type-checking system coding conventions > became important crutches to lean on for comfort. For the > same reason, I'm also more reliant on tests in this > dynamic-language project, and tend to write more tests > than I would create for, say, a Java application. And, of > course, refactoring tests is a large part of the effort at > this point in the project.
Compile-time type-checking can help with some refactoring, but you're going to need to write a lot of the same tests for both a static or a dynamic language to get the same test coverage.
For example, If you've got an parameter called 'name' and you refactor this from being a username to a full name you are still going to have a string as the input to your method calls ...
Or consider refactoring from 'full_name' and you later decide that you want to split this out into 'first_name' and 'last_name'. You could still add a method to your model class that returns 'full_name' as a concatentation of first and last name - this kind of encapsulation is what you should be using in a dynamic language to avoid having to go through these lengthy refactoring processes that you are doing.
Rails does lump together functionality in a fairly coarse-grained fashion, so it can make it a bit more onerous to refactor as the project grows in size. For example, it tends to produce 'fat Models' where you have a lot of separate concerns mixed into a single class, and the Views require a level of discipline since they allow arbitrary Ruby code to be embedded in them (although Rails does have HAML which is quite sexy).
Smalltalk programmers would probably object to the title of this post, as they have some very good refactoring tools.
Also, as a Zope 3 and Grok developer (http://grok.zope.org) I would also have to object, as using Interfaces in the Zope 3 component architecture can also avoid some of these problems when using Python.
> Also, I have done the "Rails Refactoring Challenge" as > well against a fair number of tools. None of them do a > good job yet. I am sure that they will at some > point, at least with "conventional" Rails code. However, > it is very easy to subvert Ruby code, so it will be > interesting to see just how well the tools will work.
I agree that this is more of a framework-related issue, and that a really good IDE could, in fact, deal with the majority of the refactorings I'm doing by hand. Also, as others pointed out, some of the refactorings would still have to be done by hand even if I used, say, Java.
I'd expect that a proper approach to refactoring of "dynamic code" also requires a shift in perspective which considers the runtime activity ( behaviour | data ) as primary and the code structure as secondary or supportive - at least as far a connectivity graph is concerned that cannot be derived from analyzing source code expressions. The connectivity graph together with type information is created at runtime and its reification serves as hypothetical "static code" that approximates "dynamic code". Maybe one can even go a little further and support live refactoring on a running programming. There are lots of unexplored ideas.
As was pointed out, some of the difficulties you encountered are indeed more due to Rails than to the nature of dynamically typed languages, but there are still many cases where automatic refactoring is simply impossible to achieve in dynamic languages.
def f1(o) o.init end
def f2(o) o.init end
class C def init ... end end
If I want to rename C.init to init2, how can I know which other inits need to be renamed as well?
This is sometimes referred to as the "continuous tax" that dynamic typing languages you to pay as the size and scope of your code expands.
-- Cedric Author "Next Generation Testing in Java"
Cedric Beust wrote -snip- > many cases where automatic refactoring is simply > impossible to achieve in dynamic languages.
As I say everytime you repeat this - yes there are cases where automatic refactoring is not possible - now tell us how many cases and tell us what the consequences are.
Have you tried the Ruby refactoring support in IntelliJ IDEA 7.0?
-snip- > If I want to rename C.init to init2, how can I know which > other inits need to be renamed as well?
In Smalltalk that's pretty easy because there aren't any other init methods :-)
For those less common polymorphic methods we'd go through the list of call-sites and figure it out.
It can only be a matter of moments before someone turns that into the new agile virtue of continuous code inspection - automatic refactoring is an evil because it stops programmers reading the code! :-)
> This is sometimes referred to as the "continuous tax" that > dynamic typing languages you to pay as the size and scope > of your code expands.
This is sometimes referred to as "name calling" - maybe Google can find an example of juvenile wit that refers to Java as the "continuous tax" :-)
> > Everyone has to pay some taxes. Poor programmers.
That's true. Anyway with more statically typed language you have more chances to catch the non-refactored issue as soon as you try to compile the code. While in a dynamically typed language you risk to let the problem slip into a release.
But I'm a statically-typed sort of guy so, maybe I'm biased ;-)
The main point (the "tax" that is paid) is not that refactoring is possible/impossible or easy/difficult in dynamic languages. It's how long the refactoring will take, and when you'll know if the change was good or not.
An example on two code bases: 1) One in Java, 320k lines of code, 4k classes, 2k jsp pages. 2) One in Perl, 45k lines of code, about 30 modules.
In Java, rename a method takes 45 seconds to perform, and 3 minutes to check that everything is ok (a full recompile+tests). In Perl, it took 10 minutes looking at a "Find in files" output, analyzing if the method must be renamed or not.
In Java, creating a Base class from an existing class took 40 seconds to create the base class from one class, 5 minutes adjusting things, and 3 minutes to check that everything was ok.
In Perl, it took almost half an hour chasing down the places where the classes where used, changing the references from the subclass to the base class.
I'm sure we can go on and on over all the refactorings in the book, and found that in those cases that they can be automated in static languages and not in dynamic languages, there is a "loss of productivity" of more than a 100%.
Of course, in those static languages that the refactoring support is very poor (poor guys), the gain in productivity and conciseness of some dynamic languages will give a big productivity gain. But for those static language where the refactoring support excels, the productivity gain by the conciseness of the dynamic languages will be more than countered by the loss of productivity when refactoring.
So the problem is that for dynamic languages where the number of refactoring that can be automated or are supported by tools are small, the cost of changing a medium-to-big code base is "big".
Yes, definitely there's no silver bullet with regards to refactoring in all dynamic languages. As a matter of fact, if the dynamic languages depended on refactoring tools to get where they have gotten with regards to progress, they wouldn't have turned their first year as independent languages, let alone growing to powering lots of softwares we depend on on a daily basis. For instance, I use the Bazaar VCS which uses Python for its language, and I don't think those developers have relied on refactoring as you get in Java to get to the point of being very good, close to a blast 1.0 release.
Flat View: This topic has 24 replies
on 2 pages