Many Rubyists have reached a point sometime where they have had problems straightening out dependences between source files.
If you break a Ruby program into separate files, definitely a good practice for all but the smallest projects, it can be difficult to figure out how to get the files required in the right order, and if there are circular dependencies between the definition order of modules and classes, you might think you are forced into tricks like defining an empty module or class, and then filling it in later.
Rails programmers have grown accustomed to the automatic class loading provided by ActiveSupport. Itâs a hidden Ruby secret that the Ruby language has support for automatic class loading built right into the standard class library in the form of the the Kernel#autoload, and Module#autoload methods.
Program Structure
Hereâs an admittedly artificial Ruby project . The files are presented within the directory hierarchy:
- my_module.rb
-
puts "my_module.rb loading"
module MyModule
# Code for MyModule here
end
- init.rb
-
require File.join(File.dirname(__FILE__), *%w[my_module.rb])
require File.join(File.dirname(__FILE__), *%w[my_module class1.rb])
require File.join(File.dirname(__FILE__), *%w[my_module class2.rb])
puts "about to reference MyModule::Class2 in init.rb"
MyModule::Class2.new
- my_module/class1.rb
-
puts "my_module/class1.rb loading"
class MyModule::Class1
# Code for Class1
end
- my_module/class1.rb
-
puts "my_module/class2.rb loading"
class MyModule::Class2
def initialize
puts "creating an instance of Class2"
end
end
This is a pretty standard layout. The module is in its own file, classes/modules within the module are in their own files also in a directory named after the module in the same directory as the module file. The directory nesting follows the module namespace nesting.
The files are required by the initialization file, using a Ruby idiom for getting a file path relative to the current file. This idiom is used so frequently that the Textmate Ruby bundle has a snippet for generating it called pathfh (for PATH From Here)
When I run this I get:
my_module.rb loading
my_module/class1.rb loading
my_module/class2.rb loading
about to reference MyModule::Class2 in init.rb
creating an instance of Class2
A circular dependency
Letâs introduce a circular dependency in the load order between Class 1 and Class 2:
- my_module/class1.rb
-
puts "my_module/class1.rb loading"
class MyModule::Class1
puts "about to instantiate Class 2"
@class2_instance = ::MyModule::Class2.new
puts "class2 instance has been created"
end
- my_module/class2.rb
-
puts "my_module/class2.rb loading"
class MyModule::Class2
puts "about to instantiate Class 1"
@class1_instance = ::MyModule::Class1.new
puts "class1 instance has been created"
def initialize
puts "creating an instance of Class2"
end
end
Admittedly this is pretty artificial, but when circular dependencies can happen. When we try to run this we get:
my_module.rb loading
my_module/class1.rb loading
about to instantiate Class 2
NameError: uninitialized constant MyModule::Class2
at top level in class1.rb at line 5
method require in init.rb at line 2
at top level in init.rb at line 2
The problem here is that Class1 needs to reference Class2 before itâs defined. If we switch the order of requires, it still doesnât help, because Class2 also needs to reference Class1.
Now a hacky way to âfixâ this is to add empty definitions of MyModule::Class1, and MyModule::Class2 and then let them get filled in, using Rubyâs ability to re-open classes and module definitions.
But this feels unsatisfying to me.
Rails Class Loading
As I mentioned in the introduction, Rails developers have gotten used to letting ActiveSupport find their source code for them. ActiveSupport uses the Module#constant_missing hook to search for and load the definitions of Classes and Modules on demand.
But outside of Rails, a lot of Rubyists donât like to include ActiveRecord, since it brings in a whole raft of stuff, but Ruby provides a way to handle this circular dependency problem, which also turns out to be a nice way to load code in general.
Using Autoload
Iâve been doing Ruby programming for several years now, and I only recently discovered autoload, thanks to Chad Fowler who talked about it at RubyRX. This feature has been in Ruby for some time, but itâs not well known. Hereâs the documentation:
-------------------------------------------------------- Kernel#autoload
autoload(module, filename) => nil
------------------------------------------------------------------------
Registers filename to be loaded (using Kernel::require) the first
time that module (which may be a String or a symbol) is accessed.
autoload(:MyModule, "/usr/local/lib/modules/my_module.rb")
-------------------------------------------------------- Module#autoload
mod.autoload(name, filename) => nil
------------------------------------------------------------------------
Registers filename to be loaded (using Kernel::require) the first
time that name (which may be a String or a symbol) is accessed in
the namespace of mod.
module A
end
A.autoload(:B, "b")
A::B.doit # autoloads "b"
And here it is in use:
- my_module.rb
-
puts "my_module.rb loading"
module MyModule
autoload :Class1, File.join(File.dirname(__FILE__), *%w[my_module class1.rb])
autoload :Class2, File.join(File.dirname(__FILE__), *%w[my_module class2.rb])
# Code for MyModule here
end
- init.rb
-
require File.join(File.dirname(__FILE__), *%w[my_module.rb])
puts "about to reference MyModule::Class2 in init.rb"
MyModule::Class2.new
Iâve moved the requires out of init.rb except for my_module.rb itself. Instead we have two autoload calls inside the module, which tells Ruby where to find the source code for each class, when it is needed.
When I run this I get:
my_module.rb loading
about to reference MyModule::Class2 in init.rb
my_module/class2.rb loading
about to instantiate Class 1
my_module/class1.rb loading
about to instantiate Class 2
class2 instance has been created
class1 instance has been created
creating an instance of Class2
Notice that when Class2 is referenced, the file gets require right then and there resolving the constant, and when it refers to Class1, my_module/class1.rb is loaded right in the middle of processing my_module/class2.rb and after Class2 is defined.
Pretty slick, Iâd say.