How To Score Your Rails App's Complexity Before Refactoring
Every Rails developer must refactor their Rails code at some time. Ruby's dynamic nature makes it difficult for generic tools to understand what Ruby is doing though. Once you add Rails macros and metaprogramming into the mix you really need specialized tools when refactoring Ruby. In my last post I wrote about how to a generic tool called
wc to find the complex code that needs to be refactored.
wc's has many problems when checking Ruby code though:
- it counts blank and empty lines
- it counts lines with only comments
- it counts only the number of lines, so each of these lines count as one even though the second is much more complex.
post = Post.new elsif (parsing_descr == 1 || parsing_descr == 2) && line =~ /^:\d+\s+\d+\s+[0-9a-f.]+\s+[0-9a-f.]+\s+(\w)\d+\s+(\S+)\t(.+)$/
wc's limitations, we need to use a tool that is built specifically for evaluating the complexity of Ruby code.
Flog is a tool by Ryan Davis and Seattle.rb that scores your Ruby code based on common patterns. It works based on counting your code's ABCs:
- A - Assignments. When more objects are assigned, the complexity goes up -
foo = 1 + 1.
- B - Branches. When code branches, there are multiple paths that it might follow. This also increases it's complexity.
- C - Calls. When code calls other code, the complexity increases because they caller and callee are now connected. A call is both a method call or other action like
All code has assignments, branches, and calls. Flog's job is to check that they aren't used excessively or abused.
Each of the ABCs have different metrics associated with them. For example, a standard method call might only score .2, but an
eval call would score 5 even though they are both calls. This is so the more complex and difficult to understand code is scored higher.
Now that you understand how scoring works, lets take a look at how flog uses the scoring.
When flog is given Ruby code, flog parses it and builds up a structure of the code internally using RubyParser. Then it goes through every class and method until everything is scored. What you end up with is a breakdown of how each class and method scored. So if you had two branches (1 point each) in your method, flog would give that method a score of 2. At the class level, flog would give your class a score that is a total of all of the methods in the class.
Using the scoring, you can get an idea of how complex or poorly written a piece of code is compared to other code. For example: in a class with two methods, one with a score of 40 and one with a score of 140, it's easy to see that the second method is more complex (or over complicating things). Ideally you would devote more of your time to refactor on the second method, since it's more complex.
A complete listing of the different scores that flog assigned can be found here in the source code but here is a summary of the common ones:
- Class defination - 1
- Method defination - 0
- Module defination - 0
- Subclass defination - 0.5
- method call - 0.2
- assignments - 1
- branching (
while) - 1
- litteral number - 0.25
- Symbol to Proc (e.g.
posts.collect(&:author)) - 10
A good rule of thumb for Flog scores is that you will want to eventually refactor or at least think about refactoring any method when it has a score of 40 or more. I have seen a single method go as high as 1,100 before (don't ask).
Refactoring based on flog's score
Once you have found which methods and classes that score high, you should be ready to refactor them. Since flog scores are totals of all of the assignments, branches, and calls; using a refactoring method that splits a section of code in two is usually a good first refactoring to perform. For a high scoring class, you might want to use:
- extract class
- extract superclass
- extract subclass
- move method
- move field
- remove middle man
- extract hierarchy
When refactoring a high scoring, you might use:
- extract method
- move method - especially if the method is being too intimate with another class
- move field
- encapsulate field
- decompose conditional
- consolidate conditional expression
- remove nested conditional with guard clauses
- replace temp with query
- replace method with method object
- substitute algorithm
My personal favorite refactoring is extract method. I use this all the time so split up larger methods into smaller one. It doesn't affect the total flog score for the project, but it makes the code a lot easier to understand when it's in two small methods. Do this enough and eventually you start to see some duplication across classes and you can make some really great refactorings then.
Getting started with flog
To get started with flog right away in your Rails app; I'd recommend that you install the metric_fu gem and run it's reports. metric_fu includes flog and several other tools I'm going to cover over the next few posts. Be warned, metric_fu can take a few minutes to run it's reports, especially on larger projects since it runs the entire test suite.
If you aren't using Rails or don't want to wait for the entire metric_fu report suite to run, you can get started using flog with:
- gem install flog
- flog path/to/app
This will find every Ruby file you have and will run flog on each file. Since flog produces a lot of output, I'd recommend piping this to a file so you can read through it all yourself (
flog path/to/app > flog_output.txt). Running flog like this is quick so feel free to experiment with a few of it's command line options. The options I like to use when refactoring are:
- -a (—all) will display all of the flog results. Typically flog only shows the top 60% worst code.
- -d (—details) shows the scoring details so you can see why the code is scoring so high.
- -g (—group) groups the results by classes. If you are still exploring your application to find out what areas need to be refactored, the group option can help you explore faster (especially with the -a option)
Another option I use outside of refactoring is the —score option. The score will show a project score of all of the files and also the average score per method. This is great data to save if you want to watch the project's score over the long term (e.g. is the score getting better or worse?). If you use metric_fu, this value is graphed each time you run the metrics task so hopefully you can start to see the flog score get lower and lower over time.
Refactoring the most complex code in your application is a good strategy to keep it healthy and easy to change. But that alone isn't enough, especially when there is a lot of code duplication. Luckily Ruby has a tool that can be used to detect code duplication, flay. I'll be digging into how flay can be used in your project to find where your duplication is hiding.
This is a guest post by Eric Davis. Eric is a member of the Redmine team and has written an ebook Refactoring Redmine to show Rails developers what refactoring real Rails code looks like. He writes about the different refactorings done to Redmine every day at http://theadmin.org.