Note: This article is part of a series exploring how Laser, my Ruby static analysis tool, can help improve the quality of Ruby code. I'm presenting on this at RubyConf 2011 about it, and you should come!
Ruby has a plethora of built-in functions and idioms, and overriding them without messing up is nontrivial. For example, you can easily override to_s to return something other than a String, or ! to return an integer. Some methods, like catch, method, and include can be overridden, but are difficult (or impossible) to implement in Ruby; you should most likely call super in overrides of such methods. Worse, some methods, when overridden, cannot be made to act as expected when the overridden method is called, even using super.
Note: This article is part of a series exploring how Laser, my Ruby static analysis tool, can help improve the quality of Ruby code. I'm presenting on this at RubyConf 2011 about it, and you should come!
Ruby has a plethora of built-in functions and idioms, and overriding them without messing up is nontrivial. For example, you can easily override to_s to return something other than a String, or ! to return an integer. Some methods, like catch, method, and include can be overridden, but are difficult (or impossible) to implement in Ruby; you should most likely call super in overrides of such methods. Worse, some methods, when overridden, cannot be made to act as expected when the overridden method is called, even using super.
Note: This article is part of a series exploring how Laser, my Ruby static analysis tool, can help improve the quality of Ruby code. I'm presenting on this at RubyConf 2011 about it, and you should come!
One of the more frustrating aspects of keeping a clean Ruby codebase is getting rid of unused methods. Heavy refactoring, failed designs, or requirements changes lead to leftover cruft, no matter how vigilant we are. We'd like to know what methods aren't being used so we can just remove them.
Completely arbitrarily, I've decided to finally publish Laser in gem form. It's got tons of bugs and has a lot of work to go, but please feel free to go ahead and give it a shot!
Before you get started, it'd be best if bugs were reported to http://redmine.carboni.ca/ rather than to Github. But if it's too much of a hassle (because I haven't tried too hard on the administration side), go ahead and use the Github issue tracker. I'm already juggling both!
Note: This article is part of a series exploring how Laser, my Ruby static analysis tool, can help improve the quality of Ruby code. I'm presenting on this at RubyConf 2011 about it, and you should come!
Ruby has a simple construct which is very simple to get wrong: multiple assignment. A full explanation of the multiple assignment construct can be found at the Read Ruby 1.9 project. For those familiar with multiple assignment, the problem is that it is completely unchecked. There are two types of errors with multiple-assignment: static errors, and dynamic errors.
Inspired by the discussion in ruby-core:36559 (redmine link: Feature #4801), personal experience using the label syntax, and my dabbling with converting between label syntaxes, I felt like quoted symbols have a place with label syntax. In essence, the following two code blocks should be equivalent, in my opinion:
x = 'hello' h = {:foo => 1, :'use-19' => true, :"#{x}-world" => 3.14}
x = 'hello' h = {foo: 1, 'use-19': true, "#{x}-world": 3.14}
It's just the extension of the quoted symbol forms to the label syntax. We already have labels, and they're here to stay. Without adding quoted forms, you end up with hashes like this:
After seeing this post pooh-poohing the new syntax for Symbols as Hash literal keys, I searched around to see if there were any automatic converters for this change. I was surprised to find there are not!
This is the sort of change Laser will be able to do in its sleep, and I've got a local branch presently which performs this conversion. However, Laser currently has a long startup time. Until I switch it to using autoload (which is nontrivial given how Laser bootstraps) I figured I'd whip this one up as a separate gem. Plus, it gave me another chance to use the object_regex gem we developed.
I've officially posted the license for Laser: AGPLv3.0 with commercial exceptions.
I don't have a commercial license prepared, but I assume that anybody willing to purchase one probably wants it so they can make money with Laser, so if that happens, I'll pay a lawyer to come up with a license. However, I can assure you that license will not include any guarantee of support, upgrades, or even future communication from me. It's just so you can get out of the AGPL.
A major portion of the novel work done on my undergraduate thesis over the last 6-8 months went into developing Laser enough that it could infer block usage in a method: does a method call require a block, ignore a block, etc. This blog post shows how Laser successfully analyzes dozens of potential method constructions with interesting block use characteristics!
As explained in my preliminary post on the subject, there are two major sets forming the set of all methods in Ruby:
I've submitted my thesis on static analysis in Ruby today. In essence, it was an academic assessment of all my work on Laser. The department awarded it "high honors," which is just super.
In the coming weeks, I'll be:
If you've ever pondered the Ruby standard exceptions, you probably realize they can be pretty readily implemented as pure Ruby. While YARV implements the basic exceptions in C, they make sure to use good proper Ruby and Object-Oriented design along the way. Enter NameError::message.
When you call a method that doesn't exist, you get either a NameError (if you use the bareword foobar call syntax) or a NoMethodError, which inherits from NameError. However, if you look at the RDoc for NameError.new, you'll see it takes two arguments: a message and the missing name. A NoMethodError also has the arguments to the failed method call to track. A method call, however, has a little extra information: a message, a missing name, the arguments, and the receiver. If we compare a normal NoMethodError.new() message and the message of a NoMethodError created by the interpreter, you'll notice the difference in output:
Ripper is the Ruby parser library packaged with Ruby 1.9. While quite complete, it still has bugs (2 of which I've patched alone while working on Laser, with the fixes targeted for 1.9.3: mlhs-splats and words/qwords), and it has higher-level quirks that can make use frustrating. ripper-plus is a gem intended to demonstrate what I believe the correct output from Ripper should be, with the goal that these changes become the standard Ripper behavior. I do not want to invent nor maintain a new AST standard. I do however believe these changes warrant separate implementation and discussion before a potential integration with Ripper, and creating a small library to demonstrate the proposed output seemed the simplest approach. Plus, Laser needs all these improvements, so I had to do it anyway.
NB: Ripper is a SAX-style parser; one can construct an AST from the parser events however one wishes. Ripper also has a convenience method Ripper.sexp, which generates an Array-based AST directly from the SAX events. I personally use Ripper.sexp in my work, so the examples will be of Ripper.sexp output. All of the discussion below reflects deficiencies in the underlying SAX parser: they are unavoidable whether one uses Ripper.sexp or not.
Blocks play a critical role in idiomatic Ruby code. They're a crucial part
of any successful Ruby API, and the standard library shows it: from
Files,
and Enumerable's
handy
methods, to
even the bytecode implementation of the def and class constructs,
blocks are everywhere. Like other formal arguments, whether a method
takes a block is a fundamental part of a method's API.
Unlike formal arguments, our documentation tools don't infer block argument automatically. Unlike formal arguments, when a block is required but not provided, the method will partially execute before raising an exception. Even worse, when a block is unnecessary but is provided, Ruby doesn't even notice.
Edit (3/24/2011 4:38pm): Replaced the farcical (@count ||= 0) += 1 with the 2-line version of @count ||= 0; @count += 1.
One is always learning new reasons why Ruby is slow. We know a lot of the easy reasons, like dynamic dispatch overhead, lack of type information, and the overall everything-is-dynamicness. More interestingly are the subtle features that we often miss or overlook that are just doomed to be slower, in order to offer more flexibility. Ruby's designers have often chosen the more flexible option.
In Ruby, everything is an expression. It's a big part of the design and use of the language:
Rubyists aren't afraid to occasionally use the result of an if or case expression, a simple example:
x = if foo.nil? then BarClass.new else foo end y = case foo when nil then BarClass.new else foo end
One of the alluring properties of Ruby to me was that it strove for purity in many areas,
including the OO model (with annoying impurities such as no singletons for Fixnums notwithstanding),
and this idea that everything was an expression. A class creation was an expression! A def creating
a method was even an expression, even though it resembles a statement in both appearance (having a body with no do)
and semantically (it is a closed scope: no closure over variables in enclosing scope). Such simplicity
is tempting.
A while ago I came across an interesting bit while handling variable bindings with Ruby for loops, namely that you can actually use a Constant as a for loop variable.
Now I'm working on developing a control flow graph for the annotated AST, and discovered another feature of for loop variables: you can use field syntax (also called accessor syntax):
This is an interesting one most people don't know or consider in their day-to-day Ruby coding: how Exception Handling works. More specifically, how naming which exceptions to handle works.
When we need to catch an exception:
If you're a Rubyist, you likely love modules. You use them for namespaces, you use them to mix in behavior across classes, and you might even use them as placeholders for global methods. This post is about how mixing in modules works in Ruby 1.9. I mean how they really work: how it's implemented. I'm a forward-thinking guy so I haven't bothered to research 1.8.x, but I imagine it's going to be much the same.
(Per usual, this post comes from my work on Laser. Since my work there requires statically inferring class hierarchies, I needed to know precisely how inclusion was implemented.)
Short one today. While implementing the section of Laser that models the bindings in a given bit of code, I needed to handle how the for expression creates new variable bindings in Ruby (without creating a new scope):
for a, (b, c) in [[1, [:a, :b]]]
puts "#{a} #{b} #{c}"
end
puts "#{a} #{b} #{c}"
Results in:
I present a small Ruby class which provides full Ruby Regexp matching on sequences of (potentially) heterogenous objects, conditioned on those objects implementing a single, no-argument method returning a String. The code, which I've since been improving, would in my opinion be a solid replacement for the existing, related Ripper code in a future Ruby release.
So I'm hammering away at Laser, and I come across a situation: I need to parse out comments using Ripper's output.
Today, I've pushed significant work on Wool. It's still very rough around the edges, but I've got it doing the following new, cool, and somewhat-advanced analyses:
If you'd like to warn against using a semicolon to separate statements on a single line, you can turn this warning on. However, due to personal preference, if you do this:
After a year of working hard on Amp, we've got a huge update on the way. There's been a few bumps along the road (seydar had to apply to colleges. Ouch!) but the roadmap is really settling in. Here's a list of what's gonna be out soon:
amp serve officially accepts both amp and hg as clients for Mercurial-specific serving. This is going to be hg-specific for 0.6, but it's come a long, long way. We're using Haml 3.0+ for this release to stay on the bleeding edge of Ruby development.There's more I'm missing and my laptop is about to die. But there's a lot coming to be excited about.