#ruby

Sorbet - A static type checker for Ruby

Akshay Sasidharan's avatar

Akshay Sasidharan

Sorbet is a type checker which is now joining the flock on the duck typed Ruby Land. In this post, we shall explore why we would need static type checking and what are the implications of incorporating it.

What is a Type system?

Types are how a programmer can impart meaning to the raw sequence of bits such as a value, constant or some object within a program and also helps the language's compiler/interpreter to allocate memory accordingly.

Type safety is used to enforce constraints on the various types used in a programming language. This helps to catch hold of operations between in-compatible types. These checks are done either at compile-time - static type checking or at run-time - dynamic type checking.

To enforce type safety, the language needs a Type system which is defined as a part of the programming language's compiler/interpreter. Based on how strictly the compiler/interpreter enforces type safety check determines whether a language is strongly or loosely typed.

Where does Ruby fit in all this?

Ruby is a dynamic and strongly typed language. And it is what that enables us to do duck typing or metaprogramming capabilities which we love Ruby for.

Strong typed nature of ruby helps us to enforce type safety. But these type of errors can only be found only during run-time. And for this reason, we keep relying on tests to help us out to catch them.

Certain pain points come in Ruby when having a huge codebase and a large collaborating team.

  1. Hidden bugs: Bugs in code like - method not being found, invalid arguments being provided, uninitialized constant errors, etc can only be found during run-time. Whereas with a static type checker, we can find it even before the execution of the program and thus save time.
  2. Explicit documentation: When dealing with undocumented code, programmers usually have confusions like - what arguments does this method take and what could it return? What type does variable hold? etc. Even with documented code, there is a possibility of it falling out of sync. Static checker enforces type within code such that this confusion can easily be avoided.
  3. Code refactoring: To catch errors post refactoring cycle, one has to depend upon tests to ensure nothing has been broken. In case there is something broken, we can only know it during runtime. But with a static checker, refactoring cycle becomes easier as a programmer can be confident when changing interfaces such that the type checker would catch hold of any part of the programs which is inconsistent to the updated interface.

Furthermore, the interpreter can be made to leverage the explicit types specified to produce better-optimized machine code and IDEs to implement auto-completion features.

In order to address these pain points and missing productivity for not having static type checker, one must either depend on an elaborate test suite which could not still guarantee 100% type safety or consider to rewrite in a language which addresses them. A rewrite may not be practical in most cases because it will not be inclined towards the actual business goals.

Enter Sorbet

A gradual type system that can be adopted incrementally in order to introduce static type checking to your code. You can start adding type checking to existing parts of the codebase along with the development of other features.

Adopting Sorbet

Add these two gems for Sorbet command-line interface and runtime into your Gemfile:

gem 'sorbet', :group => :development
gem 'sorbet-runtime'
> bundle install

Now initialize sorbet for the project by running the command:

> srb init

This creates the following directory and needs to be version controlled.

sorbet/
 # Default options to passed to sorbet on every run
├── config
└── rbi/
     # Community-written type definition files for your gems
    ├── sorbet-typed/
     # Autogenerated type definitions for your gems
    ├── gems/
     # Things defined when run, but hidden statically
    ├── hidden-definitions/
     # Constants which were still missing
    └── todo.rbi

Config files contain simply the options and arguments to be passed onto the command srb tc (which statically type checks the code).

RBI files are “Ruby Interface” files. Sorbet uses RBI files to learn about constants, ancestors, and methods defined in ways it doesn’t understand natively. These files are autogenerated but can also be handwritten. You can learn more about RBI files from the official docs.

sorbet-typed is a folder which contains RBI files for the gems pulled out from a community-driven central repositry.

And voila! You are all set to start type-checking your code.

Type checking your code

It all starts with a magical comment # typed: which the sorbet team calls as sigils. This is to be added onto the file which is to be typed checked. There are various strictness level based on which srb decides what to report and what to silence.

We can start with # typed: true

At # typed: true, things that would normally be called “type errors” are reported. This includes calling a non-existent method, calling a method with mismatched argument counts, using variables inconsistently with their types, etc.

An example

Let us consider a silly example to showcase what sorbet is capable of:

# typed: true
 
class Farm
  def initialize(animals = [])
    @animals = animals
  end
 
  def all_speak
    make_animals_speak(@animals)
  end
 
  def animal_count
    @animals.length
  end
 
  def insert_animals(animals)
    animals.each { ||animal| @animals << animal }
  end
 
  private
 
  def make_animals_speak(animals)
    if animals
      animals.each { |animal| p animal.speak }
    else
      p 'awkward silence.. 😪'
    end
  end
end
 
class Duck
  def initialize
    @speech = 'quack'
  end
 
  def speak
    @speech
  end
end
 
class Cow
  def initialize
    @speech = 'moo'
  end
 
  def speak
    @speech
  end
end
 
class Dog
  def initialize
    @speech = 'woof woof'
  end
 
  def speak
    @speech
  end
end
 
farm = Farm.new
farm.speak # undefined method 'speak'
 
ducks = Array.new(3) { Duck.new }
cows = Array.new(2) { Cow.new }
dogs = Array.new(1) { Doggo.new } # uninitialized constant Doggo
 
farm.insert_animals(ducks, cows, dogs) # wrong number of arguments (given 3, expected 1)
farm.speak
p "count: " + farm.animal_count # no implicit conversion of Integer into String (TypeError)

If we were to run this program only then we would find the error undefined method 'speak' for Farm object. Once we correct that, then comes the next error uninitialized constant Doggo. Oops, how silly to miss that. We fix it only to find the next error when we run the program wiz. wrong number of arguments (given 2, expected 1).

'But hey, how tedious is it to find these errors? We've been doing this kind of debugging from long since and we keep tests to ensure the correctness of our program's intentions.' - one could argue.

And then on the command line, I'd type in:

> srb tc
 
sorbet-example.rb:56: Unable to resolve constant Doggo https://srb.help/5002
    56 |dogs = Array.new(2) { Doggo.new } # uninitialized constant Doggo
                              ^^^^^
    sorbet-example.rb:41: Did you mean: Dog?
    41 |class Dog
        ^^^^^^^^^
 
sorbet-example.rb:52: Method speak does not exist on Farm https://srb.help/7003
    52 |farm.speak # undefined method 'speak'
        ^^^^^^^^^^
 
sorbet-example.rb:58: Too many arguments provided for method Farm#insert_animals. Expected: 1, got: 3 https://srb.help/7004
    58 |farm.insert_animals(ducks, cows, dogs) # wrong number of arguments (given 3, expected 1)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    sorbet-example.rb:16: insert_animals defined here
    16 |  def insert_animals(animals)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
sorbet-example.rb:59: Method speak does not exist on Farm https://srb.help/7003
    59 |farm.speak
 
Errors: 4

All the errors have been well listed out before having to run the program. We can truly benefit from adopting this on our codebases and have our tests focused more on the program behaviour. But wait these are simply syntax and constant resolution errors. To get the real juice out of this gem, we need Signatures.

Runtime checks with Signatures

Signatures are simply Ruby code that is added above a method as a contract. In order to make use of signature, we'd need to add extend T::Sig onto our respective class or module.

Signature is composed of optional parameters and a required return types to be specified.

sig {params(x: SomeType, y: SomeOtherType).returns(MyReturnType)}

These kinds of annotations (at the cost of increased verbosity of the program) helps us to catch type errors and add enforced documentation for method. This can be further utilized for autocompletion & instant type-checked feedback on IDEs and leveraged by the interpreter to produce better-optimized machine code.

Let's consider the program from earlier and add signatures to the class Farm.

class Farm
  extend T::Sig
 
  sig { params(animals: Array).void }
  def initialize(animals = [])
    @animals = animals
  end
 
  sig { void }
  def all_speak
    make_animals_speak(@animals)
  end
 
  sig { returns(Integer) }
  def animal_count
    @animals.length
  end
 
  sig { params(animals: Array).void }
  def insert_animals(animals)
    animals.each { |animal| @animals << animal }
  end
 
  private
 
  sig { params(animals: Array).void }
  def make_animals_speak(animals)
    if animals
      animals.each { |animal| p animal.speak }
    else
      p 'awkward silence.. 😪'
    end
  end

Now that we have specified type information, let's see how the type check goes.

> srb tc
 
sorbet-example.rb:78: Expected String but found Integer for argument arg0 https://srb.help/7002
    78 |p "count: " + farm.animal_count
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    https://github.com/sorbet/sorbet/tree/80e1b24dadafc4ead575cb5e3166a691d2eb73e7/rbi/core/string.rbi#L24: Method String#+ has specified arg0 as String
    24 |        arg0: String,
                ^^^^
  Got Integer originating from:
    sorbet-example.rb:78:
    78 |p "count: " + farm.animal_count
                      ^^^^^^^^^^^^^^^^^
 
sorbet-example.rb:34: This code is unreachable https://srb.help/7006
    34 |      p 'awkward silence.. 😪'
                ^^^^^^^^^^^^^^^^^^^
Errors: 2

We just found a type error and an unreachable part of code. Nice! Let's fix that.

class Farm
...
  sig { params(animals: Array).void }
  def make_animals_speak(animals)
    if !animals.empty?
      animals.each { |animal| p animal.speak }
    else
      p 'awkward silence.. 😪'
    end
  end
...
end
 
...
...
p "count: " + farm.animal_count.to_s

I'll just take my liberty to point out the obvious just in case you are not thinking about it - we have zero tests written until now.

In this way, we can incrementally add type checking at our preferred pace and granularity. And when dealing with parts of the codebase which does not have any types given, it is considered to be of type - T.untyped

T.untyped has two special properties:

  1. Every value can be asserted to have type T.untyped.
  2. Every value of type T.untyped can be asserted to be any other type!

If you want to understand why it so, check out the docs.

Initially when we are starting most of our code will be of T.untyped and incrementally by adding statically typed code we should be reducing T.untyped types.

How will testing be affected?

Ruby being a dynamic language, tests are integral when building large programs. We will still be relying on an automated test but also with added confidence on type safety. These automated tests implicitly become the tests of these added signature contracts. Moreover, we can add type checking to be a part of the CI/CD pipeline as well.

What now?

Sorbet is written in C++, it is pretty fast as it is multithreaded and scales across CPU cores. There is support coming out for IDEs such that type-checked feedback is instantaneous. You can try out the editor support online`. It will help resolve the pain points and increase productivity when we make it as a part of our toolchain.

Given the popularity and adoption trend of static type checking with Typescript or Flow in Js, mypy in Python and Hack in PHP - it is great that static type checkers are making its way onto our Ruby community as well. Matz has put forward 3 goals for Ruby 3 at a keynote given RubyKaigi 2019 wiz. performance optimizations, concurrency support, and static type checking. So types are inevitably coming to Ruby (It is yet to be seen if type annotations or separately kept out RBI files will prevail).

Sorbet was initially developed for internal tooling at Stripe. This was later open-sourced. It has been tested by about 30 companies on their codebase and this includes Shopify, Coinbase, Sourcegraph, Kickstarter, etc. So if you are at a point wherein you are slowly drowning in technical debts as we had mentioned earlier or maybe want to prevent them. Do try out Sorbet and see if that floats your boat.