Ruby Learning by Reversing: Native Gems, Part 3

The first series of Learning by Reversing examines a Ruby native gem to understand how it works. Part 3 continues the exploration by looking at what is included in the gem and how it is set up so that the native extension is built and available once the gem is installed.

Previously, in this series, we have had:

We are now quite familiar with the gem and how it is loaded. Here is what we saw:

  • We require the Ruby gem code file
  • The Ruby gem code then requires the native shared library

In this part, we will see how all these items end up in the correct place when the gem is packaged and eventually installed.

What is inside the gem?

To make sure that everything is in the correct place, we need to provide the following:

  • The main Ruby gem code in lib – this is easy and is the normal place for the main file to be included
  • The dynamic library in lib/fast_polylines as fast_polylines.so – wait, this does not exist when we look at the source!

So, that’s the main thing we need to achieve now – we need to go from having some C source code to finally having the dynamic library available as lib/fast_polylines/fast_polylines.so. The way this works is:

  • We set things up in the source code
  • We package that as part of our gem
  • The native extension is built when the gem is installed (and everything should be in the correct place then)

Setting things up in the Source Code

In the previous part, we looked at the file tree and this is what we had:

$ tree /f
Folder PATH listing for volume D_DRIVE
Volume serial number is C2DE-A69A
D:.
│   .gitignore
│   .rspec
│   .rubocop.yml
│   .yardopts
│   CHANGELOG.md
│   fast-polylines.gemspec
│   Gemfile
│   LICENSE
│   Makefile
│   README.md
│
├───.github
│   └───workflows
│           ci.yml
│
├───ext
│   └───fast_polylines
│           .gitignore
│           extconf.rb
│           fast_polylines.c
│
├───lib
│   │   fast-polylines.rb
│   │   fast_polylines.rb
│   │
│   └───fast_polylines
│           version.rb
│
├───perf
│       benchmark.rb
│
└───spec
        fast-polylines_spec.rb
        fast_polylines_spec.rb
        spec_helper.rb

The Ruby code files are straightforward and you can see the files in the path under lib and that is the same as any Ruby gem. For the native code, however, there are two parts to setting things up:

  1. Ensure that the C source code is available
  2. Have a Makefile that will be used when the gem is installed

C Source Code

The fast_polylines gem is really simple (which is why we chose to use it) and it has only one C source code file. It is common for the native source files to be put under the ext directory of the source code. The file tree shows that there are two main files under ext:

  • extconf.rb
  • fast_polylines.c

All the source code is inside the fast_polylines.c. We will explain the details of the C source code in a later part, but of course, you can take a look inside.

The other file is extconf.rb and this nicely leads us to the next bit since it helps to create the Makefile.

Makefile

The Makefile is created by running extconf.rb – if we open it up, we see this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# frozen_string_literal: true

require "mkmf"

if ENV["DEBUG"]
  warn "DEBUG MODE."
  $CFLAGS << " " << %w(
    -Wall
    -ggdb
    -DDEBUG
    -pedantic
  ) * " "
end
create_makefile "fast_polylines/fast_polylines"

If you set DEBUG in the environment, additional flags are added for the compiler. Note that:

  1. This is only used if the DEBUG is set while development and/ or if the gem is installed at a point when DEBUG is set in the environment. In most cases, this will not impact you but is a good reference if you want these flags to be set.
  2. Also, we are touching the global variable $CFLAGS

This is the method that is called https://www.rubydoc.info/stdlib/mkmf/MakeMakefile:create_makefile – it writes out the actual Makefile that will be created during gem installation. The gem installation will also call make install to create the native extension at that time.

Let’s look at this method call a bit: create_makefile "fast_polylines/fast_polylines" – we pass "fast_polylines/fast_polylines" as the argument for the target parameter. This is what the documentation for create_makefile says (some extracts stitched together):

The target name should correspond the name of the global function name defined within your C extension, minus the Init_. For example, if your C extension is defined as Init_foo, then your target would simply be “foo”.

If any “/” characters are present in the target name, only the last name is interpreted as the target name, and the rest are considered toplevel directory names, and the generated Makefile will be altered accordingly to follow that directory structure.

For example, if you pass “test/foo” as a target name, your extension will be installed under the “test” directory. This means that in order to load the file within a Ruby program later, that directory structure will have to be followed, e.g. require 'test/foo'.

So, here is what we are saying when we say fast_polylines/fast_polylines:

  • Our target is called fast_polylines (the one after the slash)
  • It should be installed to the path fast_polylines (the one before the slash)
  • We will need to require it as require 'fast_polylines/fast_polylines' (which is what we do in the main gem code)

Good – so far, we have established this:

  • We know what needs to go into ext/extconf.rb
  • We know where the source code needs to be
  • We know how the Makefile will be created so that make install will install it on gem installation
  • We know how to require the native extension after it is built

Instructions for Installation

The missing piece now is in getting the gem installation to realise that we have a native extension that needs to be built. The place that provides this information to gem install is the gemspec. Let’s look at that.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# frozen_string_literal: true

$LOAD_PATH.push File.expand_path("lib", __dir__)
require "fast_polylines/version"

Gem::Specification.new do |spec|
  spec.name        = "fast-polylines"
  spec.version     = FastPolylines::VERSION
  spec.platform    = Gem::Platform::RUBY
  spec.authors     = ["Cyrille Courtière", "Ulysse Buonomo"]
  spec.email       = ["dev@klaxit.com"]
  spec.homepage    = "https://github.com/klaxit/fast-polylines"
  spec.summary     = "Fast & easy Google polylines"
  spec.license     = "MIT"
  spec.required_ruby_version = Gem::Requirement.new(">= 2.4.6")

  spec.files = Dir["{lib,ext}/**/*.{rb,c}"] + %w(README.md CHANGELOG.md .yardopts)
  spec.extensions = ["ext/fast_polylines/extconf.rb"]
  spec.test_files = Dir["spec/**/*"] + %w(.rspec)
  spec.require_paths = "lib"

  spec.add_development_dependency("benchmark-ips", "~> 2.7")
  spec.add_development_dependency("polylines", "~> 0.3")
  spec.add_development_dependency("rspec", "~> 3.5")
end

Look specifically at Lines 17 and 18.

Getting the Files

Line 17 lists the files that need to be included into the gem when it is packaged. We ask it to include all the *.rb and *.c files that are found in the directories lib and ext and their sub-folders; we also ask it to include some other files such as the README.

This ensures that all the relevant files are included.

Instructing it to Build the Extension

Line 18 is where we specify extensions that need to be built. The documentation for extensions says:

Extensions to build when installing the gem, specifically the paths to extconf.rb-style files used to compile extensions.

These files will be run when the gem is installed, causing the C (or whatever) code to be compiled on the user’s machine.

That’s the link we needed – we now have a case that:

  • Line 17 ensures that the files are included
  • Line 18 ensures that the ext/extconf.rb will run when the gem is installed.

Once the gem is being installed, the extensions line is handled by the builder which in turn checks the kind of extension and invokes the correct builder (in our case, the Gem::Ext::ExtConfBuilder) that generates and executes the Makefile for this extension. If you’re curious (and you probably are), these are the builders specified in the builder code. The last one (Cargo.toml) is used for Rust extensions.

Aren’t there two Makefiles?

One of the loose threads is that you’ll notice that the source code has a Makefile in the root directory where the gemspec is present. This does not get included in the gem (you saw what we include above) and is used only for development. This Makefile lets you compile, test, install, benchmark, etc. while you do development. We will look at the development Makefile in a later post.

Summary

We have continued on our journey to stitch things together. We could say that this is how the jobs are split:

  • Development
    • Ensure all the code files (Ruby and C) exist
    • Set up extconf.rb to generate the Makefile
    • Ensure that the gemfile has the correct files and extensions sections
    • Package the gem
  • Gem Installation
    • Installation sets up the files
    • Builder will invoke the ExtConfBuilder that will
      • Create the Makefile
      • Execute make
      • Move the compiled object to the lib path (lib/fast_polylines in our case)
  • Gem Usage
    • You require the files from lib
    • Everything works!

I’ve tried to draw a picture about this which is a little messy but tries to capture this information.

Looking ahead

This brings us to the end of Part 3 where we looked at how the native extension source is packaged and delivered so that it can be built during installation.

I will add links and references later, possibly in the last post of the series. If you have any comments, please feel free to leave them below.

comments powered by Disqus