Ruby packaging next

Posted on Nov 7, 2014

Taking ruby packaging to the next level

Table of Content

TL;DR
Where we started
The basics
One step at a time
Rocks on the road
“Job done right?” “Well almost.”
Whats left?

TL;DR

we are going back to versioned ruby packages with a new naming scheme.
one spec file to rule them all. (one spec file to build a rubygem for all interpreters)
macro based buildrequires. %{rubygem rails:4.1 >= 4.1.4}
gem2rpm.yml as configuration file for gem2rpm. no more manual editing of spec files.
with the config spec files can always be regenerated without losing things.

Where we started

A long time ago we started with the support for building for multiple ruby versions, actually MRI versions, at the same time. The ruby base package was in good shape in that regard. But we had one open issue - building rubygems for multiple ruby versions. This issue was hanging for awhile so we went and reverted to a single ruby version packaged for openSUSE again.

While this might work for openSUSE, it can become really challenging for SLES with its long life cycle. Even in the openSUSE world we have Evergreen for 11.4 still alive. That was released in March 2011 and still no sign that Evergreen support for it is ending. We really want multiple ruby versions here.

With that in mind we were wondering how much work would be left to do, so we can actually fully support multiple ruby versions or even multiple ruby interpreters.

The steps we need to take

When asking around for “If we support multiple versions again what would you expect in regard to rubygems packaging?”, we got a few important points.

only 1 spec file for all. python currently uses 1 spec file for each version
avoid re-packaging in d:l:r:e [1]
system ruby should always be /usr/bin/ruby.

The basics

The first step was to restore the versioned ruby packaging and add at least one extra version. As a start we used a snapshot package for MRI 2.2.

All the common bits needed for all ruby interpreters moved into ruby-common. The unversioned ruby package became a wrapper to pull in the system ruby. As mentioned above /usr/bin/ruby should always point to the system ruby. So in the new schema it is not a symlink handled by update-alternatives, but a hardlink.

The same goes for %{_libdir}/libruby.so. It is actually a pity that we still need this file, as any good program linking ruby should just query RbConfig for the commands to link the libruby for the currently running interpreter. Sadly some tools ask for the CFLAGS but then hardcode -lruby. Ugh.

Our naming scheme is

mainpackage = "#{interpreter}#{major}.#{minor}"

jruby1.7
rubinius2.2
ruby2.1
ruby2.2

The interperter name is more or less the same except for rubinius. Their binary is called rbx. So we get:

interpretername = "#{binaryname}#{major}.#{minor}"

jruby1.7
rbx2.2
ruby2.1
ruby2.2

As we wanted to support gems for more than one version, we also need to express that in their package names now.

gempackage = "#{interpretername}-rubygem-#{gemname}#{gemsuffix}"

# gemsuffix is optional. we only need it for packages where we need more
# than one version

jruby1.7-rubygem-tzinfo
rbx2.2-rubygem-tzinfo
ruby2.1-rubygem-tzinfo
ruby2.2-rubygem-tzinfo

The gem binary naming is a bit longer now as well. But most of you should not notice anything as update-alternatives covers this:

gembinary = “#{binaryname}.#{interpretername}-#{gemversion}”

$ ls -l /usr/bin/bundler*
lrwxrwxrwx 1 root root  25 25. Jul 19:04 /usr/bin/bundler -> /etc/alternatives/bundler
lrwxrwxrwx 1 root root  31 25. Jul 19:04 /usr/bin/bundler-1.6.5 -> /etc/alternatives/bundler-1.6.5
lrwxrwxrwx 1 root root  38  5. Sep 13:00 /usr/bin/bundler.rbx2.2 -> ../lib64/rubinius/gems/2.2/bin/bundler
lrwxrwxrwx 1 root root  33 25. Jul 19:04 /usr/bin/bundler.ruby2.1 -> /etc/alternatives/bundler.ruby2.1
-rwxr-xr-x 1 root root 504 25. Jul 18:53 /usr/bin/bundler.ruby2.1-1.6.5
lrwxrwxrwx 1 root root  33  8. Sep 19:19 /usr/bin/bundler.ruby2.2 -> /etc/alternatives/bundler.ruby2.2
-rwxr-xr-x 1 root root 504  5. Sep 15:57 /usr/bin/bundler.ruby2.2-1.6.5

Last but not least we have the ruby(abi). In the past it used to be just the “version” number you had in your ruby paths. While this worked nicely for our packaging needs with just one ruby interpreter. It is actually tricky for cases where a gem wants at least a certain ruby version. The gemspec only has spec.required_ruby_version which is a version number. While in MRI this number is compared with the ruby version number, in the case of rubinius/jruby it is compared with the version of ruby language standard that is implemented.

Provides:  jruby(abi) = 1.7
Provides:  rubinius(abi) = 2.1.0
Provides:  ruby(abi) = 2.1.0
Provides:  ruby(abi) = 2.2.0

One note: With rubinius this alone will not be enough as a requires, we still have to add something like %requires_eq rubiniuspackage to the sub package preamble. The reason is that rbx 2.2.10 and rbx 2.3.1 might both implement the ruby 2.1.0 api, but are not binary compatible among each other.

One step at a time

Many of our gems nowadays build without having a buildrequires to their runtime dependencies. We only generate the dependencies into the package meta. Though some packages still could require e.g. rspec so they could run their testsuite at build time. At this point generating multiple spec files seemed like an easier solution: we would generate build deps for each ruby interpreter/version into the spec file. But we were asked not to, so we stick with a single spec file.

Fortunately rpm gives us macros, which on the other hand also need support in the buildservice. Good thing is the guy to make that happen for rpm and the OBS is the same person. :D

BuildRequires:  %{ruby}
BuildRequires:  %{rubydevel}
BuildRequires:  %{rubygem somegem >= someversion}

Without going into too many details here, in home:darix:ruby this expands to:

BuildRequires:  ruby2.1 ruby2.2 rubinius2.2
BuildRequires:  ruby2.1-devel ruby2.2-devel rubinius2.2-devel
BuildRequires:  rubygem(ruby:2.1.0:somegem) >= someversion rubygem(ruby:2.2.0:somegem) >= someversion rubygem(rubinius:2.2:somegem) >= someversion

The interested reader can check [2]. It involves recursive macro calls and other fun things. You have been warned.

Most of our machinery for installing and cleaning up things was hidden in macros/shell scripts already so adding a loop in those places was trivial.

Suddenly we have multiple ruby interpreter/versions in the same build root and our gems build with each of them.

So far nothing that would break all that horribly when we push it onto d:l:r:e.

Rocks on the road

The problems started when we came to the %files sections in the spec files. Until now we had been generating them into the spec file. That means for every new ruby interpreter/major branch we would need to regenerate all spec files.

That is not really a viable solution going forward. So we replaced the static files section with a macro too.

%gem_packages

But this flexibility comes at a price. If building the gem fails for one ruby interpreter/version, it will break the build for all. But you can control which interpreters are even pulled into the build environment.

%define rb_build_versions %{rb_default_ruby}
BuildRequires: %{rubydevel}
BuildRequires: %{rubygem cheetah}

In home:darix:ruby this would expand to:

BuildRequires: ruby2.1-devel
BuildRequires: rubygem(ruby:2.1.0:cheetah)

The valid values for %rb_build_versions can be found on the in the macros files of each interpreter package. We would have loved to use the package names as the macro values but those values are passed into macros again and macro names can not contain dots. For easier reading you can look at the prjconf of home:darix:ruby. [2]

“Job done right?” “Well almost.”

There were all those little bits and pieces that creeped in now. We had gems with %pre/%post scriptlets, because the gem was actually a somewhat better tarball for a service. Rubygems also lacked a way to express native buildrequires.

As it became clear that we will need to regenerate all the spec files in d:l:r:e, we aimed for a solution that allowed us regenerating the spec files at any time.

No more manually editing spec files

So we looked through a huge selection of spec files and checked what things, we actually modified. Once that list was compiled, we created a config file.

The config file is named gem2rpm.yml. Each field in the config file [3] has a matching hook in the spec file template [4] or files section template [5] (via %gem_packages).

gem2rpm.yml was patched to support the config file. Right now you have to manually pass the config file, but there is a small shell wrapper [6] that checks if the config file exists and adds the option automatically.

With that in place you can regenerate all spec files without worrying to lose manual edits. There wont be any. And yes … in the development process this has been done multiple times after fixes to the spec file template.

The final result?

#
# spec file for package rubygem-tzinfo-0
<snip>
# This file was generated with a gem2rpm.yml and not just plain gem2rpm.
# All sections marked as MANUAL, license headers, summaries and descriptions
# can be maintained in that file. Please consult this file before editing any
# of those fields
#

Name:           rubygem-tzinfo-0
Version:        0.3.37
Release:        0
%define mod_name tzinfo
%define mod_full_name %{mod_name}-%{version}
%define mod_version_suffix -0
BuildRoot:      %{_tmppath}/%{name}-%{version}-build
BuildRequires:  ruby-macros >= 5
BuildRequires:  %{ruby}
BuildRequires:  %{rubygem gem2rpm}
BuildRequires:  %{rubygem rdoc > 3.10}
Url:            http://tzinfo.rubyforge.org/
Source:         http://rubygems.org/gems/%{mod_full_name}.gem
Source1:       gem2rpm.yml
Summary:        Daylight-savings aware timezone library
License:        MIT
Group:          Development/Languages/Ruby

%description
TZInfo is a Ruby library that uses the standard tz (Olson) database to provide
daylight savings aware transformations between times in different time zones.

%prep

%build

%install
%gem_install \
  --doc-files="CHANGES LICENSE README" \
  -f

%gem_packages

%changelog

As you see this spec uses a gem2rpm.yml, which looks like this:

---
:version_suffix: '-0'

With that spec file the build generates for me:

rbx2.2-rubygem-tzinfo-0-0.3.37-9.4.x86_64.rpm
rbx2.2-rubygem-tzinfo-doc-0-0.3.37-9.4.x86_64.rpm
rbx2.2-rubygem-tzinfo-testsuite-0-0.3.37-9.4.x86_64.rpm
ruby2.1-rubygem-tzinfo-0-0.3.37-9.4.x86_64.rpm
ruby2.1-rubygem-tzinfo-doc-0-0.3.37-9.4.x86_64.rpm
ruby2.1-rubygem-tzinfo-testsuite-0-0.3.37-9.4.x86_64.rpm
ruby2.2-rubygem-tzinfo-0-0.3.37-9.4.x86_64.rpm
ruby2.2-rubygem-tzinfo-doc-0-0.3.37-9.4.x86_64.rpm
ruby2.2-rubygem-tzinfo-testsuite-0-0.3.37-9.4.x86_64.rpm

This has already been used extensively to build Discourse and GitLab for SLE 12.

What is left?

If you look just at the things that are packaged using the new way, we are done. But it would be nice to also support the versioning pattern that we have used up to 13.1. Maybe even support building against unversioned ruby versions. (Yes, in theory that is possible.)

Once we have taken those steps, we have to recreate all spec files in d:l:r:e. Yes, you read correctly … recreate all spec files. While we wrote all macros in a way to work as a drop in replacement, as soon as we want to build for more than one ruby version, we need the %gem_packages usage. I really hope enough people step up to the effort so that the number of packages each person has to touch is kept in the low double digits. Most of the work will be extracting the manual bits into gem2rpm.yml files and then regenerate the spec file with the new template. A more detailed mail will be sent later.

Last but not least we can improve the templates (move duplicated code into gem2rpm e.g.)

Building non gem based ruby libraries is not covered by this new packaging at all. For once many of the non rubygem based libraries are bindings build within a larger build process. We would need to manually do the loop for all ruby interpreter in those. I think in many cases we only need those libraries for our system ruby. (like yast e.g.) For the other cases we should work with upstream to switch the ruby bindings to rubygems.

Also not covered yet, but it would be really useful: All the ruby scripts shipped as part of the distro should use a shebang line pointing to the versioned binary. You might ask “why? /usr/bin/ruby is a hardlink!” While this is true … people can still replace it. Our scripts/programs should not break in that situation. E.g. yast is very critical in that regard. Again a gem based distribution makes it easier as rubygems does it for us in that case. But it shouldn’t be much work to add a small script to fix shebang lines and an rpmlint check that finds all shebang lines that are unversioned.