Tuesday, May 12, 2020

Bazelizing Coda

I recently decided to spend most of my newly copious free time getting up to speed on blockchain, and in particular the Coda project.  Coda looks very promising; the zkSNARK stuff is about as close to "indistinguishable from magic" as it gets, and it's very recent, so I expect it will become more and more common in blockchain technologies. Plus the Coda folks are actively encouraging (and incentivizing) community participation in various ways, so I figure time spent working on it is time well spent.

My first order of business, aside from figuring out how to run a sandbox node (i.e. in a docker container) is to build the thing on my Mac so that I can start fiddling with the code.  Unfortunately, I was unable to do that. They have instructions for building on a Mac, outside of docker, but I immediately ran into problems. I started to debug the build process and quickly decided that I would rather throw caution to the wind and just bazelize the whole thing. Or at least spend a few days doing so, to get a feel for how much work a complete Bazelization would involve.

I have some previous experience doing this with a moderately complex C/C++ project, so I know that in most cases adding Bazel support is relatively easy. Furthermore, Bazel has good support for cross-platform builds (I said good, not simple), and I want to be able to build Linux and other binaries on my Mac.  A major selling point of the Coda protocol is that it is light weight; if that is true, it should run on smallish systems - I'm thinking Raspberry Pi, Thingy 91, Khadas SBCs (Fuschia!), etc. Support for cross-platform builds is essential.  And finally, the Coda build code is a bit on the ad-hoc side; it involves at least three build systems (Dune for OCaml, the main implementation language, plus Nix, plus the build systems of various dependencies), a bunch of shell scripts, dockerfiles, etc.  Bazelization would reduce the complexity to a considerable degree, and also, in principle, improve reliability and quality.

Another incentive: bazelizing a codebase is a good way to learn its structure.

So off I went. As I expected, much of the work was pretty easy. I was able to Bazelize most of the C/C+ dependencies in about a day and a half - and that includes refreshing my memory, since I had not worked with Bazel for a couple of years. I did run into some problems but I made enough progress over the course of a week to decide to finish and polish the thing.  This is the first in a series of articles describing what I did and how I dealt with Bazelization, in case it may be helpful either to Coda peoples or Bazel peoples.

There are two phases. Phase I is to bazelize the C/C++ dependencies, and Phase II is to bazelize the OCaml part.  Phase I should be relatively easy since I have some experience in that area; dunno about Phase II, since I don't have any experience with OCaml, and have only dabbled in writing the kind of Bazel rules needed to support it.

Phase I: C/C++ libraries

Phase I involves several steps:

  • Local native builds
    • get local builds working on my machine (Mac Catalina)
    • get local builds working on Linux and Windows.  The former should be trivial, given working Mac builds.  I think Bazel has pretty good Windows support these days, so that should only be a bit more work, maybe sorta.
  • Cross-platform builds
    • From host X targeting Y, with various HW architectures, etc.
      • Priorities: start with host Mac targeting linux on x86_64 and arm (raspberry pi, android)
      • Support Android NDK and at least one musl-based toolchain
    • Support for a variety of toolchains
    • Easy extensibility - it should be easy to add another target toolchain.
    • Use the latest Bazel facilities (e.g. platforms and toolchains).

A. Local native builds

Here is the list of direct dependencies:
  • boost
  • libffi
  • libgmp
  • libpatch
  • libprocps
  • libsodium
  • jemalloc
  • libomp
  • openssl
  • libpq
  • libsnark
  • zlib
Boost was easy, since somebody already took the trouble to bazelize it and make the code available as a library of Bazel rules (rules_boost). All you have to do is declare the github repo as a "git_repository" in your Bazel WORKSPACE file; then using a Boost module is as easy as declaring it as a dependency in you cc_library target like so:  deps = ["@boost/:algorithm"]. Isn't that awesome?

Most of the other libs were straightforward, but a few took a little work. There are basically two ways to add Bazel support to a third party library from the outside. You can write BUILD files containing the target recipes needed to build the code, or, if the library already contains a build system, you can have Bazel run it.  This used to be rather a pain, but at some point in the last few years Bazel added support for the very common configure-make and CMake build systems in the form of the rules_foreign_cc library.

A good example of the use of this library involves libsodium. Release versions of this library come with the standard "configure" shell script, and the build instructions are to run "./configure" and then "make".  The configure_make rule defined in rules_foreign_cc does this for you. So all you need to do to use libsodium is register it as an external repository in your WORKSPACE file.  First grab the rules_foreign_cc library:

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
    name = "rules_foreign_cc",
    strip_prefix="rules_foreign_cc-master",
    url = "https://github.com/bazelbuild/rules_foreign_cc/archive/master.zip",
    sha256 = "55b7c4678b4014be103f0e93eb271858a43493ac7a193ec059289fbdc20b9023",
)
load("@rules_foreign_cc//:workspace_definitions.bzl", "rules_foreign_cc_dependencies")
rules_foreign_cc_dependencies()

Then grab libsodium:


all_content = """filegroup(name = "all", srcs = glob(["**"]), visibility = ["//visibility:public"])"""
http_archive(
  name="libsodium",
  type="zip",
  url="https://github.com/jedisct1/libsodium/archive/1.0.18-RELEASE.zip",
  sha256="7728976ead51b0de60bede2421cd2a455c2bff3f1bc0320a1d61e240e693bce9",
  strip_prefix = "libsodium-1.0.18-RELEASE",
  build_file_content = all_content,
)


Then in your BUILD file load the rules library and use the configure_make rule it defines:

load("@rules_foreign_cc//tools/build_defs:configure.bzl", "configure_make")
configure_make(
    name = "libsodium",
    configure_env_vars = { "AR": "" }, ## macos needs this
    lib_source = "@libsodium//:all",
    out_lib_dir = "lib",
    shared_libraries = ["libsodium.dylib"], ## macos
    visibility = ["//visibility:public"],
)

Now you can add it as a dependency wherever you need it, e.g. in test/libsodium/lib/BUILD:

cc_library(
    name = "test_libsodium",
    srcs = glob(["*.cpp"]),
    hdrs = glob(["*.h"]),
    deps = ["//:libsodium"], # meaning, the libsodium target in the BUILD file at the project root
    visibility = ["//visibility:public"]
)

Easy-peasy. Openssl and some other libs also work like this.  Unfortunately not all configure-make packages are this easy. Sometimes they ship with autogen.sh but not the configure file it generates, in which case Bazel's configure_make rule will do you no good. Such is the case with jemalloc.  But Bazel provides a rule called genrule (general rule) that allows us to deal with this situation. Briefly, you use genrule to run autogen.sh, and then list that as input to the configure_make rule. Works great - but you do have to inspect the code to figure out how to write the genrule.  Bazel insists that all the inputs and all the outputs of a genrule must be explicitly listed (this is an annoying but good thing, since it helps guarantee replicable builds), and since different libs will have different files, you have to write the genrule by hand.  This was necessary for jemalloc and libffi.

The rules_foreign_cc library also supports CMake in the form of a cmake_external rule. That works for libomp (openmp).

Debugging a build failure

libsnark is a little tricky.  It uses CMake, so we use cmake_external as above, but out of the box, it fails.  Building with --verbose_failures --subcommands --sandbox_debug the message is:

$ bazel build //:libsnark
...
rules_foreign_cc: Build script location: bazel-out/darwin-fastbuild/bin/libsnark/logs/CMake_script.sh
rules_foreign_cc: Build log location: bazel-out/darwin-fastbuild/bin/libsnark/logs/CMake.log

Target //:libsnark failed to build
(02:07:58) INFO: Elapsed time: 2.371s, Critical Path: 1.93s
(02:07:58) INFO: 0 processes.
(02:07:58) FAILED: Build did NOT complete successfully


So we look in bazel-out/darwin-fastbuild/bin/libsnark/logs/CMake.log and find:

-- Found PkgConfig: /usr/local/bin/pkg-config (found version "0.29.2")
-- Checking for module 'libcrypto'
--   No package 'libcrypto' found

Hmm. We saw a message about building Openssl while the build was running. Let's check the working area.  Bazel builds stuff in it's own tmp dirs. You can find them listed in the logfile we just examined (CMake.log in this case; config.log in the configure_make case). Here's an example:

$ less bazel-out/darwin-fastbuild//bin/libsnark/logs/CMake.log
Bazel external C/C++ Rules #0.0.8. Building library 'libsnark'
Environment:______________
DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer
TMPDIR=/var/folders/wz/dx0cgvqx5qn802qmc3d4hcfr0000gp/T/
SDKROOT=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk
EXT_BUILD_ROOT=/private/var/tmp/_bazel_gar/3367fabd230435c540fea97e1a70bf66/sandbox/darwin-sandbox/10/execroot/__main__
XCODE_VERSION_OVERRIDE=11.4.1.11E503a
INSTALLDIR=/private/var/tmp/_bazel_gar/3367fabd230435c540fea97e1a70bf66/sandbox/darwin-sandbox/10/execroot/__main__/bazel-out/darwin-fastbuild/bin/libsnark
__CF_USER_TEXT_ENCODING=0x1F6:0x0:0x0
PATH=/private/var/tmp/_bazel_gar/3367fabd230435c540fea97e1a70bf66/sandbox/darwin-sandbox/10/execroot/__main__:/usr/gnu/bin:/usr/local/bin:/bin:/usr/bin:.
BUILD_TMPDIR=/var/folders/wz/dx0cgvqx5qn802qmc3d4hcfr0000gp/T/tmp.Y34P4w2t
PWD=/private/var/tmp/_bazel_gar/3367fabd230435c540fea97e1a70bf66/sandbox/darwin-sandbox/10/execroot/__main__
EXT_BUILD_DEPS=/var/folders/wz/dx0cgvqx5qn802qmc3d4hcfr0000gp/T/tmp.rF8INNUy
SHLVL=2
BUILD_LOG=bazel-out/darwin-fastbuild/bin/libsnark/logs/CMake.log
BUILD_SCRIPT=bazel-out/darwin-fastbuild/bin/libsnark/logs/CMake_script.sh
APPLE_SDK_PLATFORM=MacOSX
APPLE_SDK_VERSION_OVERRIDE=10.15
_=/usr/bin/env

Since this is an external lib, we want to look in EXT_BUILD_ROOT:

$ find /private/var/tmp/_bazel_gar/3367fabd230435c540fea97e1a70bf66/sandbox/darwin-sandbox/10/execroot/__main__ -name libcrypto*
/private/var/tmp/_bazel_gar/3367fabd230435c540fea97e1a70bf66/sandbox/darwin-sandbox/10/execroot/__main__/bazel-out/darwin-fastbuild/bin/copy_openssl/openssl/lib/libcrypto.a

And there it is. Why couldn't CMake find it?  I dunno. Maybe because it was consulting /usr/local/bin/pkg-config. That wouldn't work, since Bazel builds libcrypto in its own little sandbox. So maybe this is a weakness in the cmake_external rule, or maybe I haven't configure it properly.

In any case, I decided that rather than debug this, I would bazelize libsnark.  Mainly because I figure that would be a good way to get to know a little more about libsnark, which is used not only by Coda, but also by ZCash, and presumbably by other blockchain projects.  How many SNARK implementations can there be, after all?

libsnark depends on various libs as well: xbyak, ate-pairing, libff, libfqfft, and some others. I've got most of them done.  It turns out doing this was a good idea, because it exposed problems that did not occur with Coda's deps.  For instance libgmp builds just fine, until you --enable-cxx. Then you have a problem.  The fix is simple, but it took me the better part of a day to find it, haha.

So current status is that most of this stuff is bazelized, at least for me, on my mac. You can get an idea of what it looks like at xbyak and ate-pairing.  What I'm now working on is support for the newer Bazel stuff like platforms and toolchains, which includes support for local native builds on Linux and Windows.  Once that is a little further along I'll push it to github and write a followup article with links.