Thursday, July 7, 2016

Java cacerts on Intel Edisons and Gateways

The Intel Edison ships with loads of ca certificates in /etc/ssl/certs, and it also ships with Java 8 (OpenJDK), but it does not ship with a pre-configured cacerts file for Java. Ditto for the Wind River Linux installation that is preinstalled on the Dell 3290 gateway device.

So if you try to use Java to access something over HTTPS you'll get a security exception. For example, if you want to run Clojure using the excellent boot build too, the first time you try "$ boot repl" on either WRLinux/Gateway or Yocto/Edison, you’re going to get an exception along the lines of:

Exception in thread "main"
java.lang.RuntimeException: Unexpected error:
the trustAnchors parameter must be non-empty

The problem is that although both systems ship with a large set of ca certificates, as well as preinstalled OpenJDK 1.8, the latter is not configured to use the former out of the box.  Since I virtually never have to deal with this sort of thing it took me a couple of hours to figure out what the problem was and how to fix it.  All you have to do is create a Java “trust store” by running a Java utility calledkeytoolonce for each certificate you want to add to the trust store. (A trust store is analogous to a key store; the latter is where you keep your keys, the former is where you keep public certificates of trust; if you’re a glutton for punishment take a look at Configuring Java CAPS for SSL Support.)
Here’s what you need to know to make sense of that:
  • Java is installed at /usr/lib64/jvm (WRLinux on the Gateway), or /usr/lib/jvm (Yocto on Edison)
  • The standard place for the Java truststore is $JAVA_HOME/jre/lib/security/cacerts. Note that cacerts is a file, not a directory. You create it with
  • keytool
  • keytool is located in $JAVA_HOME/jre/bin
  • The certificates are in /etc/ssl/certs, which is a directory containing soft links to /usr/share/ca-certificates

Now you just need something to save you the drudgery on installing everything in /etc/ssl/certs into the Java trust store, since keytool must be given an individual file rather than a directory of files as input. Fortunately, somebody already wrote that script for us. You can find it at Introduction to OpenJDK in the subsection Install or update the JRE Certificate Authority Certificates (cacerts) file. I just copied that to ~/bin/mkcacerts, chmodded it to make it executable, and then created a one-time helper:


# use this for Yocto/Edison:
# LIB=lib
# use this for WRLinux/Gateway
if [ -f /usr/$LIB/jvm/java-8-openjdk/jre/lib/security/cacerts ]; then
  mv /usr/$LIB/jvm/java-8-openjdk/jre/lib/security/cacerts \
# if you have a ca-certificates.crt file, use this:
# -f "/etc/ssl/certs/ca-certificates.crt"
# otherwise use
# -d "/etc/ssl/certs/"
./mkcacerts                 \
        -d "/etc/ssl/certs/"           \
        -k "/usr/$LIB/jvm/java-8-openjdk/bin/keytool"      \
        -s "/usr/bin/openssl"          \
        -o "/usr/$LIB/jvm/java-8-openjdk/jre/lib/security/cacerts"

There is one tricky bit: notice the "-d" parameter in the script above. Use that if /etc/ssl/certs contains certificates but no "ca-certificates.crt" - this was the case in earlier versions of the Edison. But if you find /etc/ssl/certs/ca-certificates.crt, then replace that line as noted in the comment.

Now cd to ~/bin and run $ ./  You'll see a bunch of alarming-looking output as it churns through the certificates, adding them to the Java truststore (cacerts file). It takes a while, but once it's done you should be good to go with Java and HTTPS.

Tuesday, July 5, 2016

Why I Still Hate Hate Hate Build Systems

You might think that by 2016 software and systems developers would have figured out build systems. But no.  Build systems, along with package management and configuration, remain the bane of the developer's, not to mention the user's, existence.

I've worked with many such systems, from the venerable make/autotools system, to Cmake, to Scons, to Ant, Maven, Gradle (Boo!) and who knows what else. I hate them all.  They are all misbegotten monstrosities that only their mothers could love.

Case in point: I want to use the tinyb Bluetooth LE library on the Intel Edison and on a Dell 3290 gateway device.  You'd think that would be easy enough - download, configure, make, install, use. The project uses CMake, which ought to be even better than Make.

But like all build systems, Cmake works just great - until it doesn't.  In this case:

  • It started out by offering an inscrutable error message:  "A required package was not found". I'm not making this up, that is the message.  Which package?  Why so coy, CMake?  Cantcha just say???
  • So you look in the error log, and you see that a check for Pthreads failed. But you know your system has pthreads, so you flail around for a while, then notice it failed because the test command specified not "-lpthread", but "-lpthreads".  Whaa? A bug in CMake? An hour and a half later you finally realize that was a spurious report - the error message was not caused by Pthreads testing but by the check that occurred just before the error message. You proceed to feel both very stupid, because you should have thought of that immediately, and very, very pissed that the morons in charge of Cmake could have saved you the trouble by simply saying what the problem is.  I dunno, maybe something like "A required package was not found: gio-unix-2.0>=2.40"?  Izzat so hard?
  • And the way you actually figure out that gio-unix-2.0 was the problem is by creating a test project and painstakingly going through the CMakeLists.txt file line by line, trial-and-error. You begin to fear you might have an aneurism.  You are glad the Cmake devs are not within throttling distance, lest you squeeze their throats very hard.
  • But, but, bu bu - my system does so have gio-unix-2.0!  It's right there! (Of course, you waste a good 20-30 minutes googling around trying to figure out what exactly gio-unix is).
  • More trial and error.  Maybe its the version number.  You try a bunch, nothing makes a difference, the test always fails.
  • Finally, you inspect the pkg-config file for gio-unix-2.0. Everything looks hunky dory. Except for that @VERSION@ thingie there, it looks a little fishy, but you are not a pkg-config expert, you assume that it must be there for a reason.  But then you look at the related gio-2.0 file, and there in the version field is an actual number!  2.48.1, to be exact!
  • So finally it dawns on you: the person whose neck you need to squeeze, hard, mercilessly, is the person who install the broken pkg-config file for gio-unix-2.0, the one that didn't get the version number so it fails all checks.
  • But the CMake devs do not get away so easily!  It's still a flaw in their product - when they test for a library with a version >=2.40, and the version they get is obviously broken - not a version number at all - they should throw a "broken config file" exception, not a "package not found" error.  It's just false to say the package was not found! And it would also be false to say that the package found had too low a version number! It did not fail the version test, it was incapable of even taking the version test, and that is the exception.

The real problem with build systems is that they do not treat the build engineering problem as a software development problem. To manage builds you need a real programming language, and you need to write your build scripts with all the care you would use in writing critical C code.  There is actually one project out there that takes this approach, the only one I know of: Whereas previous build systems are of the "build a better mousetrap" variety, so they're all pretty much the same thing, boot represents a fairly radical re-conceptualization of what building software is all about.


Both the Intel Edison and the Dell 3290 (Intel IoT Gateway) with Wind River Linux, being embedded systems, run a Yocto-based OS.  The package manager for Yocto is opkg, not apt nor rpm nor any of the other PMs one typically encounters in full-featured Linux environments.  Unfortunately, good information on how to correctly use opkg is not easy to find.  Here's what worked for me:

Here "Edison" means an Intel Edison module updated to the latest image as of June 2016, and "Gateway" means a Dell/Wyse 3290 running Wind River Linux version 6.  Note that the Intel IoT Gateway platform also supports Ubuntu and Windows OSs.

opkg homepage:


Config file:

  • Edison:
  • Gateway:  /etc/opkg/opgk.conf
Here's what I used on the Gateway:

# opkg.conf

dest root /
dest ram /tmp
lists_dir ext /var/opkg-lists
option overlay_root /overlay
src/gz x86_base
src/gz x86_packages

Thursday, June 2, 2016

Zephyrizing the Curie

Disorganize Notes on Getting started with Zephyr on the Arduino 101

Ubuntu Linux:

* forget the JTAG stuff
* install the latest dfu-util (and libusb)

$ gpg --verify dfu-util-0.9.tar.gz.md5.asc 
$ tar -zxvf dfu-util-0.9.tar.gz 
$ cd dfu-util-0.9
$ ./configure
configure: error: *** Required libusb-1.0 >= 1.0.0 not installed ***
$ sudo apt-get install libusb-1.0-0-dev
$ ./configure
$ make
$ sudo make install
$ dfu-util -V
dfu-util 0.9

* install the Arduino IDE (>= 1.6.7)
* install the Curie board in the IDE (Tools->Board -> Boards Manager)

  (Note: Tools->Board will show a selected board so it doesn't look like a list of boards. Click on it anyway to see the list and the Boards Manager option)

The list of boards will then include "Arduino/Genuino 101".

As a side effect a hidden directory ~/.arduino15/ will be created.  For dfu-util to work a udev rule must be set up.  A shell script is included to do this, in


Execute that script (using sudo) and you'll be ready to flash.  To do that, you have to reset the board and then run your dfu-util command within 5 seconds.

"The MASTER RESET is intended to reset both the cores of Curie chip, while RESET will only reset the sketch (though the software doesn't support that yet, so they both reset the whole board).
When pressing MASTER RESET the board will enter DFU and you should see an entry in your device manager similar to this one" (;wap2)
  • minimum version of dfu-util required, and instructions for obtaining and installing
  • the required udev rule
  • required group memberships - my UID was a member of plugdev (from the old instructions) and dialout (required by Arduino IDE).  But when I removed myself from those groups I was still able to flash so I guess they're not required
  • "flash mode"
    •  if you press the master reset button, then wait > 5 sec, the flash will fail with message "No DFU capable USB device available"
    • if you press the master reset and flash within 5 sec, it will work
    • thereafter, you can flash at will, until you hit the master reset again
    • so I would say there is a "flash" mode, or something like that, which you enter if you flash within 5 secs of a master reset
    • in flash mode, if you press the plain reset, you do not have to flash within 5 seconds
    • exit flash mode by pressing master reset (and not flashing)
    • once in no-flash mode, the only way to flash is via the master reset button, the plain reset won't work that way

It should be possible to get started without installing the Arduino IDE, by using the script create_dfu_udev_rule:

    if [ "$(id -u)" != "0" ]; then
       echo "This script must be run as root"


    echo >/etc/udev/rules.d/$NAME
    echo \# Arduino 101 in DFU Mode >>/etc/udev/rules.d/$NAME
    echo SUBSYSTEM==\"tty\", ENV{ID_REVISION}==\"8087\", ENV{ID_MODEL_ID}==\"0ab6\", MODE=\"0666\", ENV{ID_MM_DEVICE_IGNORE}=\"1\", ENV{ID_MM_CANDIDATE}=\"0\" >>/etc/udev/rules.d/$NAME
    echo SUBSYSTEM==\"usb\", ATTR{idVendor}==\"8087\", ATTR{idProduct}==\"0aba\", MODE=\"0666\", ENV{ID_MM_DEVICE_IGNORE}=\"1\" >>/etc/udev/rules.d/$NAME

    udevadm control --reload-rules
    udevadm trigger

    Intel firmware page:  (source available)

    Wednesday, June 1, 2016

    Iotivity on the Intel IoT Gateway

    Iotivity is "an open source software framework enabling seamless device-to-device connectivity to address the emerging needs of the Internet of Things."  It has the backing (as in paid programmers) of some of the major players, including Intel and Samsung, to name just two.

    You probably wouldn't want to build Iotivity on an IoT gateway machine; it is designed as embedded software, after all, so you'd cross-compile it on a more powerful machine and then install it.  Or rather, you'd build a platform image in the proper way, but that is beyond the scope of this article (I'll blog about it later).  But if you're working on Iotivity itself, and if you're pulling changes on the bleeding edge, you need to build it by hand. Here's how.

    The instructions for Linux on the Iotivity website are for Ubuntu LTS 12.04.  They assume that apt-get is installed, which is not the case with WindRiver Linux on a gateway.  WRLinux does have rpm, but I'm not yet familiar that so let's do it by hand.

    WARNING: this is a little bit messy at the moment, as I'm composing it as my build proceeds, but it should be of some use.  I'll clean it up a bit later.

    Note that the build scripts are supposed to do some of this stuff automatically (e.g. download and compile some of the deps).  Which they do, on Ubuntu.  I had some trouble getting them to work on WRLinux (also OS X), plus the build scripts make some assumptions that only work on Ubuntu. So I decided to pull out most (all?) of the dependency stuff and handle it separately, manually.  Once you have that stuff installed the build seems to proceed smoothly.

    • tinycbor - see below
    • gtest-1.7.0
      • the build scripts use, but google has switched to github:
    • bzip2 - only need for downloading other deps?
    • uuid
    • libicu
    • gio - BLE uses gdbus.  deps:
      • glib - low-level C routines etc. from the Gnome project
        • libffi - Foreign Function Interface
          • - download zipfile, ./, .configure...
        • libpcre - Perl Compatible Regular Expressions
    • python 2.7.*
    • scons
    1. Python 2.7.* - should be already installed.  Needed for scons
    2. Install bzip2
      1. $ wget
      2. verify md5 checksum:  $ md5sum ...
      3. $ tar -zxvf bzip2-1.0.6.tar.gz
      4. $ cd bzip2-1.0.6
      5. $ sudo make install
    3. Install libuuid
      1. $ wget
        1. Do NOT use OSSP uuid, it seems to be missing some fns
      2. verify integrity - on Sourceforge you can find the checksums in the file browser; see Verifying Downloaded Files.
      3. $ tar -zxvf libuuid-1.0.3.tar.gz
      4. $ cd libuuid-1.0.3
      5. $ ./configure
      6. $ make
      7. $ sudo make install
    4. Install libicu
      1. e.g. $ wget
      2. build it
    5. Install libpcre (
      1. $ wget
      2. $ wget
      3. $ gpg --verify pcre-8.38.tar.bz2.sig
      4. $ gpg --recv-keys FB0F43D8 (or whatever key is shown by the --verify op)
      5. $ tar -jxvf pcre-8.38.tar.bz2
      6. $ ./configure --enable-utf --enable-unicode-properties
      7.  $make && sudo make install - puts libs in /usr/local/lib
    6. Install libffi
    7. Install glib - this may be a pain, see Notes on glib below.
      1. blah....
      2. $ ./configure
        1. Unfortunately the system pcre lib in /usr/lib64 is apparently not compiled with unicode support so ./configure will not finish.  I linked /usr/lib65/ to /usr/local/lib.  I dunno if that's a good idea, but it got me thru ./configure
    8. Install boost
      1. $ wget
      2. verify sha256 hash:  $ sha256sum ...
      3. $ tar -jzvf boost_1_61_0.tar.bz2
      4. $ cd boost_1_61_0
      5. $ ./ --with-libraries=system,filesystem,date_time,thread,regex,log,iostreams,program_options --prefix=/usr/local
      6. $ ./b2
      7. go make a cup of coffee, the build takes a long time
      8. $ sudo ./b2 install
    9. Install scons
      1. $ wget
      2. see  Building and Installing SCons on Any System
    10. Build Iotivity
      1. Download:  $
        1. NOTE: download the zip file and check the sha256 sum.  As of the time of writing, the tar.gz file does NOT match its checksum!
        2. $ unzip
        3. $ cd iotivity-1.1.0
        4. $ scons
    The build takes quite a while.


    At the very beginning you will see a message like this:

    *********************************** Error: *************************************
    * Please download cbor using the following command:                               *
    *     $ git clone extlibs/tinycbor/tinycbor *

    But when you try to do this you will get an error:

    fatal: unable to access '': Problem with the SSL CA cert (path? access rights?)

    Gateways running WRLinux come with lots of certificates (see /etc/ssl/certs) but apparently not the one we need for git cloning.  The workaround is to use wget.  Go to the tinycbor git repo in your browser, click on the "Clone or download" button, then right click on the "Download zip" link to copy the link address.  On the gateway, cd to iotivity-1.1.0/extlibs/tinycbor, wget the zip file, and unzip it.  You'll get a directory called tinycbor-master.  Rename this to tinycbor, and then return to the iotivity-1.1.0 dir and rerun scons.

    Other notes:

    $ scons -c  seems to hang when it gets to "scons: Cleaning targets ..."  ???

    Notes on glib:

    The bluetooth le stuff for linux depends on the Bluez stack, which
    depends on gdbus, which depends on gio, then glib, etc.

    So we install glib-2.0, and then the trouble starts.  The pkg-config
    files go in /usr/local/lib/pkgconfig, but when scons runs pkg-config
    it only uses the default dir, namely /usr/lib64/pkgconfig.  I don't
    know how to get it to also look in /usr/local, you can't pass stuff
    via env vars.  So the quick-n-dirty is to copy the relevant *.pc
    files.  Ditto for the headers; copy /usr/local/include/gio-unix-2.0 to

    Sunday, May 29, 2016

    IoT resources

    Linux tools

    Wednesday, May 11, 2016

    Embedded programming

    Excellent talk by John Light: Embedded Programming for IoT.  Light is involved in Iotivity development.

    He talks a little bit about coroutines but the talk focuses mostly on memory management in very constrained environments.  He has also written about this topic: blog:

    Heap Allocation on the Internet of Things

    IoT Memory Management: A Case Study

    On the Iotivity wiki:

    Memory Management Design

    Memory Management Design ii

    Memory Management Design iii

    Wednesday, April 20, 2016

    TikZ (PGF) Packages

    I haven't found an up-to-date list of packages that use/support the wonderful PGF/TikZ library, so I went through the latex packages in the latest TexLive and found these.  It's not exhaustive but it's better than nothing.  I excluded some libraries that use pgf/tikz under the covers (e.g. chemmacros).


    • aobs-tikz  -  overlayed pictures in beamer
    • bloques -  control blocks
    • blox - block diagrams
    • bondgraphs - 
    • braids - braid diagrams
    • celtic - celtic-style knots
    • circuitikz - electrical and electronic networks.  NB: tikz 3.0.1a has builtin support for some circuit libraries that was inspired by this library.
    • csvsimple - uses key-value syntax from pgfkeys
    • diadia - diabetes diary
    • endiagram - potential energy curve diagrams
    • forest - linguistic trees
    • hf-tikz  -  highlight formulas
    • hobby - a tikzlibrary implementing John Hobby's algorithm for cubic bezierz curves
    • makeshape - method of creating custom shapes in PGF
    • mframed  - Framed environments that can split at page boundaries
    • ocgx - Optional Content Group support (layers)
    • pgf-blur - blurred shadows
    • pgf-soroban
    • pgf-umlcd
    • pgf-umlsd
    • pgfgantt
    • pgfkeyx
    • pgfopts
    • pgfornament
    • pgfplots
    • prooftrees  -  based on forest
    • sa-tikz "a library to draw switching architectures"
    • spath3
    • tikz-3dplot
    • tikz-cd  -  commutative diagrams
    • tikz-dependency - dependency diagrams for linguistics
    • tikz-dimline  -  technical dimension lines (e.g. showing a widget is 4cm wide, etc)
    • tikz-feynman
    • tikz-inet  -  interaction nets
    • tikz-opm  -  Object Process Methodology diagrams
    • tikz-plattice  -  particle accelerator lattices
    • tizk-qtree -  2012.  NB: tikz 3.0.1a (and maybe earlier) has extensive built-in support for trees
    • tikz-timing - timing diagrams
    • tikzinclude - importing images
    • tikzmark  -  mark a point in a page for further use
    • tikzorbital - molecular diagrams
    • tizkpagenodes - extending current page nodes
    • tikzpfeile  -  arrows
    • tikzposter - generate scientific posters
    • tikzscale - absolute resizing
    • tikzsymbols - great collection of symbols that started with cookbook symbols and then grew
    • tkz-orm - Object-Role Model drawing.  NB: *not* part of the TKZ collection.
    • visualtikz - excellent visual documentation for tikz: extensive collection of simple examples showing syntax and result

    The Altermundus TKZ Collection

    This collection of PGF/TikZ libraries by the prolific Alain Matthes is so awesome it deserves its own section.  Some of the documentation is in French only.  (Note the spellings, tkz v. tikz)
    • alterqcm  -  two-column multiple choice questions  (Fr)
    • tizkrput - macro for placing stuff
    • tkz-base
    • tkz-berge - graphs (En)
    • tkz-euclide - an amazing library for doing good old fashioned geometry.   The doc is in French but if you know TikZ you can probably figure it out; if not, its a very good reason to learn French.
    • tkz-fct - 2d function plotting
    • tkz-graph  - more graphs (Fr)
    • tkz-linknodes - links between lines of an environment (En)
    • tkz-kiviat
    • tkz-tab  -  very fancy table support (Fr)
    • tkz-tukey

    Monday, April 11, 2016

    Developer-friendly IoT devices and tools

    IoT Overview:
    • Arduino/Genuino 101 - the new Arduino uses the Intel Curie!  See also the Intel Arduino101 site.
    • ESP8266  This is a very impressive and cheap little IoT device - MCU, WIFI, etc. for a couple of bucks!  Breakout modules for anywhere from $6 to $10.  Do a web search and you'll find lots of ESP8266 info and products; a good place to start is  Support for Lua and Javascript.  CAVEAT: the ESP8266 has already been through numerous versions so make sure you get the one you want.
    • Intel Edison - a little more expensive but a lot more powerful - a CPU running Linux, a separate MCU, a DSP, Wifi, Bluetooth, lots of memory, etc.
    • Texas Instruments has tons of IoT stuff.  Look for the Launchpad for the device you're interested in - these are inexpensive Arduino-like dev boards that make it easy to tinker.  The website also has lots of background info, whitepapers, training vids, etc.  See in particular
    • OpenMote
    • Other HW vendors: the semiconductor market is very competitive, with many major players and heaven knows how many smaller ones. Many of them are beginning to offer Arduino-like development boards for products that previously would only have been of interest to professional EEs.  See for example:
      • NXP - Formerly Philips Semiconductor.  NXP recently acquired Freescale, which was formerly the semiconductor division of Motorola
      • Atmel - recently acquired by Microchip.  Up until the release of the Arduino101, which uses the Intel Curie MCU, Arduinos used Atmel MCUs
      • STMicroelectronics
    • Online IoT-friendly retailers: lots.  I've had good luck with these, but there are lots of others; do shop around.
    • Make - maker zine

    • Espruino
    • NodeMCU "open-source firmware and development kit that helps you to prototype your IOT product within a few Lua script lines"  (ESP8266)
    • AT&T Flow - web-based dev environment.  based on:
    • Node-RED - "A visual tool for wiring the Internet of Things" (developed and open-sourced by IBM)
    • Alljoyn
    • Iotivity
    Operating Systems:
    Standards bodies:

    Cloud services - testing, data storage, etc.

    • IoT-LAB  "IoT-LAB provides a very large scale infrastructure facility suitable for testing small wireless sensor devices and heterogeneous communicating objects."

    • awesome-iot - a list of IoT resources somebody put on github
    • IPSO Alliance page listing devices for building smart objects

    Saturday, March 19, 2016

    Coinduction Resources

    Here are some materials I've found useful in trying to understand coinduction and related notions. Most of them involve some pretty heavy mathematics, but their introductions usually provide an overview and some examples that will help you develop your intuitions even if the math is over your head.

    Some major figures, all of whom have made many publications available on their homepages:

    A good place to start is An introduction to (co)algebra and (co)induction (PDF) by Rutten and Jacobs. (in Advanced Topics in Bisimulation and Coinduction).  The first part gives several concrete examples that will help you understand coinduction.

    Coinductive Models of Finite Computing Agents by Wegner and Goldin.  Free version available on their homepages.

    Communicating and Mobile Systems: the Pi Calculus, Milner.  Not easy to read, but well worth the effort.

    An interesting paper showing how the concepts of bisimulation and bisimilarity emerged independently in computer science, philosophical logic, and set theory:  On the Origins of Bisimulation and Coinduction, Sangiorgi.  Free version available on author's homepage.

    Universal coalgebra: a theory of systems, Rutten.  A landmark paper, I gather.  You can find a free version on Rutten's homepage.

    Coalgebraically Thinking, David Corfield (blog post, n-Category Cafe, 2008)

    This seems to be a growth industry; a number of introductory texts have been published in the past few years:

    Friday, March 11, 2016

    The Problem with Inductive Definition - and why it needs coinductive codefinition

    There's a subtle problem with inductive definitions.  Consider the cannonical example of Nat, usually defined using something like the following pseudo-code:

    Nat: Type
    Z : Nat
    S : Nat -> Nat

    The problem is that given only this definition we cannot answer a simple question:  is every n:Nat of the form S...Z?  In other words, do the cannonical constructors of an inductively defined type exhaust the membership of that type?

    The mathematician Andrej Bauer wrote an interesting blog post on this topic (The elements of an inductive type), but he comes at it from a very different perspective than I do.  His focus is on models of Nat, and he shows that there are various "exotic" models, which he describes thusly:  "Let us call a model exotic if not every element of (the interpretation of) 𝚗𝚊𝚝 is obtained by finitely many uses of the successor constructor, starting from zero."  It's an interesting post (as are the comments), but I think it misses the key point, which is that the problem is inherent in the structure of inductive definition; models are beside the point.  In this article I take a stab at an alternative characterization of the problem and a possible solution.

    The reason we cannot answer the question of whether an inductively defined type contains only those elements built up using the constructors, given only the inductive definition, is that the inductive part of the definition (here, S) makes the assumption that its argument is of the appropriate type, but that's the only assumption it makes.  In particular it makes no assumption as to the structure of that argument. We can make this a bit more explicit using informal English:

    Z is a natural number
    Assuming that n is a natural number, infer that Sn is a natural number.

    So we've assumed that n:Nat, but we have not assumed that it has cannonical form.  Maybe it was constructed solely using Z and S, maybe it wasn't - all we know (or rather assume) is that it is a Nat. For the purposes of inductive definition, it does not matter: S is a rule. If n is in fact of type Nat (no assumption here), then, no matter what its form or how we got it, Sn is of type Nat, end of story.  So for all we know, Nat could be chock-full of non-canonical members.

    That's not very satisfactory, since intuition tells us that of course there is nothing in Nat that is not constructed from Z and S.  But our inductive definition does not say this, and there is no way to prove it from the definition alone.

    Then how can we vindicate our intuitions, and show that every Nat is built up using nothing but S, starting from Z?  We want something like the following proposition:

    For all n:Nat, either n=Z or n=Sm for some m:Nat   ;; where n=Sm means Sm is the canonical form of n)

    Note that the inductive definition of Nat does not justify the second part of this proposition.  It tells us that every n:Nat has a successor Sn:Nat, but it does not tell us that every n:Nat is the successor to some other Nat (that is, has a predecessor "some m" such that n=Sm).  Even worse, it leaves open the possibility that every n:Nat could have any number of predecessors.  It may seem intuitively obvious that every Nat has a unique successor and a unique predecessor, but again, the definition does not say this; in fact it does not even imply it if your intuition hasn't already been trained to expect it.  Nor can it be used to prove it.

    We cannot justify this inductively, since induction will always involve assumptions as above, and if we try to build canonicity into our assumptions we just move the problem.  For example it won't work to say "Assuming that n is either Z or a Nat of form Sm, then Sn is a Nat", since the same problem arises with respect to m.  And we cannot go further and say "Assuming that n is a cannonical Nat..." because that is obviously circular - you only get canonical form after you've defined S.

    The only way I can see to do it is to use the dual notion of coinduction.  The idea is that we can introduce another operator P (Predecessor) and use it to show that every n:Nat can be expressed as (or reduced to, or equivalent to,  choose your terminology) a canonical form.  But not: defined as a canonical form!  Coinduction does not define; it does not say what things are (no metaphysics here); rather, it characterizes how they behave, which is to say it codefines them.  (See my previous post, Definition and Co-definition, for more on the distinction.)  The fact that every n:Nat has a predecessor (with PZ = Z) does not define Nat, it just says what you can do with a Nat once you have one.  By the same token it is not quite true to say that Z and S define Nat; they define those elements of Nat that have canonical form, but they do not "close" Nat, in that they do not and cannot exclude the possibility of (undefined) non-canonical Nat forms.

    So to get a complete and intuitively satisfying "definition" of Nat we need both an inductive definition of canonical forms and a coinductive codefinition of non-canonical forms.  Here's one way to get started that I don't think works:

    P :  Nat -> Nat

    Leaving aside the fact that this just looks like an alias for S (I'll address that problem in a minute), the form of this expression obscures the difference between inductive constructor and coinductive coconstructor.  Informally, an inductive constructor of a type X takes things "into" X; a coinductive coconstructore takes things "from" X.  We can therefore make this structure more conspicuous as follows:

    S : Nat -> X
    P : Nat <- X

    and setting X == Nat.  We could also have written P : X -> Nat, but that would make P look like a constructor for Nat rather than a coconstructor for X.  It's really the reversal of the arrow that exposes the duality, rather than the swapping of the type variables.

    So now we have P: Nat <- X, but that just looks like a funny way to write S.  All it says is that if you have n:Nat, then you have Pn:Nat.  Now the question is how to express the idea that P functions as a predecessor operation.  The only way to do that is in terms of S.

    PZ = Z
    SPn = Sn ;; n = Z
    SPn = n   ;; n != Z

    Note that we are not using S to define P.  That is, we do not want to define P by saying P(Sn) = n; we want instead to start with the same minimal assumption we used to specify S, namely, "Assuming n:Nat, infer Pn...".  In fact we're not really defining P at all!  Instead we're specifying the relation between S and the result of using P. There's really nothing in this "definition" that characterizes P as a predecessor operation, except for the fact that the successor operation S cancels it out.  In a sense the operation P performs is completely hidden, and could be anything; all that matters is that once you use P, you can use S to cancel it out.

    (Note that S : Nat -> X is also not a definition of S.  It does not define any mechanism, operation, algorithm, etc. that S performs.  Constructors like S, and coconstructors like P, are not functions.)

    This definition/codefinition answers our original question directly.  Every n:Nat is equal to S(Pn), which is in canonical form.

    I haven't said much about what coinduction really is, or why we should think of P as a coconstructor, etc.  I'll address that in another post.

    (Aside:  some writers define "canonical form" as a form containing only canonical constructors, e.g. SSSZ; others defined it as one whose outermost form is a canonical constructor, e.g. Sn is in canonical form.)

    Monday, March 7, 2016

    Workspaces: Boot's Missing Abstraction

    Boot is a thing of beauty, but it does have a minor wart or two.  One of them is the fact that implementation details are exposed by the API.  For example, to add files to a Fileset you need to obtain a temporary directory using boot.core/tmp-dir!, add your file to that directory using standard Java filesystem APIs, and then add the temp dir contents to the fileset using something like boot.core/add-resource.  The latter does not really add anything to the Fileset; Filesets are immutable, so add-resource really constructs a new Fileset consisting of the old Fileset plus the contents of the temp dir. (NB: add-resource and other add-* functions can work with any dir, but the Boot Way is to use tmp-dir!)  The concept of a Fileset is an abstraction over storage mechanisms; the fact that Boot maps Filesets to the filesystem is an implementation detail.  A filesystem is really just a kind of database, so an alternative implementation could map Filesets to some other data storage mechanism.

    The problem is that API functions like tmp-dir! inevitably suggest that we're working with the filesystem, but this too is an implementation detail.  Boot's current implementation of tmp-dir! happens to create a hidden temporary directory, but an alternative could use some other mechanism, such as an in-memory database like Datascript.  Furthermore, the fact that one uses Java filesystem APIs to work with the temp dir is also an implementation detail.  An alternative could offer its own more abstract API, with functions like add-item or similar.  What's important is that in order to alter the Fileset the developer must obtain a workspace, then add, change, and delete items in the workspace, and finally merge the workspace with the incoming Fileset to create a new Fileset to be passed on to subsequent tasks in the pipeline.  The implementation mechanism is not relevant to the semantics, so ideally, the API would be expressed solely in terms of implementation-independent abstractions.

    When I say that the concept of a workspace is Boot's missing abstraction I do not mean that Boot lacks some critical bit of functionality that workspaces would provide.  I mean just that promoting the concept of a workspace to first-class status would make it easier for developers and users to understand and use Boot.

    For example, my recommendation is that people new to Boot think of boot.core/tmp-dir! as boot.core/workspace, and should avoid thinking of workspaces as filesystem directories and their contents as files.  A workspace is just a space, never mind how the implementation handles the mapping of that space to a storage mechanism.  The concept is very similar to the notion of a namespace in Clojure.  The Java implementation of Clojure maps namespace symbols to file names, but that too is an implementation detail; when you work with namespaces and their contents in Clojure you never think about filesystem locations, you just think about names in namespaces. Similarly for Boot workspaces; when you add something to a workspace and then merge it with a Fileset you don't care how Boot goes about making this happen behind the scenes, you just want the stuff you put in the workspace to end up in the Fileset.

    Unfortunately it's a little harder to ignore the filesystem-based implementation when you go to add something to a workspace.  A common pattern looks something like this:

    (let [tmp-dir (boot/tmp-dir!)
          out-file (io/file tmp-dir "hello.txt")]
      (spit out-file "hello world")
      (-> fileset (boot/add-asset tmp-dir) boot/commit!))

    Obviously it's hard to think of io/file as anything other than a filesystem operation.  But it's easy to imagine a few macros that would provide a more abstract interface.  So the above might look something like:

    (let [ws (-> (boot/workspace) (boot/add-item "hello.txt" "hello world"))]
      (-> fileset (boot/add-asset ws) boot/commit!))

    To really make the concept of a workspace first-class would require a lot more analysis; for example, you'd have to really think through the semantics and syntax of something like add-item here.  You'd have to decide what to do if the item already existed and so forth. But it also opens up some interesting possibilities.  Workspaces could be immutable, for example. The concept of a workspace could also be construed as an extension of Clojure's concept of a namespace. You could allow the user to specify a workspace name symbol, e.g. (boot/workspace, which would allow you to treat items in the workspace as analogous to Clojure vars.  You would then have to decide on a mapping from vars to location names for the backing storage mechanism, but that doesn't seem an insurmountable design problem.  The example above might become something like:

    (let [ws (boot/workspace foo)
          wss (add-item ws hello.txt "hello world")]
      (-> fileset (boot/add-asset wss) boot/commit!))

    Or you might be even more Clojure-like, and support workspace operations like find-ws, so instead of the preceding you could write something like

    (boot/workspace foo)
    (boot/workspace bar)]
    (let [ws (boot/find-ws foo)]
      (add-item ws hello.txt "hello world")
      (-> fileset ws boot/commit!))

    You could even think of Fileset as a species of Workspace, a kind of lambda workspace.

    Monday, February 15, 2016

    Definition and Co-definition


    Inductive definition has been around forever; so has co-inductive "definition", for that matter, but it is much less well-know.  Furthermore, there is an important sense in which co-inductive definitions are not definitions at all, at least not in the informal sense in which we normally think of definitions. "Definition" is derived from the same Latin root as "finite" and the point of definition is just to give a finite, delimited, determinate, etc. characterization of the definiendum.  The notion of an inductive definition already runs afoul of this informal notion, since it involves at least potential infinity in the form of the inductive step.  But at least inductive definitions start out with their feet on the ground, with a base case.  Co-inductive definitions are not like this; they do not have base cases, and their definienda may themselves involve indeterminate notions like infinities and non-determinism.  So it is not obvious why we should even think of them as definitions.  They're more like co-definitions: something dual to definitions, symmetrically related to the concept of definition but nonetheless fundamentally different.

    That may explain why co-induction is not better known.  At first it's just weird to have something that looks a lot like an inductive definition, but that lacks a base case.  How can something like that serve as a genuine definition if it does not start at some definite beginning point?  And if we cannot "reach" every case by starting from that beginning and inductively stepping our way up, then aren't we compelled to say that whatever it is we're talking about must be undefined, maybe even undefinable?

    Here I think the beginning of wisdom is to reject essentialism and reductionism.  The impulse to reduce everything to foundations has been overwhelmingly dominant, at least in the West, for many centuries, and until relatively recently has been something of an article of faith since Descartes and the 17th century enlightenment.  Of course there were always dissidents - Vico comes to mind - but it was only with the emergence the Pragmatist Enlightenment beginning in the late 19th century that the folly of radical essentialism has gradually become clear.  Ok enough philosophy - the point is, to get comfortable with all this co-stuff you really do need to expand your imaginative horizon, and in particular you have to get comfortable with the idea that definitions from the ground up (reducible to foundational base cases plus an inductive step) cannot claim any privilege - co-definitions that work from the top down  (without a net, so to speak; from infinitude to finitude, from non-determinism to determinism, etc.) are just as fundamental.  They may not define things in a reductionistic sense, but they do characterize them - or their behavior - in ways that make them just as useful as definitions.

    Inductive definition

    Inductive definitions work by specifying how to construct the definiendum, always starting from a base case.  For example, the type of finite lists List A is defined by giving the primitive constructors that may be used to "put together" finite lists: Nil and Cons.  One of these constructors, Nil, functions as the inductive "base case".

    The values of an inductively defined type are always finite, though the type itself may be infinite. That means any value such a type can always be totally constructed at least in principle, since every such value is finite and well that's just what inductive definition means.

    Pi types are inductively defined; the primitive constructor is lambda.  Base case?

    Co-inductive co-definition

    Consider what effects, that might conceivably have practical bearings, we conceive the object of our conception to have. Then, our conception of these effects is the whole of our conception of the object. (The Pragmatic Maxim, C.S. Peirce)

    Co-inductive co-definitions do no say what a thing is; rather they say what it does.  Rather than characterizing essence, they characterize accidence.  Rather than nature, they specify behavior.

    Co-inductive definitions work by specifying how to de-struct the definiendum, never starting from a base case.  For example, the type of infinite lists List* A is defined by giving the primitive eliminators that may be used to "take apart" infinite lists:  Hd and Tl.  There is no base case; you don't construct an infinite list by starting at Nil and building up; if you tried that, you would never finish, and constructors must always terminate.

    Instead, you start with the very thing you would be trying to construct, if you were in the business of inductive definition: infinite data, non-deterministic computations, etc.  You treat those as just given, and you set about (co-)defining they way the behave under the operations of the type.  Once you've done that, there is nothing more to know about them.

    Co-inductive co-definitions are much like the police, according to that great mathematician Richard J. Daly, Mayor of Chicago:  "The police are not here to create disorder; they're here to preserve disorder."

    So List* A provides a clear contrast with inductive List A.  The inductive definition of List A starts with Nil, the base case; the co-inductive co-definition of List* A starts with the notion of an infinite list of A values.  In other words, the elements of List* A are given, not defined!  The business of the co-definition is not to define (the essence/nature of) List* A, but to co-define its behavior under the co-constructors (eliminators) of the type being co-defined.  In the case of List* A, those eliminators are Hd* (Head*) and Tl* (Tail*).  They look just like their counterparts (Hd, Tl) for List A; the critical difference is that they both operate on streams.  Hd* co-constructs an a:A, and Tl* co-constructs another List* A.

    The values of a co-inductively defined type may be infinite, but they may not be.  But in all cases, it is not possible to deterministically and completely construct such values.  For example, List* A values cannot be completely constructed; we can derive a new List* A value from an old one (together with an a:A) using Cons*, but we cannot construct an entire infinite list.

    Sigma types are co-inductively defined; the primitive eliminators (co-constructors) are fst and snd.