Gambit C-Interface

Gambit has a really awesome way to interface with code written in C. The manual is fairly complete in this matter, but sometimes pulling together all the necessary information is a little daunting. Here is an example of calling C from Scheme to find the SHA256 sun of a string. It goes a little beyond the most trivial examples to show how to make your own type conversion from C and Scheme.

C Source

This is the C source code, which I put into c_sha.c — it needs to have a different base filename from the Scheme code, because otherwise gsc will overwrite it when generating the C code from the Scheme file.

#define ___VERSION 404002

#include <sys/types.h>
#include <sha2.h>

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#include <string.h>

#include <gambit.h>

/*
 * Convert a Scheme list of numbers into an array of u_int8_t numbers.  I won't
 * actually be using this right now, so we're only leaving it here for
 * symmetry, which is reason enough.
 */

___SCMOBJ SCMOBJ_to_BYTEARRAY (___SCMOBJ src, u_int8_t **dst, int arg_num)
{
    int i;
    ___SCMOBJ lst = src;
    int len = 4; /* start with a small result array */
    u_int8_t *result = (u_int8_t *)calloc(len, sizeof(u_int8_t));

    if (result == NULL)
        return ___FIX(___HEAP_OVERFLOW_ERR);

    i = 0;
    result[i] = NULL; /* always keep array null terminated */

    while (___PAIRP(lst)) {
        ___SCMOBJ scm_int = ___CAR(lst);
        u_int8_t c_int;
        ___SCMOBJ ___err;

        /* Need to increase the size of our array. */
        if (i >= len-1) {
            u_int8_t *new_result;
            int j;

            len = len * 3 / 2;
            new_result = (u_int8_t *)calloc(len, sizeof(u_int8_t));
            if (new_result == NULL) {
                free(result);
                return ___FIX(___HEAP_OVERFLOW_ERR);
            }
            for (j=i; j>=0; j--)
                new_result[j] = result[j];
            free (result);
            result = new_result;
        }

        /* Convert current element in the input list into an integer. */
        ___err = ___EXT(___SCMOBJ_to_U8) (scm_int, &c_int, arg_num);

        /* When the conversion fails, signal an error. */
        if (___err != ___FIX(___NO_ERR)) {
            free(result);
            return ___err;
        }

        result[i++] = c_int;
        result[i] = NULL;
        lst = ___CDR(lst);
    }

    if (!___NULLP(lst)) {
        free(result);
        return ___FIX(___UNKNOWN_ERR);
    }

    /*
     * Note that the caller is responsible for calling free() when it is
     * done with the result.
     */

    *dst = result;
    return ___FIX(___NO_ERR);
}

/*
 * Convert an array of u_int8_t integers into a Scheme list.  This is needed
 * because otherwise Gambit doesn't know what to do when it gets a u_int8_t
 * array back from the sha256() function and we want to be able to access it as
 * an array of integers.
 */

___SCMOBJ BYTEARRAY_to_SCMOBJ (u_int8_t *src, ___SCMOBJ *dst, int arg_num)
{
    ___SCMOBJ ___err = ___FIX(___NO_ERR);
    ___SCMOBJ result = ___NUL; /* start with the empty list */
    int i = 0;

    while (src[i] != NULL)
        i++;

    /* Build the byte-array starting at the tail. */

    while (--i >= 0)
    {
        ___SCMOBJ scm_int;
        ___SCMOBJ new_result;

        /*
         * Invariant: result is either the empty list or a ___STILL pair
         * with reference count equal to 1.  This is important because
         * it is possible that ___U8_to_SCMOBJ and ___make_pair
         * will invoke the garbage collector and we don't want the
         * reference in result to become invalid (which would be the
         * case if result was a ___MOVABLE pair or if it had a zero
         * reference count).
         */

        ___err = ___EXT(___U8_to_SCMOBJ) (src[i], &scm_int, arg_num);

        if (___err != ___FIX(___NO_ERR))
        {
            ___EXT(___release_scmobj) (result); /* allow GC to reclaim result */
            return ___FIX(___UNKNOWN_ERR);
        }

        /*
         * Note that scm_int will be a ___STILL object with reference
         * count equal to 1, so there is no risk that it will be
         * reclaimed or moved if ___make_pair invokes the garbage
         * collector.
         */

        new_result = ___EXT(___make_pair) (scm_int, result, ___STILL);

        /*
         * We can zero the reference count of scm_str and result (if
         * not the empty list) because the pair now references these
         * objects and the pair is reachable (it can't be reclaimed
         * or moved by the garbage collector).
         */

        ___EXT(___release_scmobj) (scm_int);
        ___EXT(___release_scmobj) (result);

        result = new_result;

        if (___FIXNUMP(result))
            return result; /* allocation failed */
    }

    /*
     * Note that result is either the empty list or a ___STILL pair
     * with a reference count equal to 1.  There will be a call to
     * ___release_scmobj later on (in ___END_CFUN_BYTEARRAY_to_SCMOBJ
     * or ___END_SFUN_BYTEARRAY_to_SCMOBJ) that will allow the garbage
     * collector to reclaim the whole list of strings when the Scheme
     * world no longer references it.
     */

    *dst = result;
    return ___FIX(___NO_ERR);
}



/*
 * Actually do the hash.
 */

u_int8_t *sha256(char *buf)
{
    SHA2_CTX ctx;
    u_int8_t *results;
    int n;

    results = (u_int8_t *)calloc(SHA256_DIGEST_LENGTH, sizeof(u_int8_t));
    n = strlen(buf);

    SHA256Init(&ctx);
    SHA256Update(&ctx, (u_int8_t *)buf, n);
    SHA256Final(results, &ctx);

    return results;
}

Note that this code was written to run on OpenBSD. You may not have sha2.h so if you don't you'll either have to figure out how to get it, or just substitute your own trivial example in it's place. Maybe try the OpenSSL version instead.

The functions SCMOBJ_to_BYTEARRAY() and BYTEARRAY_to_SCMOBJ() are used to translate Gambit's internal representation of a list of bytes to and from a C array of bytes. A lot of types, especially different varieties of strings, have their own conversions defined already. But in this case we want to do something that isn't provided by default, so we need to tell Gambit how to convert the types so we can access them like a list from our Scheme code. Just a bit more work needs to be done to finish telling Gambit how this conversion is performed, that we have to do in the Scheme code itself.

Scheme Code

(c-declare #<<c-declare-end

extern ___SCMOBJ SCMOBJ_to_BYTEARRAY();
extern ___SCMOBJ BYTEARRAY_to_SCMOBJ();
extern u_int8_t *sha256();

#define ___BEGIN_CFUN_SCMOBJ_to_BYTEARRAY(src,dst,i) \
if ((___err = SCMOBJ_to_BYTEARRAY (src, &dst, i)) == ___FIX(___NO_ERR)) {
#define ___END_CFUN_SCMOBJ_to_BYTEARRAY(src,dst,i) \
free(dst); }

#define ___BEGIN_CFUN_BYTEARRAY_to_SCMOBJ(src,dst) \
if ((___err = BYTEARRAY_to_SCMOBJ (src, &dst, ___RETURN_POS)) == ___FIX(___NO_ERR)) {
#define ___END_CFUN_BYTEARRAY_to_SCMOBJ(src,dst) \
___EXT(___release_scmobj) (dst); }

#define ___BEGIN_SFUN_BYTEARRAY_to_SCMOBJ(src,dst,i) \
if ((___err = BYTEARRAY_to_SCMOBJ (src, &dst, i)) == ___FIX(___NO_ERR)) {
#define ___END_SFUN_BYTEARRAY_to_SCMOBJ(src,dst,i) \
___EXT(___release_scmobj) (dst); }

#define ___BEGIN_SFUN_SCMOBJ_to_BYTEARRAY(src,dst) \
{ ___err = SCMOBJ_to_BYTEARRAY (src, &dst, ___RETURN_POS);
#define ___END_SFUN_SCMOBJ_to_BYTEARRAY(src,dst) }

c-declare-end
)

(c-define-type byte-array "u_int8_t *" "BYTEARRAY_to_SCMOBJ" "SCMOBJ_to_BYTEARRAY" #t)
(define sha256 (c-lambda (char-string) byte-array "sha256"))

(display (sha256 "abc"))

Here we first tell Gambit about all the relevant declarations that will be coming from our C code. Then we define a series of macros that tell Gambit how to deal with converting our byte array type from Scheme to C and back again. These macros also include how that data will be freed on either side of the language boundary.

The other thing that we need to do is use c-define-type to actually make the association between our type that we've defined and the name of the type in C which is u_int8_t * — a pointer to a 8-bit integer. As arguments we pass the suffixes of the macro names that will do the conversion, and finally a flag indicating that cleanup functions should be called automatically (i.e. in the event of a continuation invocation skipping over frames on the C side). Note that the manual says this is optional, but in v4.4.2 I get an error when I omit it; the default is #t which is good for safety.

Compilation

First we need to get Gambit to generate the C code for the Scheme file. Then we'll compile our C functions and link them all together.

$ gsc -link g_sha.scm
$ gcc -I/opt/gambit/include -L/opt/gambit/lib c_sha.c g_sha.c g_sha_.c -lgambc -lm -lutil

For my setup on OpenBSD, I need -lutil for openpty() but Linux users may not. Also note that here I'm linking against the Gambit static library. The result of this step is a binary that you can run:

$ ./a.out
(186 120 22 191 143 1 207 234 65 65 64 222 93 174 34 35 176 3 97 163 150 23 122 156 180 16 255 97 242)