Mixing Scheme 48 and C

This chapter describes an interface for calling C functions from Scheme, calling Scheme functions from C, and allocating storage in the Scheme heap.. Scheme 48 manages stub functions in C that negotiate between the calling conventions of Scheme and C and the memory allocation policies of both worlds. No stub generator is available yet, but writing stubs is a straightforward task.

7.1  Available facilities

The following facilities are available for interfacing between Scheme 48 and C:

7.1.1  Scheme structures

The structure external-calls has most of the Scheme functions described here. The others are in dynamic-externals, which has the functions for dynamic loading and name lookup from Section 7.4, and shared-bindings, which has the additional shared-binding functions described in section 7.2.3.

7.1.2  C naming conventions

The names of all of Scheme 48's visible C bindings begin with `s48_' (for procedures and variables) or `S48_' (for macros). Whenever a C name is derived from a Scheme identifier, we replace `-' with `_' and convert letters to lowercase for procedures and uppercase for macros. A final `?' converted to `_p' (`_P' in C macro names). A final `!' is dropped. Thus the C macro for Scheme's pair? is S48_PAIR_P and the one for set-car! is S48_SET_CAR. Procedures and macros that do not check the types of their arguments have `unsafe' in their names.

All of the C functions and macros described have prototypes or definitions in the file c/scheme48.h. The C type for Scheme values is defined there to be s48_value.

7.1.3  Garbage collection

Scheme 48 uses a copying garbage collector. The collector must be able to locate all references to objects allocated in the Scheme 48 heap in order to ensure that storage is not reclaimed prematurely and to update references to objects moved by the collector. The garbage collector may run whenever an object is allocated in the heap. C variables whose values are Scheme 48 objects and which are live across heap allocation calls need to be registered with the garbage collector. See section 7.8 for more information.

7.2  Shared bindings

Shared bindings are the means by which named values are shared between Scheme code and C code. There are two separate tables of shared bindings, one for values defined in Scheme and accessed from C and the other for values going the other way. Shared bindings actually bind names to cells, to allow a name to be looked up before it has been assigned. This is necessary because C initialization code may be run before or after the corresponding Scheme code, depending on whether the Scheme code is in the resumed image or is run in the current session.

7.2.1  Exporting Scheme values to C

Define-exported-binding makes value available to C code under as name which must be a string, creating a new shared binding if necessary. The C function s48_get_imported_binding returns the shared binding defined for name, again creating it if necessary. The C macro S48_SHARED_BINDING_REF dereferences a shared binding, returning its current value.

7.2.2  Exporting C values to Scheme

These are used to define shared bindings from C and to access them from Scheme. Again, if a name is looked up before it has been defined, a new binding is created for it.

The common case of exporting a C function to Scheme can be done using the macro S48_EXPORT_FUNCTION(name). This expands into

s48_define_exported_binding("name",
                               s48_enter_pointer(name))

which boxes the function into a Scheme byte vector and then exports it. Note that s48_enter_pointer allocates space in the Scheme heap and might trigger a garbage collection; see Section 7.8.

These macros simplify importing definitions from C to Scheme. They expand into

(define name (lookup-imported-binding c-name))

where c-name is as supplied for the second form. For the first form c-name is derived from name by replacing `-' with `_' and converting letters to lowercase. For example, (import-definition my-foo) expands into

(define my-foo (lookup-imported-binding "my_foo"))

7.2.3  Complete shared binding interface

There are a number of other Scheme functions related to shared bindings; these are in the structure shared-bindings.

Shared-binding? is the predicate for shared-bindings. Shared-binding-name returns the name of a binding. Shared-binding-is-import? is true if the binding was defined from C. Shared-binding-set! changes the value of a binding. Define-imported-binding and lookup-exported-binding are Scheme versions of s48_define_exported_binding and s48_lookup_imported_binding. The two undefine- procedures remove bindings from the two tables. They do nothing if the name is not found in the table.

The following C macros correspond to the Scheme functions above.

7.3  Calling C functions from Scheme

There are three different ways to call C functions from Scheme, depending on how the C function was obtained.

Each of these applies its first argument, a C function, to the rest of the arguments. For call-imported-binding the function argument must be an imported binding.

For all of these, the C function is passed the argi values and the value returned is that returned by C procedure. No automatic representation conversion occurs for either arguments or return values. Up to twelve arguments may be passed. There is no method supplied for returning multiple values to Scheme from C (or vice versa) (mainly because C does not have multiple return values).

Keyboard interrupts that occur during a call to a C function are ignored until the function returns to Scheme (this is clearly a problem; we are working on a solution).

These macros simplify importing functions from C. They define name to be a function with the given formals that applies those formals to the corresponding C binding. C-name, if supplied, should be a string. These expand into

(define temp (lookup-imported-binding c-name))
(define name
  (lambda (formal ...)
    (call-imported-binding temp formal ...)))

If c-name is not supplied, it is derived from name by converting all letters to lowercase and replacing `-' with `_'.

7.4  Dynamic loading

External code can be loaded into a running Scheme 48 -- at least on most variants of Unix and on Windows. The required Scheme functions are in the structure load-dynamic-externals.

To be suitable for dynamic loading, the externals code must reside in a shared object. The shared object must define two functions:

The s48_on_load is run upon loading the shared objects. It typically contains invocations of S48_EXPORT_FUNCTION to make the functionality defined by the shared object known to Scheme 48. Scheme 48 calls s48_on_reload when it loads the shared object for the second time, or some new version thereof. (More on that later.) Most of the time, s48_on_reload will simply call s48_on_load.

For Linux, the following commands compile foo.c into a file foo.so that can be loaded dynamically.

% gcc -c -o foo.o foo.c
% ld -shared -o foo.so foo.o

The following procedures provide the basic functionality for loading shared objects containing dynamic externals:

Load-dynamic-externals loads the named shared objects. The plete? argument determines whether Scheme 48 appends the OS-specific suffix (typically .so for Unix, and .dll for Windows) to the name. The rrepeat? argument determines how load-dynamic-externals behaves if it is called again with the same argument: If this is true, it reloads the shared object (and calls its s48_on_reload function), otherwise, it will not do anything. The rresume? argument determines if an image subsequently dumped will try to load the shared object again automatically. (The shared objects will be loaded before any record resumers run.) Load-dynamic-externals returns a handle identifying the shared object just loaded.

Unload-dynamic-externals unloads the shared object associated with the handle passed as its argument. Note that this invalidates all external bindings associated with the shared object; referring to any of them will probably crash the program.

Reload-dynamic-externals will reload the shared object named by its argument and call its s48_on_reload function.

This procedure represents the expected most usage for loading dynamic-externals. It is best explained by its definition:

(define (import-dynamic-externals name)
  (load-dynamic-externals name #t #f #t))

7.5  Compatibility

Scheme 48's old external-call function is still available in the structure externals, which now also includes external-name and external-value. The old scheme48.h file has been renamed old-scheme48.h.

Also, the old code for loading external code dynamically is still available in the dynamic-externals structure, but will probably disappear in a future release.

7.6  Accessing Scheme data from C

The C header file scheme48.h provides access to Scheme 48 data structures. The type s48_value is used for Scheme values. When the type of a value is known, such as the integer returned by vector-length or the boolean returned by pair?, the corresponding C procedure returns a C value of the appropriate type, and not a s48_value. Predicates return 1 for true and 0 for false.

7.6.1  Constants

The following macros denote Scheme constants:

7.6.2  Converting values

The following macros and functions convert values between Scheme and C representations. The `extract' ones convert from Scheme to C and the `enter's go the other way.

S48_EXTRACT_BOOLEAN is false if its argument is #f and true otherwise. S48_ENTER_BOOLEAN is #f if its argument is zero and #t otherwise.

s48_extract_string and s48_extract_byte_vector return pointers to the actual storage used by the string or byte vector. These pointers are valid only until the next; see Section 7.8.

The second argument to s48_enter_byte_vector is the length of byte vector.

s48_enter_integer() needs to allocate storage when its argument is too large to fit in a Scheme 48 fixnum. In cases where the number is known to fit within a fixnum (currently 30 bits including the sign), the following procedures can be used. These have the disadvantage of only having a limited range, but the advantage of never causing a garbage collection. S48_FIXNUM_P is a macro that true if its argument is a fixnum and false otherwise.

S48_TRUE_P is true if its argument is S48_TRUE and S48_FALSE_P is true if its argument is S48_FALSE.

An error is signalled if s48_extract_fixnum's argument is not a fixnum or if the argument to s48_enter_fixnum is less than S48_MIN_FIXNUM_VALUE or greater than S48_MAX_FIXNUM_VALUE ( - 229 and 229 - 1 in the current system).

7.6.3  C versions of Scheme procedures

The following macros and procedures are C versions of Scheme procedures. The names were derived by replacing `-' with `_', `?' with `_P', and dropping `!.

7.7  Calling Scheme functions from C

External code that has been called from Scheme can call back to Scheme procedures using the following function.

This calls the Scheme procedure p on nargs arguments, which are passed as additional arguments to s48_call_scheme. There may be at most twelve arguments. The value returned by the Scheme procedure is returned by the C procedure. Invoking any Scheme procedure may potentially cause a garbage collection.

There are some complications that occur when mixing calls from C to Scheme with continuations and threads. C only supports downward continuations (via longjmp()). Scheme continuations that capture a portion of the C stack have to follow the same restriction. For example, suppose Scheme procedure s0 captures continuation a and then calls C procedure c0, which in turn calls Scheme procedure s1. Procedure s1 can safely call the continuation a, because that is a downward use. When a is called Scheme 48 will remove the portion of the C stack used by the call to c0. On the other hand, if s1 captures a continuation, that continuation cannot be used from s0, because by the time control returns to s0 the C stack used by c0 will no longer be valid. An attempt to invoke an upward continuation that is closed over a portion of the C stack will raise an exception.

In Scheme 48 threads are implemented using continuations, so the downward restriction applies to them as well. An attempt to return from Scheme to C at a time when the appropriate C frame is not on top of the C stack will cause the current thread to block until the frame is available. For example, suppose thread t0 calls a C procedure which calls back to Scheme, at which point control switches to thread t1, which also calls C and then back to Scheme. At this point both t0 and t1 have active calls to C on the C stack, with t1's C frame above t0's. If thread t0 attempts to return from Scheme to C it will block, as its frame is not accessible. Once t1 has returned to C and from there to Scheme, t0 will be able to resume. The return to Scheme is required because context switches can only occur while Scheme code is running. T0 will also be able to resume if t1 uses a continuation to throw past its call to C.

7.8  Interacting with the Scheme heap

Scheme 48 uses a copying, precise garbage collector. Any procedure that allocates objects within the Scheme 48 heap may trigger a garbage collection. Variables bound to values in the Scheme 48 heap need to be registered with the garbage collector so that the value will be retained and so that the variables will be updated if the garbage collector moves the object. The garbage collector has no facility for updating pointers to the interiors of objects, so such pointers, for example the ones returned by EXTRACT_STRING, will likely become invalid when a garbage collection occurs.

7.8.1  Registering objects with the GC

A set of macros are used to manage the registration of local variables with the garbage collector.

S48_DECLARE_GC_PROTECT(n), where 1< n< 9, allocates storage for registering n variables. At most one use of S48_DECLARE_GC_PROTECT may occur in a block. S48_GC_PROTECT_n(v1, ..., vn) registers the n variables (l-values) with the garbage collector. It must be within scope of a S48_DECLARE_GC_PROTECT(n) and be before any code which can cause a GC. S48_GC_UNPROTECT removes the block's protected variables from the garbage collector's list. It must be called at the end of the block after any code which may cause a garbage collection. Omitting any of the three may cause serious and hard-to-debug problems. Notably, the garbage collector may relocate an object and invalidate s48_value variables which are not protected.

A gc-protection-mismatch exception is raised if, when a C procedure returns to Scheme, the calls to S48_GC_PROTECT() have not been matched by an equal number of calls to S48_GC_UNPROTECT().

Global variables may also be registered with the garbage collector.

S48_GC_PROTECT_GLOBAL permanently registers the variable value (an l-value of type s48_value) with the garbage collector. It returns a handle pointer for use as an argument to S48_GC_UNPROTECT_GLOBAL, which unregisters the variable again.

7.8.2  Keeping C data structures in the Scheme heap

C data structures can be kept in the Scheme heap by embedding them inside byte vectors. The following macros can be used to create and access embedded C objects.

S48_MAKE_VALUE makes a byte vector large enough to hold an object whose type is type. S48_EXTRACT_VALUE returns the contents of a byte vector cast to type, and S48_EXTRACT_VALUE_POINTER returns a pointer to the contents of the byte vector. The value returned by S48_EXTRACT_VALUE_POINTER is valid only until the next garbage collection.

S48_SET_VALUE stores value into the byte vector.

7.8.3  C code and heap images

Scheme 48 uses dumped heap images to restore a previous system state. The Scheme 48 heap is written into a file in a machine-independent and operating-system-independent format. The procedures described above may be used to create objects in the Scheme heap that contain information specific to the current machine, operating system, or process. A heap image containing such objects may not work correctly when resumed.

To address this problem, a record type may be given a `resumer' procedure. On startup, the resumer procedure for a type is applied to each record of that type in the image being restarted. This procedure can update the record in a manner appropriate to the machine, operating system, or process used to resume the image.

Define-record-resumer defines procedure, which should accept one argument, to be the resumer for record-type. The order in which resumer procedures are called is not specified.

The procedure argument to define-record-resumer may be #f, in which case records of the given type are not written out in heap images. When writing a heap image any reference to such a record is replaced by the value of the record's first field, and an exception is raised after the image is written.

7.9  Using Scheme records in C code

External modules can create records and access their slots positionally.

The argument to s48_make_record should be a shared binding whose value is a record type. In C the fields of Scheme records are only accessible via offsets, with the first field having offset zero, the second offset one, and so forth. If the order of the fields is changed in the Scheme definition of the record type the C code must be updated as well.

For example, given the following record-type definition

(define-record-type thing :thing
  (make-thing a b)
  thing?
  (a thing-a)
  (b thing-b))

the identifier :thing is bound to the record type and can be exported to C:

(define-exported-binding "thing-record-type" :thing)

Thing records can then be made in C:

static s48_value
  thing_record_type_binding = S48_FALSE;

void initialize_things(void)
{
  S48_GC_PROTECT_GLOBAL(thing_record_type_binding);
  thing_record_type_binding =
     s48_get_imported_binding("thing-record-type");
}

s48_value make_thing(s48_value a, s48_value b)
{
  s48_value thing;
  s48_DECLARE_GC_PROTECT(2);

  S48_GC_PROTECT_2(a, b);

  thing = s48_make_record(thing_record_type_binding);
  S48_RECORD_SET(thing, 0, a);
  S48_RECORD_SET(thing, 1, b);

  S48_GC_UNPROTECT();

  return thing;
}

Note that the variables a and b must be protected against the possibility of a garbage collection occuring during the call to s48_make_record().

7.10  Raising exceptions from external code

The following macros explicitly raise certain errors, immediately returning to Scheme 48. Raising an exception performs all necessary clean-up actions to properly return to Scheme 48, including adjusting the stack of protected variables.

s48_raise_scheme_exception is the base procedure for raising exceptions. type is the type of exception, and should be one of the S48_EXCEPTION_...constants defined in scheme48arch.h. nargs is the number of additional values to be included in the exception; these follow the nargs argument and should all have type s48_value. s48_raise_scheme_exception never returns.

The following procedures are available for raising particular types of exceptions. Like s48_raise_scheme_exception these never return.

An argument type error indicates that the given value is of the wrong type. An argument number error is raised when the number of arguments, nargs, should be, but isn't, between min and max, inclusive. Similarly, a range error indicates that value is not between between min and max, inclusive.

The following macros raise argument type errors if their argument does not have the required type. S48_CHECK_BOOLEAN raises an error if its argument is neither #t or #f.

7.11  Unsafe functions and macros

All of the C procedures and macros described above check that their arguments have the appropriate types and that indexes are in range. The following procedures and macros are identical to those described above, except that they do not perform type and range checks. They are provided for the purpose of writing more efficient code; their general use is not recommended.