This chapter describes the foreign-function interface for calling C functions from Scheme, calling Scheme functions from C, and allocating storage in the Scheme heap. Scheme 48 manages stub functions in C that negotiate between the calling conventions of Scheme and C and the memory allocation policies of both worlds. No stub generator is available yet, but writing stubs is a straightforward task.
The foreign-function interface is modeled after the Java Native Interface (JNI), more information can be found at Sun’s JNI Page.
Currently, Scheme 48 supports two foreign-function interfaces: The old GCPROTECT-style and the new JNI-style interface (this chapter) live side by side. The old interface is deprecated and will go away in a future release. Section 8.12 gives a recipe how to convert external code from the old to the new interface.
The following facilities are available for interfacing between Scheme 48 and C:
Scheme code can call C functions.
The external interface provides full introspection for all Scheme objects. External code may inspect, modify, and allocate Scheme objects arbitrarily.
External code may raise exceptions back to Scheme 48 to signal errors.
External code may call back into Scheme. Scheme 48 correctly unrolls the process stack on non-local exits.
External modules may register bindings of names to values with a central registry accessible from Scheme. Conversely, Scheme code can register shared bindings for access by C code.
The structure external-calls has most of the Scheme functions described here. The others are in load-dynamic-externals, which has the functions for dynamic loading and name lookup from Section 8.4, and shared-bindings, which has the additional shared-binding functions described in section 8.2.3.
The names of all of Scheme 48’s visible C bindings begin with ‘s48_’ (for procedures, variables, and macros). Note that the new foreign-function interface does not distinguish between procedures and macros. Whenever a C name is derived from a Scheme identifier, we replace ‘-’ with ‘_’ and convert letters to lowercase. A final ‘?’ converted to ‘_p’, a final ‘!’ is dropped. As a naming convention, all functions and macros of the new foreign-function interface end in ‘_2’ (for now) to make them distinguishable from the old interface’s functions and macros. Thus the C macro for Scheme’s pair? is s48_pair_p_2 and the one for set-car! is s48_set_car_2. Procedures and macros that do not check the types of their arguments have ‘unsafe’ in their names.
All of the C functions and macros described have prototypes or definitions in the file c/scheme48.h.
Scheme 48 uses a precise, copying garbage collector. The garbage collector may run whenever an object is allocated in the heap. The collector must be able to locate all references to objects allocated in the Scheme 48 heap in order to ensure that storage is not reclaimed prematurely and to update references to objects moved by the collector. This interface takes care of communicating to the garbage collector what objects it uses in most situations. It relieves the programmer from having to think about garbage collector interactions in the common case.
This interface does not give external code direct access to Scheme objects. It introduces one level of indirection as external code never accepts or returns Scheme values directly. Instead, external code accepts or returns reference objects of type s48_ref_t that refer to Scheme values (their C type is defined to be s48_value). This indirection is only needed as an interface to external code, interior pointers in Scheme objects are unaffected.
There are two types of reference objects:
Scheme objects that are passed to external functions are passed as local references. External functions return Scheme objects as local references. External code has to manually manage Scheme objects that outlive a function call as global references. Scheme objects outlive a function call if they are assigned to a global variable of the external code or stored in long-living external objects, see section 8.7.1.
A local reference is valid only within the dynamic context of the native method that creates it. Therefore, a local reference behaves exactly like a local variable in the external code: It is live as long as external code can access it. To achieve this, every external function in the interface that accepts or returns reference objects takes a call object of type s48_call_t as its first argument. A call object corresponds to a particular call from Scheme to C. The call object holds all the references that belong to a call (like the call’s arguments and return value) to external code from Scheme. External code may pass a local reference through multiple external functions. The foreign-function interface automatically frees all the local references a call object owns, along with the call object itself, when an external call returns to Scheme.
This means that in the common case of Scheme calling an external function that does some work on its arguments and returns without stashing any Scheme objects in global variables or global data structures, the external code does not need to do any bookkeeping, since all the reference objects the external code accumulates are local references. Once the call returns, the foreign-function interface frees all the local references.
For example, the functions to construct and access pairs are declared like this:
s48_ref_t s48_cons_2(s48_call_t call, s48_ref_t car, s48_ref_t cdr);
s48_ref_t s48_car_2(s48_call_t call, s48_ref_t pair);
s48_ref_t s48_cdr_2(s48_call_t call, s48_ref_t pair);
This foreign-function interface takes a significant burden off the programmer as it handles most common cases automatically. If all the Scheme objects are live for the extent of the current external call, the programmer does not have to do anything at all. Since the lifetime of the Scheme objects is then identical with the lifetime of the according reference objects. In this case, the systems automatically manages both for the programmer. Using this foreign-function interface does not make the code more complex; the code stays compact and readable. The programmer has to get accustomed to passing the call argument around.
How to manage Scheme objects that outlive the current call is described in section 8.7.1.
Section 8.12 gives a recipe how to convert external code from the old GCPROTECT-style interface to the new JNI-style interface.
Shared bindings are the means by which named values are shared between Scheme code and C code. There are two separate tables of shared bindings, one for values defined in Scheme and accessed from C and the other for values going the other way. Shared bindings actually bind names to cells, to allow a name to be looked up before it has been assigned. This is necessary because C initialization code may be run before or after the corresponding Scheme code, depending on whether the Scheme code is in the resumed image or is run in the current session.
s48_ref_t s48_get_imported_binding_2(char *name)
s48_ref_t s48_get_imported_binding_local_2(s48_call_t call, char *name)
s48_ref_t s48_shared_binding_ref_2(s48_call_t call, s48_ref_t shared_binding)
Define-exported-binding makes value available to C code under name, which must be a string, creating a new shared binding if necessary. The C function s48_get_imported_binding_2 returns a global reference to the shared binding defined for name, again creating it if necessary, s48_get_imported_binding_local_2 returns a local reference to the shared binding (see section 8.1.3 for details on reference objects). The C macro s48_shared_binding_ref_2 dereferences a shared binding, returning its current value.
Since shared bindings are defined during initialization, i.e. outside an external call, there is no call object. Therefore, exporting shared bindings from C does not use the new foreign-function interfaces specifications.
void s48_define_exported_binding(char *name, s48_value v)
These are used to define shared bindings from C and to access them from Scheme. Again, if a name is looked up before it has been defined, a new binding is created for it.
The common case of exporting a C function to Scheme can be done using the macro s48_export_function(name). This expands into
s48_define_exported_binding("name", s48_enter_pointer(name))
which boxes the function pointer into a Scheme byte vector and then exports it. Note that s48_enter_pointer allocates space in the Scheme heap and might trigger a garbage collection; see Section 8.7.
These macros simplify importing definitions from C to Scheme. They expand into
(define name (lookup-imported-binding c-name))
where c-name is as supplied for the second form. For the first form c-name is derived from name by replacing ‘-’ with ‘_’ and converting letters to lowercase. For example, (import-definition my-foo) expands into
(define my-foo (lookup-imported-binding "my_foo"))
There are a number of other Scheme functions related to shared bindings; these are in the structure shared-bindings.
Shared-binding? is the predicate for shared-bindings. Shared-binding-name returns the name of a binding. Shared-binding-is-import? is true if the binding was defined from C. Shared-binding-set! changes the value of a binding. Define-imported-binding and lookup-exported-binding are Scheme versions of s48_define_exported_binding and s48_lookup_imported_binding. The two undefine- procedures remove bindings from the two tables. They do nothing if the name is not found in the table.
The following C macros correspond to the Scheme functions above.
int s48_shared_binding_p(s48_call_t call, x)
int s48_shared_binding_is_import_p(s48_call_t call, s48_ref_t s_b)
s48_ref_t s48_shared_binding_name(s48_call_t call, s48_ref_t s_b)
void s48_shared_binding_set(s48_call_t call, s48_ref_t s_b, s48_ref_t v)
There are different ways to call C functions from Scheme, depending on how the C function was obtained.
Each of these applies its first argument, a C function that accepts and/or returns objects of type s48_ref_t and has its first argument of type s48_call_t, to the rest of the arguments. For call-imported-binding-2 the function argument must be an imported binding.
For all of these, the interface passes the current call object and the argi values to the C function and the value returned is that returned by C procedure. No automatic representation conversion occurs for either arguments or return values. Up to twelve arguments may be passed. There is no method supplied for returning multiple values to Scheme from C (or vice versa) (mainly because C does not have multiple return values).
Keyboard interrupts that occur during a call to a C function are ignored until the function returns to Scheme (this is clearly a problem; we are working on a solution).
(import-lambda-definition-2 name (formal ...) c-name) (syntax)
These macros simplify importing functions from C that follow the return value and argument conventions of the foreign-function interface and use s48_call_t and s48_ref_t as their argument and return types. They define name to be a function with the given formals that applies those formals to the corresponding C binding. C-name, if supplied, should be a string. These expand into
(define temp (lookup-imported-binding c-name)) (define name (lambda (formal ...) (call-imported-binding-2 temp formal ...)))
If c-name is not supplied, it is derived from name by converting all letters to lowercase and replacing ‘-’ with ‘_’.
External code can be loaded into a running Scheme 48---at least on most variants of Unix and on Windows. The required Scheme functions are in the structure load-dynamic-externals.
To be suitable for dynamic loading, the externals code must reside in a shared object. The shared object must define a function:
void s48_on_load(void)
The s48_on_load is run upon loading the shared objects. It typically contains invocations of S48_EXPORT_FUNCTION to make the functionality defined by the shared object known to Scheme 48.
The shared object may also define either or both of the following functions:
void s48_on_unload(void)
void s48_on_reload(void)
Scheme 48 calls s48_on_unload just before it unloads the shared object. If s48_on_reload is present, Scheme 48 calls it when it loads the shared object for the second time, or some new version thereof. If it is not present, Scheme 48 calls s48_on_load instead. (More on that later.)
For Linux, the following commands compile foo.c into a file foo.so that can be loaded dynamically.
% gcc -c -o foo.o foo.c % ld -shared -o foo.so foo.o
The following procedures provide the basic functionality for loading shared objects containing dynamic externals:
Load-dynamic-externals loads the named shared objects. The plete? argument determines whether Scheme 48 appends the OS-specific suffix (typically .so for Unix, and .dll for Windows) to the name. The rrepeat? argument determines how load-dynamic-externals behaves if it is called again with the same argument: If this is true, it reloads the shared object (and calls its s48_on_unload on unloading if present, and, after reloading, s48_on_reload if present or s48_on_load if not), otherwise, it will not do anything. The rresume? argument determines if an image subsequently dumped will try to load the shared object again automatically. (The shared objects will be loaded before any record resumers run.) Load-dynamic-externals returns a handle identifying the shared object just loaded.
Unload-dynamic-externals unloads the shared object associated with the handle passed as its argument, previously calling its s48_on_unload function if present. Note that this invalidates all external bindings associated with the shared object; referring to any of them will probably crash the program.
Reload-dynamic-externals will reload the shared object named by its argument and call its s48_on_unload function before unloading, and, after reloading, s48_on_reload if present or s48_on_load if not.
This procedure represents the expected most usage for loading dynamic-externals. It is best explained by its definition:
(define (import-dynamic-externals name) (load-dynamic-externals name #t #f #t))
The C header file scheme48.h provides access to Scheme 48 data structures. The type s48_ref_t is used for reference objects that refer to Scheme values. When the type of a value is known, such as the integer returned by vector-length or the boolean returned by pair?, the corresponding C procedure returns a C value of the appropriate type, and not a s48_ref_t. Predicates return 1 for true and 0 for false.
The following macros denote Scheme constants:
s48_false_2(s48_call_t) is #f
.
s48_true_2(s48_call_t) is #t
.
s48_null_2(s48_call_t) is the empty list.
s48_unspecific_2(s48_call_t) is a value used for functions which have no meaningful return value (in Scheme 48 this value returned by the nullary procedure unspecific in the structure util).
s48_eof_2(s48_call_t) is the end-of-file object (in Scheme 48 this value is returned by the nullary procedure eof-object in the structure i/o-internal).
The following macros and functions convert values between Scheme and C representations. The ‘extract’ ones convert from Scheme to C and the ‘enter’s go the other way.
int s48_extract_boolean_2(s48_call_t, s48_ref_t)
long s48_extract_char_2(s48_call_t, s48_ref_t)
char * s48_extract_byte_vector_2(s48_call_t, s48_ref_t)
long s48_extract_long_2(s48_call_t, s48_ref_t)
long s48_extract_unsigned_long_2(s48_call_t, s48_ref_t)
double s48_extract_double_2(s48_call_t, s48_ref_t)
s48_ref_t s48_enter_boolean_2(s48_call_t, int)
s48_ref_t s48_enter_char_2(s48_call_t, long)
s48_ref_t s48_enter_byte_vector_2(s48_call_t, char *, long) (may GC)
s48_ref_t s48_enter_long_2(s48_call_t, long) (may GC)
s48_ref_t s48_enter_long_as_fixnum_2(s48_call_t, long) (may GC)
s48_ref_t s48_enter_double_2(s48_call_t, double) (may GC)
s48_extract_boolean_2 is false if its argument is #f and true otherwise. s48_enter_boolean_2 is #f if its argument is zero and #t otherwise.
The s48_extract_char_2 function extracts the scalar value from a Scheme character as a C long. Conversely, s48_enter_char_2 creates a Scheme character from a scalar value. (Note that ASCII values are also scalar values.)
The s48_extract_byte_vector_2 function needs to deal with the garbage collector to avoid invalidating the returned pointer. For more details see section 8.7.3.
The second argument to s48_enter_byte_vector_2 is the length of byte vector.
s48_enter_long_2() needs to allocate storage when its argument is too large to fit in a Scheme 48 fixnum. In cases where the number is known to fit within a fixnum (currently 30 bits on a 32-bits architecture and 62 bit on a 64-bits architecture including the sign), the following procedures can be used. These have the disadvantage of only having a limited range, but the advantage of never causing a garbage collection. s48_fixnum_p_2(s48_call_t) is a macro that true if its argument is a fixnum and false otherwise.
int s48_fixnum_p_2(s48_call_t, s48_ref_t)
s48_ref_t s48_enter_long_as_fixnum_2(s48_call_t, long)
long S48_MAX_FIXNUM_VALUE
long S48_MIN_FIXNUM_VALUE
An error is signaled if the argument to s48_enter_fixnum is less than S48_MIN_FIXNUM_VALUE or greater than S48_MAX_FIXNUM_VALUE (−229 and 229−1 on a 32-bits architecture and −261 and 262−1 on a 64-bits architecture).
int s48_true_p_2(s48_call_t, s48_ref_t)
int s48_false_p_2(s48_call_t, s48_ref_t)
s48_true_p is true if its argument is s48_true and s48_false_p is true if its argument is s48_false.
s48_ref_t s48_enter_string_latin_1_2(s48_call_t, char*); (may GC)
s48_ref_t s48_enter_string_latin_1_n_2(s48_call_t, char*, long); (may GC)
long s48_string_latin_1_length_2(s48_call_t, s48_ref_t);
long s48_string_latin_1_length_n_2(s48_call_t, s48_ref_t, long, long);
void s48_copy_latin_1_to_string_2(s48_call_t, char*, s48_ref_t);
void s48_copy_latin_1_to_string_n_2(s48_call_t, char*, long, s48_ref_t);
void s48_copy_string_to_latin_1_2(s48_call_t, s48_ref_t, char*);
void s48_copy_string_to_latin_1_n_2(s48_call_t, s48_ref_t, long, long, char*);
s48_ref_t s48_enter_string_utf_8_2(s48_call_t, char*); (may GC)
s48_ref_t s48_enter_string_utf_8_n_2(s48_call_t, char*, long); (may GC)
long s48_string_utf_8_length_2(s48_call_t, s48_ref_t);
long s48_string_utf_8_length_n_2(s48_call_t, s48_ref_t, long, long);
long s48_copy_string_to_utf_8_2(s48_call_t, s48_ref_t, char*);
long s48_copy_string_to_utf_8_n_2(s48_call_t, s48_ref_t, long, long, char*);
s48_ref_t s48_enter_string_utf_16be_2(s48_call_t, char*); (may GC)
s48_ref_t s48_enter_string_utf_16be_n_2(s48_call_t, char*, long); (may GC)
long s48_string_utf_16be_length_2(s48_call_t, s48_ref_t);
long s48_string_utf_16be_length_n_2(s48_call_t, s48_ref_t, long, long);
long s48_copy_string_to_utf_16be_2(s48_call_t, s48_ref_t, char*);
long s48_copy_string_to_utf_16be_n_2(s48_call_t, s48_ref_t, long, long, char*);
s48_ref_t s48_enter_string_utf_16le_2(s48_call_t, char*); (may GC)
s48_ref_t s48_enter_string_utf_16le_n_2(s48_call_t, char*, long); (may GC)
long s48_string_utf_16le_length_2(s48_call_t, s48_ref_t);
long s48_string_utf_16le_length_n_2(s48_call_t, s48_ref_t, long, long);
long s48_copy_string_to_utf_16le_2(s48_call_t, s48_ref_t, char*);
long s48_copy_string_to_utf_16le_n_2(s48_call_t, s48_ref_t, long, long, char*);
The s48_enter_string_latin_1_2 function creates a Scheme string, initializing its contents from its NUL-terminated, Latin-1-encoded argument. The s48_enter_string_latin_1_n_2 function does the same, but allows specifying the length explicitly---no NUL terminator is necessary.
The s48_string_latin_1_length_2 function computes the length that the Latin-1 encoding of its argument (a Scheme string) would occupy, not including NUL termination. The s48_string_latin_1_length_2 function does the same, but allows specifying a starting index and a count into the input string.
The s48_copy_latin_1_to_string_2 function copies Latin-1-encoded characters from its second NUL-terminated argument to the Scheme string that is its third argument. The s48_copy_latin_1_to_string_n_2 does the same, but allows specifying the number of characters explicitly. The s48_copy_string_to_latin_1_2 function converts the characters of the Scheme string specified as the second argument into Latin-1 and writes them into the string specified as the third argument. (Note that it does not NUL-terminate the result.) The s48_copy_string_to_latin_1_n_2 function does the same, but allows specifying a starting index and a character count into the source string.
The s48_extract_latin_1_from_string_2 function returns a buffer that contains the Latin-1 encoded characters including NUL termination of the Scheme string specified. The buffer that is returned is a local buffer managed by the foreign-function interface and is automatically freed on the return of the current call.
The s48_enter_string_utf_8_2 function creates a Scheme string, initializing its contents from its NUL-terminated, UTF-8-encoded argument. The s48_enter_string_utf_8_n_2 function does the same, but allows specifying the length explicitly---no NUL terminator is necessary.
The s48_string_utf_8_length_2 function computes the length that the UTF-8 encoding of its argument (a Scheme string) would occupy, not including NUL termination. The s48_string_utf_8_length_2 function does the same, but allows specifying a starting index and a count into the input string.
The s48_copy_string_to_utf_8_2 function converts the characters of the Scheme string specified as the second argument into UTF-8 and writes them into the string specified as the third argument. (Note that it does not NUL-terminate the result.) The s48_copy_string_to_utf_8_n_2 function does the same, but allows specifying a starting index and a character count into the source string. Both return the length of the written encodings in bytes.
The s48_extract_utf_8_from_string_2 function returns a buffer that contains the UTF-8 encoded characters including NUL termination of the Scheme string specified. The buffer that is returned is a local buffer managed by the foreign-function interface and is automatically freed on the return of the current call.
The functions with utf_16 in their names work analogously to their utf_8 counterparts, but implement the UTF-16 encodings. The lengths returned be the _length and copy_string_to functions are in terms of UTF-16 code units. The extract function returns a local buffer that contains UTF-16 code units including NUL termination.
The following macros and procedures are C versions of Scheme procedures. The names were derived by replacing ‘-’ with ‘_’, ‘?’ with ‘_p’, and dropping ‘!.
int s48_eq_p_2(s48_call_t, s48_ref_t, s48_ref_t)
int s48_char_p_2(s48_call_t, s48_ref_t)
int s48_null_p_2(s48_call_t, s48_ref_t)
int s48_pair_p_2(s48_call_t, s48_ref_t)
s48_ref_t s48_car_2(s48_call_t, s48_ref_t)
s48_ref_t s48_cdr_2(s48_call_t, s48_ref_t)
void s48_set_car_2(s48_call_t, s48_ref_t, s48_ref_t)
void s48_set_cdr_2(s48_call_t, s48_ref_t, s48_ref_t)
s48_ref_t s48_cons_2(s48_call_t, s48_ref_t, s48_ref_t) (may GC)
long s48_length_2(s48_call_t, s48_ref_t)
int s48_vector_p_2(s48_call_t, s48_ref_t)
long s48_vector_length_2(s48_call_t, s48_ref_t)
s48_ref_t s48_vector_ref_2(s48_call_t, s48_ref_t, long)
void s48_vector_set_2(s48_call_t, s48_ref_t, long, s48_ref_t)
s48_ref_t s48_make_vector_2(s48_call_t, long, s48_ref_t) (may GC)
int s48_string_p_2(s48_call_t, s48_ref_t)
long s48_string_length_2(s48_call_t, s48_ref_t)
long s48_string_ref_2(s48_call_t, s48_ref_t, long)
void s48_string_set_2(s48_call_t, s48_ref_t, long, long)
s48_ref_t s48_make_string_2(s48_call_t, long, char) (may GC)
int s48_symbol_p_2(s48_call_t, s48_ref_t)
s48_ref_t s48_symbol_to_string_2(s48_call_t, s48_ref_t)
int s48_byte_vector_p_2(s48_call_t, s48_ref_t)
long s48_byte_vector_length_2(s48_call_t, s48_ref_t)
char s48_byte_vector_ref_2(s48_call_t, s48_ref_t, long)
void s48_byte_vector_set_2(s48_call_t, s48_ref_t, long, int)
s48_ref_t s48_make_byte_vector_2(s48_call_t, long, int) (may GC)
External code that has been called from Scheme can call back to Scheme procedures using the following function.
s48_ref_t s48_call_scheme_2(s48_call_t, s48_ref_t p, long nargs, ...)
This calls the Scheme procedure p on nargs arguments, which are passed as additional arguments to s48_call_scheme_2. There may be at most twelve arguments. The value returned by the Scheme procedure is returned by the C procedure. Invoking any Scheme procedure may potentially cause a garbage collection.
There are some complications that occur when mixing calls from C to Scheme with continuations and threads. C only supports downward continuations (via longjmp()). Scheme continuations that capture a portion of the C stack have to follow the same restriction. For example, suppose Scheme procedure s0 captures continuation a and then calls C procedure c0, which in turn calls Scheme procedure s1. Procedure s1 can safely call the continuation a, because that is a downward use. When a is called Scheme 48 will remove the portion of the C stack used by the call to c0. On the other hand, if s1 captures a continuation, that continuation cannot be used from s0, because by the time control returns to s0 the C stack used by c0 will no longer be valid. An attempt to invoke an upward continuation that is closed over a portion of the C stack will raise an exception.
In Scheme 48 threads are implemented using continuations, so the downward restriction applies to them as well. An attempt to return from Scheme to C at a time when the appropriate C frame is not on top of the C stack will cause the current thread to block until the frame is available. For example, suppose thread t0 calls a C procedure which calls back to Scheme, at which point control switches to thread t1, which also calls C and then back to Scheme. At this point both t0 and t1 have active calls to C on the C stack, with t1’s C frame above t0’s. If thread t0 attempts to return from Scheme to C it will block, as its frame is not accessible. Once t1 has returned to C and from there to Scheme, t0 will be able to resume. The return to Scheme is required because context switches can only occur while Scheme code is running. T0 will also be able to resume if t1 uses a continuation to throw past its call to C.
Scheme 48 uses a copying, precise garbage collector. Any procedure that allocates objects within the Scheme 48 heap may trigger a garbage collection.
Local object references refer to values in the Scheme 48 heap and are automatically registered with the garbage collector by the interface for the duration of a function call from Scheme to C so that the values will be retained and the references will be updated if the garbage collector moves the object.
Global object references need to be created and freed explicitly for Scheme values that need to survive one function call, e.g. that are stored in global variables, global data structures or are passed to libraries. See section 8.7.1 for details.
Additionally, the interface provides local buffers that are malloc’d regions of memory valid for the duration of a function call and are freed automatically upon return from the external code. This relieves the programmer from explicitly freeing locally allocated memory. See section 8.7.2 for details.
The interface treats byte vectors in a special way, since the garbage collector has no facility for updating pointers to the interiors of objects, so that such pointers, for example the ones returned by s48_unsafe_extract_byte_vector_2, will likely become invalid when a garbage collection occurs. The interface provides a facility to prevent a garbage collection from invalidating pointers to byte vector’s memory region, see section 8.7.3 for details.
When external code needs a reference object to survive the current call, the external code needs to do explicit bookkeeping. Local references must not be stored in global variables of the external code or passed to other threads, since all local references are freed once the call returns, which leads to a dangling pointer in the global variable or other thread respectively. Instead, promote a local reference to a global reference and store it in a global variable or pass to another thread as global references will survive the current call. Since the foreign-function interface never automatically frees global references, the programmer must free them at the right time.
s48_ref_t s48_make_global_ref(s48_value obj)
void s48_free_global_ref(s48_ref_t ref)
s48_ref_t s48_local_to_global_ref(s48_ref_t ref)
s48_make_global_ref permanently registers the Scheme value obj as a global reference with the garbage collector. Basic Scheme values are _s48_value_true, _s48_value_false, _s48_value_null, _s48_value_unspecific, _s48_value_undefined, and _s48_value_eof.
To free a global reference an to unregister it with the garbage collector, use s48_free_global_ref. The function s48_local_to_global_ref promotes a local reference object to a global reference object.
For example, to maintain a global list of values, declare a static global variable
s48_ref_t global_list = NULL;
and initialize it in the external code’s initialization function
global_list = s48_make_global_ref(_s48_value_null);
Note that you need to use a Scheme value (not a reference object) as the argument for s48_make_global_ref, since there is not yet a call object at the time external code gets initialized. To add new_value to the list, you can use the following snippet:
temp = global_list; global_list = s48_local_to_global_ref(s48_cons_2(call, new_value, global_list)) s48_free_global_ref(temp);
You have to remember to always promote reference objects to global references when assigning to a global variable (and later, to free them manually). Note that it is more efficient to free the previous head of the list when adding a new element to the list.
The foreign-function interface supports the programmer with allocating memory in external code: The programmer can request chunks of memory, called local buffers, that are automatically freed on return from the current call.
void *s48_make_local_buf (s48_call_t, size_t)
void s48_free_local_buf (s48_call_t, void *)
The function s48_make_local_buf returns a block of memory of the given size in bytes. This memory freed by the foreign-function interface when the current call returns. To free the buffer manually, use s48_free_local_buf.
The interface treats byte vectors in a special way, since the garbage collector has no facility for updating pointers to the interiors of objects, so that such pointers, for example the ones returned by s48_unsafe_extract_byte_vector_2, will likely become invalid when a garbage collection occurs. The interface provides a facility to prevent a garbage collection from invalidating pointers to byte vector’s memory region. It does this by copying byte vectors that are used in external code from and to the Scheme heap.
These functions create byte vectors:
s48_ref_t s48_make_byte_vector_2(s48_call_t, long) (may GC)
s48_ref_t s48_make_unmovable_byte_vector_2(s48_call_t, long) (may GC)
s48_ref_t s48_enter_byte_vector_2(s48_call_t, const char *, long) (may GC)
s48_ref_t s48_enter_unmovable_byte_vector_2(s48_call_t, const char *, long) (may GC)
s48_make_byte_vector_2 creates a byte vector of given size, s48_make_unmovable_byte_vector_2 creates a byte vector in that is not moved by the garbage collector (only the Bibop garbage collector supports this). The functions s48_enter_byte_vector_2 and s48_enter_unmovable_byte_vector_2 create and initialize byte vectors.
The following functions copy byte vectors from and to the Scheme heap:
void s48_extract_byte_vector_region_2(s48_call_t, s48_ref_t, long, long, char*)
void s48_enter_byte_vector_region_2(s48_call_t, s48_ref_t, long, long, char*)
void s48_copy_from_byte_vector_2(s48_call_t, s48_ref_t, char *)
void s48_copy_to_byte_vector_2(s48_call_t, s48_ref_t, char *)
s48_extract_byte_vector_region_2 copies a given section from the given byte vector to its last argument, s48_enter_byte_vector_region_2 copies the contents of its last argument to its first argument to the given index. s48_copy_from_byte_vector_2 copies the whole byte vector to its last argument, s48_copy_to_byte_vector_2 copies the contents of its last argument to the byte vector.
char *s48_extract_byte_vector_unmanaged_2(s48_call_t, s48_ref_t)
void s48_release_byte_vector_2(s48_call_t, s48_ref_t, char*)
s48_extract_byte_vector_unmanaged_2 returns a local buffer that is valid during the current external call and copies the contents of the given byte vector to the returned buffer. The returned byte vector may be a copy of the Scheme byte vector, changes made to the returned byte vector will not necessarily be reflected in Scheme until s48_release_byte_vector_2 is called.
The following functions to access byte vectors come with the most support from the foreign-function interface. Byte vectors that are accessed via these functions are automatically managed by the interface and are copied back to Scheme on return from the current call:
char *s48_extract_byte_vector_2(s48_call_t, s48_ref_t)
char *s48_extract_byte_vector_readonly_2(s48_call_t, s48_ref_t)
s48_extract_byte_vector_2 extracts a byte vector from Scheme by making a copy of the byte vectors contents and returning a pointer to that copy. Changes to the byte vector are automatically copied back to the Scheme heap when the function returns, external code raises an exception, or external code calls a Scheme function. s48_extract_byte_vector_readonly_2 should be used for byte vectors that are not modified by external code, since these byte vectors are not copied back to Scheme.
Each reference object consumes a certain amount of memory itself, in addition to the memory taken by the referred Scheme object itself. Even though local references are eventually freed on return of an external call, there are some situations where it is desirable to free local references explicitly, since waiting until the call returns may be too long or never happen, which could keep unneeded objects live:
External code may create a large number of local references in a single external call. An example is the traversal of a list: Each call from external code to the functions that correspond to car and cdr returns a fresh local reference. To avoid the consumption of storage for local references proportional to the length of the list, the traversal must free the no-longer-needed references as it goes.
For example, this is a straightforward definition of an external function that calculates the length of a list:
s48_ref_t s48_length_2(s48_call_t call, s48_ref_t list) { long i = 0; while (!(s48_null_p_2(call, list))) { list = s48_cdr_2(call, list); ++i; } return s48_unsafe_enter_long_as_fixnum_2(call, i); }
In this implementation, each iteration step creates a new local reference object via s48_cdr_2 that is actually only needed for the next iteration step. As a result, this function creates new local references for every element of the list. The local references are live during the entire function call.
To avoid consuming storage proportional to the length of the list for all those local reference objects, the improved version cleans up the unneeded local reference on every iteration step:
s48_ref_t s48_length_2(s48_call_t call, s48_ref_t list) { s48_ref_t l = s48_copy_local_ref(call, list); long i = 0; while (!(s48_null_p_2(call, l))) { s48_ref_t temp = l; l = s48_cdr_2(call, l); s48_free_local_ref(call, temp); ++i; } return s48_unsafe_enter_long_as_fixnum_2(call, i); }
Note that without the call to s48_copy_local_ref the reference to the head of the list would be freed along with all the temporary references. This would render the whole list unusable after the return from s48_length_2.
The external call does not return at all. If the external function enters an infinite event dispatch loop, for example, it is crucial that the programmer releases local references manually that he created inside the loop so that they do not accumulate indefinitely and lead to a memory leak.
External code may hold a local reference to a large Scheme object. After the external code is done working on this object, it performs some additional computation before returning to the caller. The local reference to the large object prevents the object from being garbage collected until the external function returns, even if the object is no longer in use for the remainder of the computation. It is more space-efficient if the programmer frees the local reference when the external function does not need it any longer and will not return for quite some time.
There are common situations where local references are created solely to be passed to another function and afterwards never used again. In this case, the called function can free the local references of the arguments.
To improve memory usage while making subcalls from external calls, the foreign-function interface provides functionality to create a new (sub-)call object and clean the local references that are created during that subcall:
s48_call_t s48_make_subcall(s48_call_t call)
void s48_free_subcall(s48_call_t subcall)
s48_ref_t s48_finish_subcall(s48_call_t call, s48_call_t subcall, s48_ref_t ref)
s48_make_subcall returns a new call object that represents a subcall of the current call and can be passed as the call argument to any subcalls of the current call. Upon return of a subcall, s48_free_subcall frees the subcall and all the local references associated with it. The function s48_finish_subcall also frees the subcall and all the local references associated with it, but copies its third argument to the current call, so that it survives the subcall.
C data structures can be kept in the Scheme heap by embedding them inside byte vectors. The following macros can be used to create and access embedded C objects.
s48_ref_t s48_make_value_2(s48_call_t, type) (may GC)
s48_ref_t s48_make_sized_value_2(s48_call_t, size) (may GC)
type s48_extract_value_2(s48_call_t, s48_ref_t, type)
long s48_value_size_2(s48_call_t, s48_ref_t)
type * s48_extract_value_pointer_2(s48_call_t, s48_ref_t, type)
void s48_set_value_2(s48_call_t, s48_ref_t, type, value)
s48_make_value_2 makes a byte vector large enough to hold an object whose type is type. s48_make_sized_value_2 makes a byte vector large enough to hold an object of size bytes. s48_extract_value_2 returns the contents of a byte vector cast to type, s48_value_size_2 returns its size, and s48_extract_value_pointer_2 returns a pointer to the contents of the byte vector. The value returned by s48_extract_value_pointer_2 is valid only until the next garbage collection. s48_set_value_2 stores value into the byte vector.
Pointers to C data structures can be stored in the Scheme heap:
s48_ref_t s48_enter_pointer_2(s48_call_t, void *) (may GC)
void * s48_extract_pointer_2(s48_call_t, s48_ref_t) (may GC)
The function s48_enter_pointer_2 makes a byte vector large enough to hold the pointer value and stores the pointer value in the byte vector. The function s48_extract_pointer_2 extracts the pointer value from the scheme heap.
Scheme 48 uses dumped heap images to restore a previous system state. The Scheme 48 heap is written into a file in a machine-independent and operating-system-independent format. The procedures described above may be used to create objects in the Scheme heap that contain information specific to the current machine, operating system, or process. A heap image containing such objects may not work correctly when resumed.
To address this problem, a record type may be given a ‘resumer’ procedure. On startup, the resumer procedure for a type is applied to each record of that type in the image being restarted. This procedure can update the record in a manner appropriate to the machine, operating system, or process used to resume the image.
Define-record-resumer defines procedure, which should accept one argument, to be the resumer for record-type. The order in which resumer procedures are called is not specified.
The procedure argument to define-record-resumer may be #f, in which case records of the given type are not written out in heap images. When writing a heap image any reference to such a record is replaced by the value of the record’s first field, and an exception is raised after the image is written.
External modules can create records and access their slots positionally.
s48_ref_t s48_make_record_2(s48_call_t, s48_ref_t) (may GC)
int s48_record_p_2(s48_call_t, s48_ref_t)
s48_ref_t s48_record_type_2(s48_call_t, s48_ref_t)
s48_ref_t s48_record_ref_2(s48_call_t, s48_ref_t, long)
void s48_record_set_2(s48_call_t, s48_ref_t, long, s48_ref_t)
The argument to s48_make_record_2 should be a shared binding whose value is a record type. In C the fields of Scheme records are only accessible via offsets, with the first field having offset zero, the second offset one, and so forth. If the order of the fields is changed in the Scheme definition of the record type the C code must be updated as well.
For example, given the following record-type definition
(define-record-type thing :thing (make-thing a b) thing? (a thing-a) (b thing-b))
the identifier :thing is bound to the record type and can be exported to C:
(define-exported-binding "thing-record-type" :thing)
Thing records can then be made in C:
static s48_ref_t thing_record_type_binding = NULL; void initialize_things(void) { thing_record_type_binding = s48_get_imported_binding_2("thing-record-type"); } s48_ref_t make_thing(s48_call_t call, s48_ref_t a, s48_ref_t b) { s48_ref_t thing; thing = s48_make_record_2(call, thing_record_type_binding); s48_record_set_2(call, thing, 0, a); s48_record_set_2(call, thing, 1, b); return thing; }
Note that the interface takes care of protecting all local references against the possibility of a garbage collection occurring during the call to s48_make_record_2(); also note that the record type binding is a global reference that is live until explicitly freed.
The following macros explicitly raise certain errors, immediately returning to Scheme 48. Raising an exception performs all necessary clean-up actions to properly return to Scheme 48, including adjusting the stack of protected variables.
The following procedures are available for raising particular types of exceptions. These never return.
s48_assertion_violation_2(s48_call_t, const char* who, const char* message, long count, ...)
s48_error_2(s48_call_t, const char* who, const char* message, long count, ...)
s48_os_error_2(s48_call_t, const char* who, const char* message, long count, ...)
s48_out_of_memory_error_2(s48_call_t, )
An assertion violation signaled via s48_assertion_violation_2 typically means that an invalid argument (or invalid number of arguments) has been passed. An error signaled via s48_error_2 means that an environmental error (like an I/O error) has occurred. In both cases, who indicates the location of the error, typically the name of the function it occurred in. It may be NULL, in which the system guesses a name. The message argument is an error message encoded in UTF-8. Additional arguments may be passed that become part of the condition object that will be raised on the Scheme side: count indicates their number, and the arguments (which must be of type s48_ref_t) follow.
The s48_os_error_2 function is like s48_error_2, except that the error message is inferred from an OS error code (as in strerror). The s48_out_of_memory_error_2 function signals that the system has run out of memory.
The following macros raise assertion violations if their argument does not have the required type. s48_check_boolean_2 raises an error if its argument is neither #t or #f.
void s48_check_boolean_2(s48_call_t, s48_ref_t)
void s48_check_symbol_2(s48_call_t, s48_ref_t)
void s48_check_pair_2(s48_call_t, s48_ref_t)
void s48_check_string_2(s48_call_t, s48_ref_t)
void s48_check_integer_2(s48_call_t, s48_ref_t)
void s48_check_channel_2(s48_call_t, s48_ref_t)
void s48_check_byte_vector_2(s48_call_t, s48_ref_t)
void s48_check_record_2(s48_call_t, s48_ref_t)
void s48_check_shared_binding_2(s48_call_t, s48_ref_t)
External code can push the occurrence of external events into the main Scheme 48 event loop and Scheme code can wait and act on external events.
On the Scheme side, the external events functionality consists of the following functions from the structure primitives:
And the following functions from the structure external-events:
The function new-external-event-uid returns a fresh event identifier on every call. When called with a shared binding instead of #f, new-external-event-uid returns a named event identifier for permanent use. The function unregister-external-event-uid unregisters the given event identifier.
External events use condition variables to synchronize the occurrence of events, see section 7.5 for more information on condition variables. The function register-condvar-for-external-event registers a condition variable with an event identifier. For convenience, the function new-external-event combines new-external-event-uid and register-condvar-for-external-event and returns a fresh event identifier and the corresponding condition variable.
The function wait-for-external-event blocks the caller (on the condition variable) until the Scheme main event loop receives an event notification (by s48_note_external_event) of the event identifier that is registered with the given condition variable (with register-condvar-for-external-event). There is no guarantee that the caller of wait-for-external-event is unblocked on every event notification, therefore the caller has to be prepared to handle multiple external events that have occurred and external code has to be prepared to store multiple external events.
The following prototype is the interface on the external side:
void s48_note_external_event(long)
External code has to collect external events and can use s48_note_external_event to signal the occurrence of an external event to the main event loop. The argument to s48_note_external_event is an event identifier that was previously registered on the Scheme side. Thus, external code has to obtain the event identifier from the Scheme side, either by passing the event identifier as an argument to the external function that calls s48_note_external_event or by exporting the Scheme value to C (see section 8.2.1).
Since the main event loop does not guarantee that every call to s48_note_external_event causes the just occurred event to get handled immediately, external code has to make sure that it can collect multiple external events (i.e. keep them in an appropriate data structure). It is safe for external code to call s48_note_external_event on every collected external event, though, even if older events have not been handled yet.
External code has to be able to collect multiple events that have occurred. Therefore, external code has to create the needed data structures to store the information that is associated with the occurred event. Usually, external code collects the events in a thread. An separate thread does not have an call argument, though, so it cannot create Scheme data structures. It must use C data structures to collect the events, for example it can create a linked list of events.
Since the events are later handled on the Scheme side, the information associated with the event needs to be visible on the Scheme side, too. Therefore, external code exports a function to Scheme that returns all current events as Scheme objects (the function that returns the events knows about the current call and thus can create Scheme objects). Scheme and external code might need to share Scheme record types that represent the event information. Typically, the function that returns the events converts the C event list into a Scheme event list by preserving the original order in which the events arrived. Note that the external list data structure that holds all events needs to be mutex locked on each access to preserve thread-safe manipulation of the data structure (the Scheme thread that processes events and the external thread that collects events may access the data structures at the same time).
If the sole occurrence of an event does not suffice for the program, the Scheme side has to pull the information that is associated with an event from the C side. Then, the Scheme side can handle the event data. For example, a typical event loop on the Scheme side that waits on external events of an permanent event type that an long-running external thread produces may look like this:
(define *external-event-uid* (new-external-event-uid (lookup-imported-binding "my-event"))) (spawn-external-thread *external-event-uid*) (let loop () (let ((condvar (make-condvar))) (register-condvar-for-external-event! *external-event-uid* condvar) (wait-for-external-event condvar) (process-external-events! (get-external-events)) (loop)))
In the above example, the variable *external-event-uid* is defined as a permanent event identifier. On every pass through the loop, a fresh condition variable is registered with the event identifier, then wait-for-external-event blocks on the condition variable until external code signals the occurrence of a matching event. Note that process-external-events! and get-external-events need to be defined by the user. The user-written function get-external-events returns all the events that the external code has collected since the last time get-external-events was called; the user-written function process-external-events! handles the events on the Scheme side.
When the Scheme side only waits for one single event, there is no need for an event loop and an permanent event identifier. Then, new-external-event is more convenient to use:
(call-with-values (lambda () (new-external-event)) (lambda (uid condvar) (spawn-external-thread uid) (wait-for-external-event condvar) (unregister-external-event-uid! uid) ...))
Here, new-external-event returns a fresh event identifier and a fresh condition variable. The event identifier is passed to spawn-external-thread and the condition variable is used to wait for the occurrence of the external event.
External code uses s48_note_external_event to push the fact that an external event occurred into the main event loop, then the Scheme code needs to pull the actual event data from external code (in this example with get-external-events). The user-written function spawn-external-thread runs the external code that informs the Scheme side about the occurrence of external events. The event identifier is passed as an argument. The external-event-related parts of the implementation of spawn-external-thread in external code could look like this:
s48_ref_t spawn_external_thread(s48_call_t call, s48_ref_t sch_event_uid) { ... s48_note_external_event(s48_extract_long_2(call, sch_event_uid)); ... }
The event identifier is extracted from its Scheme representation and used to inform the Scheme side about an occurrence of this specific event type.
All of the C procedures and macros described above check that their arguments have the appropriate types and that indexes are in range. The following procedures and macros are identical to those described above, except that they do not perform type and range checks. They are provided for the purpose of writing more efficient code; their general use is not recommended.
long s48_unsafe_extract_char_2(s48_call_t, s48_ref_t)
s48_ref_t s48_unsafe_enter_char_2(s48_call_t, long)
long s48_unsafe_extract_integer_2(s48_call_t, s48_ref_t)
long s48_unsafe_extract_double_2(s48_call_t, s48_ref_t)
long s48_unsafe_extract_fixnum_2(s48_call_t, s48_ref_t)
s48_ref_t s48_unsafe_enter_fixnum_2(s48_call_t, long)
s48_ref_t s48_unsafe_car_2(s48_call_t, s48_ref_t)
s48_ref_t s48_unsafe_cdr_2(s48_call_t, s48_ref_t)
void s48_unsafe_set_car_2(s48_call_t, s48_ref_t, s48_ref_t)
void s48_unsafe_set_cdr_2(s48_call_t, s48_ref_t, s48_ref_t)
long s48_unsafe_vector_length_2(s48_call_t, s48_ref_t)
s48_ref_t s48_unsafe_vector_ref_2(s48_call_t, s48_ref_t, long)
void s48_unsafe_vector_set_2(s48_call_t, s48_ref_t, long, s48_ref_t)
long s48_unsafe_string_length_2(s48_call_t, s48_ref_t)
char s48_unsafe_string_ref_2(s48_call_t, s48_ref_t, long)
void s48_unsafe_string_set_2(s48_call_t, s48_ref_t, long, char)
s48_ref_t s48_unsafe_symbol_to_string_2(s48_call_t, s48_ref_t)
char * s48_unsafe_extract_byte_vector_2(s48_call_t, s48_ref_t)
long s48_unsafe_byte_vector_length_2(s48_call_t, s48_ref_t)
char s48_unsafe_byte_vector_ref_2(s48_call_t, s48_ref_t, long)
void s48_unsafe_byte_vector_set_2(s48_call_t, s48_ref_t, long, int)
Additionally to not performing type checks, the pointer returned by s48_unsafe_extract_byte_vector_2 will likely become invalid when a garbage collection occurs. See section8.7.3 on how the interface deals with byte vectors in a proper way.
s48_ref_t s48_unsafe_shared_binding_ref_2(s48_call_t, s48_ref_t s_b)
int s48_unsafe_shared_binding_p_2(s48_call_t, x)
int s48_unsafe_shared_binding_is_import_p_2(s48_call_t, s48_ref_t s_b)
s48_ref_t s48_unsafe_shared_binding_name_2(s48_call_t, s48_ref_t s_b)
void s48_unsafe_shared_binding_set_2(s48_call_t, s48_ref_t s_b, s48_ref_t value)
s48_ref_t s48_unsafe_record_type_2(s48_call_t, s48_ref_t)
s48_ref_t s48_unsafe_record_ref_2(s48_call_t, s48_ref_t, long)
void s48_unsafe_record_set_2(s48_call_t, s48_ref_t, long, s48_ref_t)
type s48_unsafe_extract_value_2(s48_call_t, s48_ref_t, type)
type * s48_unsafe_extract_value_pointer_2(s48_call_t, s48_ref_t, type)
void s48_unsafe_set_value_2(s48_call_t, s48_ref_t, type, value)
It is straightforward to convert external code from the old foreign-function interface to the new foreign-function interface:
Converting functions:
Add s48_call call as a first argument to every function prototype that returns or accepts a s48_value.
Replace every s48_value type in the function prototype and the body with s48_ref_t.
Add call as the first argument to every function call that returns or accepts a Scheme object.
Remove all the GCPROTECT-related code (i.e. GCPROTECT and UNPROTECT).
Converting global (static) variables:
Replace s48_value type of the global variable with s48_ref_t, initialize these variables with NULL.
Set a real Scheme object in the initialization function of your code with one of these alternatives:
Use s48_make_global_ref to convert a s48_value to a global reference. For details and an example see section 8.7.1.
Use s48_local_to_global_ref to convert a local reference object to a global one.
If your global variable is supposed to hold a shared binding (e.g. an record type binding), you can use s48_get_imported_binding_2 that returns a global reference.
Replace S48_GC_PROTECT_GLOBAL with s48_local_to_global_ref to convert a local reference object to a global one.
Use s48_free_global_ref to cleanup global references when appropriate.
If you add #define NO_OLD_FFI 1
just above
#include <scheme48.h>
in your source code file, it will hide all
the macros and prototype definitions of the old foreign-function
interface. That way you can make sure that you are only using the new
interface and the C compiler will remind you if you don’t.