Guile-based Emacs

This project that I (Ken Raeburn) have started is for converting GNU Emacs to use Guile as its programming language. Support for Emacs Lisp will continue to exist, of course, but it may be through translation and/or interpretation; the Lisp engine itself may no longer be the core of the program. If you want to help, please send me email at raeburn@raeburn.org (not at my MIT address, please).

This should not be confused with Keisuke Nishida's work on interfacing Guile and Emacs, or the work going on in the Guile project for interpreting Emacs Lisp.

Status

The short status summary, for people not interested in investing time, tracking down bugs, helping with development, hacking on both Guile and Emacs: Forget it, scram, this stuff doesn't work, and it's a long way from working. It could take years.

The longer version, for those who want to help:

In 2009 I've had some time to sink into it, and got things updated since my last effort a few years ago.

Some stuff does work, actually. In a terminal or x11 window, it runs shell mode, browses directories, does font-lock coloring of C source files, etc. See README.GUILE for some more details.

I haven't tried lots of complex stuff that's likely to stress-test it, and probably won't for a little while. Getting make bootstrap working is also kind of an important goal.

But there's a lot of work remaining. I've actually only done a small fraction of it so far.

The Plan, and progress so far

The last big update to this section was in 1999. The major change since then has been that I've switched to tracking the Emacs sources in CVS, reimplemented my changes in that code base, and brought it up to more or less the same point.

So far, I've replaced the basic object representation with that of Guile, switched the allocator and garbage collector to be Guile's, and I'm using Guile objects for numbers and cons cells. All the objects that aren't using Guile's representations are using new "smob" types; the vector and misc types are still subtyped as in normal Emacs. The smobs are allocated via Guile code, though the data structures hanging off of them use the old Emacs allocators.

I'll be trying to switch to Guile strings next, but since Guile strings don't have multi-language support or text properties, that'll take some work. (Text properties will probably be done using a single object property initially, and the interval structures hidden within C code, and another smob.) After strings come symbols; symbol bindings are represented differently, so that could also require some real work. I think symbols may need to be done before exception handling, but I haven't really looked.

Separately from those, vectors need to be converted, and the various "vectorlike" and "misc" objects will probably be broken out into completely independent types; those shouldn't be tough.

Switching over to the Guile evaluator will be very interesting. Someone has written partial code to have the Guile evaluator handle some Emacs Lisp syntactic constructs like `let', and translate to Scheme equivalents; I think it still needs finishing. The Emacs byte-code interpreter will of course have to continue to work with existing byte code libraries; whether it'll be by performing the equivalent Scheme operations on the fly or by translating a block of byte-code into the Scheme equivalent (and possibly optimizing), I don't know.

For now, both evaluators exist, and there's no way to talk between them. Actually, I added a Lisp function where you can pass a string to the Guile evaluator and get a string back, converting between the Guile and Lisp string types in the process, but it hasn't got good error trapping, and the Emacs error handlers aren't known to the Scheme evaluator, so any error will kill the program. It's not really the right approach anyways; I should be able to pass lists with symbols back and forth, with good error trapping. Well, more like calling eval-as-lisp from scheme and eval-as-scheme from lisp, with unified exception handling. Someday. For now, this simple version could be interesting for hack value, and maybe a "Guile interaction mode" if someone feels like writing the rest of the support. Other than that, though, the Guile evaluator is never used while Emacs is running.

Another potentially big headache: the input handling. The existing Emacs code blocks and unblocks input, and tests whether input is blocked and whether input is pending, and sets or clears the input-pending flag. Input events, if they arrive outside of a critical code section, can cause Lisp code to be executed from the SIGIO signal handler. Obviously, Guile critical sections need to disable this. Right now, I'm thinking that a Guile signal handler (installed with scm_sigaction) might do the trick, but I may have to resort to using Guile's async objects, which may not exist after the POSIX thread support is done....

And that brings up another point: What if multiple threads are in use? How should a ^G interrupt, causing a trap into the debugger, affect a multi-threaded Lisp process?

Perhaps the input handling should be in a separate thread, and trigger an async to signal the quit in the other thread? An other thread, out of (possibly) many? (If so, one chosen at random, or one specially designated?) All other threads?

The low-level thread handling is clearly part of the Scheme package, though we could put some Emacs-specific stuff on top. The input handling is Emacs-specific, so the interrupt handling for threads should not interfere with the low-level thread code if it can be avoided.

For now, I think it's easiest to assume one thread only, and the current async support. (sigh)

So, how will any of this get into a released version of Emacs? Probably piecemeal; some bugfixes and random cleanups can probably go into the main GNU Emacs release, but the major changes (object representation etc) probably have to wait until they're a lot more stable.

How will any of this affect XEmacs? Beats me. It's Someone Else's Problem.

In the (waaaay) long term, I'm also thinking about trying to take certain pieces of Emacs, like buffers or window/frame objects, and make them separable components useable from Guile code. Lots of things, like buffer-local variables and change-hook functions, would probably become Emacs-specific. (Other Guile applications could use object properties, and may not even have the notion of a single "current" buffer.) Or they'd be implemented in Scheme, on top of the most primitive version of the object implemented in C. Or, maybe not; Emacs performance still has to be good. And the various objects may be too closely tied to each other and to Emacs for this to be practical. But first things first.

Sources

Still with me? :-)

I'm using a Subversion repository at the moment, for a "mirror" of the Emacs CVS repository, the Guile sources (a bit outdated), and my merged tree jamming the two together. Sorry, I haven't set up access for other people yet.

I've put a snapshot at guile-emacs.tar.bz2.

Mailing lists

There are two mailing lists hosted at sources.redhat.com: guile-emacs and guile-emacs-cvs. They've been very low traffic in the last few years. I think I'm still on them, though.

Hacking tips and notes

Lisp nil and Scheme ()/#f: See Jim Blandy's notes on the subject.

Translating Lisp to Scheme:

Mikael Djurfeldt has written up some code to use syntax transformations and a few hooks in the Guile C code to do some translating. It looks like a good start for handling things like reading the function slot of a symbol in a funcall, though it doesn't use Jim's proposed scheme for nil/()/#f. To get Mikael's code, check out the "mdj_elisp_branch" branch of the Guile CVS repository. Executing
```
(define-module (foo) :use-syntax (lang elisp))
```
is one way to invoke his translation code. Note that the full set of Lisp primitives (e.g., defvar) is not written yet, so you can't run production elisp code. (This is outdated ... check the mailing lists for more recent work.)
There's been some discussion in comp.lang.lisp and comp.lang.scheme about this too. I should dig up the interesting bits and mention them here.
Emacs byte-code? Interpret or translate?

When changing how objects refer to one another internally and how they're structured, it's easy to miss something in the garbage collection support. If you do, it's hard to track down, without running the garbage collector every few lines and checking just when certain objects get reclaimed, which is really painful. Be especially careful in hacking on any of this code.

Another thing to note is that some bits of Emacs code expect GC to run only at certain times, mostly during evaluations and funcalls and such. With Guile, on the other hand, garbage collection can happen any time storage is being allocated, and perhaps someday, concurrently with *any* executing code that doesn't take steps to prevent it. So there are times when an object being created may be examined by the garbage collector even though it's not "complete" yet; thus precautions must be taken to ensure that such objects are in consistent states when other associated objects are being constructed. Filling in fields with nil or zero (the SCM representation of zero, that is), is one way, probably best. Disabling garbage collection at such times is easier, but IMHO less clean.

For fairly generic changes, not specific to Guile, such as using macros instead of explicit structure element references, or random bugfixes, try to make the changes in the "non-guile" branch first and merge to the trunk. If they're fixes for real bugs that would show up in the old Lisp implementation, make sure they get back to RMS and the official Emacs source tree. If they show up there in a different form, bring that form back into the non-guile branch and then the trunk.

ENABLE_CHECKING is a win. It causes every XWINDOW call to barf on any object that's not a window, etc. It slows things down, but fixes real problems. While it may be work with the Lisp representation to extract the "pointer part" and make it a window pointer or other object pointer (seems kinda dicey to me but I don't know a platform where it'll fail), extracting the window structure pointer in Guile means dereferencing the pointer that is the SCM object. Since that could crash if the SCM object is some IMP type of object and therefore probably not a valid pointer, all such code needs to check the object type before extracting the pointers. This may result in a slowdown in non-Guile Emacs, but frequently the code can be rewritten to still be efficient. For example, comparing two Lisp objects by applying XWINDOW to both and comparing pointers for equality; at least with the plain numeric representations of Lisp objects, comparisons can be done on the objects directly.

Try to keep ENABLE_CHECKING working for the non-guile branch. I've added a patch so configuring with --enable-checking will turn it on.

XFASTINT causes problems. GNU Emacs code often takes advantage of the fact that the basic representation is an integer type and that the Lisp and integer representations of a given number are the same. The first is also true in Guile but the second is not. XFASTINT is used to extract small numbers from Lisp objects efficiently. XFASTINT is also used as a means of reading or assigning to an object efficiently, without handling type and value separately. IMHO these two uses should be separated in GNU Emacs, and XFASTINT used for only one, or removed altogether.

NO_UNION_TYPE is annoying, and helpful. With a union representation of Lisp objects, the compiler complains if integral-typed and Lisp objects are mixed without proper conversions. Of course, it hurts performance too because of the way calling conventions are usually written, so no currently supported platform actually uses a union type. (I assume it's safe to say that, since Emacs 20.3 wouldn't build if you switched to the union representation.) Making it work before switching to the Guile object representation cleaned up some problems I probably wouldn't have spotted otherwise. While technically bugs, they probably didn't really affect the behavior of (non-Guile) Emacs at all.

If XSETWINDOW or the like is used (assigning a Lisp_Object given only a pointer to an internal data structure already created and given a Lisp_Object handle) I've been adding the SCM object to the internal data structure, initializing it when the object is created, and simply pulling it out in the XSETFOO macro. This hack is also useful when one internal data structure points to a second internal data structure, and the second is also referred to by Lisp objects directly; if the former reference is the only reference left to the second structure, the GC pass still has to mark the Lisp object version of the second object, so that it won't be reclaimed. Eventually, perhaps Lisp objects can be used instead of the structure pointers, and this extra field can be eliminated.

Some info on internationalization/multilingualization is on the Guile ideas page. I'm not familiar with the issues involved or what work may be going on.

Try not to add any global state; remove some, if you can. Guile supports multiple threads, though there are problems with the implementation. I'd love to see Emacs Lisp support become multi-threaded in some clean way, though I don't even want to think about the UI issues right now.

Miscellany

I am not trying to enhance the existing Lisp system to handle new types the way Guile does. While that might be of interest to some people (like Perlmacs dude John Tobey), I think replacing the object representation and allocation schemes is the right way to start my project. Otherwise, no objects can be shared between Lisp and Scheme, so how else can I start it? If it does happen, though, as long as the structure is similar to the way Guile does it, making the transition to Guile shouldn't require much in the way of changes.

I'm tending to leave some of the performance issues (storage space requirements, GC speed/frequency, experimentation with other GC algorithms, startup speed, "unexec" or fast-load capabilities) as problems to be tackled by the Guile team. I'm interested in them as well, and may work on some of them, but I see them as no longer being issues specific to Emacs, and thus at least somewhat outside the scope of this project for now. I want generic Guile solutions first, then Emacs-specific ones only if they're still needed. (Okay, there will certainly be lots of places in the Emacs code where things can be tuned up after I do this conversion. My point is, don't hack Emacs code because of performance issues with Guile when Guile itself can be fixed, even if it takes a little more work.)

How can we have Lisp "let" and function-argument bindings apply only to a single thread (of many that might be running, possibly preemptively and without thread-switch hooks), without completely ruining performance?

There are parts of Emacs I'm not very familiar with. There's also a lot about Guile and Scheme in general I'm not all that familiar with. If you think I'm taking a poor approach technically to some part of this project, or want to tackle some interesting aspect of the project that you know a lot about, do let me know.

hits since creating this counter.

Ken Raeburn / raeburn@raeburn.org

Last updated 2009-06-11 [in progress].