The Amber Operating System

by

Charles Frankston

Submitted in partial fulfillment of the requirements for the degree of

Bachelor of Science

at the

Massachusetts Institute of Technology

May 1984


Signature of Author

Department of Electrical Engineering and Computer Science
May 18, 1984

Certified by

Thesis Supervisor (F. J. Corbató)

Accepted by

Chairman, Departmental Committee

Amber is an operating system similar to Multics, but based on a unique capability oriented model. Amber is intended to advance the state of the art in multifunction operating systems. The design of the kernel of the Amber operating system is described. The goals of the design and the decisions made in arriving at the final form are explained in detail.

The Amber Design team, the original six: Ted Anderson, Jeff Broughton, Hon Wah Chin, Lee Parks and Daniel Weinreb. We set out to accomplish a dream.

The later additions, Earl Killian and Jay Pattin, who joined in spite of the obvious problems.

Dedicated to the idealism of youth.

Thesis Cover Page

Table of Contents

Chapter One: Introduction

Chapter Two: Background

Goals of Amber

Chapter Three: Structure

Chapter Four: Features

Why Multics?

File System

Protection Model

Access Control Lists

Capabilities

Memory Management

Segmentation

Paging

File Mapping

Dynamic Linking

Chapter Five: Kernel Model

Domains and Segments

Kernel Capabilities

Access Control

Access Control Lists

Tasks

Message Channels

Dummy Objects

Extended Type Object

Retrospective

To Revoke, or not to Revoke?

Null Access

Why not message passing?

Chapter Six: Implementation

Concepts

Languages and Hosts

Other Hardware Bases

Status

Introduction

Very few large multifunction operating systems have been created from scratch in recent years. The expense, the belief that such systems are too complex, the emergence of "portable" operating systems, and the trend toward single user computers are all factors against building such systems. However, there are several reasons why it is premature to declare that the era of large multifunction operating systems has passed.

The Amber operating system team started with the belief that enough had that a system could be built that would be significantly better than existing systems in many areas. This document chronicles that design effort. The background chapter outlines the circumstances of the system's creation, and the goals we believed the system should meet. The structure chapter describes the kernel concept and explains why it was adopted for Amber. The features chapter describes in detail what features we felt were important to have in the system and why. The chapter on the kernel model describes the abstract model implemented by the kernel. Lastly, the implementation chapter is a quick overview of the implementation techniques used in Amber.

Background

Amber is a new operating system designed for use on the S-1 computer system [S-1A] [S-1B], being built at Lawrence Livermore National Laboratory in Livermore, California. An S-1 processor is a high performance (roughly half the throughput of a Cray-1 for floating point operations) central processor featuring a large virtual address space that is both paged and segmented in a manner similar to Multics [MulticsVM]. The full S-1 design allows for up to 16 processors to be connected to a common main memory via a crossbar switch to form a multiprocessor configuration.

Goals of Amber

The design effort for Amber had a number of goals, as well as requirements dictated by the nature of the S-1 computer system and the project that was creating it The goal of multiprocessor support has already been mentioned. Additional goals for the system were that it support various modes of usage:

The scale and capabilities of the S-1 processor suggested that these various modes of usage occur concurrently as far as possible. It was clear from the outset that a system as large as the S-1 would often be shared by more than one community of users. This dictated an additional set of goals for the system:

Isolation
That the operating system keep the users from interfering with each other.
Access Control
That access to information among users be carefully controlled.
Security
That the mechanisms for Isolation & Access Control be secure.

The isolation goal means that no behavior of a user's program, intentional or otherwise, should cause another user's program to behave in an unexpected fashion. This includes undue degradation of system performance. The security goal should go without saying, except that all too many systems that attempt to provide Isolation and Access Control provide them so long as a user does not take advantage of bugs or holes in the system that subvert the protection mechanisms. Sometimes these holes are not just small implementation errors, but basic flaws in the system architecture. There is also another set of goals decided on by the design team itself:

Sharing
That sharing of programs and data among users should be easy and natural
Usability
that the system set new standards for usability (human interface) and maintainability.

Note that achieving sharing while at the same time providing access control makes both more difficult. Sharing could easily be provided by allowing all users to read or write all data in the system and access control could be provided by forbidding any user from accessing data other than his own. Allowing each user to indicate in some convenient way precisely who is allowed access to what data in the system is a much more substantial problem. We call the combination of these two goals controlled sharing. The strong commitment to support of controlled sharing in Amber is somewhat in contrast with recent trends away from elaborate protection structures in operating systems in favor of dedicated use of computer systems by a single group or even a single individual. The design team believed this simpler view to be rather shortsighted. The issue of controlled sharing of information is an important one [Rotenberg1973]. Dedicating computers to specific communities only moves the control issue from the single operating system architecture into the architecture of the network structure that sooner or later will connect those computers together.

That the system be usable seems almost too obvious to state, yet is far from being attained in most existing systems. Usability breaks down into two areas: human engineering and maintainability. Human engineering is simply that the system be easy or natural for people to use. Maintainability is really human engineering in another guise. A system may be easy to use, but end up being quite a lot of work for those who are charged with keeping the system functioning properly from day to day. A good example of this is the original Unix [Unix] file system, which is much praised for its tree structured appearance to users, but the maintenance of which generally requires someone at each site capable of patching the file system because its internal structure becomes inconsistent too easily.

Finally, as should be clear from our list of goals, we intended to create a practical system. We did not set out to do operating system research, although we may have done some. Our foremost goal was to create a system that would stand up well to day to day usage and help people who are not computer scientists get work done.

Structure

The term operating system is often used rather loosely. Sometimes it refers to the whole collection of programs that is provided with a computer system. I.e. whatever the manufacturer of the system might provide. A stricter usage is that portion of the system that directly controls the running of the system. E.g. the task scheduler, the low level input output routines and the like. On most computers performing functions such as issuing input output or task switching can only be done when the hardware is in a special mode. This is referred to as supervisor state on many machines. We will just call it a privileged mode. We will refer to the portion of the system that runs in privileged mode as the kernel. Some systems have kernels that do not run entirely in privileged mode, but the Amber kernel does.

We have a list of goals that Amber was expected to satisfy, but there are many conceivable ways of organizing a system to satisfy these goals. Historically, there have been two reasons why operating systems were created for computers. One is merely to provide a library of frequently used subroutines that would always be available, usually input and output routines. The second, more compelling reason is to provide a mechanism for allowing multiple users to use a single computer system without interfering with each other.

In some ways these goals are contradictory. If the operating system is a mechanism for providing a library of subroutine functions then, presuming other constraints such as memory limitations don't preclude it, it would seem advantageous to include as many functions as possible in the kernel of the operating system.

However, if the primary purpose of the operating system is to multiplex one computer system among many users then it would be better to keep the kernel as small and simple as possible. This desire stems from problems in ensuring that one's protection model has been properly implemented. In a large complicated system that has all sorts of mechanisms not related to protection built into it, it has often proven very difficult to ascertain that the protection functions work correctly. The protection model is a term describing the set of mechanisms that keeps the users of the computer system from interfering with each other, e.g. from reading or writing each other's data when they are not supposed to, yet allows them to read and share information when they are supposed to.

These two goals can be difficult to reconcile, and the best way to do so is by realizing that one of them is misplaced. It should not be necessary to put a function inside the operating system kernel in order to make the function generally available. That this sometimes is necessary is generally due to inadequate subroutine linkage facilities present in many systems. In a system with poor subroutine linkage facilities, making everything a kernel call is very convenient. A kernel call can usually be done by a hardware trap instruction and is thus easy to get at. Code inside the kernel is shared by all users, which subroutines may not be, etc. However, the section on dynamic linking will show that subroutines need not be inside to kernel to be easy to use, and goes into more depth on the problems of putting too many functions in the kernel.

A good criterion for what to put in the kernel is whatever cannot be done without increasing privilege. An example of this would be accessing some shared peripheral such a disk drive or a terminal multiplexor. A user program cannot be allowed access to these devices because that would enable the program to read other user's data or output to another user's terminal. Thus some sort of privileged system call must be provided that makes sure the user reads only his own data and outputs only to his own terminal. On the other hand a set of subroutines that implements a uniform protocol for I/O operations need not be in the kernel, since there should be no change of privilege involved. If a system follows our criteria for choosing what functions to put in the kernel then it is called kernel based.

Features

Kernel based systems typically have the property that there is much more code outside the kernel than inside it. The structure of the kernel though will have a strong influence on how the system as a whole will appear to users. Indeed if we merely sat down and designed an operating system kernel, not only would our job be incomplete (because one would also need many functions outside the kernel before the system is usable), but we would probably have not done a very good job. What is needed is some model, some conception of how and for what purposes the system as a whole is to be used. One can then do a top down design to implement this model. This would include both kernel and non-kernel functions, the decision of which functions go in the kernel being made according to our aforementioned criteria.

Unfortunately, in the case of an operating system, this standard system design scenario is not so simple. An operating system provides a set of interfaces for use by as yet unwritten and even unanticipated applications. Therefore the choice of what functions one tries to put in the system as a whole is really largely a matter of experience and judgment. The rest of this section will be devoted to reconstructing some of the reasoning process and motivation behind the way Amber was structured.

Why Multics?

In many ways, the Multics system [Corbató] can be considered the prototype for Amber. Of all prior systems studied, Multics came closest to meeting the goals set for Amber.

Multics offers a combination of features not found in any other system. In many places where other systems have similar features, the Multics implementation tends to be more complete and better thought out. This is not universally true, and there are certainly many good ideas not in Multics. This chapter attempts to identify what these features are in order to set more specific goals for Amber.

File System

The most important function an operating system offers in support of interactive use such as program development is its file system. The file systems of existing systems differ widely in their functionality, richness of naming structure and other criteria. Multics offers:

Two features not found in Multics, but available in the Tenex (later TOPS-20) file system are:

Finally, a feature that has been discussed for Multics at times, but exists in only one other system we are aware of (The LMFS file system for the Symbolics 3600, which was designed after Amber.):

File name completion is actually more of a function of the user interface to than the organization of the file system. However, it can be very expensive to use command completion if the directory is not organized with command completion in mind. Multics stores file names in a hash table, which makes command completion expensive. Amber uses a balanced binary tree.

Until recently symbolic file system links seemed to be an MIT secret, being found only in Multics and two homegrown MIT systems: ITS for the PDP-10 [ITS1.5] and MagicSix for Interdata/Perkin-Elmer 32 bit machines [Parks1979]. However, symbolic links have been added to release 4.2 of Berkeley Unix. This feature frees the user from the constraints of the strict hierarchy that a tree structure imposes. This flexibility is essential for good name space management.

Earlier versions of Unix offer only a form of non-symbolic file system links sometimes called hard links. The problem with a hard link is that it points to a specific instance of a file (actually to an internal file number) rather than a file name. Therefore a hard link cannot be put in place prior to the creation of the file. If the destination file is removed and later restored the hard link will be broken. An example of the problems with this approach is the simple case of writing a file out from a text editor. If a system crash or other interruption occurs while writing a buffer out from an editor, then one can be left with only a portion of the file on disk. To guard against this many systems write out the file under a temporary name and then rename the file to its permanent name, deleting the old file. Preferably, the rename and delete steps are atomic. If this were done on Unix then any links to the original file would be lost since the edited result has a different file number.

User defined property lists gives users and their programs a place to keep some notes about a file. One expected use of this facility is to keep the name of a program used to access a file. One could then store text files with many different internal organizations but provide a uniform interface by putting the name of the appropriate access routine in the property list. This should end the debate that has been going on for so many years about whether text files should be stored as streams of characters or lists of line oriented records.

Protection Model

Access Control Lists

When a computer is used by more than one person, especially by more than one group of people (a group here is some community of users such as a department in a corporation, or different corporations) means should be provided to keep information private. We have pointed out already however an extreme case of this, in which users can only access their own data, or more likely their own data plus public files. But this is not acceptable. What is desired is a specific way of indicating who may and who may not read or write specific files or groups of files. This protection model is often called discretionary access control.

Of course, maintaining an exhaustive list on each file of each user that may have access gets quite cumbersome. A mechanism of grouping files and their access entries is needed. The directory structure already provides a means of grouping files. It is easy to provide access setting commands that apply to all the files in a directory. There are arguments whether it is better to do that or implicitly inherit access from directories, but we won't go into that here.

There should also be a mechanism for grouping users to grant the access to. Multics [Saltzer1974] provides such a mechanism in the form of projects. Identifiers in access control lists consist of both person and projectid components, i.e. JRUser.Fooproject. Either component may be replaced by a wild card character, thus *.Aproject would grant access to anyone on the Aproject. A user can be registered on one more or projects. Different Multics sites have tended to use the projects in different ways. At some sites a project might correspond to the purpose the user is using the system for at the moment, whereas at other sites the users project might reflect which of the groups (user communities) he belongs to.

The problem with the Multics project is that it is intimately intertwined with the resource accounting mechanisms. One can only be logged into one project at a time. Also, users cannot create their own arbitrary groupings of users as they can with files in the file system, because creating a project requires special system privileges. The result this has led to on Multics is very long access control lists specifying many Username.* terms and/or the need to log out of one project and into another in order to exercise one's access rights.

The solution we have adopted in Amber is similar to what Primos [Primos] has. This is essentially to create one more level of symbolic name space. Users can create entities in the file system which hold lists of users for access control purposes. The names of these lists can then appear in access control lists. This provides a good way of grouping access names and has an added feature: since access is given to a group identifier, a user can be added, or removed from the group and the effect of the change will automatically be reflected in the access available to that user. It is not necessary to locate every file that the user may have been granted access to in order to apply the change.

The one advantage of the Multics project scheme is that it can be used to provide fire walls of sorts. That is, sometimes access is deliberately granted to a specific project, or a specific user on a specific project that is different from the project that user commonly logs in on. This is done in the case where the access is to some important privilege that the user does not want to be able to exercise by accident or, worse yet, by subversion. Our solution to this in Amber is based on a generalization of the Tenex "wheel" feature. A "wheel" in Tenex can essentially override all protection mechanisms. However, to avoid accidentally exercising such a powerful privilege while doing ordinary work on the machine the wheel capability must be explicitly enabled. Wheels will usually enable their wheel capabilities, perform the specific task that they needed the special privilege for and then disable it. The state of the rest of their session is undisturbed making this mechanism much more convenient than logging out and logging in again.

In the Amber scheme a user must explicitly enable his group accesses one at a time. It is expected that a user would set up a default list to be enabled when he logs in. If he had any access rights that were considered dangerous he could then explicitly enable and disable them the way the Tenex wheel might. When we get to explaining the details of the Amber capability model we will also see that more elaborate mechanisms can be set up whereby some of a user's most dangerous access might be held by a trusted server that would demand a special password or other additional verification before releasing them. This could be used to guard against a "Trojan Horse" attack. In a Trojan Horse attack, someone with a privileged account is convinced to run a a program provided by a user in the belief the program is harmless, when it in fact is not. On most systems a program run in a privileged user's account can exercise all his privileges, often without his knowing it.

Capabilities

In the access control list model of protection each object has a list that grants selected entities (users) access privileges. A capability system, on the other hand, associates with each entity a list of objects that the entity has access to. An overall discussion of both approaches can be found in [Saltzer&Schroeder], and an insightful examination of the duality of these two concept can be found in [Lampson1971]. There are certain things a capability scheme can do that an access control scheme can't do, or can only do awkwardly. Many academic systems [Hydra1981], [CambridgeCAP], have been built using capability models, but most modern commercial systems, (Multics, Primos, Data General AOS, and a rumored new release of VAX/VMS), seem to have opted for the more obvious access control list scheme.

A capability scheme is useful when access is to be determined procedurally rather than statically. Capability based systems can therefore handle much more complicated rules for access than those that must be built into an access control list system. Capabilities allow one to pass access rights to an object over to server processes but still retain revocation rights. For example, most systems that have to deal with line printer servers generally have to make the server a highly privileged task, or in the case of Multics leave explicit access entries all over the file system for the server. These privileged servers have often been a source of security holes. With a capability system one can pass the server a capability that's very specific to the job at hand and quickly revoke it when the job is done.

A full capability system like Hydra can also implement mutually suspicious subsystems [Schroeder1972], the need for which I will not discuss here.

The Amber model is a combination of the access control and capability models that attempts to provide the advantages of both. This will be explained fully in the section on the kernel model.

Memory Management

Segmentation

Many existing operating systems running on computers that support large address spaces provide remarkably poor facilities for managing the address space. Examples are VAX/VMS and Berkeley Unix 4.2. These systems require a region of disk to be configured to act as a "swap area". Thus, though the use of a paged virtual memory has freed programs from the need to preallocate main memory, installations must still preallocate disk space to a swap area. Most installations cannot afford to preallocate enough disk space to allow more than a couple of users to have a large address space.

Programs written for these systems still tend to be very much aware of their address space consumption and cannot use sparse allocation to advantage. Sparse allocation is an elegant solution to an age old problem. Suppose we have a program that has more than one large table the size of which it is difficult to estimate in advance. An example might be an assembler, which would like to maintain the symbols of the program being assembled in one table, the object code produced thus far in another table, and its own stack which grows and shrinks during statement and expression analysis. Prior to virtual memory, one might make an initial guess at the size of these tables and position them in fixed places in memory. If the guesses were wrong, and the tables outgrew their initial allocations and threatened to collide with each other or the end of memory, simple programs simply did not work, while more complex ones had all sorts of code to try and rearrange the tables or swap them to auxiliary disk files. Handling these contingencies not only made the program complex and hard to maintain, but increased running times drastically.

A segmented virtual memory of the form that Multics introduced allows one to put these tables far enough apart that it can practically be assumed they will never collide. A segmented memory address is composed of two parts -- a segment number and an offset within the segment. Each segment has a particular size. Multics and the S-1 have special hardware to detect address references that fall outside the bounds of a segment. This can be a very useful debugging feature. A Multics segment can be from 0 to about 256 * 1024 words long. A S-1 segment can be as large as the entire address space, although Amber only supports segments half that size.

Paging

Segmentation aids the programmer in managing memory by allowing different objects or tables to each be assigned to its own segment. However, the job the system must do to manage segments is not easy. Segments can grow and shrink rapidly as the object they are holding grows or shrinks (e.g., an execution stack). Some early systems that supported segmentation spent an undue amount of time shuffling segments back and forth in memory. The Block Transfer instruction was included in the DEC PDP-10 architecture for doing this job.

The solution to the job of efficiently managing segments is to divide them into fixed size pages. A segment that could be 256K words long on Multics can be formed of from 0 to 256 pages of 1024 words each. The positions of the pages of a segment are described by an array of pointers (called a page table) and thus need not occupy contiguous physical memory addresses. This eliminates the need for memory shuffling in order to deal with the dynamic behavior of segment size. If there is a demand for more pages than can fit in primary memory, the kernel will automatically move the extra pages to disk and bring them back into main memory on demand. This is called demand paging.

Many computers have the hardware needed to implement demand paging (Digital VAX, IBM 370), and operating systems that implement paged, but not segmented memory management (VAX/VMS, Berkeley VAX Unix, VM/370, MVS for the IBM 370). These operating systems all suffer to some extent from the swap space problem we alluded to above. Architectures that have the necessary hardware for demand paging and a large address space (VAX and IBM 370 with Extended Addressing option) can still use the techniques of segmentation by allocating contiguous ranges of pages to represent a segment. They will not however, have the feature of trapping references that overflow a segments boundaries that the Multics and S-1 hardware provide. This feature is primarily a debugging aid, but it can be an important one.

File Mapping

Multics uses segments in a way that frees it from the preallocation problems of a system swap space. Files in the file system can be assigned to segments in a user's address space, by a process called "mapping the file in" (simply a system call that takes a file name and returns a segment number). The system then takes care of making the file appear in main memory for reference or modification and makes sure such modifications are properly reflected on the disk copy. All the memory in a user's address space exists in some file in the file system. Correspondingly, every file in the file system can be mapped into a segment in some user's address space (presuming the user has access to the file, hasn't already used up all the segment numbers in his address space etc.).

In systems like Berkeley Unix or VAX/VMS the disk can have two or more copies of any program that is being run. One is the file system copy of the program being run (e.g. the file /bin/programname), and another is the copy in the swap area. Multics needs only one copy of the an object program, since it can page it directly from the file system. This is even more of an advantage when using virtual memory to update large writable databases from the storage system.

Dynamic Linking

The subroutine is an extremely useful technique for reducing the amount of code one might have to write to get a particular job done on a computer. Given this fact, it is remarkable what some systems make users go through to call a subroutine. The traditional method of linking subroutines, sometimes called "linkage editing" or "loading", copies the object code of each subroutine into a file with a copy of the main program. This has several disadvantages:

Dynamic linking is implemented by having the compiler output object files that are directly executable. All external references are set up so a trap is taken the first time they are used. The trap handler has some means of finding the symbolic name of the reference being referred to. The trap handler uses this information to find the subroutine or variable being referenced, places the item in the address space if it wasn't already there and then replaces the trap with a more direct reference to the item. On Multics the trap handler (called simply the linker) uses naming conventions in the file system and a user setable list of directories to search in resolving the link. The user has a great deal of control over this procedure via manipulation of these search rules and his file system name space. The system also seems to work far better for novice users: since the standard libraries are already set up in his search rules, most novices are barely even aware that the linking is going on. In contrast, some systems, for inexplicable reasons, even require the user to remember (and type in!) the names of the libraries that must be searched to find the run time routines of the language he is using.

There are a number of techniques besides dynamic linking that other systems have adopted to solve some of the problems associated with linkage editing.

Load and Go
This technique loads all the subroutines into core each time the main program is loaded, but does not place a copy on disk. This saves disk space, but the expense prior to each run of the program can be quite substantial. This technique is typically used in debugging environments.
Put everything in the system
TOPS-20 [TOPS20] is an example of this. By placing lots of useful functions in the kernel) you get several of dynamic linking's advantages. Everyone shares the same code, no disk space is wasted, you don't have to resolve the reference until runtime. The disadvantages are that the average user can't change or replace one of the system routines, and the kernel gets large and complicated, possibly compromising protection. Also, it is hard to develop these routines since the kernel environment usually does not have as good facilities and debugging probably requires the entire computer. Even trivial bugs left in any of these functions can bring the whole system down.
Fork a process
This is how Unix solves the "editor calls the mail program" and vice versa problem. For that example the solution is probably appropriate, but for smaller examples the expense of forking a process for each little subroutine call becomes odious. Furthermore, in Unix a new process is forked and loaded each time the call is made. It doesn't get much cheaper after the first time as it does using dynamic linking.

Kernel Model

Amber is based on an abstract model that defines all the objects in the kernel and the operations one can perform on them. In general, the fewer the objects and the operations, the easier it is to understand the abstract model. The Amber kernel model strives toward this goal.

In the rest of this discussion refer to a user program. A user program is simply any program not running with kernel privilege. We also refer to a user process -- this refers to the entire non-kernel state of a collection of programs in a single address space.

Domains and Segments

The Amber Kernel provides a set of subroutine calls, called Kernel Calls, to manipulate objects that the kernel manages. The kernel objects are those that must be protected by the system in order to insure their integrity. For example, if any user program could directly access the page tables then it could reference the memory of other users; similarly if users could directly read arbitrary disk records, they could reference each other's files. Thus, there is a set of abstract objects that the kernel manages that allow users to access resources such as disk files, main memory, and tasks in a controlled fashion.

All objects are cataloged in the file system, even objects that normally have a transient existence, such as tasks. The objects that catalog other objects are called domains or directories. The two terms are used interchangeably in Amber. Domains store the names of objects, and any capabilities to the objects. Since domains are themselves objects, they can be stored in other domains. Thus, the domains can form a tree. This tree of domains is directly comparable to the file system tree of Multics or Unix.

Any object or capability in a domain can have several co-equal names. Any object that has a name can also have a property list. This is just an association of named attributes to values associated with a file. This is intended to allow user programs to associate small notes with files. There are no fixed limits to the number of names or the number or size of properties that a file may have; however, there may be restrictions on the size of directories in any particular implementation of Amber. (On the S-1, directories are stored in segments inside the kernel, and thus take up address space in the kernel. For this reason kernel segments holding directories are not always kept mapped in, but are demand managed. There is also a system parameter controlling their maximum size.) The kernel also maintains several attributes about objects, such as the date and time created, modified or referenced, the size of an object where relevant (e.g., for segments), and other such information that one might expect a file system to keep.

The most common object in the system is a segment. Segments correspond to the files of most other systems. On the S-1, segments have a maximum size of 2@+[29] bytes. There are kernel calls to map a segment into an address space. This is the only means provided to access the information in a segment. It is expected that outside the kernel a more general file will provide device independent access to files of arbitrary size by building them out of segments.

Kernel Capabilities

All objects are referred to in kernel calls by means of a capability to the object. Capabilities may be obtained from the kernel in a variety of ways. For example, a user program which wishes to access a segment named "foo" would first use the lookup_ kernel call to get a capability to foo out of a domain. It would then use the map_segment_ kernel call to map foo into its address space. It may then freely modify foo -- the kernel will ensure the modified file migrates back to disk (ignoring system crashes, disk failure or other unfortunate events for now).

One of the arguments to the lookup_ kernel call used to get the capability to foo is a capability to the domain that foo resides in. We somehow already had to have that capability. Suppose we started with the full pathname for foo. That is, a list of names of directories tracing down from the root which provides the path that arrives at the domain holding foo, e.g. "root", "udd", "JoeSmith", "foo". Note that Multics might represent this pathname as ">udd>JoeSmith>foo", and Unix as "/udd/JoeSmith/foo". However, unlike both those systems, the Amber kernel does not manipulate pathnames and therefore imposes no syntax on their interpretation. Thus to find "root", "udd", "JoeSmith", "foo", the user program would do a series of lookup calls starting with "root" and looking up "udd". Once it had the capability to "udd" it could then call lookup passing in its capability to "udd" and asking to get a capability for "JoeSmith" and so on down to "foo". Where did the capability for "root" come from? The process has to be born with it or it can't lookup anything in the file system. It is quite feasible to create processes with an initial capability for a domain other than the root, if one wishes to restrict them to some subtree of the file system. (Note that if one wishes to insure this, the subtree must contain no capabilities that point "up" in the tree.)

An object may have several capabilities that refer to it. Capabilities can also refer to other capabilities. Furthermore, a capability that refers to another capability need not be in the same domain. Note that although we speak of a filesystem tree, the capability structure is really more like a vast directed graph. There are some restrictions however. A list of capabilities cannot be circular. The kernel will simply not allow a circular chain of references to be created. The reader should be careful to distinguish a capability that refers to a capability in another domain from a domain entry that might be a domain object. The latter is what forms the file system tree.

An indirect capability allows its owner the minimum of the access rights of all the capabilities it must indirect through. Any changes in access rights in a capability along the chain takes effect immediately. E.g., if a user is using a capability that indirects through five other capabilities, then he can have no more access than the logical and of the mode bits of all six capabilities (his plus the five it depends on). If one of the five capabilities is deleted, the user immediately loses all use of his capability.

One need not worry about the mode bits of a capability in the indirect chain changing since there is no kernel call to change mode bits in place. This must be accomplished by deleting the capability and creating a new one with different mode bits. However, even if the new capability ends up giving the same user the same access, the kernel will not automatically reinstantiate the new indirect capability. The user process fault handler is expected to handle the fault and reestablish the indirect chain by the same means it created it.

Note therefore that indirect capabilities are revocable, in the sense that any access granted through one can revoked by the owners of the capabilities the indirect capability depends on. Nonrevocable capabilities point directly at the object. Therefore, any user who wants to retain revocation rights should never give out a capability that points directly to the object.

Access Control

We have thus far described how capabilities are used to refer to kernel objects and each other. We now describe the way in which access to these capabilities is controlled. Capabilities have modes. A mode is a set of bits whose exact interpretation varies for different types of objects, but which govern the type of access that capability allows to the object. For example, segment type objects have mode bits of read, write, and execute.

Access Control Lists

We said that Amber had Multics type access control lists, but thus far we have discussed only capabilities. How do access control lists fit in this scheme? First, we must introduce the principal. The principal corresponds to the Multics userid. That is, it represents a person or an entity that one might wish to grant or not grant access to. In Amber, a principal is simply a domain. Any domain can be used as a principal in an access control list, but it is expected that some naming conventions between user names and their principal domains would be followed.

Associated with each object then is a list of principals and modes. This list is called the object's access control list (ACL). In order to gain access to an object through its access control list a task must present a capability to a domain that it has use access to that matches some principal in the ACL. Group access is handled by giving all members of a group use access to a domain that represents the group. The kernel call that obtains access to an object through its ACL takes a list of domains to match against the principals on the ACL. Thus for any particular call the a user can choose exactly which privileges to exercise.

As in the case of an indirect capability, any changes in an access control list take effect immediately. Unlike the indirect case however, the user does not lose his active access to the object unless the specific ACL entry that granted him access was deleted. Of course, it is possible that a user could lose access to any object because the term for the specific principal he used to gain the access is deleted, even though he may have use rights to other principals that are still valid. In this case again, if desired, the user process fault handler is expected to try to reestablish access to the object by calling the kernel with the current list of active principals.

Tasks

We have introduced the concept of capabilities that refer to other capabilities, or indirect capabilities, but we have not yet discussed what they are useful for. To understand that we must know more about the task structure of Amber.

A point of execution in an address space is called a task. Every task is associated with a domain called its domain of execution. The domain of execution contains all the capabilities of a running task and its address space. A task must have a capability cataloged in its domain of execution before it can be used in any kernel call. Since domains of execution will typically be created on demand when a task is started, they generally will start out initialized with only a few capabilities, such as one for the root. The task would then make kernel calls to obtain indirect capabilities to any objects it needed to access, including segments.

We have just introduced yet another association that domains maintain, which is an association between segments in the domain and their position in an address space. Mapping a segment into an address space is the same as assigning it a position in a domain of execution. The segment need not (and usually will not) live in the domain of execution, but in order to make the map call, a capability to the segment must exist in that domain. The only case in which the the segment might actually exist in the domain of execution rather than being an indirect capability would be for temporary segments such as execution stacks.

There may be more than one task associated with an address space. All the tasks executing in a given address space have identical access privileges. In fact, they share the same capabilities, namely, all those in the domain.

Tasks are themselves objects, which can have capabilities pointing to them. The capabilities for a task have modes of status, writestate, and control, granting rights to examine or modify the state of a task and start and stop its execution.

Message Channels

Most capability systems have a means whereby tasks can pass capabilities among themselves. Amber has an object called the message channel which can be used for this purpose. The Amber message channel can also pass data between tasks. Message channels are intended for use in communicating with server tasks. A server task would typically place a message channel object in some advertised place in the file system. A user program requiring the server could then use the lookup_ kernel call to get a capability to the message channel.

Dummy Objects

There is an object that is available for use as a placeholder. The dummy object lets users take advantage of all the information that the kernel manages for objects without incurring any of the overhead one of the other objects would. In particular it is expected that dummy objects will be used to implement soft links outside the kernel, by storing the target of the link on the property list of the object.

Extended Type Object

An extended type object is an object whose access is constrained to go through a procedural type manager. Only the manager of an extended type object can get a capability directly to the object. The manager is a domain that contains a prototype task. There is no special privilege needed to create an extended type object. Any object may be sealed with the capability to the domain that is the type manager for that object. Thenceforth, only the type manager may unseal (i.e. obtain any capability to) the sealed object. Other users wanting access to the information or services that an extended type manager provided would invoke the type manager, which involves starting the task from its template. The state of a task so invoked cannot be altered or examined by the invoking task. Similarly, the invoked task cannot open any message channels or lookup any objects other than the ones that were sealed. The extended type manager either runs to completion and returns a result, or can be terminated at predefined points (to allow it to insure that the database it may be managing is in a consistent state). The task running an extended type manager is very much like a Hydra encapsulated procedure [Hydra1981].

This facility is the best example of a feature the Amber capability model provides that virtually no stretching of the Multics access control model could provide. The extended type object is an implementation of mutually suspicious subsystems as described in [Schroeder1972]. The implementation is expensive however, requiring a process to be invoked for each use. This is considered acceptable because use of extended type managers is expected to be infrequent in Amber. It is expected that the normal kernel provided access modes on objects will be sufficient for most purposes.

If heavy use of a protected object is expected, one should probably instantiate a permanent task to manage the object and have users communicate with it via message channels. It is not necessary to use the extended type mechanism to do this.

Retrospective

Any large design effort has some decisions that were harder to make than others. Amber is no exception. This section reviews some of the hardest ones.

To Revoke, or not to Revoke?

There was significant debate as to whether revocable capabilities were worth the effort. There were two arguments against implementing revocation. Both assumed that the implementation of revocation would make the kernel noticeably more complicated. One argument asserted that revocation does not really regain lost information. That is, if someone once had access to a file they weren't supposed to, they could already have made a copy of its contents, or in the case of write access, destroyed its contents. In this case revocation doesn't really matter and gives a false sense of security.

The counter-argument to this is simply that just because someone once had access to an object doesn't mean that they necessarily made a copy. Human interaction are complex -- relationships between people, organizations, nations, etc., change with time. It is unreasonable to claim that it is too hard to make a computer system capable of tracking those relationships in realtime.

The other argument asserted that revocation of access is a rare event. Furthermore, if something so bad had happened as to instigate some removal of access then the owner would not mind having to take severe measures to revoke access. For example, in the case of a file the owner might copy the information to a new file, delete the old object, and then rename the new file to take the logical place of the old one if necessary. Since the new file is not the same object, no one who had a capability to the old one could use the new one. The access for the new object could then be set to the old access minus whoever was being revoked. There is a potent real world analogy to this argument. If a group of people have been issued keys to a door, and one is suspected of misusing his access, it is often not possible to reclaim the key. In that case a locksmith must be called to change the lock on the door.

The counter-argument in this case is that rare or not, the operation need not be that difficult. In essence, there was disagreement on just how hard it would be to implement revocation.

Null Access

A related argument has to do with denial of access as opposed to revocation of access. Amber has no explicit way of denying access to a particular principal. I.e. it has no counterpart to the Multics null access term. In Amber a user is denied access to an object only if he does not have use access to any principal on the ACL of the object. This could be inconvenient in the case where one wants to allow everyone in the system except perhaps one user to access an object. Presumably the system maintains a principal which is available to all users. In order to exclude a single user, it is necessary to create one's own principal with use access entries for everyone on the system except for the user you wish to deny access to. Furthermore, this new principal won't permit everyone in the system access to the object unless they actually add it to their list of principals to try when checking out objects. Other users may not want to bother adding a new principal to their list merely because two users are having a petty dispute.

The justification for not having a specific null access feature is that such a feature is virtually meaningless on a system with a large user community. If you attempt to deny all but one user read access to a file, any other user that is willing to cooperate with the excluded user can read the file for him and pass along the data. In fact, Amber has the facilities to make such a scheme easy -- user A can allow user B to masquerade as user A simply by giving user B use access to the principal that represents A. This is referred to as "granting a power of attorney". Of course, user A had better trust B thoroughly, since he is putting his entire existence on the computer system in B's hands.

Why not message passing?

One of the most fundamental decisions in Amber was to include the file system in the kernel. File systems are complicated. Better file systems like Amber provide lots of features, such as a rich name space. Good file systems have lots of error detection and recovery code for dealing with the "real world" where disks fail. If our goal is to have a lean and simple kernel in order to make it easier to understand and verify, then it would certainly be good if we could leave this stuff out of the kernel. But we can't just give users calls to access the disk hardware themselves because disks are usually shared among several users. Something must manage their accesses to make sure they don't interfere with each other. Systems that don't have the file system in the kernel do this by giving one task the right to access the disk drives. This task then services disk requests from the rest of the tasks and can regulate their access.

Systems structured in this fashion where objects are managed by server tasks, and users communicate with the server tasks by passing messages back and forth, are called message passing systems. They have become quite popular in academic circles of late [demos]. One reason for this is the increased modularity that the message passing approach enforces. The "arms length" relationship that tasks have with each other encourages careful consideration of their dependencies on each other.

There are several reasons why the Amber team did not take this approach. Perhaps, the foremost one was unfamiliarity. No one on the team had worked on a message passing system before, whereas we were all familiar with Multics. There were more substantive arguments as well. Although a message passing system does encourage modularity, that does not mean that a system based on procedure calls is necessarily less modular. A primary reason for message passing systems is that they make the kernel smaller, and therefore much easier to verify. However, it is not clear what the advantage of this is, since to verify the security of the system as a whole one must verify all tasks that manage critical resources such as the disks. Still, the message passing adherents can claim that a bug in the line printer driver is less likely to provide a means of subverting disk management. Finally, and perhaps decisively, there is an efficiency argument. Message passing systems require at least one task switch per operation. A task switch is typically more expensive than a procedure call, even a procedure call that results in a change in privilege. Careful design and hardware assist can mitigate this. However, the Amber team decided that the subroutine approach was, overall, more cautious and chose that path.

Implementation

Concepts

A full description of the details of the kernel implementation is well beyond the scope of this document. However, it is appropriate to acknowledge where the concepts that guided the implementation came from.

Multics was the source of many design studies, most of which were done as graduate student theses. Several in the mid-70s concentrated on better ways to implement the Multics supervisor [MulticsKernel]. The Amber kernel structure was strongly influenced by these documents. In particular, Janson's thesis studied circular dependencies in operating systems and ways to avoid them [Janson1976]. The Amber kernel has no circular dependencies between major modules. Reed's thesis on multilevel virtual processes influenced the Amber traffic controller and task hiearchy [Reed1976].

One of the most interesting constructs in the kernel is the access cache. Without the access cache, the Amber capability scheme simply would not have been implementable. It is therefore quite remarkable that the cache was not invented until well after the capability design had been committed to.

There are two problems that the access cache solves. The first is simple. Essentially all kernel calls refer to the objects they deal with by capabilities. Every kernel call must then look up these capabilities in the domain of execution and trace down the object that they actually refer to. A cache speeds up this process enormously. The access cache holds the most recently used capabilities and a shortcut pointer to the objects they reference.

We have mentioned how Amber will not allow a circular capability reference to be created. We have also described how revocation takes effect instantly in Amber. The former is facilitated by the access cache, and the latter would be nearly impossible without it. The idea is very simple. Before any capability can be placed in the access cache, any capability it depends on must first be placed in the access cache. Every capability that has dependents has a pointer to a list chained through all its dependents. If a capability must be removed from the access cache for any reason, its dependents are removed first. The next kernel call that tries to use an object so removed will simply take a cache fault and rebuild the chains if it is still possible to do so. If it is not possible, then access has changed and the kernel call will return an appropriate error code. Thus the access cache will always have in its address space a handy representation of the dependency structure of any capability in active use. Note that if this were not the case, there would be no way, short of a file system search, to find out which dependent capabilities might have been revoked by the deletion of a particular capability.

Languages and Hosts

The implementation of Amber was begun on Multics, writing in PL/1. PL/1 has a poor type definition facility, so it was extended using a preprocessor. Using this technology, large portions of the kernel were written and tested using special test programs. Specifically, the task scheduler, directory management, and segment and page management were all implemented and tested to some extent. Of course, there was much that could not be tested except on a real S-1, such as the portion of page management that actually sets page table entries and expects memory to become accessible as a side effect. Wherever possible however, a dummy routine was substituted that would try to use native Multics facilities to duplicate the desired action. For example, the internal kernel routine that does the final work of updating an S-1 segment table to map a segment in was replaced with one that instead mapped a Multics segment into the Multics address space. This allowed more testing to proceed than one might otherwise believe possible.

Over half a year into the implementation effort it was decided to shift languages. There were two reasons for this. First, there was growing dissatisfaction, not only with PL/1's poor type definition facilities, even though somewhat mitigated by the preprocessor, but with PL/1's concept of a module. We wanted to be able to write a module that would implement particular system abstraction. The module might have some public entry points and many private subroutines. There is no way to have several different procedures all share the same set of internal (private) subroutines in PL/1 except by use of the entry statement. The entry statement had many undesirable properties, however, as well as not being implemented in the PL/1 subset G compiler front end that the project planned to purchase. The second and compelling reason for abandoning PL/1 was not technical. It was simply proving too hard to get the Department of Energy to approve the purchase of such an expensive piece of software.

In most cases such bureaucratic bungling would be cause for a project to fail. In the case of the Amber project, it proved a blessing. Jeff Broughton, then Amber project leader, implemented two extended Pascal compilers. The extended language was named Pastel (an off-color Pascal). A complete compiler targeted to the S-1 architecture was completed in several months. However, the first version of the compiler, which was implemented in slightly more than a weekend, produced PL/1 as its output! This was done to allow the continued use of Multics, which had no Pascal compiler, as the development site for Amber. At that point the project had a couple of man-years of code written in PL/1. The conversion of all PL/1 code to Pastel was accomplished in about 3 man months. (1998 historical note: at one point in the project Richard Stallman visited, and had the Pastel compiler explained to him. He left with a copy of the source, and used it to produce the Gnu C compiler. Most of the techniques that gave the Gnu C compiler its reputation for good code generation came from the Amber Pastel compiler.)

Eventually, development was moved to a local VAX running Berkeley Unix. This necessitated a new code generator targeted toward the VAX. Some of the Multics specific simulation routines were rewritten so testing could continue, but by and large, aside from the code generator the movement of Amber development to the Vax was not a major effort. On the Vax there was opportunity to compare the Pastel compiler to existing Vax compilers. In general the Pastel compiler produces smaller and faster code than the Unix C implementation language, by about 30% in both measures.

Other Hardware Bases

In section paging we point out that other computer architectures with large paged address spaces are capable of supporting an Amber-like operating system. In particular the Pastel compiler already has code generators for the VAX and the National Semiconductor 16032 families, although the latter code generator is untested. An Amber port to these or other architectures would involve reimplementing the portions of the kernel which manage address spaces. There are some fundamental differences in the way a paged machine handles its page tables. On a segmented machine like Multics, the S-1, or the Prime 500 series, access to a segment is determined from bits in a user's segment descriptor word. On a paged system each page table word has access bits. (The access we are referring to here is read, execute, or write to an area of main memory. The S-1 actually has bits in both the segment and the page table words and and's them together to determine effective access.) This unfortunately means that two users who have the same segment mapped in with different access rights cannot share the same page table.

This exacerbates an already difficult situation on the VAX where the very small page size results in a very low ratio of words of main memory described per word of page table. In spite of the user page tables themselves being pageable on the VAX, enough page table entries to describe the entire user address space of 228 words requires 128K bytes of non-pageable physical memory to hold the first level page tables. Needless to say, using segments under such a scheme would incur a certain overhead in memory consumption. There are some measures that can be taken to mitigate the problem, and in any event, memory is relatively cheap nowadays.

Status

The design of Amber was begun in 1979 by a team of six. Hon Wah Chin was project leader. Team members were Ted Anderson, Jeff Broughton, Charles Frankston, Lee Parks, and Daniel Weinreb. All team members were familiar with Multics. Lee Parks and Ted Anderson had participated in implementation of a small scale Multics like system as undergraduates [Parks1979]. Daniel Weinreb had worked on the MIT Lisp Machine project as a undergraduate [LispMachine].

Most of the first year of the effort was spent in design and discussion. The essence of the current capability scheme was devised by Jeff Broughton toward the end of the design period. Prior to this time, the design was much closer to Multics in its concepts of segments, directories, access control and the like. In fact some code which was already written had to be modified to accommodate the new scheme, but the changes were not drastic.

By that time it was apparent that the S-1 Mark IIA computer system, which Amber was designed for, was going to be ready later than expected. Had the hardware been closer to completion, it is likely that a less ambitious and more expeditious system would have been implemented. However, the continued non-availability of the target computer system was very detrimental to completion of coding efforts. It was increasingly difficult, both technically and psychologically, to continue building a kernel with no real feedback on how successful the elements of the structure thus far implemented were.

At the end of the first year, Daniel Weinreb and Lee Parks left the project. Hon Wah Chin assumed other duties with the S-1 project and Jeff Broughton became team leader. One and a half years into the project Earl Killian joined in the midst of the switch to Pastel as the implementation language. Three years into the project Charles Frankston took an extended leave to continue his education. Jay Pattin joined the Amber team over four years after the inception of the project.

At times other duties, such as maintenance of support computers and network structure, have taken manpower from the implementation of Amber. In the fourth year of the Amber effort, Jeff Broughton became technical director for the entire S-1 effort, and had little time to spend on Amber implementation. Ted Anderson was drafted into debugging Mark IIA hardware. Over the five year course of the project, direct manpower working on Amber has been between one and three persons, with an average close to 2. Roughly 10 total man years have been expended implementing the Amber kernel, which is believed to be 90% complete. The final 10% still awaits the completion of the S-1 Mark IIA hardware.

Some work has been done on the facilities outside the kernel. There exists a window oriented command dispatcher running in simulation on both Multics and Berkeley Unix. A description of the concepts and ideas involved in this system is beyond the scope of this document.

1998 Historical Note: Eventually one Mark IIA processor ever became operational -- and Amber ran on it -- largely due to the efforts of Jay Pattin. Overall however, the Mark IIA processor design was a failure -- featuring one of the most complex hardware designs ever built, just as RISC architectures were becoming popular. Over time most of the S-1 team went on to other endeavors. Earl Killian, for one, learned the lessons of the complex Mark IIA project well, and went on to become chief architect for MIPS corporation, helping to design several generations of RISC microprocessors.