The operating system is the most important and among the first of the software subsystems developed for a new computer. It is commonly the most complex of the programs ever developed for a given computer system; it defines the functions available to all other programs. This is why operating systems have so greatly influenced software technology.
The influence of operating systems has been felt in two ways. First, operating systems set the environment for applying all other software technology, whether the application is to user software or to system software. Second, operating systems research has in fact motivated much of the development of software technology, especially large software systems. Many early software problems were first formalized and treated as operating systems problems. Their solutions were then generalized and applied to other forms of software.
This paper surveys some of the influence of operating systems research on software technology. After a short historical discussion, it turns to the significant results of operating systems research. These results include user interfaces, resource allocation, and process management. The final part of the paper reviews current research (and trends) in operating systems. It is of course impossible to cover adequately all aspects of operating systems research in a short paper. We hope that the comments by the discussants and the bibliography of important papers will help to fill the gaps.
Early operating systems were developed as control programs (or control systems) for their computers. These systems had two broad objectives:
These two objectives required that the operating system completely control all system resources. Minor modifications to early computer hardware easily provided the operating system with some control. Centralized control over the physical resources permits implementing abstract resources which are simpler to use; and centralized schedulers permit the operating system to effectively increase the system's productivity.
Techniques to improve the efficiency and productivity of computer systems were of primary concern in early operating systems research. The disparity in processing speeds of the central processor unit (CPU) and its input/output (I/O) devices suggested sharing or multiplexing of resources as a means of using the resources more efficiently. Multiprogramming, which implements the sharing of the central processor and main memory among computer users, quickly became a dominant theme, producing extensive research on resource management, scheduling, modeling, measurement, and simulation. This research sought to evaluate and improve existing policies and mechanisms.
Resource sharing must not affect the correct execution of the user programs. A user's view of the system must be independent of the number of users of the system and, hence, of the actual state of the physical hardware. Each user is thus presented with an idealized virtual machine -- an abstraction of the underlying computer which does not depend on the presence of other users in the system [Buzen 1973; Popek 1974]. Many early operating systems implemented abstract devices as a tool to insulate users from the details of hardware and as a means of providing greater reliability. The abstract device was often more convenient to use than the actual device. An example is a file system which abstracts from the physical properties of disks and tapes, implementing user-named logical files.
Multiprogrammed operating systems implement multiple, concurrently operating virtual machines. The term "process" refers to a program executing on a virtual machine. Operating systems research has produced theoretical and practical results for managing processes. The study of cooperating communicating parallel processes was the first step in a progression of increasingly sophisticated design structures for operating systems.
We see in retrospect that abstraction and virtualization have been central themes in operating systems research. The early concern for efficient operation led to multiprogramming which, in turn, produced both a need for further abstractions and virtualization and spawned a large research literature on scheduling, modeling, and other aspects of performance.
Operating systems research has had an enormous impact on many areas of computer science. Five are most notable.
Programming languages which implement concepts such as monitors
(Concurrent Pascal [Brinch Hansen 1976]) and abstract data types
(Alphard [Wulf 1976], Clu [Liskov 1977], Gypsy [Ambler 1977]) are
now appearing [Jones 1977]. It is not accidental that the designers of
these language systems all have strong research roots in operating
systems. The use of these concepts can have profound effects on the
programming process. This has been demonstrated in the operating
system environment. These concepts allow formal approaches to the
specification and proof of correctness [Hoare 1974, Robinson 1975]
and implementation of reliable systems [Linden 1976]. Brinch Hansen
 reports the implementation of a complete operating system
in 1300 lines of Concurrent Pascal.
Operating systems research has steadily improved the efficiency and effectiveness of computer systems performance. This improvement has come both from insight and understanding of computing processes and also from improved techniques of resource allocation. The most significant visible result has been the development of successful time-sharing operating systems, bringing interactive computing to the individual at a reasonable cost. Supporting this highly visible impact is a large literature on resource scheduling and process control.
Another area of significant visibility has been structuring methods for large software projects. This is currently a highly active area of research. It is discussed fully in the section on Current Research.
The ultimate justification for computers is to enhance human capabilities by speeding up computations and performing routine clerical activities. Interactive computing, an essential feature of "time-sharing" computer systems [Wilkes 1975], allocates computer capacity as needed to humans. This major result of operating systems research has influenced software technology enormously.
There are two kinds of time-sharing systems. A general-purpose time-sharing system offers a spectrum of services which help users to develop, test, store, execute, and share programs. A special-purpose time-sharing system offers some particular type or class of service. An example is a "single language system", in which the user may write and execute programs only in a single language (such as Basic or APL). Another example is a transaction system, in which the user has access to a specific set of functions defined on a data base (such as an airline reservation system, a bank teller system, or a medical information system).
The most famous early time-sharing system was CTSS at MIT [Corbato 1962]. Other general-purpose systems were developed shortly thereafter at Bolt Beranek and Neuman for the DEC PDP-1, and the University of Manchester for the TITAN project [McKeag 1976]. Another well known system of the same period was built at the University of California at Berkeley for an XDS 940 [Lampson 1966]. These early systems and their principles have been treated by Wilkes . Several special-purpose time-sharing systems were also developed in this period.
Operating systems development by major vendors for noninteractive, or "batch", processing was well under way when experimental time-sharing systems were being designed. Consequently, time-sharing capabilities were not originally included in these major systems. So successful -- and salable -- have been time-sharing systems, that now every major vendor operating system is either interactive or supports a time-sharing subsystem. Interactive computing is even more popular with minicomputers; the UNIX operating system [Ritchie 1974] for the PDP-11 minicomputer is an example. Some large processors, notably the IBM 360/67 and the GE 645 were designed especially to support interactive computing.
The current technical and commercial impact of interactive computing is vast. The essential function of a computer, an efficient collector and distributor of information for people, is served naturally by these systems. It is difficult to conceive of systems such as an airlines reservation system operating in a batch mode. Applications such as military command and control would be impossible without rapid interaction. Interactive computing has contributed strongly to program design and development through on-line file systems and such programs as text editors and incremental compilers. Interactive program development and use has been found to improve human productivity in many installations.
The theme of multiprogramming appearing in most operating systems research has been inspired by the desire to utilize expensive resources efficiently by sharing them. The separate (virtual) machines of the users are in fact simulated on a common, shared, physical hardware system. The resource allocation policies used to implement virtual machines have comprised CPU (central processing unit) scheduling, primary memory management, and to a lesser extent, auxiliary memory management. Each of these areas is discussed in the following sections.
Effective CPU scheduling was a requirement for the transition from the batch processing systems of 1960-1965 to the multiprogramming and time-sharing systems of the late 1960s and early 1970s. There are two fundamental scheduling objectives: maximizing processor efficiency and throughput, and minimizing the response time for short transactions in interactive systems. These objectives are not always compatible.
The CPU scheduling problem introduced some new factors not previously considered in the general scheduling literature: the service time required by an individual request is not known a priori; the service time distributions are highly hyperexponential; preemption is required for fair sharing and responsiveness. Thus, operating systems research has motivated new work in scheduling theory.
It was discovered early that scheduling the job whose CPU service time request is shortest minimizes overall response time (especially for short jobs). Preempting jobs of long CPU service requests allows "I/O-bound" jobs to proceed to the I/O devices of the system, thereby increasing throughput significantly [Stevens 1968; Ryder 1969; Marshall 1969]. Predictive and quantum-based scheduling prevent long jobs from monopolizing the CPU.
In a study of predictive schedulers, Sherman, Baskett and Browne  demonstrated that straightforward prediction, when coupled with preemption of long jobs gives near optimal processor utilization. They also showed that a simple round robin scheduler with a properly chosen quantum size may produce CPU utilization almost as high as the most sophisticated predictors. (See also [Lassater 1973; Lering 1975; Cho 1975]). Most modern commercial operating systems use some form of predictive CPU scheduling with preemption.
Interactive systems face the challenge of reasonable response and high processor utilization without unduly delaying long jobs. The class of multilevel feedback schedulers (usually with discrete quanta and preemption) were proposed to meet this challenge [Corbato 1962]. Many operating systems, including Multics and IBM's MVS for the 370, use a multilevel feedback scheduler. Research on CPU scheduling has resulted in identifiable changes in operating systems over the years.
Multilevel feedback schedulers have been criticized because they are difficult to analyze. The complexity of analysis [Coffman 1973] does not diminish the importance of these schedulers. Though difficult, the analyses confirm the simple intuition on which these schedulers are based: they approximate minimal response time for short jobs, without prior knowledge of service time requirements.
Analysis of scheduling problems in early computer systems were based on classical queueing theory: a single server with a queue, Poisson arrivals, and arbitrary service times (the M/G/1 queueing system). Analyses for round-robin, multilevel, and priority schedulers are extensions of this classical model [Coffman 1973; Kleinrock 1975]. Many significant extensions to scheduling theory for discrete quantum-based and preemptive schedulers were developed.
Even as the analysis techniques for single-resource queueing systems were being perfected, they were becoming obsolete: computer systems were evolving into networks of resources. Scherr's study of CTSS , based on a simple network (the machine repairman), showed that the model structure is far more important, in leading to accurate results, than the details of the CPU service distribution or the scheduler. Moore  reached a similar conclusion for his analysis of MTS. Buzen  introduced the "central server" queueing model, which has become an important analytic tool [Buzen 1973; Baskett 1975; Buzen 1975; Gelenbe 1975; Muntz 1975; Gelenbe 1976]. A recent development is the reformulation of the models for operational assumptions, so that the mathematics no longer depend on Poisson arrivals or exponential service distributions; the latter assumptions reduce the credibility of classical queueing analyses [Buzen 1976, Denning 1977]. The application of queueing network models to computer systems is covered in more length in the section "Performance Measurement: The Connection to Reality", elsewhere in this volume.
Because memory management techniques rely heavily on memory addressing hardware, memory management research and computer architecture have interacted strongly. The requirement in early multiprogramming systems that each job be loaded entirely in a contiguous region of main memory was eventually relaxed by paging and segmentation schemes. Research results have quantified the overhead inherent in these schemes and have shown how to partition memory to minimize waste. They have discovered efficient procedures for compacting programs in memory and for swapping programs in and out of main memory. Significant gains in efficiency have resulted in the cases where this research has been applied.
The most important class of results in memory management has dealt with the concept of virtual memory, implemented either by page or segment based addressing (or both) [Denning 1970]. Paging has been popular because it is is simple to implement; however, segmentation is of increasing interest because it allows the addressing mechanism to be used for protecting small, logical program units.
Paged virtual memory originated in the late 1950s with the ATLAS project at Manchester University [Kilburn 1962]; segmented virtual memory originated in the Burroughs B5000 series computers. A combination of segmentation and paging is used in Multics [Organick 1972]. Segmentation has been viewed both as a natural extension of overlaying techniques and as a powerful means of controlling access and sharing. Papers by Dennis  and by Arden, et al.  became standard references. These ideas influenced such commercial products as the IBM 360/67 with CP/67 (now VM/370), MTS (Michigan Terminal System), and the GE 645 (now Honeywell 6180) hardware for the Multics operating system.
Early virtual memory systems performed poorly. May still do. The reason is "thrashing", that is, serious degradation of CPU efficiency caused by excessive paging. Thrashing occurs when too many programs are forced into a fixed main memory. There has been extensive experimental work aimed at understanding which memory management polices are best. A famous early study by Belady  focussed on fixed partition policies and included results for an (unimplementable) optimal paging algorithm, while more recent studies have included variable partition policies [Denning 1975; Hoare 1972]. Mattson, et al.  developed a "stack model" for memory policies enabling efficient data collection and analysis for common paging policies (see also [Coffman 1973]).
The working set model of program behavior [Denning 1968] defined a method for determining the active part of the address space of an executing program, thereby laying the foundation for controlling thrashing through a combination of CPU scheduling and memory management [Denning 1976]. Only recently when queueing models allowed analyzing the effects of memory policies and load controllers, have we discovered effective scheduling methods for avoiding thrashing. Bard  installed such a method in CP-67. Rodriguez-Rosell and Dupuy  showed that by preventing the level of multiprogramming from getting too high, a working set dispatcher could effectively control thrashing. Denning, et al.  demonstrated that optimal load controls are not difficult to define and implement, and that whereas load control has a greater effect than the memory policy, the working set dispatcher gives a robust control which is nearly optimal [Denning 1975; Denning 1978].
Another significant outgrowth of memory management research has been the modeling and quantification of "locality" in program behavior [Spirn 1977]. Demand paging works only because programs localize their references in address spaces for significant intervals of time [Denning 1968]. Only recently have reliable procedures been developed to measure [Madison 1976] and analyze locality [Courtois 1976; Courtois 1977; Denning 1978]. Procedures for assigning small program segments among large pages can produce significant improvements in virtual memory performance by maximizing locality of reference to pages [Ferrari 1975].
The impact of virtual memory is obvious. The dominant medium and large scale commercial systems use segmented or paged virtual memories. Many minicomputers and microcomputers use virtual memory architectures, though frequently only to solve the relocation problem for small segments placed in a large physical memory. Nearly all paging systems employ a variable partition memory allocation policy. The most successful of these policies derive from the working set model.
Auxiliary memory devices, whose access times are significantly slower than main memory access times, tend to be congestion points in a system. The policies of managing these devices are therefore concerned with maximizing throughput by eliminating avoidable idle time.
The paging drum (and fixed head disk), whose use in paging systems is widening, has been studied extensively. Requests to use the drum are sorted into queues, one for each drum sector; the drum services the request at the head of a given queue as the corresponding sector rotates beneath the read-write heads. This drum will behave according to a shortest-access-time-first (SATF) policy. Since an SATF policy minimizes total accumulated latency time, it maximizes possible throughput for a fixed load on the drum [Coffman 1969; Denning 1972; Coffman 1973; Fuller 1974].
Analyses of moving-arm disks have been less conclusive. The problem is that an SATF policy -- which in practice is approximated by a shortest seek time first policy - tends under heavy loads to discriminate against requests for the inner and outer cylinders of a disk. Although the throughput is maximum, response time may suffer because certain requests may wait excessively. A comprehensive study by Teorey and Pinkerton  showed that no single policy is uniformly best over the entire range of load conditions. Several model studies have analyzed the effect of overlapping seek and transmission operations for multiple disks connected to a single controller; but the results depend strongly on the workload.
The performance of policies for reducing mechanical motion at a disk or drum is usually determined by considering the device off-line -- i.e., under specific loads. It is still open whether motion-reducing scheduling produces observable effects on overall performance when the device is on-line. Certain load controllers, for example, will set the level of multiprogramming to keep the average paging-drum queue length at 1 request [Denning 1976] -- under such a light load, the SATF policy may have no discernible effect. The standard practice of grouping the records of the same file on the same disk cylinder results in a majority of disk requests without seeks [Lynch 1972]; even under heavy disk loads, there is some question about the overall importance of arm-movement optimization.
Third generation computer systems introduced a new aspect of program design: concurrent or parallel programming. CPU scheduling and memory management mechanisms allow for creating and managing multiple virtual machines; but they do not address the logical control operations by which programmers may coordinate concurrently executing processes. A considerable body of theory has evolved in this area.
The term "process" has been used since the early 1960's to denote an abstraction of the activity of a processor. It gives the concept "program in execution" meaning irrespective of whether a processor happens to be executing the instructions of the given program at a given time. The development of this concept was a major advance; it provided a framework for concurrent programming.
To be reliable, systems supporting concurrent programs must satisfy four properties: determinacy, freedom from deadlock, indivisible operations, and synchronization.
Type (c) approaches may be expensive in execution and are unsuitable where fast response is desired. Type (b) procedures tend to have high overhead when the system is near saturation [Sherman 1974]. All the procedures for approaches (b) and (c) rely on knowing how many units of each resource type are available in the system. When there is no bound, as with semaphore counts, the procedures cannot be used. For this reason, approaches of type (a) or approaches based on preemption, are the most practical.
The most common approach of type (a) is called ordered resource usage. The system resources are partitioned into classes Ci, which are arranged in a hierarchy (such as a linear chain or a tree). A process which already holds resources from Ci may request resources from Cj only if Cj is below Ci in the hierarchy. This procedure prevents circular wait and operates irrespective of whether there are bounds on resource quantities. Since the partitioning restricts the ability to share, resource utilization may be lower than with type (b) or (c) methods.
The major results of the research on process management have been accepted into practice because they are straightforward, they are effective in keeping concurrency simple, and they can be rigorously proved correct. The Venus operating system [Liskov 1972], for example, implements synchronization primitives in microcode. Habermann  showed that with each semaphore we can associate three auxiliary variables: s(t), the number of sends, w(t), the number of attempted waits and r(t), the number of received signals, as of time t. The relation r(t) = min[s(t),w(t)] is invariant and can be used to establish the correctness of many synchronization problems. Brinch Hansen  has refined this method and Howard  has recently shown how to generalize it with history variables for use with arbitrary monitors.
Current operating systems research is building on the significant concepts of the past. Now that CPU scheduling and memory management are well understood, most investigators assume that the operating system comprises multiple concurrent processes executing in virtual memory. Attention is shifting back to the user interface, concentrating on such problems as information security and system structure.
Protection and sharing of information are essential but conflicting goals. It is extremely difficult to guarantee that, under all conditions, a process may acquire all and only that information to which it is entitled. All implemented protection and sharing mechanisms have been found to contain some form of a "loophole", though some systems are certainly more reliable and secure than others.
An important source of security violations is the system's access control mechanism or access policy specification. One attractive solution is based on a security kernel -- a minimal set of operating system programs which, if correct, will guarantee the correctness of all access-controlling operations throughout the system. Several projects to construct such a kernel are underway [Walter 1974; Robinson 1975; Millen 1976].
The concept of a "capability machine" has received considerable attention as a basis for a highly reliable and secure computer system. Dennis and Van Horn  proposed a basic design, in which processes had lists of capabilities specifying accessible objects (processes, segments, procedures, capability lists) and allowable access modes to them; there was a small set of commands for accessing objects and manipulating capability lists. The Hydra operating system is an experimental capability machine emphasizing the implementation of "abstract data types" [Wulf 1974; Cohen 1975]. The Plessey 250 [England 1972] has demonstrated the feasibility of a capability architecture and the concept may appear in future computer architectures.
It is often useful to verify, during compilation, a program's authorization to access objects [Conway 1972; Brinch Hansen 1973]. This is because assurances are needed that a program is correct and reliable before it is put into operation. However, compiler checking cannot prevent hardware malfunctions or data entry errors; thus, run-time checking is also needed for a highly secure and reliable system. The required run-time checks are simple extensions of virtual memory address translations [Denning 1976; Linden 1976].
Formal treatments of protection problems began with the access matrix concept [Lampson 1971; Denning 1971; Graham 1972]. All existing implementations of access protection can be deduced from this concept. By showing that the states of an access matrix can simulate the configurations of a Turing machine, Harrison, et al.  proved that the safety question -- can a given user ever acquire a given access right to a given object? -- is computationally intractable. Even if we can prove that a system's access control is safe, there are other ways to violate security. "Leakage channels", which pass encoded messages along legitimate access paths, are virtually impossible to seal [Lampson 1973]. Though these results do not prove that the safety question is intractable in all practical systems, they do help explain why these questions are persistently difficult to answer.
To handle information security problems which are not conveniently represented within the access matrix model, various research projects have been established. These projects have identified information security problems at three successively more difficult levels:
The impact of security research on major commercial systems has been slight. Only two major vendor systems offer appreciable protection capabilities: IBM's VM/CMS/370 and Honeywell's Multics. VM/CMS achieves protection at the expense of sharing by establishing disjoint address spaces for each copy of its individual user operating systems. Multics attaches access rights to each object in its file system and also employs special "rings of protection" hardware [Saltzer 1975]. The HYDRA [Wulf 1974; Cohen 1975; Wulf 1975] operating system for the C.mmp multiprocessor is an experimental capability machine.
To be useful, the abstractions implemented by an operating system must correspond to the semantic concepts people use to solve problems. This requires narrowing the gap between languages and operating systems. There has been much recent investigation of semantic concepts and the corresponding structures in operating systems [Jones 1977]. Besides enhancing problem solving ability, this research is helping to simplify systems, documentation, and maintenance.
Errors are committed often in designing operating systems. Accordingly, a considerable effort has been directed toward program structures that are amenable to being proved correct. Concurrency makes operating system structures more difficult to prove correct than the sequential processes embodied in most programming languages. A major thrust of the research has, therefore, emphasized correctness of computation with concurrent operations.
Despite the prominence of the top-down design approach in the structured programming literature [Zurcher 1969; Dijkstra 1971; Parnas 1972], operating system designers seem to have much success with techniques strongly resembling the bottom-up approach [Dijkstra 1968; Liskov 1972; Robinson 1975] -- especially level structured systems and monitors. Habermann, Flon and Cooprider  proposed a method of integrating top-down design with bottom-up implementation.
In his study of concurrent processes, Dijkstra  showed that the major timing and sequencing problems could be solved with the primitive P and V operations on semaphores. Habermann  showed an invariant relation among sent and received signal counts for P and V operations showing that formal proof techniques can be applied to concurrent programs. Howard  showed that invariant relations among history variables can be used to prove the correctness of very general concurrent structures. Owicki and Gries  have studied axiomatic systems for concurrent programs.
A collection of parallel programs can be regarded as implementing a network of servers that accomplishes some given task. Rather than proving termination, as must be done for ordinary sequential programs, one must prove that all work submitted is eventually completed. Dijkstra  suggested that a hierarchical structure of processes, with downward calls and upward returns, could easily be proved to complete all work by showing that each process returns to a "homing position". Brinch Hansen  has refined this argument when processes communicate by message buffers. The "correctness" of arbitrary networks of processes can be established so long as each task in the network progresses toward completion at each step through the network [Lauesen 1975]
The concept of monitors has attracted considerable attention as a method of organizing system functions to avoid synchronization problems. A monitor is a set of procedures managing a given set of hardware or software resources. It includes a locking mechanism that guarantees indivisibility of monitor operations by allowing at most one process to execute inside the monitor at a time. A language notation for monitors based on SIMULA classes was proposed by Brinch Hansen . Hoare  has clarified its properties and showed axiomatic approaches for proving a monitor's correctness; Howard  extended this by showing how to use history variables as a basis for the invariant assertions needed in the proofs. A small operating system of very limited capability, SOLO, was written in 1300 statements of Concurrent Pascal, a language implementing processes and monitors [Brinch Hansen 1976; Brinch Hansen 1977].
Despite the simplifications which monitors may allow in a system and its proof methods, there is a danger: the requirement of mutual exclusion of all operations within a given monitor may introduce excessive queueing at the monitor entries. Early versions of Multics, for example, could not support three processors very well because contention on the ready-list lock blocked the third processor most of the time. Present versions of Multics employ a large number of separate locks, each for a small portion of the ready list; but even this scheme loses 7% of all available time among 4 processors in waiting for locks to be released. Another possible problem with the centralization inherent in monitor-based scheduling may be incompatibility with distributed architectures. The resolution of these questions awaits performance studies of multiprocessor systems based on monitors.
As important as the correct operation of an operating system, is its continuous and reliable operation. An operating system must provide reliable service despite hardware malfunctions, program flaws, and data entry errors. The need for fault-tolerance may place severe constraints on the data structures and algorithms used to provide services to the uers. Current approaches to fault tolerance are based on extensive checkpointing to allow recovery in case of a detected error, or on recording operations so that the system may recover from an error by reversing the effect of all actions back to a point prior to the error. These techniques are expensive and used only in special cases. Randell  has been studying the possibility of more efficient checkpointing based on "recovery blocks" declared in the programming language. There is a strong case that capability machines have a high degree of fault tolerance, but the users of these machines must be willing to accept the severe constraint of small protection domains [Denning 1976; Linden 1976].
This research is influencing, and is influenced by, emerging architectural concepts such as distributed computing and multiprocessor/multimemory computer systems. The HYDRA operating system was designed for the C.mmp architecture (16 PDP-11 processors and 16 memory units) [Wulf 1974; Wulf 1975]. The CM* project [Swan 1977], among others, is developing an operating system for architectures involving large numbers of processors. The RSEXEC operating system developed by BBN for the DEC-10 computers on the ARPAnet [Thomas 1973] illustrates an operating system for a computer communication network.
There are currently at least three models for the organization of distributed systems.
Operating systems is a maturing field. Research has produced a body of knowledge for building useful, powerful operating systems for conventional machine architectures. New operating systems are being worked on every day. New operating systems are brought on-line at military installations monthly. Microcomputer and minicomputer manufacturers are developing operating systems for their new products. Mainframe manufacturers are producing new operating systems while enhancing existing systems.
This high level of activity manifests implementation efforts, rather than new research. This activity emphasizes gradual change to existing systems; it avoids the revolutionary. Minicomputer vendors are modifying their systems to compete better with major products. They intend to develop operating systems well grounded in the existing research results. However, radical new concept bases are not in evidence.
There are relatively few operating systems design or implementation projects at present in the research community. The major reason for this is the lack of facilities to support research oriented implementation projects in universities and research laboratories. There is presently little motivation for universities to undertake large research projects on conventional architectures. However, future research on reliability could quickly renew the motivation for experimental implementations.
Another reason for the few operating systems research projects is the changing set of research topics. Classical operating systems research has concentrated on mechanisms to solve problems such as memory management and resource scheduling. Such mechanisms have been developed and installed in conventional architectures. The important current problems for conventional systems deal with policy: how best to use the existing mechanisms. An illustration of this is the problem of functionality, which asks: what kinds of abstract machines should be created for users? We have yet to see serious efforts to characterize the functions which will allow the most efficient execution of a given workload.
Despite such constraints, the research community will be active. Among its attentions will be the neglected user interface. Standard operating system facilities will be accessible by language notations rather than as arcane supervisor calls, and more semantic concepts will be supported efficiently by the operating system. Severe structural constraints and greater type-checking will produce more understandable programs and permit better documentation and maintenance.
Research will continue on the problems of constructing reliable, correct, fault tolerant systems, including methods for constructing programs which are convincingly correct before being put into operation. These methods will stress integration of language facilities and the operating system. Users will need to accept constraints on their freedom to produce unreliable software, but will find that reliable systems are also secure systems.
Research in secure computing will also continue. It will focus on auditing and detecting suspicious patterns of use; on practical methods of shielding small data bases from compromise under statistical queries; and on encryption of data in detachable storage or in transmission on communication networks.
New architectures will be respectable subjects of operating systems research. The decreasing cost of hardware allows serious consideration of approaches which previously were discarded as being too expensive. Cheaper hardware will enable dedicating processors and memory units to specific functions, thereby mitigating control problems created by sharing limited hardware resources. Sharing will continue to be important, but the focus will be on sharing information rather than hardware.
Microprocessors will open new applications involving computer controlled systems and distributed information. In such dedicated systems, operating system software will tend to merge with programming language and application software ("vertical integration") introducing new implementation problems. structuring problems.
The possibility of dedicating hardware resources to specific functions will mean that functional modules of the system will interact only via the communication links among them. Predicting throughput and response time will be much more reliable than it is today, and queueing models will play a stronger role in designing systems. We will thus see significant progress toward "guaranteed performance systems".
Decentralization -- distributed processing -- will be a major theme of future operating systems. The greater need for communication among computers will inspire new approaches to sharing information and synchronizing processes. Decentralization will also direct more attention toward data flow concepts -- languages and hardware. Multiprocess systems with many processors and networks of computer systems will create new problems for the research community to solve.