This report demonstrates the feasibility of sharing a Nova 3/D minicomputer at the level of the basic machine interface (hardware/firmware environment). Sharing is made possible by the creation of a privileged software nucleus called a virtual machine monitor. Virtualization of the Nova is shown at three levels: theory, design, and implementation.
Chapter 1 defines the concept of a virtual machine monitor, demonstrates the theoretical feasibility of virtualization, discusses the history and utility of virtualization, and specifies the Nova 3/D virtual machine. Chapter 2 discusses initial design considerations, describes the top level decomposition of the monitor, discusses the notion of virtual processor states and their transitions, and describes some monitor components referenced in later decomposition. Chapter 3 describes the decomposition of a single top level component, the trap process, which handles all real machine traps. Chapter 4 describes the decomposition of all other top level components.
Chapter 5 concerns implementation. It describes the systems programming language used, describes needed systems variables and tables, and explains the function of all required programs.
Three appendices are provided. The first describes how to run a systems program on the Nova using the facilities provided by RDOS (real time disk operating system). The second demonstrates how to run the UT virtual machine monitor. The third presents the monitor command language syntax.
This report is a revision of a Master's Thesis of the same
This chapter explores the possibility of virtualizing a Nova 3/D (with Memory Management and Protection Unit). First, the notion of a virtual machine monitor is defined and a demonstration of the theoretical feasibility of virtualizing a Nova is presented. Next, a discussion of the history and utility of virtualization is given. Finally, a specification for the Nova 3/D virtual machine is stated.
A virtual machine monitor (VMM) is a privileged software nucleus which creates copies of the basic machine interface on which it runs . These copies are called virtual machines. A basic machine interface is defined by Buzen and Gagliardi  as "the set of all software visible objects and instructions that are directly supported by the hardware and firmware of a particular system." Figure 1 depicts this virtual machine architecture.
In order to distinguish VMMs from system software with similar functionality, several authors   have cited three defining requirements. These are:
A virtual machine monitor may also be defined as a collection of processes which create virtual processors and virtual devices. This definition serves to make two points. First, a VMM need not be a single sequential computation; rather, it is a collection of independent computations which could execute concurrently. Second, the notion of virtual machine may be decomposed into the notions of virtual processor and virtual device.
In order to determine the precise function of a virtual machine monitor, it is helpful to consider the relation between real and virtual processors (devices). A real processor (of the type this report concerns) is composed of two kinds of resources: actions and store. The store serves to hold machine state and consists of main memory and special registers. Actions cause state to state transitions and include facilities such as instruction interpretation, interrupts, and traps. The relation between real and virtual processors is indicated with respect to these two resources.
The store of a virtual processor is logically equivalent to the store of a real processor in that it defines an equivalent state space. However, there are physical differences. Special registers of the virtual processor are not necessarily supported by real special registers; rather, they may be maintained as virtual special registers in the main memory of a real processor. Likewise, the main memory of a virtual processor may be maintained as a virtual memory on a backing store such as a disk or in real main memory. In order for the actions of the real processor to cause state to state transitions on the virtual store, it is necessary to copy portions of the virtual store into their logical equivalents in the real store. When this occurs, those portions of the virtual store that have been copied in are said to be realized or mapped.
Actions of a virtual processor consist of those actions of the real processor which cause state transitions on the realized virtual store as well as actions, supported by the VMM, which cause transitions in the virtual store.
Real devices are also composed of actions and store. A device store is a set of registers visible to software running on the real processor which controls the device and holds device state. In this context, memory associated with a device is not considered part of the store. Device actions cause state transitions on the device store corresponding to the operation of a real device. A virtual device store may be directly supported by the real store or may be maintained in main memory of the real processor. In both cases, access to the virtual device store by software running on a virtual processor constitutes a virtual processor action supported by the VMM. Virtual device actions are also supported by the VMM and correspond to the operation of a virtual device, which is mapped to a real device or simulated.
From the previous discussion, it is apparent that a virtual machine monitor must function in two fundamental ways. First, it must allocate the resources of the real machine. This means creating and maintaining a virtual store for each virtual processor, creating virtual devices, mapping virtual devices to real devices, and assigning the real processor(s) to virtual processor. Second, the VMM must support actions of the virtual processors and devices which cause transitions directly in the virtual store. This includes interpretation of some instructions, simulation of traps and interrupts, and simulation of the actions of virtual devices.
Intuitively, a real machine is virtualizable if all instructions which may potentially disrupt the system can be arrested before execution and if the state of the real processor before the arrest can be recovered. In this way, a virtual processor which is assigned the real processor may be prevented from interfering with other virtual processors or the VMM. To demonstrate feasibility, it is first necessary to indicate which instructions are disruptive. A formal model of Popek and Goldberg  is presented which identifies potentially disruptive instructions and defines a formal requirement for virtualization which the instruction set must meet. Next, the architecture of the Nova 3/D is reviewed in the context of the formal model. The Nova 3/D instruction set is then inspected; it is shown that the Nova 3/D nearly satisfies the formal requirement and that problems may be overcome with only a small loss of equivalence. Finally, it is shown that the Nova 3/D provides for the recovery of real processor state after a potentially disruptive instruction is arrested.
The model of Popek and Goldberg concerns machines having address translation hardware and more than one mode of operation. Address translation is a prerequisite for virtualization since a virtual processor requires a logical address space equivalent to that of a real processor. Logical addresses are those generated by a running program. Modes provide a mechanism for partitioning the instruction set. The requirement for virtualization suggested by the model concerns the manner in which the instruction set is partitioned.
According to Popek and Goldberg , machine state is specified by the following four components.
In the Popek and Goldberg model, special registers associated with devices and the CPU were excluded for simplicity. They will be included here for completeness. To do so does not diminish the model's validity. Hence, a fifth component of state is:
In summary, the state of a computer is specified by the five
Before proceeding to the requirement for virtualization, several definitions are required. An instruction is said to trap when
Status information is normally defined as the address of the instruction causing the trap, the address which actually caused the trap, and the condition which caused the trap. An instruction is said to be privileged if it executes in supervisor mode and traps in user mode. An instruction is said to be sensitive if it is either control sensitive or behavior sensitive. Control sensitive instructions are those which alter M, R, or I. Instructions are behavior sensitive if their execution is dependent upon their location in physical memory or upon processor mode. An instruction which is not sensitive is said to be innocuous.
Assume a computer exists whose state is specified by the above five components and which has both innocuous and sensitive instructions, some of which are privileged. Is this machine virtualizable? Popek and Goldberg have shown that it is only if the set of sensitive instructions is a subset of the privileged instructions. Informally, this means that user mode programs cannot issue halt commands, change mode except through traps, alter relocation registers, control devices, or alter any component of I. An attempt to do so should always result in a trap.
In the context of the formal model, the architecture of the Nova 3/D may be reviewed. For a complete description of the hardware, refer to Data General manuals , , and .
Special registers of the central processor are the on/off switch, the interrupt enable flag (CPU busy flag), the power fail flag (CPU done flag), data switches, and the interrupt request line, which holds the addresses of all devices requesting interrupts. Special registers of the MMPU other than the relocation registers are the status register, which contains mode, protection, and map selection flags; the violation data register, which indicates the condition(s) causing the last trap; the violation address register, which indicates the logical address of the instruction causing the trap, the page check register, which is used to determine the contents of a map register; and the MMPU done flag, which indicates a protection violation during data channel mapping.
Protection flags of the status word are used to enable traps when user mode programs execute I/O instructions, use auto-increment/decrement locations, write or access protected pages, or execute more than fifteen consecutive defer cycles. Write and access protection is specified on a page by page basis by setting bits in the relocation registers. Auto location protection permits instructions, which indirectly address logical addresses 20 to 37 (octal), to be trapped.
Having reviewed the Nova 3/D architecture, it is appropriate to examine the instruction set, identify sensitive instructions, and verify that they are privileged. Should all sensitive instructions be privileged, then the formal requirement will be met and one aspect of feasibility will be demonstrated. The Nova 3/D with MMPU has seven classes of instructions: memory reference, arithmetic and logical, stack, trap, device I/O, CPU control, and MMPU control. The latter two classes are forms of the device I/O instruction, however, it is useful to consider them separately. In the following discussion, the assembly language mnemonic for each instruction is given in parentheses.
Because only eight bits are available for a memory address, addressing is done either directly to the first 256 words of memory or relative to the program counter or an index register. Relative addressing ranges between a minus 128 or plus 127 word displacement. Indirect addressing is also provided.
Since memory reference instructions do not alter M, R, or I, they are not control sensitive. If they do not indirectly address the auto increment/decrement registers, they are also not behavior sensitive and, hence, are innocuous. Indirectly addressing the auto locations constitutes a behavior sensitive instruction because the outcome depends upon mode. In supervisor mode, the results are as expected. However, in user mode, proper incrementing or decrementing will not occur if logical addresses 20 to 37 (octal) are not mapped to physical addresses 20 to 37. In this case, the sensitive instructions are also privileged because the MMPU can trap attempts to access these locations indirectly while in user mode.
In a single instruction, the source and destination accumulators are specified and the initial value of the carry bit is given. Also, options are available for shifting one place to the right or left, swapping bytes (high or low order eight bits), not loading the results, and skipping the next instruction conditionally or unconditionally.
The Nova 3/D also provides two additional instructions. These are:
The no operation command may also be used with I/O instructions addressing the CPU or MMPU. In the data in and data out instructions, the A, B, and C variations refer to data buffers. The control commands start (S), clear (C), and pulse (P) may be used with any I/O instruction. Depending on the device addressed, the interpretation of an I/O command is different. Because I/O commands can alter I, they are control sensitive. However, they are also privileged since the MMPU can trap I/O instructions while in user mode.
The second type of I/O instructions are those which alter the program counter based on the state of the busy or done flags of I/O devices, the CPU, and the MMPU. For the moment, instructions addressing the CPU and the MMPU are considered with those addressing devices for convenience. Strictly speaking, these instructions are not sensitive as they only alter P and cannot change I. Nonetheless, they are privileged since the MMPU can trap I/O instructions. These instructions are:
If the control command S is issued with any instruction addressing the CPU, the CPU busy flag is set to one and interrupts are enabled. Similarly, if a C command is issued, the busy flag is cleared and interrupts are disabled. The done flag (power fail indicator) is also cleared.
If a C control command is issued with an instruction addressing MAP, the violation data register and the MMPU busy and done flags are cleared. If a P is issued, mapping is enabled for the next data fetch. IF a C is issued with an instruction addressing MAP1, all internal MMPU logic is initialized.
One instruction not covered explicitly by the MMPU commands is mode change from supervisor to user. To change modes, the program map enable flag (bit 0) of the MMPU status register is set to one and the mapping inhibit flag (bit 2) is set to zero. Mode change occurs on the next defer (indirect address) cycle, which should be a jump indirect through a user program counter. The jump instruction is sensitive as it alters M, however, it is not privileged.
The preceeding review of all instructions has shown that the virtualization requirement of Popek and Goldberg is almost satisfied. Only the jump to user mode instruction creates a virtualization problem. If this problem can be overcome, the first aspect of feasibility is demonstrated.
There are several solutions to the virtualization problem created by the mode switch instruction. First, the VMM could simply fail to support the virtual MMPU facility. This is extremely easy to implement but would result in an unacceptable loss in equivalence. Second, every instruction following the status word change could be interpreted by the VMM. No alterations in real processor hardware or virtual processor software would be required, however, the VMM could become large and complex. Third, the real processor could be altered to trap on defer cycles. No changes in virtual processor software would be required; however, the impact on other operating systems using the same real processor would be unpredictable. Finally, virtual processor software could be required to issue a special explicit trap instruction immediately before the jump to user mode. No hardware changes would be required; however, a small loss of equivalence to the real processor would exist.
Of these four solutions, the last is chosen due to its relative ease of implementation and minor loss of equivalence. Existing programs may be altered with small complication since defer cycles following status word changes may be easily detected. Trap number 127 is used to indicate mode switch and should not be used by virtual processor software for any other reason. This number is selected because it is the largest and perhaps the least likely to be used by virtual processor software.
In order to give a complete demonstration of virtualization feasibility, it is necessary to show that the state of the processor before the trap may be recovered after the trap. Relevant processor state is defined by the program counter, the carry bit, the accumulators, the stack pointer, and the frame pointer. Traps correspond to violations due to I/O protection, auto location protection, access (validity) protection, write protection, and defer protection. In all cases, the program counter prior to the trap is saved in the MMPU violation address register.
Whenever a trap occurs, the instruction causing the trap does not execute, with one exception. An instruction which does not execute cannot change the relevant processor state other than the program counter, hence, the state before the trap is recoverable.
The exception concerns the SAV and RET stack instructions. In all stack operations, the stack and frame pointer are not updated until the operation is completed, thus, there is no danger of losing these pointers if a write or validity protection occurs during the operation. However, accumulators and main memory may be altered corresponding to the progress of the operation before the trap. This is of no consequence since only the program counter is required to restart the instruction. Upon restart, the altered components will have their values recopied.
According to Buzen and Gagliardi , the use of I/O processors and multiprogramming in the early 1960s created "very serious potential problems for system integrity." The creation of dual state architectures was a step towards solving these problems. Software could then be run in two modes: one which allowed access to all machine facilities and one which allowed only nondisruptive instructions to be executed. Only a privileged software nucleus which created an "extended" machine was allowed to run in the privileged mode. This solution created further problems. Programs designed to run on an extended machine could be transported only to facilities having an identical extended machine. More than one copy of privileged software could not be run concurrently, precluding development and modification of privileged software without a dedicated machine. Also, hardware test and diagnostic software could not be run concurrently with privileged software.
A method was needed for sharing a computer at the lowest level, the basic machine interface. If a single computer presents several basic machine interfaces which are completely isolated from one another, the above problems are solved. Transported programs may run on the appropriate extended machine concurrently with programs running on different extended machines. Similarly, programs designed to run on the basic machine interface may be run concurrently.
IBM began development of virtual machine systems in 1964 with the creation of CP-40 (Control Program 40), which ran on a System/360 model 40 modified to support virtual storage . The virtual machine created by CP-40 did not support address translation. Later, CP-67 emerged, running on a System/360 model 67. This system did support virtual machines having virtual address translation. A single user operating system, CMS (Cambridge Monitor System), was developed concurrently to extend the created basic machine interfaces.
According to Meyer and Seawright , IBM's objectives in this effort were to research timesharing methods, to examine hardware requirements for timesharing, to develop an in-house timesharing system, and to develop performance analysis techniques. A later IBM development was the VM/370 system which created virtual 370s capable of supporting all the System/360 and System/370 operating systems . IBM also developed the virtual machine like system M44/44X, which ran on a modified 7044, and the System 360/30, a single virtual machine system.
Other virtual machine systems are the Michigan Terminal System (MTS), which supports virtual 360s and runs on the System/360 model 67, the PDP 10 system, which runs on a modified PDP 10 at MIT , the HITAC 8400 system, which runs on a HITAC 8400 (RCA Spectra 70/45) , and the UCLA VM system, which runs on a modified PDP 11/45 .
If virtual machine methods are used to implement timesharing systems, sharing of data between users is prevented. This is because the virtual machine monitor is aware of all users but has no knowledge of any file structures. Conversely, user operating systems know about file structures but are unaware of each other. Several researchers  have developed techniques to permit data sharing in VM/370.
All virtual machine systems mentioned above, as well as the one described in this paper, use traps and simulation to support virtual machines. Goldberg  has pointed out that these methods are "clumsy and awkward." He proposes that hardware virtualizers be used to support virtual machines. These are hardware/firmware devices which provide a mapping function between virtual resources and real resources. The hardware virtualizer must store maps, activate virtual machines, compose maps, and pass control to the VMM after a map fault. A complete description of this concept is given in .
Several authors      have enumerated the uses of virtual machine systems. These ideas are summarized below. Virtual machine systems provide the capacity
Before the design of a virtual machine monitor can begin, it is necessary to give a specification for the created virtual machines. This specification describes objects visible to software running on a virtual machine as well as the operation of virtual actions.
The virtual store consists of the same components as the real store. Virtual main memory is restricted to thirty-two pages or 32,768 words. Store components include accumulators, stack pointer, frame pointer, carry bit, CPU busy and done flags, interrupt request line, and data switches. Store associated with devices and the MMPU is considered separately.
All instructions which may be executed on the real machine may be executed on the virtual machine, with one exceptions. A jump to user mode instruction will not execute properly unless it is immediately proceeded by an explicit trap with trap number set to 127.
The virtual store of the MMPU consists of the same components as the real MMPU. There are the two program maps, the two data channel maps, a status register, a violation address register, a violation data register, a mode switch (busy flag), and a data channel error flag (done flag). Virtual memory management, that is, address translation directed by virtual maps, is supported. All five protection features (write, validity, auto location, I/O, and defer) are also supported. Virtual violations occur on virtual machines just as real violations would occur on real machines. The consequences of virtual violations are also equivalent.
The following virtual devices are available. For each virtual device, the real device which supports it is indicated.
The teletype device configuration of a real Nova permits only two teletypes to be connected directly to device lines. Additional teletypes (CRTs) must be attached to a multiplexer (QTY) which uses a single device line. This organization is reflected in the configuration of virtual teletypes. A virtual multiplexer is not specified due to a lack of CRTs in the UT Nova configuration. Similarly, other peripherals (such as tape drives), which exist but are not present in the UT Nova configuration, are not specified.
Virtual interrupts occur in virtual machines exactly as real interrupts occur on real machines, assuming real processor and device states are equivalent. The consequences of virtual interrupts are also equivalent.
A virtual stack fault occurs when virtual interrupts are enabled and the virtual stack pointer is set to a multiple of 256 during a PSHA, POPA, SAV, or RET stack instruction. This is equivalent to a real machine stack fault. The consequences of a stack fault are also equivalent.
The functions of virtual front panel switches are provided through the command language used at the VMM operator's console. These functions are equivalent to those of the Nova 3/D with one exception. The memory reference functions are restricted to the first 1024 words of memory to prevent the possibility of a page fault. Front panel commands are fully described in Chapter 4.
The design of a virtual machine monitor is the creation of its structure and form. The goal of our design is to produce a VMM structure which is easy to understand and which suggests a straightforward implementation. Our main concern is with issues which are directly related to virtualization.
The method used for creating VMM structure is top down decomposition. The monitor is decomposed in an iterative fashion until an implementation is suggested.
There are two basic issues at the top level of virtual machine monitor design. The first issue concerns the initial decomposition of the monitor into computations, which we call processes. The meaning of the term "process" in this context is stated below. The mechanism of process scheduling is also considered as well as the problem of indivisible operation on shared data objects.
The second issue concerns virtual processors. In order to understand how the decomposition is to proceed, virtual processor states are identified and transitions between states are described. In this chapter, these two issues are discussed. The decomposition of VMM processes is described in Chapters 3 and 4.
The notion of concurrent processes used in this report is taken from Brinch Hansen . Essentially, a process is "a sequence of operations carried out one at a time." Processes are concurrent if their executions overlap (or interleave) in time. In this report, processes are our unit of decomposition, serving to organize our system into logical components. Processes are the entities which are visibly scheduled for execution, either directly or by interrupts or traps.
The following VMM processes may be identified. There is one process associated with each real device, one associated with real processor traps, one associated with stack faults, an initialization process, and a dispatcher process.
The process associated with the multiplexer (QTY) supports virtual actions for virtual teletype devices (TTI and TTO). The real time clock (RTC) process supports virtual actions for virtual clocks, provides an alarm clock function for signaling the end of virtual processor time slices, and provides a system clock. The disk (DKP) process supports virtual actions for the virtual 6030 diskette and virtual 4234 moving head disk. It also assists the support of page fault handling and command line interpretation (virtual processor bootstrap loading). The line printer (LPT) and card reader (CDR) processes support virtual actions for these virtual devices. The teletype (TTI and TTO) processes support virtual actions for the virtual second teletype device (TTI1 and TTO1). Finally, the processes associated with the second console (TTI1 and TTO1) provide console communication, system generation, and command line interpretation. Each of the device processes has the additional function of saving and restoring real machine state upon entry and exit and supporting the virtual interrupt and stack fault facilities.
The process associated with traps supports the majority of virtual processor actions which operate directly on the virtual store. This process handles page faults, interprets sensitive instructions, supports the virtual auto increment/decrement facility, provides virtual memory management and protection, simulates virtual traps, saves and restores real machine state upon entry and exit, and simulates virtual interrupts.
The initialization process exists out of necessity and should not be considered a true process of the VMM. It is active only when the monitor is started and cannot run in parallel with any other process.
The process associated with stack faults supports the virtual stack fault facility and handles faults which occur during VMM process execution. It saves and restores real machine state upon entry and exit and simulates virtual stack faults.
The dispatcher process serves two functions. When no virtual processors are ready to take over the real processor, it serves as an idling process and merely waits for an interrupt. If at least one virtual processor is ready, it composes an address translation map, realizes a portion of the virtual store, and assigns the real processor to the virtual one, putting itself to sleep. The dispatcher process may also initiate simulation of virtual interrupts and stack faults.
Virtual machine monitor processes are scheduled in three ways. First, all device processes and the stack fault process are scheduled by interrupts. This means that, potentially, any combination of these processes may execute logically in parallel. In practice, some combinations are explicitly prohibited by interrupt masking. This prevents undesirable results such as data loss caused by performance degradation due to processor sharing.
Second, the trap process is scheduled by real machine traps, which occur while the processor is assigned to a virtual processor. Because interrupts may occur during execution of the trap process, it can potentially run in parallel with device processes or the stack process.
Finally, the dispatcher process is scheduled explicitly by either the initialization process, a device process, or the trap process. The scheduling process puts itself to sleep in awaking the dispatcher process. In the case of a device or trap process, the dispatcher is awakened when a virtual processor, which was assigned the real processor when the trap or device process was awakened, is preempted. Again, because interrupts may occur while the dispatcher process is executing, it can potentially run in parallel with device processes or the stack fault process. It cannot run concurrently with the trap process.
With minor exception, all VMM processes execute a sequential program and wake up another process. One exception is the dispatcher process which could idle forever in the absence of interrupts. Also, the dispatcher process does not awaken a VMM process; rather, it assigns the real processor to a virtual processor. Nor does the virtual processor awaken a VMM process; it loses the real processor. For this discussion, consider a virtual processor to be a null VMM process which only serves to awaken another process or be awakened when all VMM processes go to sleep.
The picture of processor sharing one obtains from this design is quite simple. It is similar to nested subroutine calls of sequential programs in that the last process to be awakened is the first process to be put to sleep.
Virtual machine monitor processes perform indivisible operations on shared variables, hence, critical sections must exist. Critical sections must execute in a mutually exclusive fashion in order to prevent unpredictable results . A rather straightforward but inelegant solution to the mutual exclusion problem may be found by observing that at least one of the communicating processes is always a device process, which is scheduled by interrupts. To insure mutual exclusion of critical sections, it is sufficient to disable interrupts on entry and reenable them on exit, as the Nova 3/D is a one processor system.
If more than one processor were available, it would be desirable to build the VMM upon a software nucleus which would hide interrupts and provide process scheduling as well as synchronization primitives.
From the vantage point of a VMM, a virtual processor assumes four states corresponding to the running condition of a real processor. These are called ready, running, interpreting and blocked. There is also one state, called terminated, which corresponds to the real processor halt condition. These states have the following meaning:
In summary, interpretation of virtual instructions and simulation of virtual interrupts, stack faults, and traps occurs in the running and interpreting states. The ready state exists due to processor sharing; the blocked state is required for the same reason and because of the length of time required for virtual memory realization. Virtual store state transitions caused by virtual devices are independent of virtual processor state, however, the consequences of these transitions are significant only when a virtual processor is running or interpreting.
Transitions between states of a virtual processor are supported by VMM processes. Of the twenty possible transitions between five states, only fourteen exist. Figure 3 shows these transitions. A description of each transition and its causes is given below.
The disk subsystem is a collection of programs shared by three processes which supports I/O to the 4234 moving head disk and the 6030 diskette. These programs are executed by the TTO1 process to support virtual processor program loading, by the trap process to support virtual disk I/O, again by the trap process to support page fault handling, and by the DKP (disk) process to support the previous three functions. When the TTO1 or trap process requires disk I/O, it first acquires a free disk record, generates appropriate information, loads the record, and passes it to the disk subsystem. The subsystem performs the I/O, recovers from certain real disk errors, and signals completion. A disk record contains the following information.
The disk subsystem is composed of three parts, a disk record manager, a disk record queuer, and a disk driver. The record manager has the task of allocating and recovering disk records. The disk queuer maintains a queue of disk records and passes records to the disk driver according to a service discipline. Finally, the disk driver must actually perform the disk I/O based on the information contained in the disk record. This involves loading the data channel map, initiating a seek, initiating a read or write, and signaling completion. Should errors occur at any stage, the driver must attempt recovery; if the error is unrecoverable, the driver communicates this condition to the operator.
The disk subsystem is concerned only with the details of disk
transfers. In the context of VMM design, this is not particularly
relevant. What is more significant is the manner in which disk
records are generated. In closing, recall that the disk subsystem
is a collection of programs, not processes; separate activations
of the same program could be executed concurrently by several
Whenever the virtual processor is assigned the real processor by a VMM process, a virtual interrupt or stack fault may be pending. Simulation is required when virtual interrupts are enabled, virtual mapping is not inhibited, and a virtual interrupt or stack fault is pending. Virtual interrupts are pending if a least one virtual done flag is high and interrupts for the corresponding device are not disabled. A virtual stack fault is pending if a real stack fault occurred while the virtual processor was running. Virtual interrupts are enabled when the virtual CPU busy flag is one and the current virtual program counter is more than one greater than the address at which the virtual CPU busy flag was set.
Interrupt simulation begins by fetching the contents of virtual physical location one. This is the address of the virtual interrupt handler or the beginning of an indirect chain to it. The chain is followed until the end; should a link exist in an unrealized portion of virtual main memory, a program is entered which realizes the required virtual store. The contents of the virtual program counter are stored in virtual location zero. The virtual program counter is loaded with the address of the virtual interrupt handler. The CPU busy flag (interrupt enable) and MAP done flag (mode switch) are cleared.
Simulation of stack faults proceeds in the same manner except that the address of the stack fault handler (or a link to it) is fetched from virtual physical location three. Virtual stack fault simulation has precedence over virtual interrupt simulation, reflecting this condition in the real processor.
The trap process is awakened by a real processor violation trap or explicit trap instruction. In every case, the trap process first determines if the trap corresponds to a virtual trap. If so, a virtual trap is simulated; otherwise, actions associated with instruction interpretation or virtual memory are performed.
In order for a virtual processor to an execute innocuous memory reference instructions, the addressed portion of virtual main memory must be realized. The virtual machine monitor must support some form of memory management, more specifically, a demand paged virtual memory system. The selection of paging, with page size of 1024 words, is dictated by the Nova hardware; demand paging is appropriate since the VMM cannot anticipate the addressing behavior of a virtual processor.
The memory management and protection hardware of the Nova 3 was designed with the specific goal of supporting virtual memory. Address translation for running programs is supported by two thirty-two register program maps, which map logical page addresses (range 0..31) to physical page addresses (range 0..127). A logical page may also be mapped into a validity violation. Any attempt to access such a logical page results in a real processor violation trap; a validity violation may be interpreted as a page fault or as a signal to set an access bit maintained in software. Logical pages may also be individually write protected. A write violation may be interpreted as a signal to set a dirty bit maintained in software. When a trap occurs, the real processor saves the logical address of the instruction causing the trap and the logical page address of the violation. After a page fault, these registers indicate where to restart the virtual processor and which page to realize.
The notion of address translation supported by the VMM is twofold. First, while a virtual processor is running in supervisor mode, virtual processor physical addresses are mapped to real processor physical addresses. Second, while the virtual processor is running in virtual user mode, the VMM supports virtual memory management and virtual processor logical addresses are mapped to real processor physical addresses. This is actually a double mapping; the mapping from virtual processor logical addresses to virtual processor physical addresses is determined by virtual program maps. The mapping from virtual processor physical addresses to real processor physical addresses is determined by the VMM. Table 1 summarizes this organization. Composition of a real processor map of either type is performed by the dispatcher process immediately before a virtual processor is assigned the real processor.
EXECUTION STATE REAL PROCESSOR MODE MAPPING VMM Program Supervisor None Virtual Processor User Virtual Processor Physical Supervisor Mode to Real Processor Physical Program Virtual Processor User Virtual Processor Logical to User Mode Program Virtual Processor Physical to Real Processor Physical
When a validity violation occurs, the trap process is awakened. The trap process must first determine if the violation is to be interpreted as a page fault or if it corresponds to a virtual validity violation, that is, a violation which occurred while a virtual processor was running in user mode with virtual validity protection enabled on that logical page. Should the violation correspond to a page fault, a page fault handling program is executed. This program may also be entered during a sensitive instruction interpretation or during simulation of a virtual trap, stack fault, or interrupt if an addressed portion of virtual main memory is not realized.
The goal of the page fault handler is to redefine the mapping between virtual processor physical pages and real processor physical pages (frames) such that restarting the virtual processor will produce a minimal number of future page faults. This first involves selection of a frame which is currently unmapped (free) or which is mapped but can be replaced with minimal unfavorable consequences. In the latter case, a replacement algorithm is used to select a frame whose contents may need to be saved in the virtual store (swapping out). Swapping out is performed by giving the disk subsystem a record with the following contents:
Page fault handling is completed by swapping in the needed virtual processor physical page and updating status information. Swapping in is done in the same fashion as swapping out. The record contents are:
Dirty bits are maintained to prevent unnecessary swapping out of frames which have not been altered since they were swapped in. When a frame is swapped in, its dirty bit is normally set to false and real processor write protection is enabled. Any attempt to write on the protected frame causes a trap, awakening the trap process. The trap process must determine if the write violation corresponds to a virtual violation. If not, the dirty bit for that frame is set to true and write protection is removed. The virtual processor is then restarted and the write instruction is allowed to execute normally.
Pages of virtual processors should be realized only in mutually exclusive partitions of the frames not used by the VMM. This prevents problems associated with frame poaching. Partition size is initially defined during system generation. In order to change partition size, the system must be regenerated.
The replacement algorithm is perhaps the most important feature of a page fault handler. There are several possibilities, such as, first in first out, least recently used, and random selection. It is not clear which of these would yield the best performance. In the face of uncertainty, a modified random selection algorithm is chosen. Performance measurements would be needed to determine if this algorithm is the best.
The modified random selection algorithm holds that a frame should be selected at random from an unlocked subset of the virtual processor's frame partition. Frames are locked when it is certain that their replacement would cause a page fault. The following frames are always locked:
Suppose an indirect chain spanned several pages and each link reference produced a page fault; the locked list would grow as each page was swapped in and links would not be swapped out. The list would be cleared when another instruction produced a fault. Should the number of frames spanned by the indirect chain be greater than the partition size, all frames will become locked and the virtual processor would be permanently blocked. This condition is detected in the following way. If the number of locked frames equals the frame partition size, the address chain for data fetch of the current instruction is followed. If it spans the frame partition, the permanently blocked condition is present. A message indicating this fact is sent to the VMM operator's console.
It is possible to cause the locked list to grow without proper cause. For example, the instruction sequence
A primary function of the VMM is to provide virtual processor actions which correspond to instruction execution. These actions work on the entire virtual store. In all cases, instructions which require VMM support are I/O instructions. They may be addressed to the CPU, to the MMPU, or to an I/O device. While a virtual processor is assigned the real processor, any attempt to execute an I/O instruction results in a violation trap, awakening the trap process. The trap process must first determine if the violation corresponds to a virtual I/O protection violation. If not, the instruction is interpreted. This is done by fetching the instruction from the virtual store, decoding it, and executing an appropriate program.
Instructions which address the CPU (device 77) function to control the virtual interrupt facility, read the virtual switches, and halt the processor. Components of the virtual store referenced by these instructions are:
DIA (read switches): Contents of the virtual console switches are loaded into the designated accumulator. Virtual console switches are set by virtual console commands.
DIB (interrupt acknowledge): The done and interrupt disable flags of each virtual device of the virtual processor are searched until an interrupt pending condition (done=1, disabled=0) is found. The order of search is DKP, CDR, LPT, RTC, TTI, TTI1, TTO, TTO1. The device address of the first device meeting the condition is loaded into the designated virtual accumulator. If no device meets the condition, zero is loaded.
DOB (mask out): The interrupt disable flags for virtual devices of the virtual processor are set according to the bits in the designated virtual accumulator. A flag is set if its mask out bit is one, cleared if it is zero. The correspondence between bits and devices is: 7 - DKP; 10 - CDR; 12 - LPT; 13 - RTC; 14 - TTI and TTI1; 15 - TTO and TTO1.
DIC (clear I/O devices): The busy, done, and interrupt disable flags for each virtual device of the virtual processor are set to zero; the virtual MMPU busy and done flags are also cleared.
DOC (halt): The virtual processor is preempted.
NIO: No operation is performed.
SKP (skip): Interpret control command as BZ, BN, DZ, and DN.
S (enable interrupts): Set the virtual CPU busy flag to one and save the address of the instruction.
C (disable interrupts): Set the virtual CPU busy flag to zero.
BN (skip if interrupts enabled): The virtual program counter is incremented if the virtual CPU busy flag is one.
BZ (skip if interrupts disabled): The virtual program counter is incremented if the virtual CPU busy flag is zero.
DN (skip if power failure): No operation since there is no virtual power failure.
DZ (skip if power available): Increment the virtual program counter.
If the transfer field of an instruction addressing an I/O device or
the MMPU is SKP, action is determined by the control field. These
BZ (skip on busy zero): Increment the virtual program counter if the virtual busy flag is zero.
BN (skip on busy non zero): Increment the virtual program counter if the virtual busy flag is one.
DZ (skip on done zero): Increment the virtual program counter if the virtual done flag is zero.
DN (skip on done non zero): Increment the virtual program counter if the virtual done flag is one.
If the transfer field of an instruction addressing an I/O device or the MMPU is NIO, then no action due to the transfer field occurs. However, action specified by the control field is performed.
The MMPU instructions address two devices, MAP (device 2) and MAP1 (device 3). They function to control the virtual MMPU facility by reading and writing virtual MMPU registers. Also included is the explicit TRAP instruction required for virtual mode change. Components of the virtual store affected by instructions addressing MAP are:
DOB (load map): A virtual map register is loaded according to the contents of the designated virtual accumulator.
DIA (read MMPU status): The current virtual MMPU status word is loaded into the designated virtual accumulator.
DOA (write MMPU status): The contents of the current MMPU status word are copied into the previous MMPU status word. The contents of the designated virtual accumulator is loaded into the current virtual MMPU status word.
DIB (read violation data): The virtual violation data register is loaded into the designated virtual accumulator.
P (map single cycle): The entire map single cycle sequence must be completely interpreted. If the next instruction is realized then it is fetched from the realized virtual store. If not, the page fault handler is entered. The effective address for the data fetch of the instruction is computed; this is a virtual processor logical address. A corresponding virtual processor physical address is fetched from the virtual program map indicated by the single cycle map select bit in the virtual MMPU status word. Should a virtual validity violation be present, a virtual single cycle validity violation is simulated.
If the virtual processor physical address is not realized, the page fault handler is entered. The virtual processor physical address is used in software interpretation of the command. If the command causes writing in the virtual store and if single cycle write protection is enabled, a single cycle write violation is simulated.
If the page fault handler is entered, the map single cycle instruction is restarted. Should subsequent page faults occur, needed frames will not be swapped out since they are on the locked list.
C (clear violation): The virtual violation data register, virtual MAP busy flag, and virtual MAP done flag are cleared.
Components of the virtual store referenced by instructions addressing MAP1 are:
DOA (initiate page check): The contents of the designated virtual accumulator is loaded into the virtual page check register.
DIA (page check): The contents of the virtual map register selected by the virtual page check register are loaded into the designated virtual accumulator.
DIB (read violation address): The contents of the virtual violation address register are loaded into the designated virtual accumulator.
C (clear map): All virtual registers associated with the MMPU are cleared.
Explicit Trap to User Mode. A real processor enters user mode by appropriately setting bits in the MMPU status word and executing a defer cycle. As previously discussed, a virtual processor may enter virtual user mode by executing an explicit trap. The instruction following the trap should specify a defer cycle. A trap executed while a virtual processor is running awakens the trap process.
The trap process first determines if the trap corresponds to a jump to virtual user mode. If not, a virtual explicit trap is simulated. The previous virtual MMPU status word is compared to the current virtual MMPU status word. Action is not taken unless differences are found in the enable bit, the inhibit bit, or the map select bit. Interpretation continues only if the enable bit is one and the inhibit bit is zero. In this case, the virtual violation data register is inspected for error conditions. If any exist, a virtual violation trap should be simulated. Otherwise, the next instruction is fetched if it is realized. If not, the page fault handler is entered. This instruction should be a jump indirect to a virtual processor physical address. If the address is realized, its contents are fetched; if not, the page fault handler is entered. These contents are either a virtual processor logical starting address or a pointer to one. In the latter case, the indirect chain must be followed until the starting address is found.
In mapping virtual logical addresses to real physical addresses, a virtual validity violation is simulated if a virtual map so indicates and the page fault handler is entered if a virtual processor physical page is not realized. Once the starting address is determined, it is loaded into the virtual program counter. The virtual busy flag (mode switch) for MAP (device 2) is set to one and the virtual processor is preempted. When it is later promoted to running, virtual processor logical addresses will be mapped to real processor physical addresses and a virtual user mode will exist.
The VMM supports mapping between virtual teletypes and the CRTs connected to the multiplexer. Hence, all virtual teletype I/O will result in multiplexer I/O by the VMM. Components of the virtual store affected by instructions addressing TTO or TTI are:
DIA (read character buffer): The contents of the virtual input buffer are loaded in the designated virtual accumulator.
S: Set TTI busy to one and TTI done to zero.
C: Clear virtual TTI done and busy.
The following actions correspond to transfer and control commands addressing
DOA (write character buffer): The contents of the designated virtual accumulator are loaded into the virtual output buffer.
S: Set virtual TTO busy to one and virtual TTO done to zero. Issue a multiplexer instruction to send the character in the virtual output buffer to the appropriate CRT.
C: Clear virtual TTO busy and virtual TTO done.
All virtual processors have access to a virtual real time clock. The concept of virtual real time at first seems contradictory. It is, however, a required facility if virtual processors are to support system software. Time is of interest to a virtual processor only when it is the running or interpreting states. When it is blocked or ready, time is at a standstill; this reflects the fact that real processors never experience time losses due to processor sharing. Transitions through these states should appear instantaneous in virtual time.
Real time clocks are simulated by counters which decrement only when a virtual processor is running or interpreting. Virtual clock counters are decremented in response to clock pulses from the real machine real time clock.
Components of the virtual store effected by instructions addressing RTC are:
DOA (set clock frequency): Load the output buffer with the contents of the designated virtual accumulator. The value in the output buffer determines virtual clock frequency, that is, the initial value of the virtual clock counter.
S: Set RTC busy to one and done to zero, enabling virtual RTC interrupts.
C: Clear RTC busy and done.
Instructions addressing DKP correspond to virtual disk I/O commands for the virtual 4234 moving head disk (drive 0) and the virtual 6030 diskettes (drives 1 and 2). Virtual diskette I/O is possible only if the virtual processor is mapped to a real diskette drive. I/O on a virtual 4234 is allowed only if the virtual processor has been allocated real 4234 cylinders. The VMM supports virtual disk I/O by causing transitions in virtual DKP registers and by creating disk records corresponding to the specified commands. When a disk record is completed, the VMM realizes any unrealized portions of the virtual store needed for the virtual disk transfer and passes the record to the disk queuer.
Virtual disks differ from real disks in several ways. Virtual seeks cause only updating of a virtual register, not a real seek. On the other hand, a virtual read or write results in a real seek and a real transfer operation. It may also initiate handling of page faults associated with unrealized virtual store required in the virtual transfer. Several errors which are possible during real disk I/O are not present in virtual disks. These are the invalid status condition, unsafe condition, address error, checkword error, and data late error. The VMM recovers from these errors before completing virtual disk I/O.
Virtual disk errors which are possible are the seek error, which occurs if the specified track or cylinder is out of range, and the end of cylinder error, which happens if the end of cylinder or track is reached before the I/O completes.
Software mapping of disk addresses is performed only with respect
to cylinders on the 4234 moving head disk. A virtual 4234 disk has
n cylinders numbered 0..n-1 where n ≤ 408-k. There are 408 cylinders
on a real 4234 disk and k cylinders are used by the VMM. When a
virtual processor is allocated cylinders, a base cylinder address
is defined which is the physical address of virtual cylinder
zero. Hence, the physical address corresponding to a virtual
cylinder address may be computed by
Components of the virtual store referenced by instructions addressing device DKP are:
DOC (specify disk address and sector count): the virtual DASCR is loaded with the contents of the designated virtual accumulator; if drive 0 is selected, the diskette bit of the virtual DKP status register is cleared; otherwise it is set to one.
DIC (read disk address and sector count): the designated virtual accumulator is loaded with the contents of the virtual DASCR.
DOA (specify command and track): if the command is a seek, the virtual command and track register is loaded with the contents of the designated virtual accumulator; otherwise, only the command field is loaded.
DIA (read status): the designated virtual accumulator is loaded with the virtual DKP status register.
DOB (load memory address): the virtual memory address register is loaded with the contents of the designated virtual accumulator.
DIB (read memory address): the designated virtual accumulator is loaded with the virtual memory address register.
S: Sets virtual DKP busy to one and clears DKP done and the end of cylinder error flag in the virtual DKP status word. If the command in the virtual command and track register is recalibrate, a recalibration flag is set and the action corresponding to a seek is performed (except clearing the recalibrate flag). If the command is a seek, the recalibrate flag is cleared and the seek done flag for the selected drive is set in the virtual DKP status word. If drive 0 is selected, the virtual seek error flag is set if the cylinder address is out of range. Should drive 1 or 2 be selected, the error flag is set if the track address is greater than 63. The DKP busy flag is cleared and the done flag is set to one. If virtual interrupts are enabled for DKP, a virtual interrupt is pending.
If the command is a read or write, all but the data channel mapping portion of a disk record is immediately generated. The record contents are:
- Identification: disk I/O / current virtual processor / last record.
- Drive: drive field of virtual DASCR.
- Disk Address.
- Cylinder: If the recalibrate flag is set, then the physical track or cylinder corresponding to logical track (cylinder) zero; otherwise, if drive is zero, then cylinder field of command and cylinder register plus the virtual processor base cylinder address; otherwise, the track field of the virtual command and track register.
- Surface: if drive is zero, then the surface field of the virtual DASCR.
- Starting Sector: if drive is zero, then the starting sector field of the virtual DASCR MOD 12, otherwise, the starting sector field MOD 8.
- Sector Count: if drive is zero, then 16 MINUS the sector count field of the virtual DASCR; otherwise ( 16 MINUS the sector count field ) MOD 9.
- Memory Address: contents of virtual memory address register.
- Read/Write: contents of command field of virtual command and cylinder register.
The VMM now determines which portions of the virtual store need to be realized prior to the virtual disk transfer, realizes these pages, and defines an appropriate data channel mapping for the virtual disk I/O.
The address in the virtual memory address register corresponds to a virtual processor logical address only if virtual data channel mapping is enabled. This is indicated by bit one in the virtual MMPU status word. Should virtual data channel mapping be disabled, the address is physical. Whether logical or physical, the starting and stopping page addresses for the virtual disk transfer may be computed by
starting page address = virtual memory address register DIV 1024
stopping page address = [virtual memory address register PLUS (disk record sector count TIMES 256)] DIV 1024.
Assuming these are virtual processor physical pages, the page descriptor for each page indicates if it is realized. If so, a disk record mapping field may be loaded corresponding to the page descriptor. If so, the page fault handler is entered. Upon completion of the page fault handling routine, the updated page descriptor is used to load a disk record mapping field.
If the page fault handler is to be entered, the frames of all realized pages needed in the virtual disk transfer should be entered on the locked list to prevent their replacement. The previous fault instruction should be set to the current instruction to prevent clearing of the locked list on first entry. Whenever, a frame is swapped in, its frame will also be placed on the locked list to prevent its replacement.
The VMM can determine if the entire contents of a page is to be overwritten by a virtual disk read instruction. In this case, if the page is not realized, the page fault handler does not need to swap in the page after it has obtained a free frame. To do so would be needless since the entire contents of the page will be lost due to the virtual disk transfer.
If the starting memory address corresponds to a virtual processor logical address, then the virtual processor physical address of each page is determined from the virtual data channel map A. If the virtual map indicates a validity violation for a page, this information is loaded into the disk record mapping field. Execution of the disk record by the disk driver will result in a data channel error. This condition is detected by the disk process and a virtual data channel error is simulated. Similarly, if the virtual data channel map indicates write protection and the virtual write protect enable flag is set in the virtual MMPU status word, this information should be loaded into the disk record mapping field.
If the data channel map contains a virtual processor physical page address, the page descriptor is used to determine if it is realized. If so, the disk record mapping field is loaded with the double mapping. Otherwise, the page fault handler is entered. On completion, the updated page descriptor is used to load the double mapping information.
When the disk record is finally completed, the ready bit of the virtual DKP status word is cleared and the record is sent to the disk queuer.
C: The virtual DKP busy flag, done flag, all virtual error flags, and all virtual seek done flags are cleared. Virtual disk transfers in progress continue.
P: Same action as S except that the virtual busy flag is not altered and all virtual error flags are cleared. A read or write operation initiated with a P while the virtual busy flag is zero will not request a virtual interrupt upon I/O completion. This command is normally issued to initiate a seek.
A virtual processor may perform I/O on the virtual card reader only if it is mapped to the real card reader. Virtual card readers are equivalent to real card readers in all respects except for data loss potential. In a real card reader, once a card enters the read station, characters are read every 400 microseconds until the end of card is reached. Characters must be processed at this rate to prevent data loss. Because a program executing on a virtual processor cannot be expected to respond to character input within 400 microseconds, the VMM must buffer the entire card. Hence, virtual card readers do not lose data.
When a virtual processor executes an instruction to pick a card and read the first character, the VMM reads the entire card and stores the contents in a buffer. Subsequent character reads by the virtual processor only require the VMM to fetch the next character from the buffer.
Components of the virtual store referenced by instructions addressing CDR are:
The following actions correspond to transfer and control commands.
DIA (read column): if the VMM is currently buffering a card or if the buffer has been read, then the designated virtual accumulator is loaded with the last component of the character buffer. Otherwise, the buffer pointer is incremented and the contents of the selected component are loaded into the designated virtual accumulator.
DIB (read status): If the VMM is currently buffering a card then the ready bit of the virtual CDR status word is cleared and the status word is loaded into the designated virtual accumulator. Otherwise, the real CDR status word is loaded into both the virtual CDR status word and the designated virtual accumulator.
S: The virtual CDR busy flag is set to one and the done flag is cleared. The buffering flag is set, the buffer pointer is reinitialized, and an attempt is made to read the first character from the real card reader. If the read succeeds, it is loaded into the character buffer after the buffer pointer is incremented. Should the read fail, the buffering indicator is cleared.
C: Clear the virtual CDR busy and done flag.
P: If the end of buffer has been reached, the virtual CDR busy flag is cleared. In any case, virtual done is set to one. If interrupts are enabled for this device, an interrupt request is pending.
A virtual processor may perform I/O on the virtual line printer only if it is mapped to the real line printer. Components of the virtual store referenced by instructions addressing LPT are:
DOA (load character buffer): the real device character buffer is loaded with the contents of the designated virtual accumulator.
DIA (read status): the designated virtual accumulator is loaded with the contents of the real device status register.
S: The virtual LPT busy flag is set to one and the virtual done flag is cleared. A real start command is issued to the real line printer.
C: The virtual LPT busy and done flags are cleared. A real clear command is issued to the real device.
A virtual processor may perform I/O on the virtual second teletype only if it is mapped to the real teletype (devices TTO and TTI). Components of the virtual store referenced instructions addressing TTI1 are:
The following actions correspond to transfer and control commands
DIA (read character buffer): The designated virtual accumulator is loaded with the real TTI character buffer.
S: The virtual TTI1 busy flag is set to one and the virtual done flag is cleared. A real start instruction is issued to the real TTI device.
C: The virtual TTI1 busy and done flags are cleared. A real clear instruction is issued to the real TTI device.
Components of the virtual store referenced by instructions addressing TTO1 are:
The following actions correspond to transfer and control commands
DOA (load character buffer): The real TTO device character buffer is loaded with the contents of the designated virtual accumulator.
S: The virtual TTO1 busy flag is set to one and the virtual done flag is cleared. A real start command is issued to the real TTO device.
C: The virtual TTO1 busy and done flags are cleared. A real start command is issued to the real TTO device.
If an auto increment or auto decrement instruction is attempted while a virtual processor is assigned the real processor, the trap process will be awakened. It must first determine if a virtual auto location violation has occurred; such is the case if the virtual MMPU status word indicates that virtual auto protection is enabled. If a virtual violation is not present, the instruction must be interpreted by the VMM. The auto increment/decrement facility is supported if the virtual processor is in supervisor mode (case 1) or if it is in user mode with virtual logical page zero mapped to virtual physical page zero (case 2). If virtual logical page zero is mapped elsewhere (case 3), interpretation is required but the auto facility is not supported.
Interpretation for case one proceeds as follows. Determine the effective data address of the trapping instruction (range 20 to 37 octal) and fetch the contents of this location from the virtual store. If the data address was in the range 20 to 27 (octal), increment the contents just fetched; otherwise decrement them. Store the updated contents back into the realized virtual store. If these contents specify an indirect address, fetch the word they point to. This process ends when the last address in the chain is fetched. Should any link or the final word pointed to not be realized, the page fault handler is entered. The final address fetched is used in the software interpretation of the instruction.
In case two, addresses generated are virtual processor logical addresses. Interpretation is the same as case one except that virtual program maps are used to obtain virtual processor physical addresses. If a virtual logical page is found to be validity protected, a virtual validity violation is simulated. Similarly, if instruction interpretation requires writing on a virtual logical page that is virtual write protected, a virtual write violation is simulated.
Case three is the same as two except that the initial auto increment or decrement is not performed. In all cases, the only instructions which are candidates for interpretation are the memory reference instructions (LDA, STA, JMP, JSR, ISZ, and DSZ).
The virtual protection features of a virtual processor are enabled only when the virtual processor is in virtual user mode and protection enable flags in the virtual MMPU status word are set to one. Any protection violation which occurs while the virtual processor is assigned the real processor awakes the trap process. If the trap process determines that virtual protection is enabled, then a virtual protection violation is supported. Virtual violation simulation begins by fetching the address of the virtual violation handler from virtual physical location 47. If this is an indirect address, the chain is followed until the end. Should links not be realized, the page fault handler is entered. The address of the virtual violation handler is loaded into the virtual program counter, the virtual logical address of the violating instruction is placed in the virtual violation address register, the inhibit flag (bit 2) of the virtual MMPU status word is set to one, and the virtual MMPU and CPU busy flags are cleared. The virtual violation data register is loaded with the virtual logical page address of the violation and appropriate violation flags are set.
The trap process determines virtual violations in the following manner:
In the real processor, the following violation combinations are
Explicit traps are simulated in the same fashion except that the address of the trapping instruction is placed in virtual physical location 46 and the virtual violation data register is not altered.
On entry to the trap process, the state of the processor at the time of the trap is saved. State includes accumulators, carry bit, stack and frame pointers, and program counter. During the trap process, components of the virtual store may be altered.
On exit, either the real processor is reassigned to the virtual processor or the dispatcher process is awakened. If the preempt condition is true initially, the dispatcher process is awakened. Otherwise, if a virtual interrupt or stack fault is pending, it is simulated. Should simulation result in a virtual mode switch or page fault, the virtual processor is preempted and the dispatcher process is awakened. Otherwise, the virtual processor regains the real processor.
The dispatcher process is awakened whenever a virtual processor is preempted. Should no virtual processor be in a ready state, the dispatcher holds the processor and awaits an interrupt. Interrupts schedule device processes which may promote virtual processors to the ready state. In this case, return to the dispatcher results in selection of a ready virtual processor and its promotion to running.
Whenever more than one virtual processor is ready to run, the VMM must determine which virtual processor is to be selected and how long it is to run. These considerations characterize the scheduling policy. It is desirable to provide a mechanism through which a wide range of policies may be implemented according to specified parameters. To this end, the dispatcher process provides a round robin variable quantum service discipline. Quanta for the virtual processors are specified by the operator and may be changed at any time. Should a quantum be set to zero, the virtual processor will not be preempted until a page fault or explicit fault occurs. This scheduling mechanism allows policies ranging from equal quantum round robin to priority round robin to first come first serve.
Once a virtual processor is selected, promotion to running proceeds in several steps. First, virtual processor state is checked for the existence of a pending virtual interrupt or stack fault. If one exists, it is simulated. Should a page fault result, the virtual processor is put in the blocked state and another virtual processor is selected.
The next step is composition of a real address translation map. If the virtual processor is in virtual supervisor mode, a real address map is created based on the frame table. On the other hand, if virtual user mode is present, virtual memory management must be supported. The real address map is based on the virtual program map selected in the virtual MMPU status word and on the page descriptors.
Promotion to running is completed by initializing the alarm clock with the specified quantum, realizing the virtual accumulators, carry bit, stack pointer, and frame pointer, and assigning the real processor to the virtual processor.
The stack fault process is awakened by a real processor stack fault. Should another VMM process be executing, the real processor stack pointer is inspected for possible overflow into virtual processor address space. If so, the condition is communicated to the operator. Should a virtual processor be assigned the real processor when the stack fault occurs, a virtual stack fault pending condition is set by the VMM. On exit from the stack fault process, the dispatcher process is awakened only if the end of the current time slice has been reached. Normally, the stack fault is simulated. Should a virtual mode switch or page fault occur during simulation, the dispatcher process is awakened. If not, the virtual processor regains the real processor.
The multiplexer process is awakened by a real machine interrupt from the multiplexer device. CRTs connected to the multiplexer are potentially mapped to virtual processors, which consider them to be TTI and TTO devices. The multiplexer process first determines if the interrupt corresponds to successful transmission or reception of a character and which CRT line is involved. In the case of transmission, the virtual busy flag of the appropriate virtual TTO is cleared and the virtual TTO done flag is set to one. If the virtual interrupt disable flag for the virtual TTO device is zero, a virtual interrupt is pending.
In the case of a reception, the character received is transferred to the appropriate virtual TTI character buffer. Virtual TTI busy is cleared and TTI done is set to one. A virtual interrupt is pending if the virtual interrupt disable flag of the virtual TTI device is set.
In both cases, the VMM clears the real machine interrupt condition.
The real time clock process is awakened by a real time clock interrupt at the rate of one thousand times per second. It serves three functions. First, a thirty-two bit system clock is maintained which increments every time the process is awakened. This clock overflows approximately one and a half months after the monitor is started. Second, it simulates virtual real time clocks by decrementing the virtual clock counter of the running virtual processor. Should the counter become zero, it is reset according to the contents of the virtual RTC output buffer. If the virtual RTC busy flag is equal to one, it is cleared and and virtual RTC done is set to one. A virtual interrupt for this device is pending if its virtual interrupt disable flag is zero.
Finally, an alarm clock counter is maintained which times a virtual processors time slice. This clock is initialized by the dispatcher process and decrements as long as it is nonzero. When it becomes zero, the running virtual processor is preempted.
The disk process is awakened by an interrupt from the real DKP device, signifying the completion of either a seek or read/write disk operation. The disk driver is called to continue the disk operation according to the current disk record, or to signal completion of a disk transfer. Transfer completion requires further action. The disk queuer is called to schedule another disk record if one is waiting and the used record is returned to the record manager.
Further action is required if the last record of a page fault, bootstrap load, or virtual disk I/O sequence is encountered. When a page fault is serviced, the state of the corresponding virtual processor is assigned ready. At the end of the bootstrap load sequence, a message is sent to the operator verifying this condition. Finally, at the end of virtual disk I/O, the contents of the real disc address and sector count register, status register, and memory address register are copied into their virtual counterparts. If the virtual DKP busy flag is equal to one, it is cleared and virtual DKP done is set to one. A virtual interrupt for the virtual DKP device is pending if its interrupt disable flag is equal to zero.
The card reader process is awakened by an interrupt from the real card reader. Interrupts from this device occur only when the VMM is reading a card for a virtual device. The character just read is stored in the selected component of the character buffer after the buffer pointer is incremented. If the end of card is reached, the buffering indicator is cleared, the buffer pointer is reset, virtual CDR busy is cleared, and virtual CDR done is set to one. A virtual interrupt is pending for the virtual CDR device if its interrupt disable flag is set to zero. If the end of card is not reached, an instruction is issued to the real card reader to set up an interrupt when the next character is read.
The line printer process is awakened by an interrupt from the real line printer. Virtual LPT busy is cleared and virtual LPT done is set to one. A virtual interrupt is pending for this device if the virtual LPT disable flag is set to zero.
These processes are awakened by interrupts from the real TTO and TTI devices. These real devices are potentially mapped to the virtual second teletype (TTO1 and TTI1) of a virtual processor. For each device, if virtual busy is cleared and virtual done is set to one. A virtual interrupt is pending if the corresponding virtual device (TTO1 or TTI1) interrupt disable flag is zero.
These processes are awakened by interrupts from the real TTO1 and TTI1 devices. These devices correspond to the real second teletype, which serves as the VMM operator's console. The three basic functions are operator console communication, system generation, and command line interpretation.
Communication with the VMM operator's console is performed through an input and an output buffer. Typing a character on the console keyboard causes an interrupt, which awakens the TTO1 process. The character is echoed and stored in the input buffer. A carriage return specifies the end of line; the input buffer is processed as either a system generation input or a command line.
Output is performed by loading a message in the output buffer and printing the first character. At this point, the keyboard is locked to prevent character echoing in the output line. When a character is printed, an interrupt awakes the TTO1 process, which prints the next character in the buffer. When the last character is printed, the keyboard is unlocked. The TTO1 process is not awakened after the printing of an echo character.
When the virtual machine monitor is started, several parameters need to be specified. These are the number of virtual machines desired, a frame partition size for each virtual machine, and a default quantum. As a response to each query is given, the TTI1 process executes a program which loads system tables. A response is not accepted unless it is considered to be legal. Responses are free format.
Once the system is generated, further inputs are considered to be command lines, of which there are three types. First, there are commands which cause simulation of virtual CPU console functions. Next, there are commands which control allocation of real devices. Finally, there are commands which serve other functions. A command line is interpreted only if it is syntactically and semantically legal. All commands are free format. Complete syntax diagrams are given in Appendix C.
A real processor normally has a front panel which permits various functions to be performed. In order to provide these functions for virtual processors, the commands must be entered at the VMM operator's console. Keywords are used to distinguish commands. Given below is a list of available commands. For each command, its keyword, syntax, and function is given.
The operator must have some means of mapping real devices to virtual devices. Specifically, the line printer, card reader, multiplexer CRTs, second teletype, 6030 diskettes, and 4234 disk cylinders must be allocated. Three commands are used to control device allocation. These are:
Several other commands serve various functions. These are:
Upon entry to a device or stack fault process, the state of the real processor at the time of the interrupt or stack fault is saved. The program counter is recovered from real processor physical location zero. A counter is also incremented which indicates how many levels of interrupts are present.
Upon exit, several possibilities exist. If the stack fault or device process put another VMM process to sleep, it is reawakened. Otherwise, the virtual processor may need to be reassigned the real processor. First, the virtual processor is checked for the preempt condition. If it exists, the dispatcher process is awakened. If it is not preempted, the virtual store is checked for the presence of a pending stack fault or interrupt. Should it be necessary, the stack fault or interrupt is simulated. If simulation results in preemption due to virtual mode switch or a page fault, the dispatcher process is awakened. If the dispatcher is not awakened, the virtual processor regains the real processor.
The structure of the virtual machine monitor has been defined in the preceeding chapters. If the design goal has been achieved, implementation should proceed in a straight forward fashion.
There are four implementation issues to be considered. First, what programming language is to be used? Second, how are processes to be scheduled? Third, what system tables and variables are needed? Finally, what programs are required to implement the functions of VMM processes?
The virtual machine monitor is implemented in a version of PASCAL developed for the Nova 3/D at the University of Texas at Austin. Several papers    describe the compiler and procedures for its use. The compiler translates PASCAL source code into the Nova macro assembly language, producing only innocuous instructions. Nova PASCAL is suitable for systems programming due to the following features:
Several features available in standard PASCAL are omitted in Nova PASCAL for reasons described in . According to , omitted features are
In some cases, it is necessary to load the contents of a PASCAL defined variable into an accumulator before an assembly language instruction using that accumulator is issued. Similarly, it is often necessary to load a PASCAL variable with the result of an assembly language instruction. So that it does not interfere with compiler generated code, this type of assembly code is encapsulated in procedure and function bodies. In this way, the relation between generated and non generated assembly code may be well understood.
All VMM processes are scheduled by the Nova hardware. The address of the entry program for device processes is put in location 000001 (octal). The address of the entry program for the stack fault process is put in location 000003. Finally, the address of the entry program for the trap process is put in location 000047. Whenever an interrupt, stack fault, or trap occurs, the hardware initiates execution of the appropriate entry program.
The following system variables are needed by the programs to be described.
The following system tables are needed. These data structures are arrays.
Programs are resources of processes; several processes may share the same program or may even execute the same program concurrently provided that it is reentrant. All shared programs of the VMM are reentrant. Because the Nova VMM is a subject of research, programs implementing it are likely to be modified. If the VMM is well designed, these changes will concern the means of implementing a function rather than the definition of the function itself. In this discussion, only a description of the function of VMM programs will be given. Program listings should be consulted for an exact implementation specification. As the majority of the code is written in PASCAL and adheres to standard conventions, it should not be difficult to follow.
The following code is written in assembly language.
Device process entry routine: Saves current processor state and calls main program of the device process. Depending on which device process is awakened, PASCAL defined variables may be loaded with different components of real machine state.
Stack fault process entry routine: Saves current processor state and calls main program for stack fault process. This program is the same as the device entry routine except that it has a different entry point.
Device and stack fault process exit routine: Returns processor to another stack fault or device process program, the dispatcher, or a virtual processor. Return to dispatcher requires that the state of the virtual processor be saved. Return to the virtual processor requires calling of a program to test for and simulate virtual stack faults and interrupts.
Trap process entry routine: Saves virtual processor state and loads selected MMPU registers into PASCAL defined variables. Calls the trap process main program.
Trap process exit routine: Returns processor to the dispatcher or a virtual processor. Return to virtual processor requires calling of a program to test for and simulate virtual stack faults and interrupts.
Stack initialization routine: Initializes the real processor stack and frame pointers as well as the interrupt stack. Executed by the initialization process.
The following programs are executed by the initialization process.
Procedure init1: Initializes the lexical table and the reserved word list.
Procedure enterrw: Enters words in the reserved word list.
Procedure init2: Initializes all system variables and tables.
The following programs are executed by the dispatcher process.
PASCAL main program (following DISP label): Selects a ready virtual processor and promotes it to running.
Procedure supermap: Composes a real address translation map in program map A using information in the frame table. Prerequisite to starting a virtual processor in virtual supervisor mode.
Procedure usermap: Composes a real address translation map in program map B using information in a virtual program map and the page descriptors. Prerequisite to starting a virtual processor in virtual user mode.
The following programs are executed by the TTI1 process.
Procedure ptti1: Main program of the TTI1 process. Handles console communication input and calls programs for command line interpretation and system generation.
Procedure legaltoken: Semantic routine which tests legality of input.
Procedure cli: Interprets command lines; calls other programs to interpret most commands.
Function showfunction: Interprets the SHOW command.
Function ownfunction: Interprets the OWN command.
Function bootfunction: Interprets the BOOT command.
Function clifunction: Interprets the virtual console commands and the RESET and ALLOCATE commands.
Procedure sysgen: Initiates the printing of system generation queries and the interpretation of inputs. This program may also be executed by the TTO1 process.
The following programs perform operation on the input and
output buffers. They may be executed by the TTI1 or Trap
Procedure putin: Place a character in the input buffer.
Procedure writec: Place a character in the output buffer.
Procedure writeln: Call writec to place a carriage return and line feed in the output buffer.
Procedure prompt: Call writeln; call writec to place a star in the output buffer.
Procedure writes: Call writec to place a string in the output buffer.
Procedure writen: Call writec to place a numeral in the output buffer.
Procedure sendobuff: Initiate printing of the output buffer.
Procedure nextch: Fetch a character from the input buffer.
The following program may be executed by the TTO1 or Trap
Procedure putch: Waits for TTO1 device to become idle and prints a character.
The following programs are used to scan the input buffer and
fetch tokens. They are executed by the TTI1 process.
Procedure getoken: Fetch the next token from the input buffer.
Procedure getnumb: Fetch a numeral from the input buffer and convert it to a number.
Procedure getid: Fetch an identifier from the input buffer and look it up in the reserved word list.
The following programs comprise the disk subsystem and may be
executed by the TTI1 process, the trap process, and the DKP
Procedure diskdriver: Performs disk I/O based on a disk record.
Procedure enterqueue: Places a disk record in the disk record queue.
Procedure scheduledisk: Schedules a disk I/O if the disk is idle.
Procedure getrecord: Obtains a free disk record.
Procedure releaserecord: Frees a disk record.
Function Availrecord: Indicates if any disk records are free.
The following programs are used to handle page faults. They
may be executed by any process.
Procedure faulthandler: Generates the disk records for swapping pages in and out. Updates page descriptors. Implements the replacement algorithm.
Procedure clearlocks: Clears the locked list of a virtual machine.
Procedure enterlock: Enters a frame on the locked list of a virtual machine.
Procedure setlock: Prevents clearing of locked list on execution of faulthandler.
The following programs are main programs for device processes.
Procedure ptto1: Handles operator console output communication.
Procedure prtc: Simulates the actions of the virtual real time clocks and handles the end of a time slice. Due to performance considerations, the incrementing of the system clock and the decrementing of the alarm clock is handled by the rtc process entry routine.
Procedure pqty: Simulates the actions of virtual TTO and TTI devices.
Procedure pdkp: Initiates scheduling of the disk, simulates the actions of virtual disks, completes the handling of page faults and bootstrap loading, and continues incomplete disk I/O.
Procedure plpt: Simulates the actions of the virtual line printer.
Procedure pcdr: Simulates the actions of the virtual card reader.
Procedure ptto: Simulates the actions of the virtual TTO1 device.
Procedure ptti: Simulates the actions of the virtual TTI1 device.
The following programs are executed by the trap process.
Procedure ptrap: Main program of the trap process. Initiates the handling of page faults detected by validity violations, interprets sensitive instructions, simulates the auto location facility, and simulates virtual traps. Other programs are called to implement some of these functions after ptrap has determined what to do.
Procedure simultrap: Simulates a virtual trap due to a virtual protection violation or a virtual explicit trap.
Procedure simulauto: Simulates the auto increment/decrement facility.
Procedure simulmri: Simulates the execution of memory reference instructions.
The following programs may be executed by any process.
Procedure savst: The state of a virtual processor is saved in its virtual store.
Procedure isfts: The name of this procedure stands for Interrupt Stack Fault Test Simulate. Tests for a pending interrupt or stack fault and simulates if required.
Systems programs, which require a basic machine interface, may be booted in and run by means of RDOS (real time disk operating system) facilities. The steps for running a systems program are:
MASTER DEVICE RELEASED FILENAME? <filename>.SV
PARTITION IN USE TYPE C TO CONTINUE
It is possible that RDOS will refuse to boot in the save file
and will complain that an overlay file (<filename>.OL) does not
exist. This is known to occur if the save file specifies a
value for location five or if no values are specified in the
first 256 locations. Programs of this nature should be avoided
due to this RDOS error.
The following dialogue exemplifies the use of the UT virtual machine
monitor. Upper case letters designate output from the computer. Lower
case letters designate input by a human operator. Comments which
explain the example dialogue are enclosed in slashes.
The keyword of each command is identified by the first four letters
of the command.
/ RDOS dialogue to start system at real machine first teletype (TTY) /
boot dp0 MASTER DEVICE RELEASED FILENAME ? vmm PARTITION IN USE - TYPE C TO CONTINUE CONTINUE
/ set data switches to 000002 , hit RESET, hit RUN; the following dialogue occurs on the real machine second teletype (CRT) /
U.T. AUSTIN NOVA 3/D VIRTUAL MACHINE MONITOR VERSION 1.0 SYSTEM GENERATION - ENTER DECIMAL VALUES NUMBER OF VIRTUAL MACHINES ? 4
/ four virtual machines are created /
QUANTUM (MS) ? 50 FRAME ALLOCATION 40 FRAMES LEFT ALLOCATION FOR VM 0 ? 15 25 FRAMES LEFT ALLOCATION FOR VM 1 ? 10 15 FRAMES LEFT ALLOCATION FOR VM 2 ? 8 7 FRAMES LEFT ALLOCATION FOR VM 3 ? 7
/ system generation is completed; the monitor now prompts the operator /
*boot 0 15
/ the virtual store of vm 0 is booted into the system; a message signals completion /
BOOT COMPLETE *boot 1 10 BOOT COMPLETE
/ device allocation /
*allo 0 qty 0 *allo 1 qty 1 *allo 5 qty 2 / a reference to a non existing vm is an error / ERROR *allo 0 dkp0 100 / allocate 100 4234 cylinders to vm 0 / *own qty 0 0 *own qty 1 1 *own qty 2 NOT ALLOC *own dkp0 0 100 *own dkp0 1 0 *allo 1 lpt *own lpt 1 *release 1 lpt *own lpt NOT ALLOC
/ the following commands will start vm 0 at location 64 /
*load 0 000064 *read 0 000064 *octal *read 0 000100 *reset *status 0 TERMINATED *start 0 *status 0 RUNNING *stop 0 *status 0 TERMINATED
/ the registers of vm 0 are examined /
*rege 0 ac0 000000 *rege 0 ac1 000001 *rege 0 sp 000200
/ registers may be altered /
*load 0 000007 *regd 0 ac0 *rege 0 ac0 000007
/ the VMM is terminated by hitting the STOP switch on the front panel /