Table of Contents Host Configuration: Workstation Administration
Walter C. Wong
Computing Services
January 7, 1993
1.1 Initial Software Requirements Specification
The Initial Software Requirements Specification (ISRS) is the first step in the requirements analysis process. The ISRS lists the requirements of a particular Computer System Item (CSI), including functional, performance, and security requirements. The ISRS also provides usage scenarios from a user, an operational and an administrative perspective.
The ISRS discusses implementation detail only if it contributes to feasibility or cost.
This document will be identified as Host Configuration: Workstation Administration Initial Software Requirements Specification. Future references to ISRS refers to this document unless otherwise noted. The CSI referred to in this document, unless otherwise stated, is Host Configuration: Workstation Administration.
The Host Configuration project addresses issues of workstation management. In defining the requirements for Host Configuration, three separate aspects are discussed:
- Performance Management - Performance Management provides the tools necessary to benchmark the operating system and commonly used applications. This is necessary for capacity planning. For example, do we need more machines now or can we wait a year? Performance Management is also useful for testing purposes. If a change was made to a kernel configuration variable, can we test this change in a controlled environment rather than a production environment to see if there is a significant improvement.
- Docking - Docking addresses mobile computing as well as availability. Docking integrates services normally available only in a network-based, distributed environment with portable computers. Since many of the issues overlap, docking will also address how to minimize the disruption of network or server failure.
- Workstation Administration - Workstation Administration recognizes the increasing popularity and increasing number of workstations. This includes UNIX, Macintosh or PC(1) systems. As a result, the task of maintaining the configuration of each individual workstation becomes an increasingly difficult and time consuming chore. Not only do system administrators have to deal with more and more machines, there also tend to be many machines from different vendors, with each vendor having different management tools and practices. Finally, workstation owners are demanding custom configurations, often tailored to their specific need. Workstation Administration concentrates on how to install, distribute, customize and maintain the software and files that are to reside on the local disk of the workstation.
Workstation administration emphasizes the distributed management of system software. That is, the operating system software and the software that the workstation needs to accomplish any specific service. This document does not explicitly concentrate on the management of third-party and locally developed software. Tools such as depot [COLYE92A] can be used to address this issue. [COLYE92B] describes the way depot is used in an AFS environment to management the software environment.
This project does not discuss the issue of backing up the local disk of individual workstations. The reader of this ISRS should assume that there is no important data on the workstation or an external backup mechanism is available. Backup issues are addressed by the Andrew II Backup and Archiving Project.
Another aspect that is not directly addressed is account management. Password file distribution and configuration will be treated as any other file in the CSI. The actual process of building of a central password is outside of the scope of this document.
This document will only discuss Workstation Administration. At this date, a formal discussion of docking and performance management is somewhat premature. These two aspects do not form a CSI by themselves. Rather, they are concepts and guidelines that are to be introduced and integrated with a CSI after an initial functional requirement analysis.
Please feel free to contact the author with any comments or criticisms. The author can be reached at the following address:
Walter C. Wong Internet: wcw+@cmu.edu
UCC 179 phone: (412) 268-8514
Computing Services FAX: (412) 268-4987
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213-3890
September 24, 1992 - First public release
December 7, 1992 - First reviewed revision
The author would like to thank the following people for their helpful comments on the document: Bill Arms, Wallace Colyer, Tracy Futhey, Alan Hastings, and John Lerchey. Many thanks are also go to Julie Jenson for proofreading and designing the format of this document.
2.1 Reference Documents
The following documents form a part of this document:
[COOPE92] Cooper, Michael, A. "Overhauling rdist for the `90s." LISA VI Proceedings. October 1992. pp. 175-188.
[OSF91A] Distributed Management Environment: Rationale. Open Software Foundation. October, 1991.
[OSF91B] Distributed Management Environment. Slides from the DME presentation. 1991.
[HOWEL92] Howell, Paul; Killey, Paul, and Kuno, Harumi. Client Syncing of Local File Space. Computer Aided Engineering Network, University of Michigan. 1992. DRAFT.
[ROSEN92] Rosenstein, Mark, and Peisach, Ezra. "Mkserv - Workstation Customization and Privatization," LISA VI Proceedings. 1992. pp. 89-95.
[SMITH89] Smith, Susanne, and Quarterman, John S. "White Paper on System Administration for IEEE 1003.7,";login:. Volume 14, no 4. July/August 1989. pp. 17-23.
[SHAFE85] Shafer, Steven, and Thompson, Mary. The SUP Software Upgrade Protocol. School of Computer Science, Carnegie Mellon University. 1985
[WONG92] Wong, Walter C. The Andrew Workstation. Computing Services, Carnegie Mellon University. 1992.
The following provide some background or additional material that is relevant to this document:
[COLYE92A] Colyer, Wallace; Held, Mark; Markley, David, and Wong, Walter. "Software Management in the Andrew System." AFS User's Group Proceedings. July, 1992.
[COLYE92B] Colyer, Wallace, and Wong, Walter. "Depot: A Tool for Software Management," LISA VI Proceedings. 1992. pp. 151-160.
[FURLA91] Furlani, John. "Modules: Providing a Flexible User Interface." LISA V Proceedings. 1991. pp. 141-152.
[OSF92] Distributed Computing Environment: An Overview. Open Software Foundation. January 1992.
[OSF91C] File Systems in a Distributed Computing Environment: A White Paper. Open Software Foundation. July 1991.
[HAYES92] Hayes, Frank. "The System Administration Squeeze," UnixWorld. October, 1992. pp. 67-70.
[KOHL92] Kohl, John T.; Neuman, B. Clifford, and Ts'o, Theodore Y. The Evolution of the Kerberos Authentication Service. August 1992. Available via anonymous ftp from aeneas.mit.edu in /pub/kerberos/doc.
[SATYA85] Satyanarayanan, M.; Howard, J. H.; Nichols, D. A.; Sidebotham N., and Spector, A. Z. "The ITC Distributed File System: Principals and Design." Proceedings of the 10th ACM Symposium on Operating System Principals. 1985.
[ZAYAS88] Zayas, Edward, and Everhart, Craig. Design and Specification of the Cellular Andrew Environment. ITC Technical Report CMU-ITC-070. August 2, 1988.
The following documents were used to create the template for the ISRS:
Software Requirements Specification, Data Item Description 08. Office of Safety, Reliability, Maintainability and Quality Assurance. NASA. Version 3.0, October 15, 1986.
IEEE Guide to Software Requirements Specifications. IEEE Std 830-1984. Institute of Electrical and Electronics Engineers. February 10, 1984.
3.1 The Problem
The problem of distributed workstation administration is not just a problem of numbers. While large numbers of workstations do make administration difficult, many other issues complicate the process. For example, many machines are often owned by people who simply want them to work but do not want to spend the resources in order to keep them working. In this case, workstation maintenance may be thrust upon graduate students or their commercial counterparts who may lack the technical expertise to manage the machines. As a result, the bulk of the work may be dropped upon some unlucky individual or organization with the technical expertise, or the work may simply be dropped. The situation, then, deteriorates slowly until immediate assistance is required. The insidious aspect of system administration is that the system can appear to be functioning properly, but, unless the proper tasks are done, one small event (disk crash, break-in, etc.) can lead to catastrophic failure.
To further complicate matters, individual vendors have begun to offer their own system administration tools. Examples of these tools include SCAMP and FullSail from DEC, netinfo from NeXT, and SMIT from IBM. Currently, these systems do not interoperate with each other or provide a common interface to their functions. Many of these tools come with unreasonable limits, such as limits to the maximum number of machines, or limits that require the all system management operations to go through the tool. Thus, most of these tools cannot interoperate with the existing management tools that large sites already use.
The Andrew system already has a working workstation administration system, which will be described in more detail in section section 4.1 on page 19. It is conceivable that the current system could continue its forced evolution and continue to function for many more years. However, there are some significant shortcomings with the system that make it worthwhile to start fresh and examine what was done right and what should be fixed. Furthermore, there is a growing concern that the future direction of system administration tools and facilities will conflict with the Andrew system. This document and this project should help draw attention to our environment and, hopefully, head off any detrimental standards or practices.
The ultimate goal at the completion of CSI is to have a system that allows us to manage the system software and files of a potentially infinite number of workstations with a relatively small technical staff. The technical staff would provide a default workstation configuration that acted as a foundation on which further customization could occur. Each client machine would be configured to optimally use the resources available and customized to meets its owner's needs. At the same time, the system staff would be able to upgrade and update software transparently to the user.
While it would be ideal if each user could reconfigure their environment dynamically, it does not seem to be feasible given the technology widely available. For example, it would be desirable for two users, using the same machine at the same time, to have environments tailored to exactly what they wanted. Although John Furlani's Module system [FURLAN91] provides a mechanism to accomplish some this, it would be rather awkward to integrate it with our software management system. At this time, reconfiguring workstations with different software packages appears to be the most practical approach.
Figure 1 The Layered Approach

It is believed that the current Andrew model can achieve this goal, to a certain extent. This model can be described in two ways. The first approach is to describe the model as layered, as shown in Figure 1. The core functionality of the system is provided by a small, technical, central staff. They also provide general and commonly used services. The next layer is the departmental administrators. Departments and organizations may have special configurations and configurations that span their entire computing community. This layer allows them to make custom changes without having to duplicate the work already done by the central staff. The final layer is the individual workstation. This allows for machine specific changes to be incorporated into the management system.
The Andrew model can also be thought of as a pipeline through which software flows, as illustrated in Figure 2. The central administration provides the main software feed. This feed is taken by the departmental administrators and modified or combined with any additional departmental services and software and then passed on to the local workstation. The local workstation then has the opportunity to make its own customization and to incorporate software and services that are specific to it.
Figure 2 The Administrative Pipeline

The model does not constrain the feeds and layers to only central, departmental, and local. Any number of feeds or layers can be introduced after the central layer. For example, in our environment, an additional college tier could be added between the central administration and the departmental administration. The central layer is presumed to always be there as there is probably some organization that will be providing the base systems. Also, a local layer may or may not exist, depending on if there are any workstation specific changes required.
The Andrew model provides a default management environment where software is automatically installed and updated. This environment also has the flexibility to allow departmental or local administrators to customize that environment. Those with technical resources can use those resources to fine-tune the environment rather than build a custom environment from scratch to match their needs.
The current Andrew model has been built around a strong central administration, and individual workstations either use the system or do not use it. However, the model doesn't preclude the ability to lessen the dependence on the central administration. The model allows the possibility for a departmental or local administrator to merge the output of two central administrative feeds. In general, it is doubtful that this will be necessary.
The broadest goal of the CSI is to provide a system that performs and allows for:
- Installation
- Distribution
- Modification
- Purification
- Deletion
of software and files on the local disks of workstations in a manner that is:
- Simple
- Configurable
- Customizable
- Flexible
- Scalable
- Maintainable
These topics will be discussed in more detail in the Functional Requirements, in section section 6.1 on page 31.
Because of the variety of computer systems and implementation philosophies, it is very possible that a single solution is not available or even ideal. If this is determined to be the case, then CSI specific to the implementation flavor will be chosen or developed.
The following sections provide a quick overview of several known systems for workstation administration. It is strongly recommended that the reader review the references and not rely solely on the summaries. The author of this document only has first hand experience with a few of the systems discussed in the section.
[WONG92]
The Andrew System includes approximately 600 workstations at Carnegie Mellon University. These workstations encompass individual workstations for faculty and staff, publicly available machines in computer clusters, and service workstations that perform a variety of tasks such as providing UNIX access or running the mail and backup systems. This management system uses a set of tools, package and mpp, for maintaining the workstations. The system is commonly referred to as just package.
Package, by itself, is very simple. Package ensures that files and directories on the local workstation match those on AFS [SATYA85], a distributed file system present on all Andrew workstations. If they do not match, package replaces the local file with the one from AFS. If the file does not belong then package will remove that file.
Without mpp, it would be quite difficult for customization to occur. Mpp is a macro preprocessor that creates the configuration files that package uses. Depending on what variables are defined, mpp will generate different package files from the same configuration files.
The benefits of the system include:
- Scalability - After spending the time to set up a system type (e.g. Ultrix/RISC 4.2A), there is very little incremental cost of adding additional machines of that type.
- Auto-configuration - The system is able to use existing operating system tools to determine the setup of the workstation and automatically chose the right files. For example, using the DEC tool sizer, we can use it to tell package which graphics card is present and thus which X server should be used.
- Limited Distributed Administration - The structure is set up so that departments or workstation owners can change their configuration without having to go through a central administrative facility.
- Configuration Libraries - Common services can be added by just adding a single line to the workstation configuration file.
- Purification - package can remove files that do not belong or replace files that do not match the copy on the filesystem.
- Disk Control - Andrew workstations were originally intended to have small local disks --- with most of the files on a distributed filesystem. As a result of falling disk prices, it is now feasible to put more data on the local disk and have fewer (or no) files on the distributed filesystem.
The main drawbacks of package are:
- High Start-up Cost - The process of creating the base configuration files for each operating system is a task that is non-trivial and cannot be completely automated.
- Limited Distribution Mechanisms - The only way to update clients is via AFS, a distributed filesystem. As a result of this, there is no easy way to limit distribution to specific workstations unless package obtains authentication to AFS.
- Better Service Dependency System - Special care is required in modifying the configuration file such that services that require other services actually get those services.
- Complexity - While it is simple to add predefined services, adding new services and negotiating the maze of configuration files can be quite difficult.
- No File Modification - The system can only copy, link, or remove files. There is no way to actually modify a configuration file. This is somewhat problematic for files like crontab and inetd.conf.
- Since package is file based, the implementation of the customization mechanisms is also based on copying and removing files. For example, services that are to be started at boot time create an /etc/rc.local.SERVICE_NAME. The base /etc/rc then runs all rc.local.* files it finds in /etc.
The Andrew System also tries to maintain a clear separation between the operating system and local and third party software. Local modifications to the operating system, such as to programs like /bin/login, are copied from /usr/local to /bin by package. All third party and local software are kept in /usr/local which is managed by a tool called depot [COLYE92A]. The software management process using depot and other tools is described by [COLYE92B].
[SHAFE85]
SUP is used by the School of Computer Science and the Department of Electrical and Computer Engineering at Carnegie Mellon to maintain over 1,000 workstations. As with many of these packages, the purpose of SUP is to ensure that multiple machines will automatically have the same software base.
SUP separates software into different collections. Each collection is a logical "unit" of software. For example, all the files and directories needed for gnu-emacs would form the gnu-emacs collection. To distribute the collections, SUP takes a client/server approach. For each collection, there is a SUP server that exports it.
Some benefits of SUP include:
- The distribution of collections is not tied into a filesystem.
- "Private" collections can be distributed through the use of passwords.
- The data being distributed can be encrypted.
- Notification facilities are available; it can mail the log file generated for a specific collection.
The limitations of SUP are:
- It is necessary to know where the SUP server is for a collection.
- Collections can not be customized; to obtain a collection from a SUP server, one has to take the collection exactly as it is.
- It is unclear what happens when conflicts occur. For example, what would happen if two collections both have the file /usr/misc/foo? Would one collection's /usr/misc/foo always overwrite the other collection's file?
[ROSEN92]
mkserv maintains over 1,000 workstations in the Athena environment at MIT. Athena recognizes that managing many machines that are identical is not that hard a task. The difficult task is trying to manage multiple machines and allow customization to occur.
mkserv uses a configuration file to determine what services a particular machine should have. After mkserv determines what services are on the machine, all of the files required for the service to function are copied to the local disk.
The benefits of mkserv include:
- It checks to see if it can run to completion before running. At this time, it is somewhat simplistic by just checking to see if the fileservers are up.
- It can easily define dependencies and ensure that the necessary files are included.
- It has an External Command Feature - by the virtue of calling shell scripts, mkserv can run other commands as part of its normal operation
Some limitations of mkserv are:
- All services are copied to the local disk. In principal this is good, but workstation owners with smaller disks may wish to sacrifice availability to reclaim the disk space.
- mkserv uses the filesystem as the method of distribution.
- Maintenance of configuration files may become a problem. There are many configuration files for each service.
[HOWEL92]
Synctree used at the University of Michigan's College of Engineering, specifically the Computer Aided Engineering Network, to manage over 1,000 workstations. The management paradigm does not differ much from the other packages; there is a master copy of software on a distributed filesystem and synctree makes sure that what is on the local disk matches the master copy. It is believed that this synctree is different than the one used by mkserv.
The master copy on AFS for synctree is separated into different sections known as templates. There is a base template, called GENERIC, that loads the basic operating system for any given system type, such as SunOS 4.1.1 for Sparcstations. Since all workstations are not alike, the next level creates a "class" distinction for each workstation. Each workstation can belong to several classes with one class having precedence over the others. The classes give a group identity and thus certain group characteristics to the workstations. For example, a workstation could have the class of being a STAFF workstation or a LAB workstation. In addition, there is the BETA class that uses software being tested. Finally, there is the workstation specific template directory. That directory contains files that are specific to the workstation, such as /etc/fstab.
The benefits of synctree include:
- Few Configuration Files - Instead of using configuration files to specify what files or directories belong on the local disk, synctree uses the directory structure, to create the list of what should be on the workstation
- External Command Feature - if a specified file changes, a binary or shell script can be automatically run
- Copy Flexibility - The maintainer can decide to copy files from or link files to the template area
Some drawbacks of synctree appear to be:
- Distribution is based on a filesystem.
- Service definitions are not ideal. The local workstation configuration file includes a complete list of services, both what is desired on the machine and what isn't. How are new services propagated to the file? Can additional services be added by more templates?
- Template configuration options are all file based. No group distinction is available. For example, if one wanted all of an application copied, he would have to specify each file in that application.
[OSF91], [OSF92]
The Open Software Foundation's Distributed Management Environment strives to unify network and system administration. Although the DME is still currently under development, if it were adopted by all vendors it would provide a consistent management environment across multiple platforms.
The DME has the potential to create a standard mechanism for initial software installation. This is the case when one has just purchased a workstation and there is nothing on the disk. Right now, each vendor has their own installation procedure and many management tools, such as package, require that the vendor's installation tools are used until the point that the local management can take over. Often this transition is not very smooth and requires modification to the vendor's installation system. Other times, the management system is restricted by the initial load system. For instance, while SunOS offers an initial network installation procedure, it makes it difficult to do more than one installation at a time.
Some architectural problems of the DME deserve some attention. First, there is the sheer bulk of the system. It is quite possible that in order to use the DME, it is necessary for one to have all the supporting infrastructure, such as the DCE. This may significantly limit the number of machines that can use it, as well as greatly increase the time and resources required to install and use the DME. Secondly, the OSF process makes the DME more vulnerable to the problems that occur when anything is designed by committee: a product that can do everything, but can do nothing well.
Two aspects of the DME apply to this document: the "object-oriented" management environment and the software distribution and installation component. The only references that are currently available are more marketing than technical. As a result, the two following sections are mostly summaries of the two references listed above.
The management model for tracking and maintaining the workstations in the management environment takes an object oriented approach. The OSF defines an object in this context as "the consolidation of data and operations into one entity - a managed object - which represents the resource or service to be managed." Thus, all management operations are done through communications with the management objects.
The implementation of this system involves creating a three-tiered approach. At the bottom of the tier, is the individual node or workstation. This level allows for any individual customization and configurations that may occur. This is also to designed to make the DME useful for smaller sites. The second tier is the "cell" level. As with DFS [OSF92], this is a management abstraction to allow different organizations to have independent management domains for their group of workstations. This will allow management operations to be made available to entire groups of machines. The final tier is the "enterprise" level. This tier does not access the individual workstations but rather sends management operations to the cell.
The information included below is an excerpt of the selection rationale for the software distribution and installation component:
- [the selected tool] provides the flexibility needed to support different software distribution policies. Software products can be administered from any location. This allows software product installation to be pushed from a software depot to target systems, or pulled to a local system from a depot system.
- Implementation... provides a distributed service. As a result, administrative tasks are not bound to dedicated systems, and services can be allocated to several systems - for example, multiple software depot systems can reside in a network.
- Information about installed software products resides only on the target systems and can be queried on demand. This provides consistency of management information, which should be co-located with the resource to which it refers.
- The managing system can act primarily as the supervisor and initiator of software management tasks. It is the responsibility of the target systems to determine the parts of the software products that need to be transferred, depending on information about installed software
- The approach for installing operating system software is well designed. The [system] supports the notion of critical file sets for kernel building, which are loaded first, and supports the rebooting of systems.
Workstation administration software for the Macintosh and the PC lag behind the tools available for UNIX systems. Macintosh systems mostly follow the "push" mechanism and are more orientated around network monitoring of Macintosh systems rather than a full blown software distribution mechanism. However, there appears to be a version of rdist, known as revRDIST for the Macintosh. Furthermore, more and more file synchronization tools are being written as a result of the Macintosh portable computers. This may lead to better workstation administration tools.
The only PC system found at this point is a product that has yet to be released. This program does software distribution based on `cloning.' That is, it allows for a single configuration to be identically copied to multiple machines.
Investigations into the Macintosh/PC world are continuing as more systems appear and this document will be updated to reflect these changes.
setld from DEC is not a workstation administration tool in the true sense. It is more of a software distribution and installation mechanism. It is included in order to point out some potential pitfalls.
Software using setld is compartmentalized into different software subsets. Each subset consists of the following items:
- a compressed tar file - this contains the actual data
- a control file - this lists information such as dependencies, the subset name, etc.
- an inventory file - this provides a verbose list of all the files and the directories where they belong, the modes and all other "important" information
- a script file - this allows for commands to be run before and after the installation process.
The system is capable of installing, removing, verifying and listing the software on the system. Two drawbacks exist in the system. First, it is quite complex and the documentation for generating the subsets is not very clear. Second, the script files are overused because it is much easier to do the work in the script than elsewhere. For example, often dependency checks are done by the script rather than by the provided mechanism. Finally, the philosophy behind this is geared towards singly managed systems. That is, it is expected that one would run setld on each client. As a result, it is very difficult to use setld to install software for a group of machines unless they are all identical, since setld in the scripts may configure the software packages to the workstation that setld is running on.
[COOPE92]
rdist is a rather old, and in use, UNIX software distribution tool. The reviewed version, however, is a new version, version 6.0, from University of Southern California.
rdist is a file based distribution system. Files to be distributed are "pushed" from a central server to all the clients. The files that are distributed are described in a single configuration file. The configuration file specification is not as bad as package, as wild cards are allowed and mode, owner, and group information is taken from the file rather than having to be specified explicitly.
The benefits of rdist include:
- well known - since rdist was distributed on the BSD 4.3 tape, it is quite heavily used.
- notification system - can send mail to notify people of events. Different people can be notified, depending on what failed.
- transport via rsh - This is a well known and readily available transport mechanism.
- good logging capabilities
The problems with rdist are:
- No local customization abilities: all the customization must be done on the server end. It is possible that one would have different rdist files for each client but that runs into the problem of maintaining a large number of configuration files.
- Transport is via rsh - All the bugs and problems with rsh are also there in rdist.
The two different distribution approaches, "push" versus "pull," illustrate different distribution mentalities. The "push" method is an authoritarian, "dictatorial" approach in which the configuration of the client workstations must "follow and obey" the directions of the server doing the "pushing." On the other hand, "pull" is a much more laissez-faire approach where clients request software from the servers.
Each model has its benefits and drawbacks. For example, the "push" model is best for machines whose configuration must remain consistent and not change unless someone explicitly changes. This is quite useful for machines providing services used by the community.
The "pull" model, however, is much more scalable as the servers have much less work to do. This model assumes that if something goes wrong, someone will contact you to fix it rather than you having to go out and fix it immediately. Also, in the case that a distributed filesystem is present, the distribution mechanism tends to lean towards pulling from the distributed filesystem.
Items to be reviewed in the next revision include the following items: Tivoli's WIZdom, POSIX 1007.2, HP's Software Distribution Utilities, and Radar.
The following sections describe typical use scenarios for the CSI. This section is not meant to be a comprehensive list of all the functionality available. Instead, this section should provide the reader with a general understanding of how the CSI will be used.
The CSI should be transparent to the user of the workstation. "User" in the context of the CSI refers to the workstation administrator. The workstation administrator may be the owner of a single workstation or the person in charge of multiple workstations. The workstation administration may also be a group of people responsible for any number of machines.
To the user, the CSI will provide a functioning system where the following software can be automatically updated:(2)
- operating system
- operating system layered products
- locally developed applications
- third party software
If the default environment is not sufficient for the user, then customization can occur to the workstation configuration. These customization could consist of any or all of the following items:
- Setting the machine up as an anonymous FTP server
- Running "beta" or experimental software
- Having the machine run a license server
- Installing software that is not centrally provided
- Having different configuration files, such as /etc/rc, /etc/inetd.conf, etc.
- Using a different kernel or even a different operating system than that which is centrally provided
The mechanisms for using the CSI should be similar, if not identical, between the departmental and local workstation administration, with the exception that the departmental administrators may provide additional "library routines" for their local users that are not provided by the central administration.
Ideally, direct editing of configuration files should not be required. Instead, there should be a user interface. See section section 6.1.13 on page 35 for more details on the user interface. Novice users may prefer to use a user interface while experienced users may want the flexibility of editing (or even writing programs to edit) the configuration files directly.
Once a configuration is created, the user should not have to change the configuration in response to central changes. For example, central administration should be able to upgrade the operating system without forcing users to change their configuration files. However, if a user has installed an application locally that is incompatible with the newer version, then a configuration change may be required. In essence, the user has administrative control over the aspects that he wants control over and delegates responsibility of the aspects that he does not want to deal with to other administrators.
No direct operations support should be required for the CSI.
This scenario deals with the central administration. The central administration is responsible for providing the default services and environment from which all customization is based.
The process that the central administration would have to proceed is enumerated as follows:
1. Install the operating system on a workstation.
2. Install the CSI on the workstation. The first two phases are to "bootstrap" the CSI. In the event that the CSI was already present then the first two steps would not be required.
3. Load the operating system into the CSI for distribution. The involves loading the entire operating system as shipped by the vendor. In general, if an operating system can be shared between multiple models of the same workstation family (e.g. DEC 2100, 3100, 5000/1xx, 5000/2xx) with only minor changes, the same base should be used. See section section 6.1.12 on page 35 for more details.
After this step is completed, the operating system should be distributable by the CSI.
4. Install the OS layered products into the CSI.
5. Install third party and locally developed software into the CSI. This includes installing all the appropriate software licenses. See section section 6.7 on page 40 for more details on licensing.
6. Install site customization and basic services into the CSI. These include site specific configuration items as well as site independent services that require modification to base configuration files. Examples include pointing the CSI to the appropriate password file and creating the proper directory structure and inetd.conf to allow anonymous FTP.
Ideally, no further action would be required by the central administration to maintain the workstations. Day to day maintenance would be done automatically or on each reboot. For example, new password files would be distributed by the CSI without any central intervention.
Unfortunately, software rarely remains static. While the operating system may not change very often, new application software and new versions appear, and old unused software and older versions should be removed. Removal, is by far, the easiest part. Given that the CSI compartmentalizes the software, removal is just removing the pointer to the collection and then waiting for that removal to propagate to all the client workstations. Installation can also be as simple as installing into the CSI and waiting for the version to propagate to the clients. Compartmentalization also aids in the "backing out" process. As long as previous versions are kept, broken software can be easily removed from the production environment and previously working versions can be restored.
It is important to point out at this time that the CSI server need not be a real machine. In the current AFS environment, the directories in which files are stored are not directly accessible by the server machines, unlike NFS. Installation of software into AFS involves using a client to store the files onto the AFS server. If the CSI uses a distributed filesystem, such as AFS or DFS, as a method of distribution, then the distributed filesystem should be considered the CSI server.
This section provides a list of requirements that are needed for the CSI. They are classified as either mandatory or highly desirable.
- Mandatory requirements must be included in the CSI.
- Highly desirable requirements should be included in the CSI.
It is highly desirable that the CSI:
- be capable of installing a workstation from "scratch." For example, if the workstation just had the disks low level formatted, the CSI should be able to return it to its previously working configuration.
If this requirement is implemented by the CSI, then it is mandatory that the CSI:
- allow for network based installations.
- allow for different configurations.
- allow for certain machines to only receive a specific configuration. For example, machines in a public computer facility should always be installed with a specific configuration.
- minimize the role of the central administration in this process, after the CSI server is put into production.
If this requirement is implemented, then it would be highly desirable for the CSI:
- to allow a single machine to serve heterogenous workstations.
This requirement is not mandatory as there is no standardization in this field yet. Each vendor and each operating system all have different methods of dealing with this issue.
It is mandatory that the CSI:
- operate automatically on a regular basis with little human intervention. The only circumstance that may warrant human intervention is in the case of failure conditions where the CSI cannot reliably choose the correct path automatically.
This assumes that working system is already in place. It is not a requirement that the CSI can automatically determine what configuration the user wants.
The installation of software in the CSI comes under two forms: installation for a single workstation, and installation for multiple workstations in the CSI. In the installation of software for a single machine, one has the option to use the vendor install scripts and tailor that application for the specific machine. However, installing for a group of machines requires the installer to load all components of the collection, make any global configurations and then put in the "hooks" necessary for the software to run on all of the workstations receiving the software. The multiple machine installation assumes that the group of machines want the software configured the same way (or there really aren't any configuration differences that matter).
As discussed in section section 4.9 on page 26, many vendors avoid the multiple workstation installation problem by simply distributing the "raw" form of the collection to each workstation. This is not acceptable as it violates the simplicity requirement in section 6.1.11 on page 35.
It is mandatory that the CSI:
- allow installation of software for a single workstation.
- allow installation of software for multiple machines wanted to use the same software configuration.
- distribute the software that is installed from the CSI servers to the client workstations.
- does not assume that source code is available.
- does not assume that there will be a standard distribution format. The CSI can assume that software is installed into a directory hierarchy in a local or a distributed filesystem. This may change if standards begin to emerge and are widely adopted.
- does not require the installer to spend an excessive amount of time modifying the software package to conform with the CSI.
- allow the client workstation to refuse to install or update any software collection.
- allow the client workstation to install a different version of the software collection from another source.
- understand and deal with conflicts. A conflict is when two different software collections install different files with the same name to the same location. Conflicts are discussed in more detail in the depot paper [COLYE92A].
It is highly desirable that the CSI:
- allow the installer to use the vendor installation process.
It is mandatory that the CSI:
- be able to remove software from the server and client workstations.
- completely remove every file of the software collection.
The system manager must not need to go to each workstation in the CSI to remove software from them.
It is mandatory that the CSI:
- have the ability to ensure that files on the local disk of CSI client workstations remain the same as the copies on the server.
- have the ability to remove files that "do not belong" on the local disk of the CSI client workstation.
It is mandatory that the CSI:
- be able to describe the exact current configuration.
- know all the files that belong to each software collection.
- be able to track the change in configuration over time, similar to providing an audit trail.
- be able to prevent any change to the configuration of the workstation or allow changes to happen automatically.
- notify the workstation maintainer of configuration changes.
- trivially reconstruct or duplicate any given workstation configuration.
- share configuration items among any given number of workstations. A change to any shared configuration item must only have to occur in one place, and not on all the workstations.
- allow easy access to common configuration options and services. For example, only one command would be necessary to configure a workstation to allow NFS exporting or to have a "larger" kernel.
It is highly desirable that the CSI:
- allow services that have dependencies on other services to automatically invoke those services. For example, since NFS servers requires the Sun RPC package, a request to make a workstation an NFS server should automatically configure the workstation to install the Sun RPC package.
- allow individual workstation configuration options to have precedence over departmental and central options. Departmental options should override the central options. The central options should override the default vendor options.
It is mandatory that the CSI:
- be scalable. This can be achieved by:
- Distributing the workload - Allow those who can take additional parts of the workload do so. The CSI must allow the distribution of the administrative tasks to whomever is qualified and authorized to do so. This includes being able to grant administrative control to local workstation administrators without compromising the security of the CSI.
- Minimizing incremental cost - The largest cost of maintaining workstation administration should be during the initial installation. Adding additional workstations of the same type should require a minimal amount of central administration.
It is mandatory that the CSI:
- allow distribution of software to all client workstations.
- allow software to be entered into the "software feed" (see section Figure 2 on page 17) at any point in the feed.
- not be integrally linked with the distribution mechanism for the CSI. For example, the CSI must allow distribution from a distributed filesystem or from software servers.
- support both push and pull models. Alan Hastings pointed out that it is possible to "emulate" the "push" model using "pull." One does this by notifying the client that it should really update now and then the client goes to the server for the updates.
It is mandatory that the CSI:
- use an IP based protocol for distribution of software and communication between the CSI clients and the CSI servers. The use of DCE's RPC is desirable at this stage but that decision should be finalized at the design phase.
This requirement is subject to change as Carnegie Mellon's network configuration changes.
It is mandatory that the CSI:
- provide clear documentation on the conventions and use of the system.
- store the documentation "on-line" with the source code or provide a file stored with the source code that includes the location of the documentation.
It is mandatory that the CSI:
- transparently provide software and services. It must not be necessary to use the vendor installation process on every client workstation.
- separate functionality into multiple modules that can act independently or be independently replaced.
It is highly desirable that the CSI:
- avoid many complex configuration files. For example, instead of having additional configuration files that specify a collection, the collection can be described by the existing directory structure. In this example, a program that is located in the /etc directory on the server would be exported to the /etc directory on the client. In the event that one does not want the directory structure of the source collection to mirror the target, then the configuration file should still be kept to a minimum, such only listing the exceptions.
- allow users to only have to modify a single file in a well known location to change the workstation configuration. However, this is not advocating a single configuration file. It may be desirable to have configuration files be able to include other configuration files and, in this manner, have a way of providing "building blocks" for the CSI.
The importance of simplicity can not be understated. By having multiple modular tools, one avoids ending up with an unmanageable and difficult to understand "kitchen sink" system.
It is mandatory that the CSI:
- be able to determine different architectures
- be able to use the tools provided by the operating system to determine differences in workstation models that use the same base operating system
In general, if the operating system Fizzlebrot 4.3 supports workstation models FizzBang 6000 AXP and FizzBoom IIe then the CSI must only need one copy of FizzleBrot 4.3 even if the FizzBoom has devices with different major/minor numbers and if the X server is different between the models. What must occur in that the CSI determines which workstation model it is and then makes sure the appropriate file gets distributed to the workstation.
It is highly desirable that the CSI:
- have a user interface. This means that it is highly desirable to have an interface to the various configuration files and binaries of the CSI.
If a user interface is present, then it is mandatory that the user interface:
- have a consistent user interface design across all platforms.(3)
- allow complete access to all the functionality of the CSI through the user interface. For example, if you can do it "by hand," there must be a way to perform the same action via the interface.
- provide a terminal as well as a graphical user interface.
- be simple enough for novice users to feel comfortable.
This section assumes that a distributed filesystem is available on client CSI workstations. This section also assumes that a file exists either on the local disk of a workstation or the file can be a pointer (e.g. a symbolic link) to a file on the distributed filesystem.
It is mandatory that the CSI:
- allow the workstation administrator to chose whether he wants a file to exist on local disk of his workstation or the distributed filesystem and accessed via the pointer. The benefit of having the files local is that access times are much faster and any external failures will not keep the local workstation user from reading the file. However, this may consume local disk space that the administration may wish to use for other purposes.
- provide a flexible method of specifying the actual location of the files. It must be possible to specify individual files or entire software collections. It must also be possible to specify a group but exclude certain members of that group.
In large organizations, there may be many independent computing facilities that wish to remain administratively independent of each other. There may also exist computing facilities that wish to have administrative control in some areas but relinquish control in others.
It is mandatory that the CSI:
- be compartmentalized enough such that administrative access can be granted to specific components. This also implies that two CSI, that is, two "central" administrative bodies, can exist in an organization. Yet there needs to be a way for things to be standardized enough such that sharing can occur. This concept is similar to that of AFS cells [ZAYAS88] where a common filesystem image is presented, and data sharing can occur, but each "cell" has its own independent administrative control.
- allow collections to be incorporated from different sources.
It is mandatory that the CSI:
- allow access to all the devices on the workstation.
- allow access to all the filesystems on the workstation.
There are many factors complicating the timing requirements. For example, if a distributed filesystem is used, then the performance of the CSI will depend on all the factors that complicate the performance of a distributed filesystem, such as the performance of the network and the performance of the fileservers. Also, the size and number of software collections will also have a significant impact on performance.
In order to provide some concrete goals, some specific numbers will be provided. In the event that design and/or implementation proves these numbers to be unfeasible, then a more realistic number will be chosen, based on the data provided.
It is mandatory that the CSI:
- not take longer than twice the standard installation time in the installation of the software. This is the installation for a group of workstations and not a single workstation.
- not take longer than two (2) minutes to complete the analysis phase. The analysis phase is the stage where the CSI determines what, if any actions need to be performed in order to satisfy the functional requirements. For example, this phase would determine if the software on the workstation was out of date and a new copy was required. The two (2) minute number is an arbitrary number chosen to indicate that the analysis phase must not take an "excessive" amount of time.
- must not place an undue burden on the network. It is unreasonable to expect that installing a 100 megabyte application over the network will not generate any network traffic. However, the overhead for this transfer should not be excessive (e.g. for every 1Kbyte, 2Kbytes of "chattering" is done by the CSI).
Completion time is very important for the client CSI workstations as the CSI may have a significant impact on the usability of the client workstation during execution. Thus, the sooner the CSI completes, the sooner the user may continue using the workstation. Second, the less time the CSI executes, the smaller the chances are that something else will go wrong, such as network or server failures.
The environment is assumed to be an environment with a large number of heterogenous workstations. Environments with a smaller number of workstations should be able to use the CSI without undue overhead, though it is not required. Large is defined as any number over 50. Small is any number under 5.
It is assumed that the workstations will be connected via a relatively high speed local area network, such as Ethernet. There must be provisions, however, to offer basic functionality of the CSI to machines connected via low bandwidth connections, such as SLIP. This issue will be addressed in detail in the Docking component.
The Andrew II project is assuming a OSF DCE/DFS [OSF92] based environment. We expect to have over 4,000 workstations by the year 2000.
It is very unlikely that the CSI will become the standard campus environment. Many organizations may adopt the CSI and remain administratively separate, as discussed in Section section 6.1.15 on page 36, but it is quite possible that many of the large organizations will not see the need nor be able to justify the expense of switching to a new system when their existing systems already suit their needs.
It is mandatory that the CSI:
- support over 1000 client workstations.
- support over 4000 client workstations by 1996.
- operate in a OSF DCE environment.
- use the Kerberos authentication system [KOHL92] as part of DCE.
- support UNIX based workstations with, minimally, TCP/IP support
- support PC systems.
- support Macintosh systems.
It is mandatory that the CSI:
- keep track of managed files and directories on the local disk of the workstation.
- have the database in a form that is extractable by the workstation administrator.
This information must be detailed enough to determine the location of the original copy of the file and any other "important" fields. (An important field is vaguely defined at this stage to be a field relevant to any workstation administrator.)
The subsections below specify the quality factor requirements of the CSI, to the extent possible, in quantitative terms.
It is mandatory that the CSI, the event of failure:
- attempt to leave the workstation in a usable state. Any operations that can be made atomic must be. For example, in the UNIX environment, where rename(2) is an atomic operation, when copying a file, the file should be copied to `.NEW' and then renamed to the proper file.
- provide a mechanism for notifying the workstation administrators that the CSI failed.
- provide a mechanism for logging the failure.
- that has left a software package partially installed, then the software package must either be removed or be in a state where it does not interfere with a previous working version or it must be removed and any previously working version must be put back in place.
It is mandatory that the CSI:
- be highly portable especially since it must run on every supported workstation.
It is highly desirable that the code for the CSI:
- The code should be modular enough that any component can be removed and replaced with a different component. For example, if a linked-list routine is causing a bottleneck, then it should require minimal code modification to replace it with a hash table.
It is highly desirable that the CSI:
- when running on a workstation, not consume an excessive amount of resources.
An excessive consumption of resources is considered to be one where no other action can be performed on a multi-tasking system while the CSI is running, i.e. CPU time, memory usage, disk space usage.
It is mandatory that the CSI:
- be usable by a novice user.
- not impair the usability for the expert user in order to accommodate the novice user. It is acceptable to compromise the functionality available to the novice user such that it is less than the functionality available to the expert user.
A novice user would most likely be the user that would only services predefined by central or local administrators. An expert user would be the ones who actually define these services. For example, a novice user would simply say that he wants his workstation to be an anonymous FTP server. An expert user, on the other hand, would be able to modify this service such that he could use a different FTP server.
Note that the term user is used in the same context as described in section section 5.1 on page 27.
It is mandatory that the CSI:
- be written in a way to minimize the effort require to fix errors.
- be understandable to any technically competent individual.
- provide a clear documentation trail which includes design documents and testing documents.
Components in the CSI are not required to be reusable outside of the CSI. The CSI, however, is envisioned to be a set of inter-operating programs and not a large monolithic entity.
It is mandatory that the CSI:
- contain components that are reusable by other entities within the CSI. For example, the CSI database should be accessed by all programs through a common library.
It is mandatory that the CSI:
- use the Kerberos authentication mechanism to authenticate clients and servers. Both the client workstation and the server need to be able prove to the other that they are indeed whom they say they are. The client needs to verify the server is really the server in order to ensure that the software being obtained is indeed the software provided by the administrators for distribution. The server may need to verify the client is, indeed, the client it claims to be in the case of licensed or sensitive software.
- include the authentication and security requirement as an optional component. There may be the case where there are clients which are not able to use the authentication mechanism but should participate in the CSI. It is permissible to allow restricted access to and from clients that cannot or do not authenticate to the CSI.
- be able to deal with workstations located in insecure and potentially hostile locations, such as public computing labs, where individuals may try to damage or infiltrate the system. The CSI must not provide additional "loopholes" for these individuals, such as letting them reconfigure or reinstall software on the workstations.
- not permit "super-user" status on any client machine in the CSI to automatically confer super-user status to all the machines participating in the CSI. It is unavoidable that super-user status on certain servers may confer super-user status on a client, for example, if a client receives the `login' binary from a server, then that can be replaced on the server end.
- have a simple method of verification of software integrity. This can be done in the form of a checksum.
It is highly desirable that the CSI:
- be able to encrypt data. Because of the expensive nature of encryption, it must be possible to limit the data being encrypted.
The issue of software licensing is not, specifically, tied into the area of workstation administration. Ideally, the licensing of software should be transparent to the distribution mechanism. However, this is not always the case. Licensing can affect the way software installed, tracked, and removed.
There are three main forms of licensing:
- Magic number license - This is the traditional licensing method. A special file or item on the disk contains the information that determines whether or not the machine is authorized to run the software.
- Network/floating license - This is a newer approach where a server determines if the machine is licensed or not. The floating license concept works under the assumption that not everyone is going to be using the software at the same time on all your machines. The floating license grants any machines in a specific list a license, up to the number of allowable licenses. Thus, fewer licensing can be purchased.
- We Trust You license - There are no restrictions in the software and it is up to the local site to ensure that licensing restrictions are not violated.
Software distribution in the first two items doesn't have that much to worry about. Any distribution mechanism can handle magic number licensing and floating licenses. The more difficult is the last item. In this case, the distribution mechanism has to know whether or not the machine is authorized to receive and install the software. When the license expires and is not renewed then the CSI would have to have the software removed. Finally, there would also need to be a mechanism to track usage in order to determine whether or not it is worth extending the licenses.
The following are the acronyms and terms used in this document.
AFS Andrew File System - A large scale distributed filesystem from the Transarc Corporation.
Collection A software package. A collection is a set of one or more files belonging to the same software package.
CSI Computer System Item - The Computer System Item is the resulting software and/or hardware which meets the requirements as described in the appropriate ISRS. CSI, in this document, refers to the system for workstation administration.
DCE Distributed Computing Environment - The basis for distributed computing from the OSF.
DFS Distributed File System - Depending on context, this may refer to distributed file systems in general or refer, specifically, to the OSF Distributed File System.
ISRS Initial Software Requirement Specification - See section section 1.1 on page 9 for more information.
OS Operating System - The programs and files that provide the software which the computer needs in order to function.
OSF The Open Software Foundation.
Footnotes
- (1)
- We will henceforth refer to any Intel 80x86 based systems as PC systems.
- (2)
- Update includes installation of new software, removal of old software, in addition to new versions of currently available software.
- (3)
- This does not mean that the graphical user interface is required to look identical across all platforms. For example, a Macintosh interface to the CSI would like a typical Macintosh application; the Windows interface would look like a Windows application, and the X11/Motif interface would resemble a Motif program. Regardless of superficial differences, the underlying design and layout can be the same or similar enough between all platforms for users to easily migrate from one architecture to another.
Table of Contents
Next Chapter