Computer Aided Design (CAD): The same 1963 IFIPS conference at which Sketchpad was presented also contained a number of CAD systems, including Doug Ross's Computer-Aided Design Project at MIT in the Electronic Systems Lab and Coons' work at MIT with Sketchpad. Timothy Johnson's pioneering work on the interactive 3D CAD system Sketchpad 3 was his 1963 MIT MS thesis (funded by the Air Force). The first CAD/CAM system in industry was probably General Motor's DAC-1 (about 1963).
Video Games: The first graphical video game was probably Spaceward by Slug Russell of MIT in 1962 for the PDP-1 including the first computer joysticks. The early computer Adventure game was created by Will Crowther at BBN, and Don Woods developed this into a more sophisticated Adventure game at Stanford in 1966. Conway's game of LIFE was implemented on computers at MIT and Stanford in 1970. The first popular commercial game was Pong (about 1976).
UIMSs and Toolkits: The first User Interface Management System (UIMS) was William Newman's Reaction Handler created at Imperial College, London (1966-67 with SRC funding). Most of the early work took place at universities (University of Toronto with Canadian government funding; George Washington University with NASA, NSF, DOE, and NBS funding; Brigham Young University with industrial funding). The term UIMS was coined by David Kasik at Boeing (1982). Early window managers such as Smalltalk (1974) and InterLisp, both from Xerox PARC, came with a few widgets, such as popup menus and scrollbars. The Xerox Star (1981) was the first commercial system to have a large collection of widgets and to use dialog boxes. The Apple Macintosh (1984) was the first to actively promote its toolkit for use by other developers to enforce a consistent interface. An early C++ toolkit was InterViews, developed at Stanford (1988, industrial funding). Much of current research is now being performed at universities, including Garnet and Amulet at CMU (ARPA funded), MasterMind at Georgia Tech (ARPA funded), and Artkit at Georgia Tech (funding from NSF and Intel).
There are, of course, many other examples of HCI research that should be included in a complete history, including work that led to drawing programs, paint programs, animation systems, text editing, spreadsheets, multimedia, 3D, virtual reality, interface builders, event-driven architectures, usability engineering, and a very long list of other significant developments. Although our brief history here has had to be selective, what we hope is clear is that there are many years of productive HCI research behind our current interfaces and that it has been research results that have led to the successful interfaces of today.
For the future, HCI researchers are developing interfaces that will greatly facilitate interaction and make computers useful to a wider population. These technologies include: handwriting and gesture recognition, speech and natural language understanding, multiscale zoomable interfaces, "intelligent agents" to help users understand systems and find information, end-user programming systems so people can create and tailor their own applications, and much, much more. New methods and tools promise to make the process of developing user interfaces significantly easier but the challenges are many as we expand the modalities that interface designers employ and as computing systems become an increasingly central part of virtually every aspect of our lives.
As HCI has matured as a discipline, a set of principles is emerging that are generally agreed upon and that are taught in courses on HCI at the undergraduate and graduate level. These principles should be taught to every CS undergraduate, since virtually all programmers will be involved in designing and implementing user interfaces during their careers. These principles are described in other publications, such as, and include task analysis, user-centered design, and evaluation methods.
Technological Trends
Again, the number and variety of trends identified in this discussions outstrip the space I have here for reporting. One can see large general trends that are moving the field from concerns about connectivity, as the networked world becomes a reality, to compatibility, as applications increasingly need to run across different platforms and code begins to move over networks as easily as data, to issues of coordination, as we understand the need to support multiperson and organization activities. I will limit the discussion here to a few instances of these general trends.
· Computational Devices and Ubiquitous Computing: One of the most notable trends in computing is the increase in the variety of computational devices with which users interact. In addition to workstations and desktop personal computers, users are faced with (to mention only a few) laptops, PDAs, and LiveBoards. In the near future, Internet telephony will be universally available, and the much-heralded Internet appliance may allow interactions through the user's television and local cable connection. In the more distant future, wearable devices may become more widely available. All these technologies have been considered under the heading of "Ubiquitous Computing" because they involve using computers everywhere, not just on desks.
The introduction of such devices presents a number of challenges to the discipline of HCI. First, there is the tension between the design of interfaces appropriate to the device in question and the need to offer a uniform interface for an application across a range of devices. The computational devices differ greatly, most notably in the sizes and resolutions of displays, but also in the available input devices, the stance of the user (is the user standing, sitting at a desk, or on a couch?), the physical support of the device (is the device sitting on a desk, mounted on a wall, or held by the user, and is the device immediately in front of the user or across the room?), and the social context of the device's use (is the device meant to be used in a private office, a meeting room, a busy street, or a living room?). On the other hand, applications offered across a number of devices need to offer uniform interfaces, both so that users can quickly learn to use a familiar application on new devices, and so that a given application can retain its identity and recognizability, regardless of the device on which it is operating.
Development of systems meeting the described requirements will involve user testing and research into design of displays and input devices, as well as into design of effective interfaces, but some systems have already begun to address these problems. Some browsers for the World-Wide Web attempt to offer interfaces that are appropriate to the devices on which they run and yet offer some uniformity. At times this can be difficult. For example, the frames feature of HTML causes a browser to attempt to divide up a user's display without any knowledge of the characteristics of that display. Although building applications that adapt their interfaces to the characteristics of the device on which they are running is one potential direction of research in this area, perhaps a more promising one is to separate the interface from the application and give the responsibility of maintaining the interface to the device itself. A standard set of protocols would allow the application to negotiate the setup of an interface, and later to interact with that interface and, indirectly, with the user. Such multimodal architectures could address the problems of generating an appropriate interface, as well as providing better support for users with specific disabilities. The architectures could also be distributed, and the building blocks of forthcoming distributed applications could become accessible from assorted computational devices.
· Speed, Size, and Bandwidth: The rate of increase of processor speed and storage (transistor density of semiconductor chips doubles roughly every 18 months according to Moore's law) suggests a bright future for interactive technologies. An important constraint on utilizing the full power afforded by these technological advances, however, may be network bandwidth. Given the overwhelming trends towards global networked computing, and even the network as computer, the implications of limited bandwidth deserves careful scrutiny. The bottleneck is the "last mile" connecting the Internet to individual homes and small offices. Individuals who do not get access through large employers may be stuck at roughly the present bandwidth rate (28,800 kilobits per second) at least until the turn of the century. The rate needed for delivery of television-quality video, one of the promises of the National Information Infrastructure, is 4-6 megabits, many times that amount. What are the implications for strategic HCI research of potentially massive local processing power together with limited bandwidth?
Increases in processor speed and memory suggest that if the information can be collected and cached from the network and/or local sources, local interactive techniques based on signal processing and work context could be utilized to the fullest. With advances in speech and video processing, interfaces that actively watch, listen, catalog, and assist become possible. With increased CPU speed we might design interactive techniques based on work context rather than isolated event handling. Fast event dispatch becomes less important than helpful action. Tools might pursue multiple redundant paths, leaving the user to choose and approve rather than manually specify. We can afford to "waste" time and space on indexing information and tasks that may never be used, solely for the purpose of optimizing user effort. With increased storage capacity it becomes potentially possible to store every piece of interactive information that a user or even a virtual community ever sees. The processes of sifting, sorting, finding and arranging increase in importance relative to the editing and browsing that characterizes today's interfaces. When it is physically possible to store every paper, e-mail, voice-mail and phone conversation in a user's working life, the question arises of how to provide effective access.
· Speech, Handwriting, Natural Language, and Other Modalities: The use of speech will increase the need to allow user-centered presentation of information. Where the form and mode of the output generated by computer-based systems is currently defined by the system designer, a new trend may be to increasingly allow the user to determine the way in which the computer will interact and to support multiple modalities at the same time. For instance, the user may determine that in a given situation, textual natural language output is preferred to speech, or that pictures may be more appropriate than words. These distinctions will be made dynamically, based on the abilities of the user or the limitations of the presentation environment. As the computing environment used to present data becomes distinct from the environment used to create or store information, interface systems will need to support information adaptation as a fundamental property of information delivery.
· 3D and Virtual Reality: Another trend is the migration from two-dimensional presentation space (or a 2 1/2 dimensional space, in the case of overlapping windows) to three dimensional spaces. The beginning of this in terms of a conventional presentation environment is the definition of the Virtual Reality Modeling Language (VRML). Other evidences are the use of integrated 3D input and output control in virtual reality systems. The notions of selecting and interacting with information will need to be revised, and techniques for navigation through information spaces will need to be radically altered from the present page-based models. Three-dimensional technologies offer significant opportunities for human-computer interfaces. Application areas that may benefit from three-dimensional interfaces include training and simulation, as well as interactive exploration of complex data environments.
A central aspect of three-dimensional interfaces is "near-real-time" interactivity, the ability for the system to respond quickly enough that the effect of direct manipulation is achieved. Near-real-time interactivity implies strong performance demands that touch on all aspects of an application, from data management through computation to graphical rendering. Designing interfaces and applications to meet these demands in an application-independent manner presents a major challenge to the HCI community. Maintaining the required performance in the context of an unpredictable user-configured environment implies a "time-critical" capability, where the system automatically gracefully degrades quality in order to maintain performance. The design of general algorithms for time-critical applications is a new area and a significant challenge.
CHAPTER TWO
CURRENT DEVELOPMENT
The current development of HCI is focused on advanced user interface design, human perception and cognitive science, Artificial Intelligence, and virtual reality, etc.
Human Perception and Cognitive Science
Why do we always need to type into the computer in order for it to do something for us? A very active subfield of HCI these days is human perception and cognitive science. The goal is to enable computer to recognize human actions the same way human perceive things. The focused subfields include Natural language and speech recognition, gesture recognition, etc. Natural language interfaces enable the user to communicate with the computer in their natural languages. Some applications of such interfaces are database queries, information retrieval from texts and so-called expert systems. Current advances in recognition of spoken language improve the usability of many types of natural language systems. Communication with computers using spoken language will have a lasting impact upon the work environment, opening up completely new areas of application for information technology. In recent years a substantial amount of research has been invested in applying the computer science tool of computational complexity theory to natural language and linguistic theory, and scientists have found that Word Grammar Recognition is computationally intractable (NP-hard, in fact). Thus, we still have a long way to go before we can conquer this important field of study.
Reasoning, Intelligence Filtering, Artificial Intelligence
To realize the full potential of HCI, the computer has to share the reasoning involved in interpreting and intelligently filtering the input provided by the human to the computer or, conversely, the information presented to the human. Currently, many scientists and researchers are involved in developing the scientific principles underlying the reasoning mechanism. The approaches used varied widely, but all of them are based on the fundamental directions such as case-based reasoning, learning, computer-aided instruction, natural language processing and expert systems. Among those, computer-aided instruction (CAI) has its origins in the 1960s too. These systems were designed to tutor users, thus augmenting, or perhaps substituting for human teachers. Expert systems are software tools that attempt to model some aspect of human reasoning within a domain of knowledge. Initially, expert systems rely on human experts for their knowledge (an early success in this field was MYCIN [11], developed in the early 1970s under Edward Shortliffe. Now, scientists are focusing on building an expert system that does not rely on human experts.
Virtual Reality
From the day we used wires and punch cards to input data to the computer and received output via blinking lights, to nowadays easy-to-use, easy-to-manipulate GUI, the advancement in the user interface is astonishing; however, many novice computer users still find that computers are hard to access; moreover, even to the experienced user, current computer interface is still restricting in some sense, that is, one cannot communicate with computers in all the way he/she wants. A complete theory of communication must be able to account for all the ways that people communicate, not just natural language. Therefore, virtual reality becomes the ultimate goal of computer interface design. Virtual reality has its origins in the 1950s, when the first video-based flight simulator systems were developed for the military. These days, it receives more and more attention from not only the scientists but the mass population. (The popularity of the movie "Matrix" is a demonstration)
Up-and-Coming Areas
Gesture Recognition: The first pen-based input device, the RAND tablet, was funded by ARPA. Sketchpad used light-pen gestures (1963). Teitelman in 1964 developed the first trainable gesture recognizer. A very early demonstration of gesture recognition was Tom Ellis' GRAIL system on the RAND tablet (1964, ARPA funded). It was quite common in light-pen-based systems to include some gesture recognition, for example in the AMBIT/G system (1968 -- ARPA funded). A gesture-based text editor using proof-reading symbols was developed at CMU by Michael Coleman in 1969. Bill Buxton at the University of Toronto has been studying gesture-based interactions since 1980. Gesture recognition has been used in commercial CAD systems since the 1970s, and came to universal notice with the Apple Newton in 1992.
Multi-Media: The FRESS project at Brown used multiple windows and integrated text and graphics (1968, funding from industry). The Interactive Graphical Documents project at Brown was the first hypermedia (as opposed to hypertext) system, and used raster graphics and text, but not video (1979-1983, funded by ONR and NSF). The Diamond project at BBN (starting in 1982, DARPA funded) explored combining multimedia information (text, spreadsheets, graphics, speech). The Movie Manual at the Architecture Machine Group (MIT) was one of the first to demonstrate mixed video and computer graphics in 1983 (DARPA funded).
3-D: The first 3-D system was probably Timothy Johnson's 3-D CAD system mentioned above (1963, funded by the Air Force). The "Lincoln Wand" by Larry Roberts was an ultrasonic 3D location sensing system, developed at Lincoln Labs (1966, ARPA funded). That system also had the first interactive 3-D hidden line elimination. An early use was for molecular modeling. The late 60's and early 70's saw the flowering of 3D raster graphics research at the University of Utah with Dave Evans, Ivan Sutherland, Romney, Gouraud, Phong, and Watkins, much of it government funded. Also, the military-industrial flight simulation work of the 60's - 70's led the way to making 3-D real-time with commercial systems from GE, Evans&Sutherland, Singer/Link (funded by NASA, Navy, etc.). Another important center of current research in 3-D is Fred Brooks' lab at UNC.
Virtual Reality and "Augmented Reality": The original work on VR was performed by Ivan Sutherland when he was at Harvard (1965-1968, funding by Air Force, CIA, and Bell Labs). Very important early work was by Tom Furness when he was at Wright-Patterson AFB. Myron Krueger's early work at the University of Connecticut was influential. Fred Brooks' and Henry Fuch's groups at UNC did a lot of early research, including the study of force feedback (1971, funding from US Atomic Energy Commission and NSF). Much of the early research on head-mounted displays and on the Data Glove was supported by NASA.
Computer Supported Cooperative Work. Doug Engelbart's 1968 demonstration of NLS included the remote participation of multiple people at various sites (funding from ARPA, NASA, and Rome ADC). Licklider and Taylor predicted on-line interactive communities in a 1968 article and speculated about the problem of access being limited to the privileged. Electronic mail, still the most widespread multi-user software, was enabled by the ARPAnet, which became operational in 1969 and by the Ethernet from Xerox PARC in 1973. An early computer conferencing system was Turoff's EIES system at the New Jersey Institute of Technology (1975).
Natural language and speech: The fundamental research for speech and natural language understanding and generation has been performed at CMU, MIT, SRI, BBN, IBM, AT&T Bell Labs and Bell Core, much of it government funded. See, for example, for a survey of the early work.
New Frontiers
Now let us take a look of some of the newest developments in HCI today.
Intelligent Room
The Intelligent room is a project of MIT Artificial Intelligence Lab. The goal for the project is, said by Michael H. Coen from MIT AIL, is "creating spaces in which computation is seamless used to enhance ordinary, everyday activities." They want to incorporate computers into the real world by embedding them in regular environments, such as homes and offices, and allow people to interact with them the way they do with other people. The user interfaces of these systems are not menus, mice, and keyboards but instead gesture, speech, affect, context, and movement. Their applications are not word processors and spreadsheets, but smart homes and personal assistants. "Instead of making computer-interface for people, it is of more fundamental value to make people-interfaces for computers."
They have built two Intelligent Rooms in the laboratory. They give the rooms cameras for eyes and microphones for ears to make accessible the real-world phenomena occurring within them. A multitude of computer vision and speech understanding systems then help interpret human-level phenomena, such as what people are saying, where they are standing, etc. By embedding user-interfaces this way, the fact that people tend to point at what they are speaking about is no longer meaningless from a computational viewpoint and they can then use build systems that make use of the information. Coupled with their natural interfaces is the expectation that these systems are not only highly interactive, they talk back when spoken to, but more importantly, that they are useful during ordinary activities. They enable talks historically outside the normal range of human-computer interaction by connecting computers to phenomena (such as someone sneezing or walking into a room) that have traditionally been outside the purview of contemporary user-interfaces. Thus, in the future, you can imagine that elderly people's homes would call an ambulance if they saw anyone fall down. Similarly, you can also imagine kitchen cabinets that automatically lock when young children approach them.
Brain-Machine Interfaces
Scientists are not satisfied with communicating with computers using natural language or gestures and movements. Instead, they ask a question why can not computers just do what people have in mind. Out of questions like this, there come brain-machine interfaces. Miguel Nicolelis, a Duke University neurobiologist, is one of the leading researchers in this competitive and highly significant field. There are only about a half-dozen teams around the world are pursuing the same goals: gaining a better understanding of how the mind works and then using that knowledge to build implant systems that would make brain control of computers and other machines possible. Nicolelis terms such systems "hybrid brain-machine interfaces" (HBMIs) Recently, working with the Laboratory for Human and Machine Haptics at MIT, he was able to send signals from individual neurons in Belle's, a nocturnal owl monkey, brain to a robot, which used the data to mimic the monkey's arm movements in real time. Scientists predict that Brain-Machine Interfaces will allow human brains to control artificial devices designed to restore lost sensory and motor functions. Paralysis sufferers, for example, might gain control over a motorized wheelchair or a prosthetic arm, or perhaps, even regain control over their own limbs. They believe the brain will prove capable of readily assimilating human-made devices in much the same way that a musician grows to feel that here instrument is a part of his/her own body. Ongoing experiments in other labs are showing that the idea is credible. At Emory University, neurologist Phillip Kennedy has helped severely paralyzed people communicate via a brain implant that allows them to move a cursor on a computer screen. However, scientists still know relatively little about how the electrical and chemical signals emitted by the brain's millions of neurons let us perceive color and smell, or give rise to the precise movements of professional dancers. Numerous stumbling blocks remain to be overcome before human brains can interface reliably and comfortably with artificial devices or making mind-controlled prosthetic limbs. Among the key challenges is developing electrode devices and surgical methods that will allow safe, long-term recording of neuronal activities.
Conclusion - a look at the future
In conclusion, Human Computer Interaction holds great promise. Exploiting this tremendous potential can bring profound benefits in all areas of human concern. Just imagine that one day, we will be able to tell computers to do what we want them to do, use gestures and hand signals to command them, or directly invoke them through our thoughts. One day, we will be able to call out an artificial intelligence from the computer or better yet, a hologram (YES! I am a diehard startrek fan) to perform the tasks that we can not accomplish, to solve aid in the emergency situations, or simply, to have someone that can listen to talk to. How bright a future that is shown to as, all thanks for the research that is going to be done in the Human-Computer Interaction field.
CHAPTER THREE
CONCEPT AND DESIGN IN HCI
Design and Evaluation Methods
Design and evaluation methods have evolved rapidly as the focus of human-computer interaction has expanded. Contributing to this are the versatility of software and the downward price and upward performance spiral, which continually extend the applications of software. The challenges overshadow those faced by designers using previous media and assessment methods. Design and evaluation for a monochrome, ASCII, stand-alone PC was challenging, and still does not routinely use more than ad hoc methods and intuition. New methods are needed to address the complexities of multimedia design, of supporting networked group activities, and of responding to routine demands for ever-faster turnaround times.
More rapid evaluation methods will remain a focus, manifest in recent work on cognitive walkthrough, heuristic evaluation, and other modifications of earlier cognitive modeling and usability engineering approaches. Methods to deal with the greater complexity of assessing use in group settings are moving from research into the mainstream. Ethnographic observation, participatory design, and scenario-based design are being streamlined. Contextual inquiry and design is an example of a method intended to quickly obtain a rich understanding of an activity and transfer that understanding to all design team members.
As well as developing and refining the procedures of design and evaluation methods, we need to understand the conditions under which they work. Are some better for individual tasks, some excellent for supporting groupware? Are some useful very early in the conceptual phase of design, others best when a specific interface design has already been detailed, and some restricted to when a prototype is in existence? In addition, for proven and promising techniques to become widespread, they need to be incorporated into the education of UI designers. Undergraduate curricula should require such courses for a subset of their students; continuing education courses need to be developed to address the needs of practicing designers.
Tools
All the forms of computer-human interaction discussed here will need to be supported by appropriate tools. The interfaces of the future will use multiple modalities for input and output (speech and other sounds, gestures, handwriting, animation, and video), multiple screen sizes (from tiny to huge), and have an "intelligent" component ("wizards" or "agents" to adapt the interface to the different wishes and needs of the various users). The tools used to construct these interfaces will have to be substantially different from those of today. Whereas most of today's tools well support widgets such as menus and dialog boxes, these will be a tiny fraction of the interfaces of the future. Instead, the tools will need to access and control in some standard way the main application data structures and internals, so the speech system and agents can know what the user is talking about and doing. If the user says "delete the red truck," the speech system needs access to the objects to see which one is to be deleted. Otherwise, each application will have to deal with its own speech interpretation, which is undesirable. Furthermore, an agent might notice that this is the third red truck that was deleted, and propose to delete the rest. If confirmed, the agent will need to be able to find the rest of the trucks that meet the criteria. Increasingly, future user interfaces will be built around standardized data structures or "knowledge bases" to make these facilities available without requiring each application to rebuild them.
These procedures should be supported by the system-building tools themselves. This would make the evaluation of ideas extremely easy for designers, allowing ubiquitous evaluation to become a routine aspect of system design.
Concepts of User Interface Design
Learnability vs. Usability
Many people consider the primary criterion for a good user interface to be the degree to which it is easy to learn. This is indeed a laudable quality of any user interface, but it is not necessarily the most important.
The goal of the user interface should be foremost in the design process. Consider the example of a visitor information system located on a kiosk. In this case it makes perfect sense that the primary goal for the interface designers should be ease of operation for the first-time user. The more the interface walks the user through the system step by step, the more successful the interface would be.
In contrast, consider a data entry system used daily by an office of heads-down operators. Here the primary goal should be that the operators can input as much information as possible as efficiently as possible. Once the users have learned how to use the interface, anything intended to make first-time use easier will only get in the way.
User interface design is not a "one size fits all" process. Every system has its own considerations and accompanying design goals. The Requirements Phase is designed to elicit from the design team the kind of information that should make these goals clear.
Metaphors and Idioms
The True Role of Metaphors in the GUI
When the GUI first entered the market, it was heralded most of all for its use of metaphors. Careful consideration of what really made the GUI successful, however, would appear to indicate that the use of metaphors was actually a little further down in the list. Metaphors were really nothing new. The term computer "file" was chosen as a metaphor for a collection of separate but related items held in a single container. This term dates back to the very early days of computers.
The single most significant aspect of the GUI was the way in which it presented all possible options to the users rather than requiring them to memorize commands and enter them without error. This has nothing to do with metaphor and everything to do with focusing the user interface on the needs of the user rather than mandating that the user conform to the needs of the computer. The visual aspect of the GUI was also a tremendous advancement. People often confuse this visual presentation with pure metaphor, but closer inspection reveals that this is not necessarily the case. The "desktop" metaphor was the first thing to hit users of the GUI. Since it was a global metaphor and the small pictures of folders, documents, and diskettes played directly into it, people bought the entire interface as one big metaphor. But there are significant aspects of the GUI that have nothing to do with metaphor.
Metaphors vs Idioms
If someone says that a person "wants to have his cake and eat it too," we can intuit the meaning of the expression through its metaphoric content. The cake is a metaphor for that which we desire, and the expectation of both possessing it and consuming it is metaphoric for the assumption that acquisition of our desires comes at no cost. But if someone says that his pet turtle "croaked," it is not possible to intuit the meaning through the metaphoric content of the expression. The expression "croaked" is an idiom. We know instantly that the turtle didn't make a funny noise but rather that it died. The meaning of the idiom must be learned, but it is learned quickly and, once learned, retained indefinitely.
Most visual elements of the GUI are better thought of as idioms. A scroll bar, for example, is not a metaphor for anything in the physical world. It is an entirely new construct, yet it performs an obvious function, its operation is easily mastered, and users easily remember how it works. It is the visual aspect of the scroll bar that allows it to be learned so quickly. Users operate it with visual clues rather than remembering the keys for line up, line down, page up, page down, etc.
Metaphors Can Hinder As Well As Help
The use of metaphor can be helpful when it fits well into a situation, but it is not a panacea and is not guaranteed to add value. The use of icons as metaphors for functions is a good example. It can be a gamble if someone will understand the connection between an icon and the function. Anyone who has played Pictionary knows that the meaning of a picture is not always clear.
Consider the Microsoft Word 5.0 toolbar. Some icons area readily identifiable, some are not. The meaning of the identifiable icons will likely be gleaned from the icon, but is still not a guarantee. The unidentifiable icons, however, can be utterly perplexing, and rather than helping they can create confusion and frustration. And with so many pictographs crammed into such a small space, the whole thing reads like a row of enigmatic, ancient Egyptian hieroglyphs.
The Netscape toolbar, by contrast, can be considered to be much more graceful and useful. The buttons are a bit larger, which makes them generally more readable. Their added size also allows the inclusion of text labels indicating the command to which the icon corresponds. Once the meaning of each icon has become learned the icon can serve as a visual mnemonic, but until then the text label clearly and unambiguously relays the function the button will initiate.
The Netscape toolbar admittedly consumes more valuable window real estate than the Microsoft Word toolbar does. There are keystroke shortcuts for every button, however, and users who have mastered them can easily hide the toolbar from view. Users who prefer to use the toolbar are probably willing to sacrifice that small bit of real estate in order to have a toolbar that is presentable and easy to use.
The "Global Metaphor" Quagmire
One major pitfall into which metaphors can lead us is the "Global Metaphor," which is a metaphor that is intended to encompass an entire application. The "desktop" concept is an example of a global metaphor.
The global metaphor becomes a quagmire when reality begins to diverge from the metaphor. Consider carefully the desktop metaphor. It can be seen how it deviates from reality immediately. The trash can is a wonderful metaphor for the deletion function, but trash cans are generally not situated on the top of a desk.
The use of the trash can to eject a disk is a perfect example of contorting the metaphor to accommodate the divergence from reality. The expectation is that "trashing" a disk will delete its contents, yet the interface designers needed a way to eject a disk and the trash can came closer than anything else. Once learned it becomes an idiom that works fine, but it is initially counter-intuitive to the point that it is shocking.
The vertical aspect of the desktop also subverts the metaphor. It's closer to a refrigerator on which one can randomly place differently shaped magnets, or the old-fashioned displays on which TV weathermen placed various symbols. The fact that the desktop metaphor has to be explained to first-time users is an indication that it might not be terribly intuitive.
The global metaphor is an example of the "bigger is better" mentality. Metaphors are perceived as being useful, so some people assume that the more all-encompassing a metaphor is the more useful it will be. As in all other situations, the usefulness of a global metaphor is dictated by the overall goals of the interface. If the goal of the interface is to present a non-threatening face on a system that will be used primarily by non-technical first-time users, a global metaphor might be useful. But if the goal of the interface is to input large quantities of data quickly and effectively, a global interface might be an enormous hindrance.
Don't Throw The Baby Out With The Bath Water
While metaphors aren't always as useful as other solutions, it is important to note that in the right situation they can be a vital part of a quality user interface. The folder is a particularly useful and successful metaphor. Its purpose is immediately apparent, and by placing one folder inside another the user creates a naturally intuitive hierarchy. The counterpart in the character user interface is the directory/subdirectory construct. This has no clear correspondence to anything in the physical world, and many non-technical people have difficulty grasping the concept.
The bottom line is that if a metaphor works naturally, use it by all means. But at the first hint that the metaphor is not clearly understood or has to be contorted in order to accommodate reality, it should be strongly considered as to whether it will really help or not.
Intuitiveness
It is generally perceived that the most fundamental quality of any good user interface should be that it is intuitive. The problem is that "intuitive" means different things to different people. To some an intuitive user interface is one that users can figure out for themselves. There are some instances where this is helpful, but generally the didactic elements geared for the first-time user will hamper the effectiveness of intermediate or advanced users.
A much better definition of an intuitive user interface is one that is easy to learn. This does not mean that no instruction is required, but that it is minimal and that users can "pick it up" quickly and easily. First-time users might not intuit how to operate a scroll bar, but once it is explained they generally find it to be an intuitive idiom.
Icons, when clearly unambiguous, can help to make a user interface intuitive. But the user interface designer should never overlook the usefulness of good old-fashioned text labels. Icons depicting portrait or landscape orientation, for example, are clearly unambiguous and perhaps more intuitive than the labels themselves, but without the label of "orientation," they could make no sense at all.
Labels should be concise, cogent, and unambiguous. A good practice is to make labels conform to the terminology of the business that the application supports. This is a good way to pack a lot of meaning into a very few words.
Designing intuitive user interfaces is far more an art than a science. It draws more upon skills of psychology and cognitive reasoning than computer engineering or even graphic design. The process of Usability Testing, however, can assess the intuitiveness of a user interface in an objective manner. Designing an intuitive user interface is like playing a good game of tennis. Instructors can tell you how to do it, but it can only be achieved through hard work and practice with a lot of wins and losses on the way.
Consistency
Consistency between applications is always good, but within an application it is essential. The standard GUI design elements go a long way to bring a level of consistency to every panel, but "look and feel" issues must be considered as well. The use of labels and icons must always be consistent. The same label or icon should always mean the same thing, and conversely the same thing should always be represented by the same label or icon.
In addition to consistency of labeling, objects should also be placed in a consistent manner. Consider the example of the Employee Essentials Address Update panels (available through Bear Access).
There is a different panel for every address that can be updated, each with its own set of fields to be displayed and modified. Note that each panel is clearly labeled, with the label appearing in the same location on every panel. A button bank appears in the same place along the left side of every panel. Some buttons must change to accommodate the needs of any given panel, but positionality was used consistently. The closer buttons are to the top the less likely they are to change, and the closer to the bottom the more likely.
Note especially the matrix of buttons at the top left corner of every panel. These buttons are the same in every panel of the entire Employee Essentials application. They are known as "permanent objects." Early navigators used stars and constellations as unchanging reference points around which they could plot their courses. Similarly, modern aviation navigators use stationary radar beacons. They know that wherever the plane is, they can count on the radar beacon always being in the same place.
User interface designers should always provide permanent objects as unchanging reference points around which the users can navigate. If they ever get lost or disoriented, they should be able to quickly find the permanent objects and from there get to where they need to be. On the Macintosh, the apple menu and applications menu are examples of permanent objects. No matter what application the user is in, those objects will appear on the screen.
Most all Macintosh applications provide "File" and "Edit" as the first two pull-down menus. The "File" menu generally has "New" "Open" "Close" "Save" and "Save As" as the first selections in the menu, and "Quit" as the last selection. The "Edit" menu generally has "Cut," "Copy," and "Paste" as the first selections. The ubiquity of these conventions has caused them to become permanent objects. The users can count on finding them in virtually all circumstances, and from there do what they need to do.
Bear Access itself is becoming a permanent object at Cornell. If a user is at an unfamiliar workstation, all he or she needs to do is locate Bear Access, and from there an extensive suite of applications will be available.
Simplicity
The complexity of computers and the information systems they support often causes us to overlook Occam's Razor, the principle that the most graceful solution to any problem is the one which is the most simple.
A good gauge of simplicity is often the number of panels that must be displayed and the number of mouse clicks or keystrokes that are required to accomplish a particular task. All of these should be minimized. The fewer things users have to see and do in order to get their work done, the happier and more effective they will be.
A good example of this is the way in which the user sets the document type in Microsoft Word version 5.0 as compared to version 4.0. In version 4.0, the user clicks a button on the save dialog that presents another panel in which there is a selection of radio buttons indicating all the valid file types. In version 5.0, there is simply a popup list on the save dialog. This requires fewer panels to be displayed and fewer mouse clicks to be made, and yet accomplishes exactly the same task.
A pitfall that should be avoided is "featuritis," providing an over-abundance of features that do not add value to the user interface. New tools that are available to developers allow all kinds of things to be done that weren't possible before, but it is important not to add features just because it's possible to do so. The indiscriminate inclusion of features can confuse the users and lead to "window pollution." Features should not be included on a user interface unless there is a compelling need for them and they add significant value to the application.
Prevention
A fundamental tenet of graphic user interfaces is that it is preferable to prevent users from performing an inappropriate task in the first place rather than allowing the task to be performed and presenting a message afterwards saying that it couldn't be done. This is accomplished by disabling, or "graying out" certain elements under certain conditions.
Consider the average save dialog. A document can not be saved if it has not been given a name. Note how the Save button is disabled when the name field is blank, but is enabled when a name has been entered.
Forgiveness
One of the advantages of graphic user interfaces is that with all the options plainly laid out for users, they are free to explore and discover things for themselves. But this requires that there always be a way out if they find themselves somewhere they realize they shouldn't be, and that special care is taken to make it particularly difficult to "shoot themselves in the foot." A good tip to keep users from inadvertently causing damage is to avoid the use of the Okay button in critical situations. It is much better to have button labels that clearly indicate the action that will be taken.
Consider the example when the user closes a document that contains changes that have not been saved. It can be very misleading to have a message that says "Continue without saving?" and a default button labeled "Okay." It is much better to have a dialog that says "Document has been changed" and a default button labeled "Save", with a "Don't save" button to allow the user not to save changes if that is, in fact, the desired action.
Likewise, it can be helpful in potentially dangerous situations to have the Cancel button be the default button so that it must be a deliberate action on the part of the user to execute the function. An example is a confirmation dialog when a record is being deleted.
Aesthetics
Finally, it is important that a user interface be aesthetically pleasing. It is possible for a user interface to be intuitive, easy to use, and efficient and still not be terribly nice to look at. While aesthetics do not directly impact the effectiveness of a user interface, users will be happier and therefore more productive if they are presented with an attractive user interface.
CHAPTER FOUR
Principles for User-Interface Design.
This section represents a compilation of fundamental principles for designing user interfaces, which have been drawn from various books on interface design, as well as my own experience. Most of these principles can be applied to either command-line or graphical environments. I welcome suggestions for changes and additions -- I would like this to be viewed as an "open-source" evolving section.
The principle of user profiling
-- Know who your user is.
Before we can answer the question "How do we make our user-interfaces better", we must first answer the question: Better for whom? A design that is better for a technically skilled user might not be better for a non-technical businessman or an artist.
One way around this problem is to create user models. [TOG91] has an excellent chapter on brainstorming towards creating "profiles" of possible users. The result of this process is a detailed description of one or more "average" users, with specific details such as:
What are the user's goals?
What are the user's skills and experience?
What are the user's needs?
Armed with this information, we can then proceed to answer the question: How do we leverage the user's strengths and create an interface that helps them achieve their goals?
In the case of a large general-purpose piece of software such as an operating system, there may be many different kinds of potential users. In this case it may be more useful to come up with a list of user dichotomies, such as "skilled vs. unskilled", "young vs. old", etc., or some other means of specifying a continuum or collection of user types.
Another way of answering this question is to talk to some real users. Direct contact between end-users and developers has often radically transformed the development process.
The principle of metaphor
-- Borrow behaviors from systems familiar to your users.
Frequently a complex software system can be understood more easily if the user interface is depicted in a way that resembles some commonplace system. The ubiquitous "Desktop metaphor" is an overused and trite example. Another is the tape deck metaphor seen on many audio and video player programs. In addition to the standard transport controls (play, rewind, etc.), the tape deck metaphor can be extended in ways that are quite natural, with functions such as time-counters and cueing buttons. This concept of "extendibility" is what distinguishes a powerful metaphor from a weak one.
There are several factors to consider when using a metaphor:
Once a metaphor is chosen, it should be spread widely throughout the interface, rather than used once at a specific point. Even better would be to use the same metaphor spread over several applications (the tape transport controls described above is a good example.) Don't bother thinking up a metaphor which is only going to apply to a single button.
There's no reason why an application cannot incorporate several different metaphors, as long as they don't clash. Music sequencers, for example, often incorporate both "tape transport" and "sheet music" metaphors.
Metaphor isn't always necessary. In many cases the natural function of the software itself is easier to comprehend than any real-world analog of it. Don't strain a metaphor in adapting it to the program's real function. Nor should you strain the meaning of a particular program feature in order to adapt it to a metaphor.
Incorporating a metaphor is not without certain risks. In particular, whenever physical objects are represented in a computer system, we inherit not only the beneficial functions of those objects but also the detrimental aspects.
Be aware that some metaphors don't cross cultural boundaries well. For example, Americans would instantly recognize the common U.S. Mailbox (with a rounded top, a flat bottom, and a little red flag on the side), but there are no mailboxes of this style in Europe.
The principle of feature exposure
-- Let the user see clearly what functions are available
Software developers tend to have little difficulty keeping large, complex mental models in their heads. But not everyone prefers to "live in their heads" -- instead, they prefer to concentrate on analyzing the sensory details of the environment, rather than spending large amounts of time refining and perfecting abstract models. Both type of personality (labeled "Intuitive" and "Sensable" in the Myers-Briggs personality classification) can be equally intelligent, but focus on different aspects of life. It is to be noted that according to some psychological studies "Sensables" outnumber "Intuitives" in the general population by about three to one.
Intuitives prefer user interfaces that utilize the power of abstract models -- command lines, scripts, plug-ins, macros, etc. Sensables prefer user interfaces that utilize their perceptual abilities -- in other words, they like interfaces where the features are "up front" and "in their face". Toolbars and dialog boxes are an example of interfaces that are pleasing to this personality type.
This doesn't mean that you have to make everything a GUI. What it does mean, for both GUI and command line programs, is that the features of the program need to be easily exposed so that a quick visual scan can determine what the program actually does. In some cases, such as a toolbar, the program features are exposed by default. In other cases, such as a printer configuration dialog, the exposures of the underlying printer state (i.e. the buttons and controls which depict the conceptual printing model) are contained in a dialog box which is brought up by a user action (a feature which is itself exposed in a menu).
Of course, there may be cases where you don't wish to expose a feature right away, because you don't want to overwhelm the beginning user with too much detail. In this case, it is best to structure the application like the layers of an onion, where peeling away each layer of skin reveals a layer beneath. There are various levels of "hiding": Here's a partial list of them in order from most exposed to least exposed:
Toolbar (completely exposed)
Menu item (exposed by trivial user gesture)
Submenu item (exposed by somewhat more involved user gesture)
Dialog box (exposed by explicit user command)
Secondary dialog box (invoked by button in first dialog box)
"Advanced user mode" controls -- exposed when user selects "advanced" option
Scripted functions
The above notwithstanding, in no case should the primary interface of the application be a reflection of the true complexity of the underlying implementation. Instead, both the interface and the implementation should strive to match a simplified conceptual model (in other words, the design) of what the application does. For example, when an error occurs, the explanation of the error should be phrased in a way that relates to the current user-centered activity, and not in terms of the low-level fault that caused there error.
The principle of coherence
-- The behavior of the program should be internally and externally consistent
There's been some argument over whether interfaces should strive to be "intuitive", or whether an intuitive interface is even possible. However, it is certainly arguable that an interface should be coherent -- in other words logical, consistent, and easily followed. ("Coherent" literally means "stick together", and that's exactly what the parts of an interface design should do.)
Internal consistency means that the program's behaviors make "sense" with respect to other parts of the program. For example, if one attribute of an object (e.g. color) is modifiable using a pop-up menu, then it is to be expected that other attributes of the object would also be editable in a similar fashion. One should strive towards the principle of "least surprise".
External consistency means that the program is consistent with the environment in which it runs. This includes consistency with both the operating system and the typical suite of applications that run within that operating system. One of the most widely recognized forms of external coherence is compliance with user-interface standards. There are many others, however, such as the use of standardized scripting languages, plug-in architectures or configuration methods.
The principle of state visualization
-- Changes in behavior should be reflected in the appearance of the program
Each change in the behavior of the program should be accompanied by a corresponding change in the appearance of the interface. One of the big criticisms of "modes" in interfaces is that many of the classic "bad example" programs have modes that are visually indistinguishable from one another.
Similarly, when a program changes its appearance, it should be in response to a behavior change; A program that changes its appearance for no apparent reason will quickly teach the user not to depend on appearances for clues as to the program's state.
One of the most important kinds of state is the current selection, in other words the object or set of objects that will be affected by the next command. It is important that this internal state be visualized in a way that is consistent, clear, and unambiguous. For example, one common mistake seen in a number of multi-document applications is to forget to "dim" the selection when the window goes out of focus. The result of this is that a user, looking at several windows at once, each with a similar-looking selection, may be confused as to exactly which selection will be affected when they hit the "delete" key. This is especially true if the user has been focusing on the selection highlight, and not on the window frame, and consequently has failed to notice which window is the active one. (Selection rules are one of those areas that are covered poorly by most UI style guidelines, which tend to concentrate on "widgets", although the Mac and Amiga guidelines each have a chapter on this topic.)
The principle of shortcuts
-- Provide both concrete and abstract ways of getting a task done
Once a user has become experienced with an application, she will start to build a mental model of that application. She will be able to predict with high accuracy what the results of any particular user gesture will be in any given context. At this point, the program's attempts to make things "easy" by breaking up complex actions into simple steps may seem cumbersome. Additionally, as this mental model grows, there will be less and less need to look at the "in your face" exposure of the application's feature set. Instead, pre-memorized "shortcuts" should be available to allow rapid access to more powerful functions.
There are various levels of shortcuts, each one more abstract than its predecessor. For example, in the emacs editor commands can be invoked directly by name, by menu bar, by a modified keystroke combination, or by a single keystroke. Each of these is more "accelerated" than its predecessor.
There can also be alternate methods of invoking commands that are designed to increase power rather than to accelerate speed. A "recordable macro" facility is one of these, as is a regular-expression search and replace. The important thing about these more powerful (and more abstract) methods is that they should not be the most exposed methods of accomplishing the task. This is why emacs has the non-regexp version of search assigned to the easy-to-remember "C-s" key.
The principle of focus
-- Some aspects of the UI attract attention more than others do
The human eye is a highly non-linear device. For example, it possesses edge-detection hardware, which is why we see Mach bands whenever two closely matched areas of color come into contact. It also has motion-detection hardware. As a consequence, our eyes are drawn to animated areas of the display more readily than static areas. Changes to these areas will be noticed readily.
The mouse cursor is probably the most intensely observed object on the screen -- it's not only a moving object, but mouse users quickly acquire the habit of tracking it with their eyes in order to navigate. This is why global state changes are often signaled by changes to the appearance of the cursor, such as the well-known "hourglass cursor". It's nearly impossible to miss.
The text cursor is another example of a highly eye-attractive object. Changing its appearance can signal a number of different and useful state changes.
The principle of grammar
-- A user interface is a kind of language -- know what the rules are
Many of the operations within a user interface require both a subject (an object to be operated upon), and a verb (an operation to perform on the object). This naturally suggests that actions in the user interface form a kind of grammar. The grammatical metaphor can be extended quite a bit, and there are elements of some programs that can be clearly identified as adverbs, adjectives and such.
The two most common grammars are known as "Action->Object" and "Object->Action". In Action->Object, the operation (or tool) is selected first. When a subsequent object is chosen, the tool immediately operates upon the object. The selection of the tool persists from one operation to the next, so that many objects can be operated on one by one without having to re-select the tool. Action->Object is also known as "modality", because the tool selection is a "mode" which changes the operation of the program. An example of this style is a paint program -- a tool such as a paintbrush or eraser is selected, which can then make many brush strokes before a new tool is selected.
In the Object->Action case, the object is selected first and persists from one operation to the next. Individual actions are then chosen which operate on the currently selected object or objects. This is the method seen in most word processors -- first a range of text is selected, and then a text style such as bold, italic or a font change can be selected. Object->Action has been called "non-modal" because all behaviors that can be applied to the object are always available. One powerful type of Object->Action is called "direct manipulation", where the object itself is a kind of tool -- an example is dragging the object to a new position or resizing it.
Modality has been much criticized in user-interface literature because early programs were highly modal and had hideous interfaces. However, while non-modality is the clear winner in many situations, there are a large number of situations in life that are clearly modal. For example, in carpentry, it’s generally more efficient to hammer in a whole bunch of nails at once than to hammer in one nail, put down the hammer, pick up the measuring tape, mark the position of the next nail, pick up the drill, etc.
The principle of help
-- Understand the different kinds of help a user needs
In an essay in [LAUR91] it states that there are five basic types of help, corresponding to the five basic questions that users ask:
|