tHog

WIMPy GUIs and broken metaphors

2015-01-26

Graphical user interfaces, in particularly the model of windows, icons, menus and pointers, have arguably made a huge contribution to the popularization of computers since the 1980s or thereabouts. Unfortunately, they also dumb down a lot of what is wonderful about computers to begin with, at worst making them glorified typewriters and teletypes.

For example, the notion of text as black characters on separate white sheets is a huge burden on communication, which does not exist in the computing world. Yet people continue to emulate that painstaking tradition with word processors.

The desktop metaphor

The basic framework of most WIMP interfaces is so common we rarely stop to think about it, yet it is the source of a huge number of limitations in computing. Basically, the entire screen represents a desk where you can see different files and applications lying around. This results in a few different, though related problems:

  1. The space is very limited for thinking big. Of course, people are now used to long texts etc. where only a pageful or so is visible, but that is still a rather limited view on things. A computer could easily handle thousands of texts at once, but it is hard to make use of such capabilities when your worldview is limited by the interface.
  2. The space gets crowded and messy with multiple documents/applications open. A lot of jobs involve working with multiple programs at once, yet it might be nice to focus on one for a while. Common WIMPs have solutions like fullscreening the one application, or minimizing others, but they too have their limits. For example, minimized windows are still visible to some extent.
  3. The metaphor is broken in many small ways. For example, to move a window in Windows or OS X, you have to grab it by the top, which is not necessary when moving stuff around in the real world. Interestingly, Unix/X11 has solved this problem quite nicely.

Other broken metaphors

In general, it seems that the more closely a GUI attempts to duplicate a real-world interface, the more likely the result is a horrible mess that is complete unusable on a computer.

For example, music production software with mock rotary knobs is just asking for trouble. Knobs work in the real world because you can grab and turn them; how exactly do you do that with just a picture on a computer screen? (Scrollwheels are a partial solution, but they represent a whole other class of problems.) Moreover, the setting level is harder to see properly. Linear sliders make much more sense in a GUI, even if they also represent real-world hardware.

The main problem

IMHO, the root issue is that typical WIMPs keep everything visible at once. This reflects a rather infantile view of users; by kindergarten, most people have learned that an object can exist even when you don't see it.

Solutions

Screen size and human vision are real limitations, but they can be turned into benefits by helping users focus on the necessary things at a time.

Virtual workspaces

These have been around in Unix/X11 for ages, and it seems recently they have become somewhat popular in the Windows and Apple worlds too. Obviously, they can help you manage more windows in a way that only shows a particular group at a time. On a deeper level, they can remind you that a computer can do much more than what's currently visible.

Command line interfaces

CLI vs. GUI is sometimes seen as a holy war between different schools of users, thus making it look like a matter of taste. However, the hard fact is that you often need to use a CLI to get anything substantial done efficiently with computers.

For example, to resize 1000 photos, you can either do it manually one at a time, or use something like

for file in *.jpeg; do
convert -geometry "1024x1024>" $file ${file//.jpeg/.small.jpeg}
done
This might look like programming — so what if it is? You're telling the computer what to do. However, there are a lot of applications where graphical interfaces make a lot of sense. There's no need to make it a holy war, it's a question of different tools for different jobs.

I wonder if these problems are due to not understanding the abstractions. For example, the idea of doing image manipulation on the command line can seem counterintuitive. But it's not like the graphical programs are actually seeing colours and doing the manipulations on the screen, it is merely an interface.

Adaptive and customizable interfaces

WIMP interfaces are generally easy to learn; they are great for one-off use, but like training wheels on a bicycle, the beginner friendliness soon becomes a hindrance to productive work. Their great benefit is that it is easy to discover new features by poking around in them, but where can that learning experience lead eventually?

Conversely, CLIs are hard to jump in straigth away. Fortunately, they can coexist with GUIs, and many Linux users generally have a mix of terminals open, along with more typical GUI applications. Linux makes it easier to learn basics with a discoverable GUI, and then graduate into the CLI, depending on the work and personal preferences.

Linux/Unix (BSD, Plan 9, etc.) distributions are certainly not the only way to get a proper CLI; OS X is Unix at heart with Xterm and Bash readily available, and I keep hearing Windows has also developed some kind of a CLI beyond the DOS prompt. However, they hardly have the level of integration and commitment found in Linux/Unix distros.


Risto A. Paju