Home > Imagery > Cluehunting: A Proposal Regarding The Intelligent Use of Available Data In The User Interface

Cluehunting: A Proposal Regarding The Intelligent Use of Available Data In The User Interface

Where Cluehunting Comes From

(Editor´s Note: I have received an impressive and incredible amount
of email regarding cluehunting, and I thank everybody who mailed me.
Much of the text here needs to be rewritten to accommodate the lucid
and honestly surprising quantity and quality of research people put
into advancing this proposal. Some stuff regarding the true history
of cluehunting does need to be modified.. Bear with me.)
Cluehunting is an advanced Expansion Agent, defined as a system that
allows the computer to search possible “expansions” throughout given
contexts given a “clue” by the user. Clues are defined as segments of
data(type irrelevant) that the computer would be able to utilize to
predict the final contents of the user’s intention. Expansions are
the presumed intentions of the user. Finally, contexts are the
“search space” that is being scanned–the file system context, the
launcher context, or even a thesaurus/spell check context are all
valid options.

It would be completely unfair to describe cluehunting as a totally
original concept–it stands, if you will, on the shoulders of giants.
Tab-completion is the oldie, and as far as I know originated with the
Unix shell tcsh, though it’s also a hidden option in the NT command
shell. This technology is quite file-system specific: Enter as much
as you know about a path, starting with the root, and tab complete
will expand what you type to fit. For example–enter /usr/home/eff
and hit tab, and you will be given the first entry in /usr/home/ that
begins with “eff”. Some limited regular expressions are allowed–for
example, if I’m in the directory /usr/home/effugas and type unzip
*.zip, I will be able to tab through each zip file in my home
directory. Very slick.

Tab Completion is nice, but it has it’s flaws. First of all, tab has
become the de facto standard for “advance to next field” in GUIs, and
there’s no way I want to get rid of one of the best keyboard
timesavers in existence. Secondly, it searches files and only files.
There are other search contexts that should be hit. Finally,
tab-complete provides no way to expand into anything but a single
entry–what if the didn’t want just one of the group, what if the user
wanted to expand into all entries that fit the given form? In other
words, instead of just one zip, all of *.zip was inserted? Would be
logical in a number of situations.

Tab Complete’s newborn sibling, Autocomplete, was a web browser
innovation that began at the much maligned UI shop known as Microsoft
and was later adopted by Netscape for its Communicator browser. (To
be as fair as possible, the emacs editor includes substantial
autocomplete facilities. I am referring here to the fact that this
was the first implementation of autocomplete for ordinary users, and
as far as I know was the first implementation among the thousands of
Windows apps over the last few years.) As Microsoft integrated
Internet Explorer and Windows Explorer, both the Run Dialog and the
Web Open Dialog possess Autocomplete functionality. (Actually,
Microsoft Word will also Autocomplete anything you type that is
related to a few known categories, i.e. date, author name, etc., but
I’ll deal with this later.) So what does this bring to the table?
Well, we see the beginnings of clue contexts showing up here, since at
first glance it appears that the run menu will autocomplete files and
the web browser will autocomplete web sites. But these are both
searches of the same clue context–the history context, in which
things that have been typed before are called back to be expanded back
into reality. And how does Autocomplete expand entries? In the
middle of typing, inverted text will appear containing the contents of
what the computer is guessing the user is trying to get at. This text
will only appear to the next valid level–http:// will expand to
http://www.best.com, but it will not expand to
http://www.best.com/~effugas nor
http://www.best.com/~effugas/Personal/SILC/silc.html. There’s no way
to really scroll through possible entries in this history-based
autocomplete–the first thing that matches will be matched to its
first level, and that’s all you get.(Ed Note: Holding shift and arrow
down lets you scroll through possible autocompletes on Netscape.)
Worse, sometimes a delay in typing is required to simply trigger an
autocomplete. Still, this functionality is total joy, even with all
of its warts.


What´s New In Cluehunting

Cluehunting specifies the following advancements beyond present-day
expansion technology:

  1. Universal Expansions
  2. Inputstream Aware Expansion Styles
  3. Application-Dependant Clue Contexts
  4. Clue Context Overrides
  5. Pluggable Context Servers
  6. Regular Expressions
  7. Batch Expansions
  8. Cluelists
  9. General Accessability

Definitions help, of course:

  1. Univeral Expansions: Expansion should be available in all
    interface components.
    The primary limitation of present expansion
    methods is that can’t really be available everywhere. Cluehunting
    is designed to allow every interface construct to read the
    intentions of the user. It is the purpose of the next nine points
    to make sure that this works, and works well.

  2. Inputstream Aware Expansion Styles: Segmented streams of input
    data ought to implement commanded expansion, while unified
    inputstreams may take advantage of automatic expansion.
    A little
    background is going to be necessary to understand this. First,
    You can’t outclass something you can’t recognize the class of.
    That being said, lets talk about Microsoft’s UI department. Take
    Microsoft Word 95/97. Red and green spelling and grammar warning
    underlines are excellent interface components. They’re
    unobtrusive enough to ignore in the heat of thought, yet available
    enough to make it difficult to miss misspelled or inappropriate
    words. I miss them any time I type in anything else. They
    enhance the feedback loop of the inputstream. The inputstream is
    defined as the flow of commands from the user to the computer as
    well as any information fed back along the same channels as the
    input–for example, a clock in the lower right hand corner is not
    part of the inputstream, but the characters that pop up in
    response to the corresponding character being pressed on the
    keyboard is. What does not work in Word, however, is
    Autocomplete. When I type Dan, I’m not always talking about
    myself, and when I type August, I’m not always talking about the
    present date. I don’t want to have to interrupt my stream of
    thought to correct Word–my concepts are segmented into words from
    sentences, paragraphs, and full documents. This contrasts sharply
    with the very appropriate and useful usage of autocomplete for web
    sites, which have addresses that are single-phrase and thus
    unified. Therefore, while Word, and any other segmented
    inputstream receivers ought to require a key to be pressed before
    the phrase is expanded(though a graphical hint like a different
    cursor would help), Netscape should attempt to expand
    automatically. NOTE: Research is required to make sure this
    inconsistency does not overly confuse users. It is very possible
    that automatically triggering an expansion in unified instances
    but delaying expansions in segmented cases is utterly confusing to
    users. In this case, I’d lean towards an completely delayed
    expansion interface.

  3. Application Dependant Clue Contexts: Applications should search
    multiple clue contexts appropriate to the active application
    context.
    Strange words coming from someone who worships
    consistency in user interfaces, but I really think this is
    necessary. Applications generate context, and all clues should
    not expand from some single chosen source. For example: Suppose
    I enter the word “liffe” into a word processor. The ideal word
    processor would notify the user immediately and non-intrusively
    that the word was mispelled. Obviously, the appropriate clue
    context for a misspelled word is to search through alternative
    correct spellings. Multiple presses of the Continue Cluehunt
    keybinding would search through multiple alternative spellings,
    until the user chose to press either the Cancel Cluehunt
    keybinding(probably Escape) to revert to the misspelled form or to
    press the Cluehunt Successful keybinding(probably Enter). The
    user could, of course, reselect the correctly spelled word, and
    this time search through the default context for a correctly
    spelled word: the thesaurus. So, life would be replaced with
    various synonyms–or, the thesaurus dialog could come up to
    provide a multidimensional search between life-as-vocation,
    life-as-socialness, or life-as-complete-lack-thereof. All that
    cluehunting specifies is a precondition and a
    postcondition–dialogs do not violate this. It would be
    preferable if these weren’t modal dialogs, however–it is rarely
    appropriate for the user to be locked out of his or her document.

  4. Pluggable Context Indexes: Clue contexts, either attached to an
    application or independant, should register themself with a
    central index.
    This index of clue contexts would be categorized
    either by type or by owner application, would have MRU(most
    recently used) lists, and would be reconfigurable by the user.

  5. Clue Context Overrides: The user should be able to specify a
    specific clue context to expand from, in either a proactive manner
    or a reactive manner.
    Despite the fact that applications often
    have context that make sense, there are times when the user has
    another context in mind. For example, the user should be able to
    access the Thesaurus context while saving a file, or the
    filesystem context while documenting an application, or the web
    history context while creating a web page of links. This would be
    implemented with a Set Clue Context keybinding which would modify
    the present word’s clue context–a reactive override. If the user
    had not yet typed a word, the next word would be the recipient of
    the entered context–this would be a proactive override. Contexts
    would be registered upon install as per the plug-in clue context
    interface, and manipulatable via a replacable dialogs. Most
    probably, some degree of categorization would be appropriate, as
    well as expansion on the clue context type itself. (In other
    words, a box would be given, and you’d type in Th and Thesaurus
    might come up). Of course, common clue contexts should be
    automatically recognized. A user typing in a path in any
    application, for example, should usually first trigger the file
    system history context, and then the literal file system search
    context. Similar results should await a user typing http://.
    However, there is an advantage to being able to select a context.
    By selecting the Execute Command context, the user could load any
    app directly from within any other app and have the stdout reply
    be pasted at the cursor. Much like ircii’s /exec command, this
    would allow the contents of, say, an ls to be directly pasted at
    the cursor. Quite nice.

  6. Regular Expressions: Regular Expressions should be available for
    usage in clue expansions.
    Many users are familiar with using * to
    signify a wildcard. While the default expansion would, in
    general, presume a * at the end of the provided clue and expand
    from there, there is no reason this is necessary. A user
    searching for dictionary words that end with “sort” should be able
    to expand *sort into resort, consort, and plain old sort. The
    only problem–how to differentiate between a clue containing a
    regex for search purposes(execute context for ls -l *.gz) versus a
    clue that wants its regex expanded before search(command history
    context for ls -l *.gz). It’s quite probable that most contexts
    will only fit one or the other, but I’m unsure. Email me if you
    think that a specific “begin regex” keybinding would be necessary.

  7. Batch Expansions: All entries that fit the provided clue should
    be available for simultaneous expansion.
    Through an “expand all”
    keybinding, the contents of all clues that fit the given context
    should be pasted at the cursor. This facilitates things such as
    “gunzip *.gz” being expanded into a list of all files to be
    gunzipped, allowing the user to make sure the shell was expanding
    the list correctly, among other uses.

  8. Cluelists: All entries that fit the provided clue should be
    listable in a multiselectable sortable dialog.
    In same ways, a
    basic version of this is part of Microsoft Word 97: Right click
    on a misspelled word and note the four or five alternate correct
    spellings right there in front of you. Most GUI web browsers also
    allow you to search the typed-in history by clicking on the down
    arrow at the far right of the entry bar. Cluelists extend this
    behavior by allowing the user a listmode or detailsmode(more
    windowspeak, so shoot me) interface to select between multiple
    options for expansion. Suppose the user wants to gunzip a couple
    of his or her .gz files. Simply typing gunzip *.gz inside of a
    cluehunt-enabled xterm and pressing the “cluelist” keybinding
    would generate a window containing a list of all files ending in
    “.gz”. Then, the user would control-click or shift-click the
    specific gzipped files desired to be expanded, press OK, and hit
    enter to cause those files to be gunzipped.

  9. General Accessibility: All capabilities of cluehunting must be
    accessible by mouse as well as by keyboard. It is critical that
    Cluehunting be part of a self-documenting interface, defined as an
    interface that bolsters the user’s understanding and mapping of
    available options. One major way to make an interface
    self-documenting is to provide multiple paths to the same
    destination that reference eachother. Right-clicking on a batch
    of text should either bring up a single menu item containing
    “cluehunt” or a list of all the cluehunting options directly in
    the root right-click–research will be necessary to see which is
    preferable. Now, of course, each entry in the right-click menu
    would contain the keyboard shortcut right-justified, and the
    corresponding shortcut would be listed in the keybox(dev-note:
    Will be explained in upcoming proposal). Pretty slick.


Default Cluehunting Keybindings

(Editor´s Note: I have some association with the GNOME project, which
hopefully will end up creating a world class User Interface for Linux
and other Unix systems. Nothing official, anymore.)
Well, I’ll be blunt: We’re still working on a default keyspace for
GNOME compliant apps. However, the following are a preliminary set of
keybindings for cluehunting:

          + Cluehunt Forwards:  Alt-Shift-Right Arrow
          + Cluehunt Backwards: Alt-Shift-Left Arrow
          + Accept Cluehunt:  Anything that moves the cursor.  Enter has
            its functionality modified to not clear the contents of the
            expansion.
          + Reject Cluehunt:  Esc
          + Expand All:  Alt-Shift-Enter
          + Scroll Through Cluelist:  Alt-Shift-Up and Down.

The Future Of Cluehunting

Cluehunting is a developed proposal, but it’s still in development.
Research will be needed to check for areas of confusion and
functionality. Still to be determined:

  1. How to notify the user that the existing text is
    expandable via a cluehunt? Different cursors, different
    text colors, a note in the title bar…?

  2. How to implement cluehunting? One possible way is to
    simply have a directory structure that corresponds to
    individual clue contexts and contains standard
    stdin/stdout apps that take in the appropriate segment
    and spit out a return value. Implementation isn’t that
    much of an issue, though–possibility is more relevant
    than methodology.

Categories: Imagery
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: