Archive

Archive for the ‘Imagery’ Category

MD5 Imagery

December 8, 2004 Leave a comment

Just because we don’t have access to Wang’s attack on MD5 doesn’t mean we
can’t seek out new and amusing ways to reverse engineer it…some interesting
pictures, as Wang’s payloads propagate through an MD5 hash in a bit-visualized
environment:

The last line is the bit-representation of the final MD5 hash. This is
trivially inspired by Greg Rose et al’s musing on MD5 (very fine paper). Some others which have crossed my
email:

Categories: Imagery, Security

C’est Graphique 2002

November 13, 2002 Leave a comment

Considering everything I’ve been up to for the last couple of
months, you’d think I’d be satisfied. But alas, there was indeed
one event I had to skip — SIGGRAPH 2002, in (I believe) San Antonio.

Hmmm? A network/security geek, mourning a missed SIGGRAPH?

Not so surprising. I started out in Graphics, before meandering
through Web Design, User Interfaces, Emergency Windows Repair,
Unix Admin, Security, Low Level Networking…heh, and whatever comes
next. But after attending SIGGRAPH 2001, and seeing the Ferrofluid
Masterpiece,
Protrude, Flow
live, I remembered exactly what attracted me
to graphics in general and SIGGRAPH in particular.

Nothing like your brain calling bullshit on your eyes to wake
you up in the morning.

Anyway, there was some genuinely incredible stuff at SIGGRAPH this
year that, surprisingly enough, I never saw much mention of after
the show. (As it turns out, my absolute favorite piece of work —
the one I myself have become an avid user of — doesn’t even show up
on Google!) This is shocking, to
the point that I’m actually going to bother to report on something
like four months after the fact just because, well, it’s just that
impressive
.

The definitive, though incomplete archive
of papers can be found
here
; I’ve decided to write about a few of the things that surprised/impressed me. Note, I’m
heavily biased towards those papers that I could actually download the
associated videos of, so very cool sounding things (like
raytracing with pixel shaders
) couldn’t really be checked out. Oh well.

Click here for the review (it used to be attached directly, but it got a bit too long
for the front page).

Categories: Imagery

Behold, The Volumetric Canvas!

November 13, 2002 Leave a comment

Ever since the late 3Dfx revolutionized consumer PC hardware — and by revolutionized, I mean
“was completely without peer for over two years” — it’s been clear that specialized ASICs
(Application Specific Integrated Circuits) can, in certain instances, utterly wipe the floor
with General Purpose processors — even with Moore’s
Death March fully in place. I doubt even a Pentium 4 can match 3Dfx’s first product when it comes
to the bilinear filtering of even a moderate number of polygons!

The story continues, though. 3Dfx was supplanted, and eventually purchased outright,
by nVidia…and here’s where things get interesting:
Those circuits, ever so specialized, once only barely programmable via register combiners,
have grown in power and flexibility. They’re becoming…if not general
purpose, no longer fixed function. NV20 — embedded in the GeForce 3 and the X-Box —
retains the capacity to execute small but powerful pixel and vertex programs against anything
streaming out the pipe. The specialized have gone general — what’s old is new again.

And interesting things are coming because of it.

Check this out: At SIGGRAPH 2002, Christof Rezk-Salama released
OpenQVIS, his implementation
of the techniques in his doctoral thesis:
Volume Rendering Techniques for General Purpose Graphics Hardware
.
What’s this? Check out the following renderings:

Three things are important to realize about those images: First, the hardware used
to render them was built to render polygons, not MRI data. Second, if you’ve got an X-Box in
your living room, you already own the requisite silicon. Finally, those images render in realtime, somewhere
between 10 and 30FPS
. Relative to software performance, that’s the same kind of boost to volumetric rendering
as we saw hardware provide to the polygon thrash — not bad, considering the once fixed-function hardware was never
intended to provide this service!

Now, Rezk-Salama isn’t the first to be doing such work.
It was, after all, Klaus Engel’s

Pre-Integrated Volume Renderer
that introduced me to realtime volumetric rendering, not to mention
OpenQVIS itself. Klaus’s work is excellent, but it’s OpenQVIS that has me really excited. It’s complete,
mature, cross platform through the Qt toolkit, and Open Source. It’s trivial to generate data for,
and it’s fast. So, we’ve got a way to directly render arbitrary 3D matrixes. What will you do with
it? Keep me posted 🙂

Local Mirror

OpenQVIS (Source Code, GPL License)
OpenQVIS (Win32 Binary, Probably Requires DX8 Pixel Shaders)
Bonsai Tree
CT Scan of a Head(Large)
CT Scan of a Head(Small)
Engine
Inner Ear
MRI of a Head
Teddy Bear!
Temporal Bone
“Volume Rendering Techniques for General Purpose Graphics Hardware”, Christof Rezk-Salama

Categories: Imagery

UI Theory: Speech Vs. Vision

December 16, 2000 Leave a comment

LAW:
We talk faster than we write,
and we read faster than we listen.

COROLLARY:
We shout out known commands faster than we can click them,
but we visually absorb available commands faster than we can hear them.

CAVEAT:
We can write with more detail than we can speak.
We can listen with more comprehension than we can read.

EDUCATIONAL EXAMPLE:
Go tell some random person to do something on a computer.
Don’t do it yourself.
Don’t let them know in advance what you want them to do.
Notice how fast you realize they’re screwing up.
Notice how slow it is for you to correct them.
Notice how slow it is for them to absorb your corrections.
Notice how fast you sit down and do it yourself.

CONCLUSION:
The ideal interface will likely have us speaking at computers,
with them shooting responses back as fast as the eye can absorb.

Categories: Imagery

Analogous Key Arrays: A Proposal Regarding Relating Keyboard Hotkeys With Screen Position

January 20, 1999 Leave a comment

Analogous Key Arrays, or AKA´s, are command arrays with keyboard
hotkeys that correspond to the physical location of the onscreen
commands. For example, given a 3×3 array(the standard size for an
AKA) of commands on screen, a 3×3 array on the keyboard should be
chosen. Consider the following illustration, created from the DoxSTAR
modification to the popular game Starcraft by Blizzard Entertainment.



3×3 Command Box from Starcraft with corresponding hotkeys overlaid
for this illustration.

It should be noted quickly that AKA´s do not document the effects of a
given command, nor do they require any graphical modification to
command icons. (The graphical modifications here are for illustration
purposes only.) Position alone specifies a command array as an AKA.
This, of course, means the user is dependant upon some other means of
determining that an AKA is being used–tooltips are the ideal means
here, and were chosen for this function in DoxSTAR.
AKA´s are quite powerful, but have a number of caveats to their
usage. They have, I think, popped up in small numbers throughout the
history of GUI´s. This document is an attempt to codify their
properties. I shall attempt to explain their advantages and
disadvantages with candor; this page shall be updated if additional
observations come with time. Peer review is a good thing.

Theories supporting the AKA

Much of the science that supports Pie Menus(see also this Mactech
article) applies to Analogous Key Arrays. Pie Menus, unlike linear
menus, place commands in a circle around a central point, thus making
all commands equidistant from eachother. There´s some decent research
showing this is a significantly quicker method of making commands
accessible. The link applies in that since most multidimensional
command arrays form pie-menu like structures anyway, a positional
keyboard equivalent of those structures should gain the positive
effects noted in much of the research for Pie Menus.

In fact, AKA´s should hold a substantial speed advantage over Pie
Menus due to the intrinsically faster nature of key entry versus
coordinate based mouse clicks. Provided a modicum of famliarity with
the keyboard´s layout, it is faster to move one´s fingers to the
appropriate key and press it than to move the mouse to a specific
onscreen location and click. This does not, however, mean it is
faster to execute a command via keyboard than via mouse–that´s just
not true. Mouse access is much easier in general, because “Open File”
is significantly more descriptive than “Ctrl+O”. In other words,
mouse-driven actions use the screen to self-document themselves,
whereas keyboard driven actions usually require shortcuts to be
memorized.

AKA´s are therefore designed to address this state of affairs. The
first thing they do is maximize the intrinisic advantage of the
keyboard. With all keys in the AKA being in close proximity, travel
time to execute a given command is significantly reduced–Fitts´ Law
at work(side note: This link extends interestingly on Fitts´ Law, as
well as explains it). The tight proximity also aids non-touch
typists–AKA´s take the form of either positionally relative
arrays(key is two left from other command) or, especially in the 3×3
case, positionally cardinal arrays(key is in the upper right hand
corner). Even the touch typer benefits from the tight array, since
the hand position doesn´t have to shift to contact the heavily used
keys in the AKA. The intrinsic speediness of the keyboard is extended
to its maximum capacity.

The second action of the Analogous Key Array is to make keyboard
hotkeys easily memorable. Loose Prime Character hotkey naming(the
system in which the hotkey usually corresponds to the first letter of
the command) makes the assumption that all commands need to be
accessable at all times, and thus each command needs a separate
hotkey. In many situations(not all!!!) this is not the case. Often,
an application is in a specific mode with a limited number of
commands. Consider the case of Starcraft. All commands for all units
reduce to 3×3 arrays–and these arrays show their available options at
any time the user might wish to execute commands. Only the slightest
glance at the onscreen array is necessary to deduce the appropriate
hotkey for any given command. This is a significant improvement over
either guesswork or tooltip-checking that the LPC method enforces.
Additionally, there is no extra labor required to memorize commands
for a new unit–once any unit is learned, every unit is learned. In
DoxSTAR-modified Starcraft, the mouse is allowed to roam freely around
the battlefield doing what it does best–pixel level selection–while
the keyboard is designated to trigger commands.

In conclusion, the Analogous Key Array applies instinctual properties
such as relative position and cardinal direction to keyboard hotkey
assignment by relating onscreen location with keyboard location.
Fitts´ Law and ergonomic efficiencies mean this system is
significantly faster that competing assignment methodologies, and the
consistent and efficient method of observing screen position to derive
command hotkey significantly reduces short term memory load.

Caveats and observations regarding the usage of the AKA

This is an attempt at a semi-scientific paper, and while the concept
of creating a new(?) command methodology is appealing to my ego, the
eventual destruction by the peer-reviewing public would rip said ego
to micron-thin shreds. Therefore, the following are caveats and
observations regarding the usage of Analogous Key Arrays.

  1. AKA´s are biased towards 3×3 arrays with 8 or 9 options.

    3×3 arrays are superior, because they tie directly in with the
    cardinal directions(up, down, down-right) we understand
    instinctually. With the center point optional, this means eight
    or nine options are available. This leads to a bit of weirdness
    for the user if only a few of these boxes are filled. This
    shouldn´t be too much of a problem though– DoxSTAR-enabled
    Starcraft has plenty of incomplete arrays.
    With regards to what alternate array dimensions can be used, here
    are a few observations:

  2. It is better to be wider than to be taller.

    Consider a building–it´s easier to access additional rooms on the
    same floor than to take the stairs to additional floors. This
    applies to keys–it´s easier to hit keys that are on the same row
    rather than to constantly change rows. Just look at your
    hands–how many rows of fingers do you have?

  3. Work your way left to right, then down.

    The farther down a key is, the more likely a user is to curl his
    or her finger. Finger curling is more work than finger shifting.
    It is arguable that when localizing your product for areas in
    which the language is read in a different order then
    Left-Right-Down, you should change the order. This is because of
    the visual system´s seek patterns.

  4. Avoid AKA´s with a width of 4.

    Single width AKA´s are suboptimal but may be necessary for real
    estate purposes. Double width AKA´s fall into the left/right
    classification. Triple width AKA´s form left/middle/right.
    Quintuple width AKA´s form Likert style Strongly
    Left/Left/Middle/Right/Strongly Right scales. But Quadruple width
    AKAs lack a true middle, making them harder to visually convert
    from on-screen image to a keyboard equivalent.

  5. If exceeding a width of five, you may need to mark your icons, or
    split the AKA.

    Supposing your interface consists of a 10×1 AKA that pulls down
    from each number(1 -> 0) a 3×3 QWE Analogous Key Array. The 3×3
    can remain unmarked(leave it to the tooltips to document their
    behavior), but it´s going to be quite difficult for a user to
    immediately pick which of, say, ten entries a random one near the
    middle is. Splitting the 10×1 into two 5×1´s is an alternative,
    for instance.

  6. Page, don´t scroll.

    Scrolls went out of fashion a few thousand years ago for a reason.
    AKA´s that scroll lose consistency–at one moment, Up(Q) might be
    a given command, at the next, Middle(S) could be. Scrolling does
    prevent Page Orphans(A situation that occurs when just one or two
    objects overflows onto the next page), but this is far superior to
    the alternate scenario.


  7. AKA´s encourage modality in interface design

    Since multiple usage contexts share the same hotkeys, these contexts
    must be caused not to collide somehow. It is difficult to predict
    what commands users may wish to execute in the same context, and
    requiring constant context switches gets tiring. A couple notes on
    this:

  8. Most multicontext apps require modality anyway

    Even a swiss army knife requires the desired tool to be pulled
    out.
    Many of the situations in which the AKA will be used will already
    have established modality anyway. More research is definitely
    required to learn just how to switch context appropriately.

  9. AKA´s improve the quality of modal interfaces

    In Starcraft, build units have a command that opens up another
    command array of units to build. As soon as the build units
    command is triggered, a new, automatically hotkey-documented
    command array replaces it. No new learning is necessary.

  10. AKA´s use up much more screen real estate than LPC´s

    One advantage of the Loose Prime Character system is that the user
    only needs to remember the name of the command being attempted. In
    order for the user to remember the position, the user must actually be
    able to see the position. This means that AKA´s don´t fit well with
    linear menus(File, Edit, etc.), and also require onscreen real estate
    to establish the relative/directional positions. A few observations:

  11. The Rule of Sides severely affects AKA´s.

    Kaminsky´s Rule of Side Pollution basically says that anything
    placed along a side automatically pollutes the rest of that side
    for objects larger in the dimension being polluted. (I shall
    withdraw this rule if somebody else has already stated it in
    similar language.) Objects at a corner pollute one of the two
    sides they contact.. It may be unclear what this means. Suppose
    in a 1024×768 screen, a 100×100 clock exists in the lower right
    hand corner. Now suppose we have our “main window”, a web
    browser, that we´re trying to make as large as possible. We have
    three options:

    1. Make the Netscape window 1024×668
    2. Make the Netscape window 924×768
    3. Make the Netscape window 1024×768, but accept that the lower
      100×100 corner of the window will remain occluded.
      This is what I refer to as Side Pollution, and is a significant
      problem in many GUI´s. The reason this bites AKA´s so severely is
      that, because AKA´s operate by associating position, not name,
      there is an additional demand for screen space since position of a
      command must be established. It´s still worth it, since most
      keyboard shortcuts will never, ever be used, but it´s still not
      that simple. For one thing, the 3×3 array that AKA´s work best
      with pollutes more space than a pulldown menu or a 10×1 taskbar.
      More research is needed, but most applications should find that
      two 5×1 taskbars split across the main number area provides the
      best use of screen real estate. The reason the 10×1 or the two
      5×1 taskbars work whereas various 3×3 arrays don´t is because a
      corrolary of the Rule of Sides shows that Side Pollution isn´t
      cumulative; two 100 pixel polluters side-by-side along the bottom
      of the screen will still only pollute 100 pixels on the bottom.
      This is of course the justification behind the Win95 Taskbar–as
      long as all of the small boxes are organized along the bottom,
      their collective value outweighs the amount of space they take up.
      Of course, games have much more leeway since they have much more
      real estate to play around with.

  12. AKA´s can open up other, non identical AKA´si

    This has been alluded to in earlier sections, but just to make it
    explicit–just as clicking on a certain item can create a
    “pulldown/popup” window, so can pressing the associated key bring
    up a pulldown/popup AKA, replete with totally different keyboard
    shortcuts. Just make sure your documentation method(tooltips in
    DoxSTAR) explains what´s going on.

Keyspace is a serious issue, especially for applications
Back when I was part of the GNOME project, I made a call for a
definition of the GNOME keyspace. The keyspace is, essentially, the
breakdown of which keys do what functions in which contexts. As the
mouse has to deal with screen real estate, the keyboard has to cope
with a limited keyspace. Issues arise quite easily–while Starcraft
can be modified quite easily to have C trigger Cancel, users of a word
processing application are going to be quite peeved to find that C no
longer types the letter. In the application keyspace, normal keys
just type out their counterparts on screen–it´s when Control and
Control-Alt is added that keystrokes become commands, at least in the
application keyspace. Worse, there are certain keystrokes in the app
keyspace that seriously conflict with the QWE AKA–Control-C in
Windows must always equal Copy. Windows has also appropriated the
entire Alt keyspace to menu shortcuts–arguable for reducing
confusion, but painful when a new class of hotkeys shows up. A couple
observations on how to deal with this:

  1. Applications should use the Control-# and the Control #Pad
    Keyspace
    There are some issues with this–namely, if additional contexts
    are called via the two 5×1 arrays that Control 1-5 and Control 6-0
    provide, it´s going to be harder for the user to trigger the #789
    set than to trigger the QWE set. Also, using the #Pad keyspace
    could lead to some interesting non-uniform(but self-documenting!)
    shortcut sets. It´s much easier, by the way, for most
    applications to justify using the number pad since users are more
    used to shifting between the mouse and the keyboard while typing.
    While it is true that the whole idea of using keyboard shortcuts
    is to avoid having to move one´s hand to the mouse, it should be
    noted that the amount of mental processing, let alone time
    necessary to execute a command via a mouse far outweighs the
    amount of time necessary to hit a “far away” key and return. Apps
    where speed is critical or in which this many keyboard commands
    could not exist can rely on the Control-# keyspace anyway.

  2. Mouse-Intensive Programs should never use the #Pad as their
    primary AKA. Rather, they should use the QWE Analogous Key Array.
    Most applications will not fall into this category. Games are a
    different story–pretty much any game where a mouse is used
    moderately heavily should not be programmed with a #pad AKA.
    Here´s why–if the program is designed to have the user leave his
    right hand on the mouse, his left hand is the one at the controls
    of the AKA. The #Pad, however, is at the right side of the
    keyboard. To use this setup, either the user moves his or her
    keyboard a foot to the left, allowing both hands to be placed in
    the center right in front of the monitor, or the user shifts his
    left hand all the way to to the right side of the keyboard, which
    isn´t exactly the most ergonomic position. It is much better to
    have the right hand stay on the right side and the left hand stay
    on the left–plus, Control, Alt, and at least half of the 1-0
    number row are within easy reach of the left hand in the QWE
    layout.

  3. The Operating/Windowing System should use Control-Alt for
    launching other software
    Windows allows shortcuts to have arbitrary keyboard hotkeys.
    (Windows is most probably the most keyboard enabled GUI in
    existence–this coming from an ardent Linux fan) Unfortunately
    they don´t always work perfectly or universally. This is a
    shame–for applications that are constantly run, keyboard
    shortcuts are a wonderful way to boot. Perhaps the items in the
    Windows Quick Launch bar should automatically be bound to
    Control-Alt-#? The only issue with the Control-Alt keyspace is
    that Photoshop uses it to reasonably good effect, but then only
    with mouse commands.

  4. Customization can be good.
    For applications or games with large or expanding command options,
    an AKA that can be easily customized by the user should be quite
    useful to users who are otherwise overburdened.

Multiple AKA´s pose issues
If there is only one AKA, and it´s on the right of the screen but the
left of the keyboard, it´s somewhat bothersome but not that big of a
deal. If there are two AKA´s in simultaneous use, and they´re
sideswapped in the same manner, the hotkeys are unusable–the user
will repetitively choose the wrong hotkey. Due to the increased
horizontal range of the keyboard, horizontal conflicts are more
common, but the point holds in the vertical dimension as well–if
hotkeys are misaligned vertically, the user may have difficulties.
However, it´s probably not too troublesome if one AKA is at the top
left of the screen and the other is at the lower right, yet both are
on the same horizontal plane.

Watch out for custom keyboards!
AKA´s operate on position–what if the user´s key position isn´t the
same as the developer´s? Uh oh is right. A good portion of the
population has Microsoft Natural Keyboards or a knockoff thereof.
Using a custom AKA that spans the split in the middle of these boards
is not only foolhardy(since even normal boards lack a hard, vertical
line for the hand to align itself against) but also completely
unusable for these users. Furthermore, if you expect any large number
of users to use your layout, ship with a Dvorak configuration–there
aren´t many Dvorak users out there, but they´re not worth
ignoring–they´re arguably using a better system, and they´re quite
capable of arguing it publically 🙂 Finally, if at all possible(and
this holds true even if you aren´t writing a program that uses AKAs),
let your keyboard configuration be configurable. Please.

Don´t ignore left handed people.
Left handers have been obscenely ignored by joystick manufacturers for
the past few years–it´s probably not a nice thing to embed this
subtle dis in your code. Most left handers will work fine with the
right-handed layout, but a number of them won´t, so you may want to
ship either with a method to reconfigure the AKA hotkeys or just with
a swapper to move QWE to #789.

Miscellaneous Notes

I think I basically covered most of the issues you will have to
address when choosing if and how to implement Analogous Key Arrays.
If you feel there is anything I missed, please email me.

Categories: Imagery

Cluehunting: A Proposal Regarding The Intelligent Use of Available Data In The User Interface

August 1, 1998 Leave a comment

Where Cluehunting Comes From

(Editor´s Note: I have received an impressive and incredible amount
of email regarding cluehunting, and I thank everybody who mailed me.
Much of the text here needs to be rewritten to accommodate the lucid
and honestly surprising quantity and quality of research people put
into advancing this proposal. Some stuff regarding the true history
of cluehunting does need to be modified.. Bear with me.)
Cluehunting is an advanced Expansion Agent, defined as a system that
allows the computer to search possible “expansions” throughout given
contexts given a “clue” by the user. Clues are defined as segments of
data(type irrelevant) that the computer would be able to utilize to
predict the final contents of the user’s intention. Expansions are
the presumed intentions of the user. Finally, contexts are the
“search space” that is being scanned–the file system context, the
launcher context, or even a thesaurus/spell check context are all
valid options.

It would be completely unfair to describe cluehunting as a totally
original concept–it stands, if you will, on the shoulders of giants.
Tab-completion is the oldie, and as far as I know originated with the
Unix shell tcsh, though it’s also a hidden option in the NT command
shell. This technology is quite file-system specific: Enter as much
as you know about a path, starting with the root, and tab complete
will expand what you type to fit. For example–enter /usr/home/eff
and hit tab, and you will be given the first entry in /usr/home/ that
begins with “eff”. Some limited regular expressions are allowed–for
example, if I’m in the directory /usr/home/effugas and type unzip
*.zip, I will be able to tab through each zip file in my home
directory. Very slick.

Tab Completion is nice, but it has it’s flaws. First of all, tab has
become the de facto standard for “advance to next field” in GUIs, and
there’s no way I want to get rid of one of the best keyboard
timesavers in existence. Secondly, it searches files and only files.
There are other search contexts that should be hit. Finally,
tab-complete provides no way to expand into anything but a single
entry–what if the didn’t want just one of the group, what if the user
wanted to expand into all entries that fit the given form? In other
words, instead of just one zip, all of *.zip was inserted? Would be
logical in a number of situations.

Tab Complete’s newborn sibling, Autocomplete, was a web browser
innovation that began at the much maligned UI shop known as Microsoft
and was later adopted by Netscape for its Communicator browser. (To
be as fair as possible, the emacs editor includes substantial
autocomplete facilities. I am referring here to the fact that this
was the first implementation of autocomplete for ordinary users, and
as far as I know was the first implementation among the thousands of
Windows apps over the last few years.) As Microsoft integrated
Internet Explorer and Windows Explorer, both the Run Dialog and the
Web Open Dialog possess Autocomplete functionality. (Actually,
Microsoft Word will also Autocomplete anything you type that is
related to a few known categories, i.e. date, author name, etc., but
I’ll deal with this later.) So what does this bring to the table?
Well, we see the beginnings of clue contexts showing up here, since at
first glance it appears that the run menu will autocomplete files and
the web browser will autocomplete web sites. But these are both
searches of the same clue context–the history context, in which
things that have been typed before are called back to be expanded back
into reality. And how does Autocomplete expand entries? In the
middle of typing, inverted text will appear containing the contents of
what the computer is guessing the user is trying to get at. This text
will only appear to the next valid level–http:// will expand to
http://www.best.com, but it will not expand to
http://www.best.com/~effugas nor
http://www.best.com/~effugas/Personal/SILC/silc.html. There’s no way
to really scroll through possible entries in this history-based
autocomplete–the first thing that matches will be matched to its
first level, and that’s all you get.(Ed Note: Holding shift and arrow
down lets you scroll through possible autocompletes on Netscape.)
Worse, sometimes a delay in typing is required to simply trigger an
autocomplete. Still, this functionality is total joy, even with all
of its warts.


What´s New In Cluehunting

Cluehunting specifies the following advancements beyond present-day
expansion technology:

  1. Universal Expansions
  2. Inputstream Aware Expansion Styles
  3. Application-Dependant Clue Contexts
  4. Clue Context Overrides
  5. Pluggable Context Servers
  6. Regular Expressions
  7. Batch Expansions
  8. Cluelists
  9. General Accessability

Definitions help, of course:

  1. Univeral Expansions: Expansion should be available in all
    interface components.
    The primary limitation of present expansion
    methods is that can’t really be available everywhere. Cluehunting
    is designed to allow every interface construct to read the
    intentions of the user. It is the purpose of the next nine points
    to make sure that this works, and works well.

  2. Inputstream Aware Expansion Styles: Segmented streams of input
    data ought to implement commanded expansion, while unified
    inputstreams may take advantage of automatic expansion.
    A little
    background is going to be necessary to understand this. First,
    You can’t outclass something you can’t recognize the class of.
    That being said, lets talk about Microsoft’s UI department. Take
    Microsoft Word 95/97. Red and green spelling and grammar warning
    underlines are excellent interface components. They’re
    unobtrusive enough to ignore in the heat of thought, yet available
    enough to make it difficult to miss misspelled or inappropriate
    words. I miss them any time I type in anything else. They
    enhance the feedback loop of the inputstream. The inputstream is
    defined as the flow of commands from the user to the computer as
    well as any information fed back along the same channels as the
    input–for example, a clock in the lower right hand corner is not
    part of the inputstream, but the characters that pop up in
    response to the corresponding character being pressed on the
    keyboard is. What does not work in Word, however, is
    Autocomplete. When I type Dan, I’m not always talking about
    myself, and when I type August, I’m not always talking about the
    present date. I don’t want to have to interrupt my stream of
    thought to correct Word–my concepts are segmented into words from
    sentences, paragraphs, and full documents. This contrasts sharply
    with the very appropriate and useful usage of autocomplete for web
    sites, which have addresses that are single-phrase and thus
    unified. Therefore, while Word, and any other segmented
    inputstream receivers ought to require a key to be pressed before
    the phrase is expanded(though a graphical hint like a different
    cursor would help), Netscape should attempt to expand
    automatically. NOTE: Research is required to make sure this
    inconsistency does not overly confuse users. It is very possible
    that automatically triggering an expansion in unified instances
    but delaying expansions in segmented cases is utterly confusing to
    users. In this case, I’d lean towards an completely delayed
    expansion interface.

  3. Application Dependant Clue Contexts: Applications should search
    multiple clue contexts appropriate to the active application
    context.
    Strange words coming from someone who worships
    consistency in user interfaces, but I really think this is
    necessary. Applications generate context, and all clues should
    not expand from some single chosen source. For example: Suppose
    I enter the word “liffe” into a word processor. The ideal word
    processor would notify the user immediately and non-intrusively
    that the word was mispelled. Obviously, the appropriate clue
    context for a misspelled word is to search through alternative
    correct spellings. Multiple presses of the Continue Cluehunt
    keybinding would search through multiple alternative spellings,
    until the user chose to press either the Cancel Cluehunt
    keybinding(probably Escape) to revert to the misspelled form or to
    press the Cluehunt Successful keybinding(probably Enter). The
    user could, of course, reselect the correctly spelled word, and
    this time search through the default context for a correctly
    spelled word: the thesaurus. So, life would be replaced with
    various synonyms–or, the thesaurus dialog could come up to
    provide a multidimensional search between life-as-vocation,
    life-as-socialness, or life-as-complete-lack-thereof. All that
    cluehunting specifies is a precondition and a
    postcondition–dialogs do not violate this. It would be
    preferable if these weren’t modal dialogs, however–it is rarely
    appropriate for the user to be locked out of his or her document.

  4. Pluggable Context Indexes: Clue contexts, either attached to an
    application or independant, should register themself with a
    central index.
    This index of clue contexts would be categorized
    either by type or by owner application, would have MRU(most
    recently used) lists, and would be reconfigurable by the user.

  5. Clue Context Overrides: The user should be able to specify a
    specific clue context to expand from, in either a proactive manner
    or a reactive manner.
    Despite the fact that applications often
    have context that make sense, there are times when the user has
    another context in mind. For example, the user should be able to
    access the Thesaurus context while saving a file, or the
    filesystem context while documenting an application, or the web
    history context while creating a web page of links. This would be
    implemented with a Set Clue Context keybinding which would modify
    the present word’s clue context–a reactive override. If the user
    had not yet typed a word, the next word would be the recipient of
    the entered context–this would be a proactive override. Contexts
    would be registered upon install as per the plug-in clue context
    interface, and manipulatable via a replacable dialogs. Most
    probably, some degree of categorization would be appropriate, as
    well as expansion on the clue context type itself. (In other
    words, a box would be given, and you’d type in Th and Thesaurus
    might come up). Of course, common clue contexts should be
    automatically recognized. A user typing in a path in any
    application, for example, should usually first trigger the file
    system history context, and then the literal file system search
    context. Similar results should await a user typing http://.
    However, there is an advantage to being able to select a context.
    By selecting the Execute Command context, the user could load any
    app directly from within any other app and have the stdout reply
    be pasted at the cursor. Much like ircii’s /exec command, this
    would allow the contents of, say, an ls to be directly pasted at
    the cursor. Quite nice.

  6. Regular Expressions: Regular Expressions should be available for
    usage in clue expansions.
    Many users are familiar with using * to
    signify a wildcard. While the default expansion would, in
    general, presume a * at the end of the provided clue and expand
    from there, there is no reason this is necessary. A user
    searching for dictionary words that end with “sort” should be able
    to expand *sort into resort, consort, and plain old sort. The
    only problem–how to differentiate between a clue containing a
    regex for search purposes(execute context for ls -l *.gz) versus a
    clue that wants its regex expanded before search(command history
    context for ls -l *.gz). It’s quite probable that most contexts
    will only fit one or the other, but I’m unsure. Email me if you
    think that a specific “begin regex” keybinding would be necessary.

  7. Batch Expansions: All entries that fit the provided clue should
    be available for simultaneous expansion.
    Through an “expand all”
    keybinding, the contents of all clues that fit the given context
    should be pasted at the cursor. This facilitates things such as
    “gunzip *.gz” being expanded into a list of all files to be
    gunzipped, allowing the user to make sure the shell was expanding
    the list correctly, among other uses.

  8. Cluelists: All entries that fit the provided clue should be
    listable in a multiselectable sortable dialog.
    In same ways, a
    basic version of this is part of Microsoft Word 97: Right click
    on a misspelled word and note the four or five alternate correct
    spellings right there in front of you. Most GUI web browsers also
    allow you to search the typed-in history by clicking on the down
    arrow at the far right of the entry bar. Cluelists extend this
    behavior by allowing the user a listmode or detailsmode(more
    windowspeak, so shoot me) interface to select between multiple
    options for expansion. Suppose the user wants to gunzip a couple
    of his or her .gz files. Simply typing gunzip *.gz inside of a
    cluehunt-enabled xterm and pressing the “cluelist” keybinding
    would generate a window containing a list of all files ending in
    “.gz”. Then, the user would control-click or shift-click the
    specific gzipped files desired to be expanded, press OK, and hit
    enter to cause those files to be gunzipped.

  9. General Accessibility: All capabilities of cluehunting must be
    accessible by mouse as well as by keyboard. It is critical that
    Cluehunting be part of a self-documenting interface, defined as an
    interface that bolsters the user’s understanding and mapping of
    available options. One major way to make an interface
    self-documenting is to provide multiple paths to the same
    destination that reference eachother. Right-clicking on a batch
    of text should either bring up a single menu item containing
    “cluehunt” or a list of all the cluehunting options directly in
    the root right-click–research will be necessary to see which is
    preferable. Now, of course, each entry in the right-click menu
    would contain the keyboard shortcut right-justified, and the
    corresponding shortcut would be listed in the keybox(dev-note:
    Will be explained in upcoming proposal). Pretty slick.


Default Cluehunting Keybindings

(Editor´s Note: I have some association with the GNOME project, which
hopefully will end up creating a world class User Interface for Linux
and other Unix systems. Nothing official, anymore.)
Well, I’ll be blunt: We’re still working on a default keyspace for
GNOME compliant apps. However, the following are a preliminary set of
keybindings for cluehunting:

          + Cluehunt Forwards:  Alt-Shift-Right Arrow
          + Cluehunt Backwards: Alt-Shift-Left Arrow
          + Accept Cluehunt:  Anything that moves the cursor.  Enter has
            its functionality modified to not clear the contents of the
            expansion.
          + Reject Cluehunt:  Esc
          + Expand All:  Alt-Shift-Enter
          + Scroll Through Cluelist:  Alt-Shift-Up and Down.

The Future Of Cluehunting

Cluehunting is a developed proposal, but it’s still in development.
Research will be needed to check for areas of confusion and
functionality. Still to be determined:

  1. How to notify the user that the existing text is
    expandable via a cluehunt? Different cursors, different
    text colors, a note in the title bar…?

  2. How to implement cluehunting? One possible way is to
    simply have a directory structure that corresponds to
    individual clue contexts and contains standard
    stdin/stdout apps that take in the appropriate segment
    and spit out a return value. Implementation isn’t that
    much of an issue, though–possibility is more relevant
    than methodology.

Categories: Imagery