Introduction

Visual exploration and analysis of data is increasingly important for advancement in virtually every area of human endeavor. Whether recorded directly by people or indirectly using machines, data captures our observations and interpretations of the world.

When people interact with data, it is almost always in a visual form like graphics or text. The goal of this project is to vastly expand the usefulness of interactive visualizations by providing a general way to create and edit data inside the visualizations themselves.

The key new idea of the project is that visualization users can perform sequences of gestures with common input devices to express their observations and interpretations directly in visual form. The visualizations not only show data, but also serve as meaningful graphical spaces in which to edit that data.

By extending the data processing workflows and display techniques that are currently used in popular visualization tools and software libraries, we can flexibly and expressively translate the details of interactions into precise data changes with simultaneous visual feedback.

The innovative contributions of the project will include a general method to support interactive data editing in visualizations, a diverse collection of data editing gestures, a set of patterns to guide the process of designing visualization tools with data editing features, a declarative programming language for quickly building those tools, and a variety of built tools that show off real applications of data editing in visualizations.

The project focuses on developing, evaluating, and distributing tools for scholarly research in the digital humanities. It tightly integrates education to bring together students and researchers from computer science, information science, and the humanities, and provide them with concrete opportunities to engage in authentic interdisciplinary collaboration.

Scholarly research and education in the humanities involves open-ended exploration, analysis, and interpretation of complex data sets in diverse areas of study. This makes it an exemplary first target to demonstrate how gesture-based visual editing can be broadly applied to data analysis in virtually every segment of society. The broader impacts of the project will spring from the availability of a new, foundational, general-purpose methodology to support data entry, organization, annotation, and correction.

Project products will include publications, tutorials, videos, the visualization gesture system as open source software, a compendium of data editing gestures, and a gallery of demonstration visualization tools for public download. Ongoing information about the project and links to the resulting resources will appear on this page as they develop.

InfoVis Interaction Survey

We conducted a comprehensive survey of visualization interactions in the 570 papers published at the IEEE Conference on Information Visualization from 1995 to 2014. Our approach involves accumulating full bibliographic metadata on the papers, scanning each paper for candidate interactions, performing close reading to record and interpret authorsâ€™ explicit and implicit descriptions of interactions in their technique/tool/system, then recording details about each interaction into a database. This process has so far resulted in 486 identified interactions in the visualizations presented in 175 different papers.

Based on this survey, we created InfoVis paper trading cards as a way to ask InfoVis 2015 attendees to help us classify selection interactions on our poster, Examining the Many Faces of Selection:

Surprisingly, we found that very few visualizations in the InfoVis literature include interactions that can readily thought of as gesture-based data editing interactions. Most interactions are common navigations or variations of basic brushing selection. Many of the remaining interactions are form/widget based data value editors. The latter can be thought of as editing a user-specific boolean-valued attribute for each data item. This result has led us to rethink the role of data editing in information visualization. On the one hand, there are few existing examples from which to determine a taxonomy of editing interactions and from there to define a language for declaring gesture-based interactions in visualization designs. On the other hand, there is enormous opportunity for us to design these interactions with few prior constraints or pre-conceptions, allowing great flexibility and creativity. Our new approach is to map combinations of geometric characteristics of device operations into constrained data value domains, allowing designers to design the data value outputs of gestures as a function of geometric descriptors.

Designing Gestures for Data Editing

The project is helping to significantly expand the theory of visualization design. The human-computer interaction loop of data visualization can be usefully reinterpreted as drawing/painting/sculpting that is indirectly backed by data rather than directly backed by configurations of physical media. Interactions in visualization can follow abstract rules of computation that are less constrained than the concrete rules of physics. Visualization interaction need not be limited to direct manipulation of space (navigation) or pointing at objects in it (selection). Interaction design in data visualization can be extended beyond navigation and selection in a straightforward way.

Gestures are movements in a spatial context. They possess geometry and other characteristics. They can affect the geometry and other characteristics of the spatial context, the objects in it, or both. From this perspective, navigation is a special case of interaction that transforms context geometry. Selection is a special case of interaction that indicates object subsets. Of course, numerous other interactions exist in visualization tools, but all (or the vast majority) happen in a halo of user interface beyond the visualization model. Many of these interactions can be captured under the gesture-based data editing model.

Interactive visual data editing can be modeled on top of existing user input mechanisms based on time sampling of event streams from physical input devices. The process of mapping events into data edits can be decomposed into a small sequence of relatively simple declarations. We have specified a semi-formal model that successfully breaks down event processing for data editing into eight steps: match conditions, accumulate events, aggregate events, match event patterns, extract geometries, aggregate geometries, map features into feedback, and map features into data/parameters.

We have applied the model to specify several example visualization gestures including both well-known cases (e.g., point selection) and previously unknown ones (e.g., half-plane selection):

Application of the model appears to be highly understandable, expressive, economical, and reusable. We anticipate this approach to data editing will carry over easily and with extreme effectiveness into visualization design practice. Beyond our own efforts on application to the digital humanities in this project, we are increasingly hopeful that the model could radically open up the interaction design space to study and broad application throughout the visualization community.

Geometric Data Editing

The objects and relations in a geometrie can be used to compose geometries for several purposes: to visually encode the spatial characteristics of data items (

*glyphs*), to similarly encode the spatial characteristics of gesture feedback (*guides*), and to define functions that relate item and feedback encodings. The geometric inputs to these functions can come from the data encoding, the gesture feedback, or both. Gestures evaluate those functions to produce data values, which are assigned to the attributes of targeted items in the visualized data set. Editing can happen continuously, in discrete stages, and/or at the end.
For example, data items might be visually encoded as circles with fill color indicating an importance value. Clicking any point in the view sets the importance value to be the distance from that point to the corresponding data item's center. Similarly, a horizontal line dragged vertically might set each importance value to be the minimum distance from the line to the corresponding center. These two gestures involve the same visual encoding of items (glyphs), but use different feedback encodings (guides), and apply different functions to map the relationship between glyph and guide geometries into new data values shown as fill color. Editing is discrete (clicking) in the first gesture and continuous (dragging) in the second — although the latter could be designed otherwise — in both cases causing fill color to adjust dynamically after each click or drag movement. The two gestures differ substantially in both appearance and behavior, despite serving essentially the same editing purpose. Such variety in designing gestures for edit data of a particular type can support either general-purpose use or application-specific needs.

Functions can be applied to the geometric objects in a geometrie to produce useful data values of a variety of different data types. Some functions produce decimal values in the form of scalar measurements on geometric objects, such as the length of a line segment or the area of a circle. Other functions apply some sort of decimation to geometric objects to produce integer values, such as the quadrant of a point in a circle. Other functions test for a geometric condition to produce boolean values, such as whether a line segment intersects a circle or a point is inside a polygon.

Using the geometrie, these functions can be effectively reduced to simple characteristics of composed geometric objects. Consider the two previous example gestures. In the first example, editing happens by taking the length of the line segment that connects the click and item center points, rather than calculating the distance between those two points. Similarly, in the second example, editing uses the length of the normal line segment that connects the item center point to the dragged line.

In most visualized data sets, data values are almost always booleans, integers, decimals, or strings. Geometric gestures appear flexible enough to support general editing of the first three types. On the other hand, interactive entry of strings falls outside of our geometric editing approach. Entering strings can still happen by conventional means, such as in a popup textbox. Gestures can still be used to select which data items take the entered string as a new attribute value. Moreover, if the set of allowed strings is small and fixed (categorically typed) rather than arbitrary (nominally typed), gestures for editing integers can be combined with an integer-to-string index. Each data item has a string attribute, the visual encoding displays that string as a label, the gesture produces an integer, and the edit consists of mapping that integer into a new string from the set. For example, if data items are visually encoded as circles labelled with the season, one could change the season to spring, summer, fall, or winter using a quadrant-picking gesture.

Geometric functions can also be used to design gestures for editing values of

*constrained*data types. Real-world observations and measurements often fall into a limited set or range of sensible values. Data attributes and the gestures that edit them should respect these limits. The two example gestures above both use distance to allow entry of only non-negative decimal values, which makes sense for importance as a data attribute. The constraint in this case is a lower bound (of zero) on allowed decimal values.
Our goal is to support general-purpose visual data editing in a wide variety of real-world applications. Doing that calls for two efforts. First, we need to better understand the ways that a geometric approach to editing limits the reachability of data values, both theoretically (mathematically) and practically (on a real screen). Second, we need to better understand the ways that real-world observations and measurements can and should be constrained relative to computationally "raw" data types like integers and decimals.

The full scope of both efforts is beyond that of this project.

**Because we are targeting data editing gestures for digital humanities applications, within the scope of this project we are focusing on geometries that support addition and adjustment of small, bounded integers and decimals.**Gestures for these constrained data types will allow editing of common kinds of information in digital humanities data sets, particularly counts, ratings, categories, geographic locations, and calendar date fields.
For example, imagine a visualization of ancient cities, displayed on a map as circles with area proportional to estimated population. The map provides a gesture to edit a conjectured fraction of non-native language speakers in each city's population. The gesture presents a circular guide. The edited data value is proportional to the area of overlap between the city circle and the guide circle. This gesture design has several benefits. It bounds edited values both above (to total city population) and below (to 0). It uses area as a perceptually accurate way to show fraction, including continuously during the gesture. It also allows more precise entry of very low fractions owing to the way circles overlap.

**Glyph = Circle, Guide = Circle**

This gesture involves two circles. The glyph circle is fixed in space. The guide circle can be moved interactively. The area of their overlap is used to define a data value. This geometric combination is radially symmetric; the overlap area is a function of the distance but not direction between centers. Consequently, it is enough to consider movement of the guide circle horizontally relative to the glyph circle.

The data values that can be reached with this gesture range from 0 (no overlap, with the guide circle entirely to the left or right of the glyph circle) to the minimum area of the two circles (when they fully overlap or one fully contains the other). However, editing within this range is not uniformly sensitive to movement of the guide circle. As a function of horizontal movement distance, the area of overlap changes less when the circles overlap slightly than when they overlap a lot. Consequently, movements can adjust data values more precisely toward the low end of the reachable range. One can also quickly and easily set the data value to exactly zero by moving the entire guide circle anywhere to the left or right of the glyph circle. On the other hand, setting the data value to exactly the maximum requires slow, careful movement to achieve perfect overlap. This combination of advantages and disadvantages likely serves the above city example well, since editing in that application would usually involve small fractions.

Exploring Geometries for Editing

The design space for gesture-based data editing is potentially enormous. Geometries and constrained data types provide conceptual scaffolding to structure that space. To build and populate it, we must first understand the characteristics of constrained data types, how geometric combinations manifest those characteristics (or not), and whether those combinations are likely to be useful and usable for actual visualization applications.

The following sequence of 13 examples explores ways of combining glyph and guide geometries to edit small bounded decimal values. Glyph geometry progresses from the simple and familiar to the complex and exotic. Guide geometry is always a vertical line that can be dragged left and right. The edited data value is always an area defined by the two geometries. In other words, we treat glyph geometry as the exploratory variable with guide geometry and data value mapping as controls. This controlled approach helps us probe the reachability of data values and the sensitivity/precision of interactive movements as a function of particular geometry pairings.

**Glyph = Circle, Guide = Vertical Line**

In this example, the glyph is a circle. The guide is a vertical line which can be moved horizontally. When the line intersects the circle at more that one point, the resulting chord defines two regions inside the circle. The edited data value is the area of the region to the left of the chord.

Reachable data values range from 0 to the area of the whole circle. (Alternatively, the area could be normalized for a maximum of 1.) Like the circle-circle example above, a 0 data value is easily and quickly reached by moving the line anywhere to the left or right of the circle. Unlike the circle-circle example, the data value can be more precisely adjusted near the maximum (when the line is just left of the circle's rightmost point), as well as near zero (just right of the leftmost point). Near the center of the circle, moving the line changes the data value more rapidly but also more linearly.

For this combination of geometries, area serves as a constrained data type with several characteristics: (1) is has a lower bound of 0; (2) it has an upper bound equal to total circle area; (3) inside these bounds the reachable values are distributed quadratically, with highest density at the extrema; (4) outside each bound, the reachable value is a constant equal to the bound; and (5) the distribution is symmetric about the midpoint value.

**Glyph = Cone, Guide = Vertical Line**

In this example, the glyph is a horizontal cone. The guide is a vertical line which can be moved horizontally. The line cuts the cone vertically, enclosing a region left of the line and right of the cone's apex. The edited data value is the area of this region.

Many visualizations imply infinite geometry through the grid and axis decoration of view coordinate systems, but rarely (if ever) use it explicitly in data representation or interaction. This example introduces the interesting prospect of using infinite geometries for visual encoding and editing of data. For instance, an infinite cone could be used to visually encode the potential extent of a propagating wavefront.

Reachable data values have a lower bound of 0 and no upper bound. Because the cone extends infinitely to the right, in theory the line can be dragged far to the right to enclose an arbitrarily large region. In practice dragging is limited by the finite space and time available for interacting with the guide inside a view. The lower bound of 0 can be reached by moving the line anywhere to the left of the cone. However, this might not be quick or easy if the line has been previously dragged far to the right. (This observation suggests that there would be value in studying the dynamics of autoscrolling for geometries beyond rectangles.)

The distribution of data values is the right half of a parabola. When the line is near the apex of the cone, changes are small and precision is high. As the line moves more to the right, the area of intersection accelerates, one can quickly reach much larger data values, but with rapidly decreasing precision. While at first glance this may seem like a very costly tradeoff, in fact counts and other kinds of numbers recorded as observations in the digital humanties, when they range across multiple orders of magnitude, often require only one or two significant digits of precision. In these cases, the cone gesture can be used to maintain sufficient precision while allowing rapid adjustment over a wide range of values. (An interesting variation might calculate the data value as an inverse of the area, allowing the entry of very small fractions with similar pattern of precision across scales.)

**Glyph = Bowtie, Guide = Vertical Line**

In this example, the glyph consists of a mirror-image pair of finite horizontal cones in the shape of a bowtie. The guide is a vertical line which can be moved horizontally. The line cuts one of the cones vertically, enclosing a region left of the line and right of the cones' mutual apex. The edited data value is the area of this region.

The area increases rapidly as the vertical line moves horizontally from the leftmost edge of the bowtie. However, as the vertical line nears the apex, the rate of change decreases and precision increases linearly. As the line moves right past the apex, the rate of change increases and precision decreases linearly, up to the rightmost edge of the bowtie. Reachable data values have a lower bound of 0 and an upper bound of 1 (the normalized area of the entire bowtie), and these values can be easily and quickly set by moving the line left or right of the bowtie.

The distribution of data values consists of two half parabolas. The graph of data values is equivalent to a vertical parabola with the left half turned downward. The curve is continuous but changes very slowly at the apex, allowing for smooth setting of data values near 0.5. This combination of geometries thus provides a distribution suitably for entering observations that occur more predominantly toward the middle of a bounded range and are generally symmetric in that range.

Alternatively, area could be calculated for the region from the apex to the line, with sign corresponding to direction left or right. The resulting data values would fall in the range of -1 to +1. The axial symmetry of the bowtie relative to line position would thus allow visual entry of both a magnitude and a parity simultaneously. The bowtie the apex provides a clear, precise location where parity changes. The same axial symmetry exists in the circle example above, but the location of parity change is harder to identify due in the top and bottom of the circle. Setting parity is quick and easy across the entire magnitude range in the bowtie example, but only so toward the ends of the range in the circle examples.

**Glyph = Aster, Guide = Vertical Line**

In this example, the glyph is an "aster" created by dividing a circle into four arcs, then inverting each arc around the chord connecting its endpoints. The result is a kind of "inverted" circle. The guide is a vertical line which can be moved horizontally. The line cuts arcs vertically, enclosing a region left of the line and right of the point shared by the left two arcs (and which in the original circle would have been its leftmost point). The edited data value is the area of this region.

The distribution of data values is a sigmoid-like function with a minimum of 0 and an maximum of 1 (the normalized area of the entire aster). The curve is continuous but changes very slowly at the extremes and very quickly in the middle. This is the same pattern as in the circle examples, but exacerbated due to the "inversion". Moreover, the "inversion" creates an obvious inflection point near the center of the aster, allowing easier entry of values near the middle of the range, although still with low precision.

Entry of values at the bounds is easy and quick by moving the line outside the glyph, just like with the other finite glyph geometries. Entry of values near the bounds is highly precise. Whereas the bowtie example allows

*more*precise entry near**one**data value in the range, the aster example allows*highly*precise entry near**two**. This suggests refinement to design geometries with distributions that offer precise entry near**multiple**values, with the ability to tune where and how precise those value regimes are.**Glyph = Compound (Diamonds), Guide = Vertical Line**

A geometrie provides a set of operators to calculate

*unitary*geometric objects from others.*Compound*geometric objects can also be defined by simply collecting them into a group. Compound geometries often have the same characteristics as unitary geometries. For instance, a group of 2-D shapes has bounds and area determined by the union of regions of the shapes in the group.
In this example, the glyph is a pair of diamonds. The compound geometry consists of two squares, each rotated 45 degrees and positioned horizontally to touch at one mutual vertex. This geometry can also be seen as a set of four finite cones in a sequence alternating in direction right-left-right-left. It can also be interpreted as an extended bowtie that converges back to a point on each end. The guide is a vertical line which can be moved horizontally. The line cuts the diamonds/cones vertically, enclosing a region left of the line and right of the leftmost vertex of the leftmost diamond. The edited data value is the area of this region.

The distribution of data values is a multi-level sigmoid-like curve. This curve combines features of the distributions in the bowtie and aster examples. The compound geometry is finite like its components, so data values remain bounded. Entry of the bound values similarly remains easy and quick left and right of the overall shape, although in practice likely less so due to the need to move the line farther to overcome a larger overall width. Within the bounds, the distribution allows more precise entry near

**three**values, specifically the two bounds and their average/midpoint. In between are**two**regimes of faster but less precise adjustment.**Glyph = Compound (Hexagons), Guide = Vertical Line**

The previous examples reveal how changes in height in a glyph's geometry determines local characteristics in the accumulation of area, i.e. the distribution of reachable data values. When height changes quadratically, the curve changes cubically, like in the circle and aster examples. When height increases/decreases linearly, meaning the slope is flat but not horizontal, the distribution curve increases/decreases quadratically, like in the cone, bowtie, and diamond examples. When height is constant, meaning the slope is horizontal, the distribution curve increases linearly. Consequently, glyph designs can incorporate flat segments into geometries to create linear ranges within a data value distribution.

In this example, the glyph is a pair of hexagons. To create this geometry, each diamond in the previous example is stretched to add a flat section in its center. The width of the flat section is 1/2 of the original diamond, hence 1/3 of the created hexagon. As in the previous examples, the guide is a horizontally movable vertical line which cuts the glyph and encloses the area within the glyph to the left of the line. The center segment of each hexagon creates two linearly increasing spans in the data value distribution, one centered on 1/4 and one on 3/4.

In practice the overall visualization design and context of use can strongly affect the speed versus precision tradeoff in any of these examples. The size and aspect ratio of glyphs, as displayed on the screen, is particularly important. For the two hexagons in this example, the absolute width of the quadratic spans are the same as for the two diamonds example. The amount of horizontal space near the bound values and midpoint value is also unchanged. Thus, the precision of entering those values is preserved. However, the relative widths are smaller in the hexagon case. If the two glyphs were visually encoded and rendered to have the same overall width on screen, the apparent widths of the non-flat segments would be reduced, lowering the precision of data adjustment within them.

**Glyph = Compound (Extruded Bowtie), Guide = Vertical Line**

In this example, the glyph is an infinitely extruded bowtie. To create this geometry, an infinitely wide flat section is added to each end of the bowtie. This can also be thought of as extruding the left and right edges of the bowtie outward indefinitely. (The glyph can also be seen of as a variation of the hexagon example in which the middle third of each hexagon is infinitely extruded and the outermost third of each hexagon is effectively eliminated.) Like the previous examples, the guide is a horizontally movable vertical line which cuts the glyph in two. Unlike the previous examples, but like the variant mentioned in the finite bowtie example, the area is defined by the region between the line and the center point of the bowtie. The area and direction determine the magnitude and sign of the edited data value.

*(Note: The figures above depict the finite case. The infinite case looks the same except to shift the center to origin, fill only outward from the center, and extend the distribution curve infinitely in both directions.)*
By stretching flat segments indefinitely, the distribution of data values changes linearly everywhere except within a small subrange of values near zero. Precision is constant except within that subrange. This dichotomy suggests gesture designs to accommodate cases in which data types consist of meaningfully piecewise subtypes, such as for types that involve different units at different scales. For instance, local changes in population due to migration might be recorded exactly at values below 1000, but in thousands above that. For historical populations, most but not all edits might involve very small numbers and only a few in the thousands. An extruded bowtie could be used to provide equally sufficient precision down to 1000 then increasing precision for entering closer to 0 (whether positive or negative).

**Glyph = Compound (Stretched Diamonds), Guide = Vertical Line**

In this example, the glyph is an infinitely stretched pair of diamonds. To create this geometry, the outermost vertex of each diamond is pulled out to infinitely, stretching its outermost edges outward. The result is a compound geometry comprising a bowtie and an envelope bounded by +1/x and -1/x for |x|>1. The guide and the area-to-data-value mapping are the same as in the extruded bowtie example.

The graph of reachable data values is still continuous, and has decreasing precision outward from zero while the line is inside the bowtie. Outside the bowtie, speed of adjustment slowly increases and precision of editing slowly decreases logarithmically; accumulation along the envelope effectively integrates 1/x.

Using only the positive half of the glyph geometry could be useful for entering durations on calendar timescales. For instance, it is customary to express age in months for infants and years for children and adults, with a transition typically at 24 months / 2 years. It is also possible for adults to live beyond 100 years, and there is no definite limit. A stretched diamond glyph would limit entry to positive values, show when month entry transitions to year entry, and provide gradually increasing precision of entry toward higher ages.

**Glyph = Cusps, Guide = Vertical Line**

In this example, the glyph is defined by the envelope between -1 / log10(|x|) and +1 / log10(|x|). The guide and the area-to-data-value mapping are the same as in the extruded bowtie and stretched diamond examples. The two functions intersect at two points, x = -1.0 and x = 1.0. This produces a segmented look similar to the two previous examples; however, the segmenting results from deriving a 2-D region from two 1-D curves, rather than summing existing 2-D regions into a group. The two infinite segments on the ends — the U-shaped regions extending horizontally outward — are nevertheless visually simple. Despite subtle mathematical differences, they serve the same general purpose, indefinitely accumulating area, as the corresponding outer segments in the previous two examples. Their visual character and pattern accumulation roughly parallel the finite examples circle, diamond, and aster.

In contrast, the middle segment is horizontally finite but vertically infinite. The logarithmic function has a singularity at x=0 which creates (as a rendering artifact) the visual appearance of a cusp at the center of the glyph. Interestingly, the total area of the segment is finite (4 / ln(10)). Consequently, the middle segment can be used to accumulate area either outward (as we consider here) or from the glyph ends (if the outer segments were truncated to finite length, as the figures above suggest).

Near x=0, the cusp causes sharp increases in accumulation and extremely low precision near x=0. This behavior is the opposite of those in the previous two examples.

*(The overall distribution curve appears strikingly similar to a diagonal (x=-y) inversion of the stretched diamond case. This similarity makes some intuitive sense given the derivative-integration duality of 1/x and log(x), but closer inspection reveals differences in the middle segment, which results from the use of a third function for the bowtie in the stretched diamonds case.)*Values near x=0 are much harder to choose, making that point something of an interactive "repulsor" rather than an "attractor" for editing. This characteristic might be useful for editing constrained data types that have values one can get very close to, but shouldn't pick exactly. This could be a value in a range that is open on one or both ends, or a value that must be postive but can otherwise be arbitrarily small.**Glyph = Compound (Sequence of Diamonds), Guide = Vertical Line**

In this example, the glyph is a horizontal sequence of multiple diamonds, in this case four of them. The gesture works exactly the same way as in the two diamonds example, but results in a data value distribution with multiple transitions between four subranges that allow faster adjustment and five subranges that allow more precise value setting.

In the more general case of N diamonds, there are N faster subranges and 2+(N-1) more precise subranges: 2 for the infinitely width subranges outside of the glyph, plus N-1 for the subranges near the shared vertex points of successive diamonds. A specific number of diamonds can be used to allow editing that focuses on an equally distributed set of likely values. For instance, one could use a nine-diamond sequence to edit numbers that can fall anywhere in the range [1.0, 10.0] but that tend to be close to exact integers, which may be a desirable constrained data type for ratings.

The above glyphs with sequences of 2, 4, and 9 diamonds are all examples of designing with identical copies of the same finite shape. Sequences can also involve heterogeneous shapes. For instance, designs can mix circles, asters, and diamonds (as well as other shapes with different height profiles) to customize how speed and precision varies in particular subranges. For instance, using asters instead of diamonds would allow for easier entry of data values near more precise "focal" values. Using circles would have the opposite effect.

**Glyph = Compound (Widening Cones), Guide = Vertical Line**

**Glyph = Compound (Narrowing Cones), Guide = Vertical Line**

Sequences can also mix in infinite shapes, like in the stretched diamonds example, to customize distribution profiles with infinite extent above or below. The individual glyphs in a sequence can also have differing characteristics. For instance, one can vary the width and/or height of several diamonds in a sequence, thereby shifting where and how tightly focal data values fall within the bounds of the distribution.

Multiple cones can be similarly stacked to fine-tune a distribution. The figures above show glyphs for a set of four widening cones and a set of four narrowing cones, respectively. The corresponding distribution graphs have the same overall shape but differ strongly in the details. Complex, ad hoc gestures like these are unlikely to find a use in practical, but demonstrate a capability to customize highly specialized glyphs for unique applications.

**Glyph = Compound (Sculpted Mix), Guide = Vertical Line**

In the most general case, glyph geometries can be sculpted using a collection of mixed geometries. In this example, for instance, horizontal and diagonal line segments are interleaved to customize the rates of accumulation in successive spans in the data value distribution. The flat segments result in a constant rate of increment both near zero and toward the extremes of the reachable range.

Looking forward, sequences might also include one or more flat segments of

**zero**height. These segments serve as*spacers*to provide zones for entering exact data values. If a glyph is finite, the infinite spans beyond the upper and/or lower bounds can be thought of as infinite spacers. Regardless of where they occur, spacers allow easier exact entry by providing a non-zero width horizontal space in which height is zero, and hence no accumulation of area occurs while dragging the guide line. Consequently, one can quickly move the line anywhere along the spacer to choose the corresponding data value.
Spacers would provide an effective way to support editing of constrained data types that mix discrete and continuous values. In the ratings example above (sequence of diamonds), for instance, the goal might be to facilitate the common case of entering exact integer values, yet allow entry of decimal values in special cases. Adding spacers to the 9-diamond example would accomplish that. The width of the spacers relative to the diamonds could be set to determine how strongly favored (easily entered) integers are compared to decimals. The height of diamonds could be changed to increase or reduce available precision between integers. Other shapes could be used to change the distribution profile between successive integers. For instance, the diamonds could be replaced with squares to make decimal entry a linear function of dragging distance between the spacers.

A geometrie offers numerous geometric relationships to choose from in gesture design. Surveying the design space is helping us develop principles and guidelines for tailoring combinations of editing geometries to specific data types, and do so in ways that are compatible with visual encoding choices.

The examples above progress from the simple to the exotic. This approach serves our purpose of surveying the gesture design space. Moving forward, we will vary and relax the assumptions made in exploring the example sequence above. Potentially important variations to analyze include those in which the guide geometry has more than one interactive degree of freedom, such as to support rotation as well as translation of a guide line, free 2-D movement of a guide circle, or dynamic resizing of a guide circle. We will also explore how data can be edited as a function of geometric relationships other than area of intersection, specifically distance and direction.

It seems likely that focusing on relatively simply geometries will be sufficient to inform the bulk of practical gesture design. Although most data sets have data attributes with constrained types, those types are rarely exotic, and probably almost never have distributions as exotic as those in the more complex examples above. Gestures with simpler geometries also appear likely to be much more learnable, memorable, and performable than complex ones, particular when used as part of a data editing toolbox in a multiple view visualization design.

One of the practical challenges of gesture design will be reconciling the geometries used to visually encode data with those used in gestures to determine changes to that data. A geometry that works well for visual encoding may work poorly for gesture design, and vice versa. Common shapes like circles, diamonds, and asters are often appropriate and commonly used in visual encoding of data, but that very simplicity offers limited geometric structure for use in calculating interactive changes to data during editing. Conversely, complex geometries can facilitate gesture design for highly specialized data types, but may often be poorly suited for visually encoding data attributes in a readable, scannable way.

Visual encodings in which shape varies as a function of data are an especial challenge. In such cases one cannot rely on a uniform gesture geometry for data edits. One solution would be to require that gestures vary in parallel with the corresponding visual encodings; there can be multiple visual encodings and gestures in a view, but they must be geometrically compatible from data item to data item. Another solution would be to design gestures that occur within the view but are not geometrically related to individual data items. In the various examples above, the various glyph geometries would be associated with (and drawn as a part of) the entire view rather than individual data items. We revisit this issue in an analysis of practical gesture designs in our interactive Gesture Browser, next.

Gesture Browser (v1)

The Gesture Browser is a cross-platform Java desktop application. On Mac and Linux/Unix systems, run it by double-clicking "gesture-browser" in the "bin" directory. On Windows systems, double-click "gesture-browser.bat". Your system must have a recent version of Java installed (1.8 or newer) for the application to work.

**Download the application**here.
Upon launching, the application will show a single window. An interactive view of sample data is on the left. A gallery of available gestures is on the right. Selecting a gesture in the gallery activates that gesture for use in the view. A summary of how the gesture works is shown below the view.

**Using Gestures**

Gestures allow editing of visualized data in different ways. The idea is to perform a short sequence of simple, familiar interactions, in a particular order, to trigger the desired changes to data values. A key difference from existing data editing user interfaces is that one performs all the interaction directly on the visualized data itself.

The gestures included the gallery involve two interaction steps. The first step is to select items in the view for editing. The second step is to actually perform the editing by interacting with those items to adjust their data values.

The view shows data items as circles of various sizes. To select items, move and click the mouse inside the view. Clicking will clear any existing selection and select all items that contain the click point. Selected and unselected items appear as circles with blue and red edges, respectively.

To edit items, hold down the key and move the mouse around the view. As one does this, the view displays a visual guide. The appearance of the guide depends on the currently chosen gesture. How the data value of each selected item changes depends on the geometric relationship between the item and the guide. Release the key to stop editing the selected items.

**Trying Out Some Examples**

During editing, the "Circle Chord" example displays a green vertical line as the guide. The horizontal position of the guide sets each selected item's data value to be the area within its circle to the left of the guide's vertical line. Each item's data value is shown as its circle's fill color. This color is calculated by mapping the data value into a black-to-green gradient. Editing happens continuously while is down.

The "Circle Chord v2" example uses the same guide and maps areas into data values in the same way as "Circle Chord", but displays each item's data value as circle radius instead of fill color. During editing, each selected item is shown as both the original filled circle and a provisional unfilled circle. Actual modification of data values happens at the end of editing when the key is released.

In the "Circle Circle" example, the guide is a circle. For each selected item, the edited data value is the area of overlap between the item's circle and the guide circle. The data value is shown as fill color like in "Circle Chord". Editing happens continuously while is down.

**A Closer Look**

The above gestures use slightly different combinations of line and circle geometries. The utility and usability consequences of these differences are striking, even for such a small, simple, seemingly similar set. Each combination holds promise for specialized editing, but there appear to be many tradeoffs when considering particular editing applications and circumstances of use.

In the "Circle Chord" case, the relationship between the guide line and item circles is such that moving from left to right accumulates circular area sigmoidally rather than linearly. This gives one more fine-grained control over data values near the minimum and maximum than might be achieved with a typical linearly-mapped range slider. (To recreate such a slider for the gallery, one could simplify the "Circle Chord" gesture using the relative horizontal position of the guide in the circle rather than the resulting area.) The gesture also provides sensible limits on edited values; one can choose the exact minimum and maximum values quickly and easily by moving the guide anywhere entirely left or right, respectively, of the selected circle.

In the "Circle Chord v2" case, the resulting data value can range from 0 (when the guide line is horizontally left of the circle) to the existing data value (when the line is horizontally centered on the circle) to twice the current data value (when the line is horizontally right of the circle). This allows one to "visually multiply" a data value in a way that makes sense physically and perceptually: one "moves" linearly (horizontally) to specify the multiplication factor and "looks" quadratically (chorded circle area) to evaluate the multiplication result.

Limiting the maximum applied multiplication factor, in this case 2x, allows one to quickly apply a desirable, precise multiplication factor to an existing data value. Similarly, discretizing the allowed movement of guide geometry would allow editing involving ranges of integers and choices within sets. For the gallery, one such gesture could step the guide by even fractions (x1/2, x1/3, etc.) and multiples (x2, x3, etc.) based on non-linear distance left and right of selected circle centers.

Large multiplications might be performed by chaining small edit movements rather than requiring one large edit movement. Chained editing can make is possible to keep guides visible and usable even when screen space or other design considerations compel use of small views. One can start to imagine a direct manipulation "gesture calculator" for data values. A suite of gestures could support arithmetic operators and beyond, including generally useful convenience functions (such as a 10x multiplier) or application-specific functions.

In the "Circle Circle" case, the way that circles overlap provides skewed sensitivity during editing, with very fine-grained editing of data values near zero and increasingly course-grained editing of data values as they increase to a maximum (at maximum overlap). A key idea here is that combinations of geometries can be designed to achieve a disired distribution of editing sensitivity. This is useful in applications in which the likelihood of recorded observations is not uniform. As a complication in this case, the edited value is normalized to a range relative to the size of the selected circles, which means that sensitivity increases with item size. Consequently, one is likely to be much faster and more accurate when editing large data values. This may or may not be beneficial depending on the application.

The three gestures differ in apparant usability when more than one item is selected. In each example, the sample data shown in the view provide some opportunities to select multiple items with a single click. When multiple items are selected, the guide and its geometry can have a different relationship to each item's geometry. For instance, the vertical line guide in the "Circle Chord" case might be in the middle of one selected item circle and entirely to the right of another. Although each item correctly shows its own corresponding data edit as the gestures proceeds, a problem arises with user attention. The user's ability to execute and evaluate data edits appears to require focusing on an individual item's change, at least when items visually respond to gesturing very differently. For instance, intentionally editing multiple items in the "Circle Chord v2" case seems much harder than in the "Circle Circle" case. We suspect that, for many editing gestures, the individuality of geometric relationships between multiple visually encoded data items and a single, unified visual guide in the same coordinate system is a fundamental barrier to editing items in groups.

One solution could be to allow gesturing in a view, but ignore its location relative to data. This solution is unsatisfying because it turns the gesture into a coincidental overlay that could have been performed anywhere else. Another solution would be to design guides that include a visual component for each selected item. Gesture movement within the view would still be disjoint from the visualized coordinate system, but the guide components would each have a sensible geometry and location relative to its corresponding data item. For more than a few selected items, overplotting problems appear likely both within the guide itself and between it and the data. Visual density effects are likely an important to consider during design even for singleton guides, especially for already dense and/or overplotted data representation.

Looking forward, one can imagine gesture design that allows not only the positioning of guides, but also changes to the internal geometry of the guides in response to interaction. For example, the radius of the guide circle is fixed in the "Circle Circle" gesture. A simple variation of the design would allow one to change the guide circle radius dynamically using the scroll wheel. This additional factor adds enormous complexity to the analysis of reachable data values. Allowing user adjustment of guide circle radius would affect scale, bounds, discreteness, and non-linearity during editing, particularly for very large or small ratios of guide circle size to item circle size. We predict that in many cases, dynamic changes to guide geometry will be too difficult for users to interpret and hence use effectively. For now we are focusing on simple guides with static internal geometries to grow our understanding of gesture dynamics.

The first version of the gallery focuses on gestures that follow a two-step select-then-adjust sequence. It includes several of the simpler geometric combinations that we've explored so far. The more exotic geometries that we've considered are unlikely candidates for use as visualization glyphs in practice.

**Despite including only a few simple geometry combinations in our analysis so far, the gesture design space has turned out to be a vastly more complex, interesting, and promising to explore than we anticipated. The visual representation of data items and guides, particularly their geometries, strongly affects which data values can be reached and how. Geometric arithmetic on the visual encodings of items and guides can be used to flexibly integrate***scale, bounds, discreteness, and non-linearity*into customized gestures with an eye toward both general-purpose and application-specific data editing requirements.Student Opportunities

This five year project involves collaboration with digital humanities scholars and educators from OU History of Science, OU Library & Information Science, and the Stanford Humanities Center. Work on the project includes data collection, software implementation, visualization development, technical support for project-related research activities of our collaborators, as well as to provide assistance in visualization design, usability evaluation, educational outreach, and technical support for project-related education activities of our collaborators.

*I currently have funding for two GRAs (preferably CS PhDs) to work on this project. Please contact me if you are interested!*Students considering a PhD at OU are encouraged to inquire about these positions. Inquiries from graduate students with relevant experience outside of CS are also welcome.

I often advise graduate and undergraduate students in the areas of information visualization, visual analytics, databases, human-computer interaction, and the digital humanities. There are many possibilities for Master's theses, independent study, and undergraduate honors research within the scope of this project.

OU students in ANY field interested in participating in this project in a less formal way, such as beta testing of the software as it develops, are welcome to contact me. Although the project itself targets humanities learning and research, the software and other products of the project will be general-purpose.

Press

2014.08.27 - The Oklahoma Daily (OU student newspaper; article):

“Data manipulation may soon be easy”

“Data manipulation may soon be easy”

2014.08.12 - The Norman Transcript (press release):

“OU professor awarded National Science Foundation CAREER grant”

“OU professor awarded National Science Foundation CAREER grant”

2014.08.12 - OU Public Affairs (press release):

“OU Professor Awarded National Science Foundation CAREER Grant to Create Visualization Tools for the Digital Humanities”

“OU Professor Awarded National Science Foundation CAREER Grant to Create Visualization Tools for the Digital Humanities”

Supported Publications and Presentations

*Note: The links to PDFs below do not currently work due to temporary technical problems. Please see the corresponding entries here to access the PDFs.*