Publications

FXPAL publishes in top scientific conferences and journals.

2003
Publication Details
  • Proc. IEEE Intl. Conf. on Image Processing
  • Sep 14, 2003

Abstract

Close
This paper presents a video acquisition system that can learn automatic video capture from human's camera operations. Unlike a predefined camera control system, this system can easily adapt to its environment changes with users' help. By collecting users' camera-control operations under various environments, the control system can learn video capture from human, and use these learned skills to operate its cameras when remote viewers don't, won't, or can't operate the system. Moreover, this system allows remote viewers to control their own virtual cameras instead of watching the same video produced by a human operator or a fully automatic system. The online learning algorithm and the camera management algorithm are demonstrated using field data.
Publication Details
  • Bioinformatics
  • Sep 10, 2003

Abstract

Close
Motivation: EST data reflects variation in gene expression, but previous methods for finding coexpressed genes in EST data are subject to bias and vastly overstate the statistical significance of putatively coexpressed genes. Results: We introduce a new method (LNP) that reports reasonable $p$-values and also detects more biological relationships in human dbEST than do previous methods. In simulations with human dbEST library sizes, previous methods report $p$-values as low as $10^{-30}$ on 1/1,000 uncorrelated pairs, while LNP reports significance correctly. We validate the analysis on real human genes by comparing coexpressed pairs to GO annotations and find that LNP is more sensitive than three previous methods. We also find a small but statistically significant level of coexpression between interacting proteins relative to randomized controls. The LNP method is based on a log-normal prior on the distribution of expression levels. Availability: Source code in Java or R is available at http://ests.sourceforge.net/
Publication Details
  • SPIE Information Technologies and Communications
  • Sep 9, 2003

Abstract

Close
Hypervideo is a form of interactive video that allows users to follow links to other video. A simple form of hypervideo, called "detail-on-demand video," provides at most one link from one segment of video to another, supporting a singlebutton interaction. Detail-on-demand video is well suited for interactive video summaries, because the user can request a more detailed summary while watching the video. Users interact with the video is through a special hypervideo player that displays keyframes with labels indicating when a link is available. While detail-on-demand summaries can be manually authored, it is a time-consuming task. To address this issue, we developed an algorithm to automatically generate multi-level hypervideo summaries. The highest level of the summary consists of the most important clip from each take or scene in the video. At each subsequent level, more clips from each take or scene are added in order of their importance. We give one example in which a hypervideo summary is created for a linear training video. We also show how the algorithm can be modified to produce a hypervideo summary for home video.

Multimedia Fliers: Informal Information Sharing With Digital Community Bulletin Boards

Publication Details
  • Communities and Technologies, Amsterdam, The Netherlands, September 2003
  • Sep 5, 2003

Abstract

Close
Community poster boards serve an important community building function. Posted fliers advertise services, events and people's interests, and invite community members to communicate, participate, interact and transact. In this paper we describe the design, development and deployment of the Plasma Poster Network, a network of large screen, digital community poster boards, the Plasma Posters. An initial deployment of Plasma Posters is within our own organization, a software research community made up of technologists and designers. We present our motivation and two fieldwork studies of online and offline information sharing before describing the Plasma Posters and the underlying information storage and distribution infrastructure. Finally, we summarize findings from qualitative and quantitative evaluations of Plasma Poster usage and conclude by elaborating on socio-technical challenges that have been faced in the design and deployment of the Plasma Poster Network.

The Plasma Poster Network: Posting Multimedia Content in Public Places

Publication Details
  • Human-Computer Interaction INTERACT '03, IOS Press, pp. 599-606
  • Sep 1, 2003

Abstract

Close
Much effort has been expended in creating online information resources to foster social networks, create synergies between collocated and remote colleagues, and enhance social capital within organizations. Following the observation that physical bulletin boards serve an important community building and maintenance function, in this paper we describe a network of large screen, digital bulletin boards, the Plasma Poster Network. The function of this system is to bridge the gap between online community interactions and shared physical spaces. We describe our motivation, a fieldwork study of information sharing practices within our organization, and an internal deployment of Plasma Posters.

Weaving Between Online and Offline Community Participation

Publication Details
  • Human-Computer Interaction INTERACT '03, IOS Press, pp. 729-732
  • Sep 1, 2003

Abstract

Close
Much effort has been expended in creating online spaces for people to meet, network, share and organize. However, there is relatively little work, in comparison, that has addressed creating awareness of online community activities for those gathered together physically. We describe our efforts to advertise the online community spaces of CHIplace and CSCWplace using large screen, interactive bulletin boards that show online community information mixed with content generated at the conference itself. Our intention was to raise awareness of the online virtual community within the offline, face-to-face event. We describe the two deployments, at CHI 2002 and at CSCW 2002, and provide utilization data regarding people's participation within the physical and virtual locales.
Publication Details
  • Human-Computer Interaction INTERACT '03, IOS Press, pp. 33-40
  • Sep 1, 2003

Abstract

Close
To simplify the process of editing interactive video, we developed the concept of "detail-on-demand" video as a subset of general hypervideo where a single button press reveals additional information about the current video sequence. Detail-on-demand video keeps the authoring and viewing interfaces relatively simple while supporting a wide range of interactive video applications. Our editor, Hyper-Hitchcock, builds on prior work on automatic analysis to find the best quality video clips. It introduces video composites as an abstraction for grouping and manipulating sets of video clips. Navigational links can be created between any two video clips or composites. Such links offer a variety of return behaviors for when the linked video is completed that can be tailored to different materials. Initial impressions from a pilot study indicate that Hyper-Hitchcock is easy to learn although the behavior of links is not immediately intuitive for all users.
Publication Details
  • Human-Computer Interaction INTERACT '03, IOS Press, pp. 196-203
  • Sep 1, 2003

Abstract

Close
With digital still cameras, users can easily collect thousands of photos. Our goal is to make organizing and browsing photos simple and quick, while retaining scalability to large collections. To that end, we created a photo management application concentrating on areas that improve the overall experience without neglecting the mundane components of such an application. Our application automatically divides photos into meaningful events such as birthdays or trips. Several user interaction mechanisms enhance the user experience when organizing photos. Our application combines a light table for showing thumbnails of the entire photo collection with a tree view that supports navigating, sorting, and filtering photos by categories such as dates, events, people, and locations. A calendar view visualizes photos over time and allows for the quick assignment of dates to scanned photos. We fine-tuned our application by using it with large personal photo collections provided by several users.
Publication Details
  • Proceedings of INTERACT '03, pp. 583-590.
  • Sep 1, 2003

Abstract

Close
In a meeting room environment with multiple public wall displays and personal notebook computers, it is possible to design a highly interactive experience for manipulating and annotating slides. For the public displays, we present the ModSlideShow system with a discrete modular model for linking the displays into groups, along with a gestural interface for manipulating the flow of slides within a display group. For the applications on personal devices, an augmented reality widget with panoramic video supports interaction among the various displays. This widget is integrated into our NoteLook 3.0 application for annotating, capturing and beaming slides on pen-based notebook computers.
Publication Details
  • Proceedings of Hypertext '03, pp. 124-125
  • Aug 26, 2003

Abstract

Close
Existing hypertext systems have emphasized either the navigational or spatial expression of relationships between objects. We are exploring the combination of these modes of expression in Hyper-Hitchcock, a hypervideo editor. Hyper-Hitchcock supports a form of hypervideo called "detail-on-demand video" due to its applicability to situations where viewers need to take a link to view more details on the content currently being presented. Authors of detail-on-demand video select, group, and spatially arrange video clips into linear sequences in a two-dimensional workspace. Hyper-Hitchcock uses a simple spatial parser to determine the temporal order of selected video clips. Authors add navigational links between the elements in those sequences. This combination of navigational and spatial hypertext modes of expression separates the clip sequence from the navigational structure of the hypervideo. Such a combination can be useful in cases where multiple forms of inter-object relationships must be expressed on the same content.

Identifying Useful Passages in Documents based on Annotation Patterns.

Publication Details
  • 7th European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2003) Trondheim, Norway, August 17-22, 2003
  • Aug 17, 2003

Abstract

Close
Many readers annotate passages that are important to their work. If we understand the relationship between the types of marks on a passage and the passage's ultimate utility in a task, then we can design e-book software to facilitate access to the most important annotated parts of the documents. To investigate this hypothesis and to guide software design, we have analyzed annotations collected during an earlier study of law students reading printed case law and writing Moot Court briefs. This study has allowed us to characterize the relationship between the students' annotations and the citations they use in their final written briefs. We think of annotations that relate directly to the written brief as high-value annotations; these annotations have particular, detectable characteristics. Based on this study we have designed a mark parser that analyzes freeform digital ink to identify such high-value annotations.

Discourse Structure and Sentential Information Structure An Initial Proposal

Publication Details
  • Journal of Logic, Language and Information, Kluwer Academic Publishers, Dordrecht, The Netherlands
  • Aug 15, 2003

Abstract

Close
In this article we argue that discourse structure constrains the set of possible constituents in a discourse that can provide the relevant context for structuring information in a target sentence, while information structure critically constrains discourse structure ambiguity. For the speaker, the discourse structure provides a set of possible contexts for continuation while information structure assignment is independent of discourse structure. For the hearer, the information structure of a sentence together with discourse structure instructs dynamic semantics how rhematic information should be used to update the meaning representation of the discourse (Polanyi and van den Berg, 1996).
Publication Details
  • IEEE International Conference on Multimedia and Expo, v. I, pp. 221-224
  • Jul 7, 2003

Abstract

Close
A novel method is presented for inaudibly hiding information in an audio signal by subtly applying time-scale modification to segments of the signal. The sequence, duration, and degree of the time-scale modifications are the parameters which encode information in the altered signal. By comparing the altered signal with a reference copy, compressed and expanded regions can be identified and the hidden data recovered. This approach is novel and has several advantages over other methods: it is theoretically noiseless, it introduces no spectral distortion, and it is robust to all known methods of reproduction, compression, and transmission.
Publication Details
  • IEEE International Conference on Multimedia and Expo, v. II, pp. 77-80
  • Jul 7, 2003

Abstract

Close
We created an improved layout algorithm for automatically generating visual video summaries reminiscent of comic book pages. The summaries are comprised of images from the video that are sized according to their importance. The algorithm performs a global optimization with respect to a layout cost function that encompasses features such as the number of resized images and the amount of whitespace in the presentation. The algorithm creates summaries that: always fit exactly into the requested area, are varied by containing few rows with images of the same size, and have little whitespace at the end of the last row. The layout algorithm is fast enough to allow the interactive resizing of the summaries and the subsequent generation of a new layout.
Publication Details
  • IEEE International Conference on Multimedia and Expo, v. II, pp. 753-756
  • Jul 7, 2003

Abstract

Close
We created an alternative approach to existing video summaries that gives viewers control over the summaries by selecting hyperlinks to other video with additional information. We structure such summaries as "detail-on-demand" video, a subset of general hypervideo in which at most one link to another video sequence is available at any given time. Our editor for such video, Hyper-Hitchcock, provides a workspace in which an author can select and arrange video clips, generate composites from clips and from other composites, and place links between composites. To simplify dealing with a large number of clips, Hyper-Hitchcock generates iconic representations for composites that can be used to manipulate the composite as a whole. In addition to providing an authoring environment, Hyper-Hitchcock can automatically generate multi-level hypervideo summaries for immediate use or as the starting point for author modification.
Publication Details
  • 2003 International Conference on Multimedia and Expo
  • Jul 6, 2003

Abstract

Close
This paper presents an information-driven audiovisual signal acquisition approach. This approach has several advantages: users are encouraged to assist in signal acquisition; available sensors are managed based on both signal characteristics and users' suggestions. The problem formulation is consistent with many well-known empirical approaches widely used in previous systems and may provide analytical explanations to these approaches. We demonstrate the use of this approach to pan/tilt/zoom (PTZ) camera management with field data.
Publication Details
  • HCI International 2003
  • Jun 22, 2003

Abstract

Close
A basic objective of ubiquitous computing research is ubiquitous information: the ability to utilize any content or service, using devices that are always at hand, over networks that don't tie us down. Although much progress has been made, the ideal remains elusive. This paper reflects on the interrelations among three dimensions of ubiquitous information: content, devices, and networks. We use our understanding of these dimensions to motivate our own attempt to create a ubiquitous information system by combining unlimited World Wide Web content with mobile phones and mobile phone networks. We briefly describe a middleware proxy system we developed to increase the usefulness of very small devices as Internet terminals. We conclude with a post-mortem analysis highlighting lessons learned for others interested in information systems for very small devices.
Publication Details
  • HCI International 2003
  • Jun 22, 2003

Abstract

Close
Everywhere we go, we are surrounded by shared devices: TVs, stereos, and appliances in the home; copiers, fax machines, and projectors in the office; phones and vending machines in public. Because these devices don't know who we are, they provide the same user interface and functionality to everyone. This paper describes a system for personalizing workplace document devices- projectors, public displays, and multi-function copiers-that has been in use for over two years in our organization. We compare user interfaces that are embedded (i.e., integrated or co-located with the shared device) versus portable (i.e., accessible via portable devices such as mobile phones or PDAs). We summarize lessons learned for others designing interfaces for shared ubiquitous devices.
Publication Details
  • Business Process Management Journal, Volume 9, Number 3, 2003, pages 337-353
  • Jun 9, 2003

Abstract

Close
Purveyors of knowledge management software have a disconcerting tendency to promote the myth that all problems may be solved by more powerful tools for the exchange of information in the workplace. This fallacy is based on the faulty assumption that knowledge management is about the management of knowledge (as if knowledge were a commodity that could be managed), as opposed to the management of people whose work depends critically on what they know. The origins of knowledge management are far more firmly rooted in the psychological legacy of organizational communication than they are in the technological legacy of information management systems. However, even organizational communication is an inadequate foundation, since various schools of thought in social theory, particularly the structuration theory of Anthony Giddens, inform us that interaction (in the workplace or in any other social setting) is not strictly limited to communication. Knowledge management thus requires moving beyond simplistic models of information exchange to more challenging problems of leveraging social interaction to the advantage of the enterprise. This paper focuses on the claim of structuration theory that the dimension of communication should be supplemented with additional dimensions of power and sanction. This perspective is then examined in light of a case study of crisis management practices, and the case study provides a basis for addressing implications for technological support.

Agent Supported Cooperative Work.

Publication Details
  • Mass,USA: Kluwer Academic Publishers, 2003
  • Jun 1, 2003

Abstract

Close
This is a volume, edited by Ye and Churchill. The chapters detail the design of agent-baed technologies in service of collaborative and cooperative work practices.

AttrActive Windows: Dynamic Windows for Digital Bulletin Boards

Publication Details
  • CHI 2003
  • Apr 7, 2003

Abstract

Close
In this paper we describe AttrActive Windows, a novel interface for presenting live, interactive, multimedia content on a network of public, digital, bulletin boards. Implementing a paper flyer metaphor, AttrActive Windows are paper-like in appearance and are attached to a virtual corkboard by virtual pushpins. Windows can therefore appear in different orientations, creating an attractive, informal look. Attractive Windows can also have autonomous behaviors that are consistent with the corkboard metaphor, like fluttering in the wind. We describe the AttrActive Windows prototype, and offer the results of an initial evaluative user study.
Publication Details
  • CHI 2003
  • Apr 7, 2003

Abstract

Close
Shared freeform input is a technique for facilitating note taking across devices during a meeting. Laptop users enter text with a keyboard, whereas PDA and Tablet PC users input freeform ink with their stylus. Users can quickly reuse text and freeform ink already entered by others. We show how a new technique, freeform pasting, allowed us to deal with a variety of design issues such as quick and informal ink sharing, screen real estate, privacy and mixing ink-based and textual material.
Publication Details
  • Proc. SPIE Storage and Retrieval for Multimedia Databases, Vol. 5021, pp. 167-75
  • Jan 20, 2003

Abstract

Close
We present a framework for analyzing the structure of digital media streams. Though our methods work for video,text,and audio,we concentrate on detecting the structure of digital music files. In the first step,spectral data is used to construct a similarity matrix calculated from inter-frame spectral similarity. The digital audio can be robustly segmented by correlating a ernel along the diagonal of the similarity matrix. Once segmented, spectral statistics of each segment are computed.In the second step,segments are clustered based on the self- similarity of their statistics. This reveals the structure of the digital music in a set of segment boundaries and labels.Finally,the music can be summarized by selecting clusters with repeated segments throughout the piece. The summaries can be customized for various applications based on the structure of the original music.

AttrActive Windows: Active Windows for Pervasive Computing Applications

Publication Details
  • ACM Intelligent User Interface (IUI) 2003, Miami Beach, FL, pp 326
  • Jan 12, 2003

Abstract

Close
We introduce the AttrActive Windows user interface, a novel approach for presenting interactive content on large screen, interactive, digital, bulletin boards. Moving away from the desktop metaphor, AttrActive Windows are dynamic, non-uniform windows that can appear in different orientations and have autonomous behaviours to attract passers-by and invite interactions.
2002
Publication Details
  • IEEE Multimedia Signal Processing Workshop
  • Dec 11, 2002

Abstract

Close
We present a novel approach to automatically ex-tracting summary excerpts from audio and video. Our approach is to maximize the average similarity between the excerpt and the source. We first calculate a similarity matrix by comparing each pair of time samples using a quantitative similarity measure. To determine the segment with highest average similarity, we maximize the summation of the self-similarity matrix over the support of the segment. To select multiple excerpts while avoiding redundancy, we compute the non-negative matrix factorization (NMF) of the similarity matrix into its essential structural components. We then build a summary comprised of excerpts from the main components, selecting the excerpts for maximum average similarity within each component. Variations integrating segmentation and other information are also discussed, and experimental results are presented.
Publication Details
  • ACM Multimedia 2002
  • Dec 1, 2002

Abstract

Close
We present methods for automatic and semi-automatic creation of music videos, given an arbitrary audio soundtrack and source video. Significant audio changes are automatically detected; similarly, the source video is automatically segmented and analyzed for suitability based on camera motion and exposure. Video with excessive camera motion or poor contrast is penalized with a high unsuitability score, and is more likely to be discarded in the final edit. High quality video clips are then automatically selected and aligned in time with significant audio changes. Video clips are adjusted to match the audio segments by selecting the most suitable region of the desired length. Besides a fully automated solution, our system can also start with clips manually selected and ordered using a graphical interface. The video is then created by truncating the selected clips (preserving the high quality portions) to produce a video digest that is synchronized with the soundtrack music, thus enhancing the impact of both.
Publication Details
  • ACM Multimedia 2002
  • Dec 1, 2002

Abstract

Close
FlySPEC is a video camera system designed for real-time remote operation. A hybrid design combines the high resolution possible using an optomechanical video camera, with the wide field of view always available from a panoramic camera. The control system integrates requests from multiple users with the result that each controls a virtual camera. The control system seamlessly integrates manual and fully automatic control. It supports a range of options from untended automatic to full manual control, and the system can learn control strategies from user requests. Additionally, the panoramic view is always available for an intuitive interface, and objects are never out of view regardless of the zoom factor. We present the system architecture, an information-theoretic approach to combining panoramic and zoomed images to optimally satisfy user requests, and experimental results that show the FlySPEC system significantly assists users in a remote inspection tasks.
Publication Details
  • ACM 2002 Conference on Computer Supported Cooperative Work
  • Nov 16, 2002

Abstract

Close
Technology can play an important role in enabling people to interact with each other. The Web is one such technology with the affordances for sharing information and for connecting people to people. In this paper, we describe the design of two social interaction Web sites for two different social groups. We review several related efforts to provide principles for creating social interaction environments and describe the specific principles that guided our design. To examine the effectiveness of the two sites, we analyze the usage data. Finally, we discuss approaches for encouraging participation and lessons learned.

Moving Markup: Repositioning Freeform Annotations

Publication Details
  • Proceedings of ACM UIST 2002
  • Oct 27, 2002

Abstract

Close
Freeform digital ink annotation allows readers to interact with documents in an intuitive and familiar manner. Such marks are easy to manage on static documents, and provide a familiar annotation experience. In this paper, we describe an implementation of a freeform annotation system that accommodates dynamic document layout. The algorithm preserves the correct position of annotations when documents are viewed with different fonts or font sizes, with different aspect ratios, or on different devices. We explore a range of heuristics and algorithms required to handle common types of annotation, and conclude with a discussion of possible extensions to handle special kinds of annotations and changes to documents.
Publication Details
  • IEEE InfoVis '02 Interactive Poster and Demo
  • Oct 27, 2002

Abstract

Close
This work presents constructs called interactive space-time maps along with an application called the SpaceTime Browser for visualizing and retrieving documents. A 3D visualization with 2D planar maps and a time line is employed. Users can select regions on the maps and choose precise time intervals by sliding the maps along the telescopic time line. Regions are highlighted to indicate the presence of documents with matching space-time attributes, and documents are retrieved and displayed in an adjoining workspace. We provide two examples: (1) organizing travel photos, (2) managing documents created by room location-aware devices in a building.

Context-Aware Communication

Publication Details
  • IEEE Wireless Communications Magazine, Vol. 9, No. 5.
  • Oct 15, 2002

Abstract

Close
This paper describes how the changing information about an individual's location, environment, and social situation can be used to initiate and facilitate people's interactions with one another, individually and in groups. Context-aware communication is contrasted with other forms of context-aware computing and we characterize applications in terms of design decisions along two dimensions: the extent of autonomy in context sensing and the extent of autonomy in communication action. A number of context-aware communication applications from the research literature are presented in five application categories. Finally, a number of issues related to the design of context-aware communication applications are presented.

Web Interaction Using Very Small Internet Devices

Publication Details
  • IEEE Computer Magazine, Cover Feature, Vol. 35, No. 10.
  • Oct 15, 2002

Abstract

Close
Squeezing desktop Web content into smart phones and text pagers is more practical with separate interfaces for navigation and content manipulation. m-Links, a middleware proxy system, supports this dual-mode browsing, offering phonetop users an extendable set of actions.
Publication Details
  • 2002 International Symposium on Music Information Retrieval
  • Oct 13, 2002

Abstract

Close
We present methods for automatically producing summary excerpts or thumbnails of music. To find the most representative excerpt, we maximize the average segment similarity to the entire work. After window-based audio parameterization, a quantitative similarity measure is calculated between every pair of windows, and the results are embedded in a 2-D similarity matrix. Summing the similarity matrix over the support of a segment results in a measure of how similar that segment is to the whole. This measure is maximized to find the segment that best represents the entire work. We discuss variations on the method, and present experimental results for orchestral music, popular songs, and jazz. These results demonstrate that the method finds significantly representative excerpts, using very few assumptions about the source audio.

Audio Retrieval by Rhythmic Similarity

Publication Details
  • 2002 International Symposium on Music Information Retrieval
  • Oct 13, 2002

Abstract

Close
We present a method for characterizing both the rhythm and tempo of music. We also present ways to quantitatively measure the rhythmic similarity between two or more works of music. This allows rhythmically similar works to be retrieved from a large collection. A related application is to sequence music by rhythmic similarity, thus providing an automatic "disc jockey" function for musical libraries. Besides specific analysis and retrieval methods, we present small-scale experiments that demonstrate ranking and retrieving musical audio by rhythmic similarity.
Publication Details
  • The 4th International Conference on Ubiquitous Computing (UbiComp 2002).
  • Sep 29, 2002

Abstract

Close
As ubiquitous computing becomes widespread, we are increasingly coming into contact with "shared" computer-enhanced devices, such as cars, televisions, and photocopiers. Our interest is in identifying general issues in personalizing such shared everyday devices. Our approach is to compare alternative personalization methods by deploying and using alternative personalization interfaces (portable and embedded) for three shared devices in our workplace (a presentation PC, a plasma display for brainstorming, and a multi-function copier). This paper presents the comparative prototyping methodology we employed, the experimental system we deployed, observations and feedback from use, and resulting issues in designing personalized shared ubiquitous devices.
Publication Details
  • Workshop on User centered Evaluations for Ubiquitous Computing Systems: Best Known Methods, The 4th International Conference on Ubiquitous Computing (UbiComp 2002).
  • Sep 29, 2002

Abstract

Close
Evaluating ubiquitous systems is hard, and has attracted the attention of others in the research community. These investigators, like others in CSCW, argue there is a basic mismatch between traditional evaluation techniques and the needs posed by ubiquitous systems. Namely, these systems are embedded in a variety of complex real world environments that cannot be easily modeled (as required by theoretical analyses), simulated, measured, or controlled (as required by laboratory experiments). As a result, many investigators have abandoned traditional comparative evaluation techniques and opted instead for techniques adapted from the social sciences, such as anthropology. We wanted to perform a comparative evaluation similar to a laboratory experiment, but in such a way that we could observe the effects of our design decisions in relatively unconstrained, real world use. This led us to the process described in this paper.

Low-Resolution Supplementary Tactile Cues for Navigational Assistance

Publication Details
  • In proceedings of Mobile HCI 2002. (Pisa, Italy,2002), Springer-Verlag, Lecture notes in computer science #2411,pp.369-372.
  • Sep 18, 2002

Abstract

Close
The TactGuide is a mobile navigation device 'displaying' personalized direction cues by means of a tactile and 'tactful' representation. The TactGuide is operated by tactile inspection which is subtle enough to allow the users to engage/disengage in device interaction while preserving their visual, auditory and kinesthetic senses for inspection of the environment. The TactGuide design thereby accommodates the users' need to economize their attentional resources between device and environment while navigating through physical space. Preliminary experiments indicates that users readily map the tactile cues to spatial directions and that TactGuide can be operated as a supplement to, and without compromising, the use of our existing wayfinding abilities. substituting the use of our natural abilities and earned skills for wayfinding.
Publication Details
  • Journal of Mathematical Physics, September 2002 special issue on Quantum Information Theory, Vol. 43 (9), pp. 4376 - 7381.
  • Sep 7, 2002

Abstract

Close

To implement any quantum operation (a.k.a. ``superoperator'' or ``CP map'') on a d-dimensional quantum system, it is enough to apply a suitable overall unitary transformation to the system and a d^2-dimensional environment which is initialized in a fixed pure state. It has been suggested that a d-dimensional environment might be enough if we could initialize the environment in a mixed state of our choosing. In this note we show with elementary means that certain explicit quantum operations cannot be realized in this way. Our counterexamples map some pure states to pure states, giving strong and easily manageable conditions on the overall unitary transformation. Everything works in the more general setting of quantum operations from d-dimensional to d'-dimensional spaces, so we place our counterexamples within this more general framework.

Publication Details
  • Proceedings IEEE International Conference on Multimedia and Expo, Lausanne, Switzerland, August 2002
  • Aug 26, 2002

Abstract

Close
We present a method for rapidly and robustly extracting audio excerpts without the overhead of speech recognition or speaker segmentation. An immediate application is to automatically augment keyframe-based video summaries with informative audio excerpts associated with the video segments represented by the keyframes. Short audio clips combined with keyframes comprise an extremely lightweight and Web-browsable interface for auditioning video or similar media, without using bandwidth-intensive streaming video or audio.
Publication Details
  • IEEE International Conference on Multimedia and Expo 2002
  • Aug 26, 2002

Abstract

Close
This paper presents a camera system called FlySPEC. In contrast to a traditional camera system that provides the same video stream to every user, FlySPEC can simultaneously serve different video-viewing requests. This flexibility allows users to conveniently participate in a seminar or meeting at their own pace. Meanwhile, the FlySPEC system provides a seamless blend of manual control and automation. With this control mix, users can easily make tradeoffs between video capture effort and video quality. The FlySPEC camera is constructed by installing a set of Pan/Tilt/Zoom (PTZ) cameras near a high-resolution panoramic camera. While the panoramic camera provides the basic functionality of serving different viewing requests, the PTZ camera is managed by our algorithm to improve the overall video quality that may affect users watching details. The video resolution improvements from using different camera management strategies are compared in the experimental section.

Detecting Path Intersections in Panoramic Video

Publication Details
  • IEEE International Conference on Multimedia and Expo 2002
  • Aug 26, 2002

Abstract

Close
Given panoramic video taken along a self-intersecting path, we present a method for detecting the intersection points. This allows "virtual tours" to be synthesized by splicing the panoramic video at the intersection points. Spatial intersections are detected by finding the best-matching panoramic images from a number of nearby candidates. Each panoramic image is segmented into horizontal strips. Each strip is averaged in the vertical direction. The Fourier coefficients of the resulting 1-D data capture the rotation-invariant horizontal texture of each panoramic image. The distance between two panoramic images is calculated as the sum of the distances between their strip texture pairs at the same row positions. The intersection is chosen as the two candidate panoramic images that have the minimum distance.
Publication Details
  • SPIE ITCOM 2002
  • Jul 31, 2002

Abstract

Close
We present a framework, motivated by rate-distortion theory and the human visual system, for optimally representing the real world given limited video resolution. To provide users with high fidelity views, we built a hybrid video camera system that combines a fixed wide-field panoramic camera with a controllable pan/tilt/zoom (PTZ) camera. In our framework, a video frame is viewed as a limited-frequency representation of some "true" image function. Our system combines outputs from both cameras to construct the highest fidelity views possible, and controls the PTZ camera to maximize information gain available from higher spatial frequencies. In operation, each remote viewer is presented with a small panoramic view of the entire scene, and a larger close-up view of a selected region. Users may select a region by marking the panoramic view. The system operates the PTZ camera to best satisfy requests from multiple users. When no regions are selected, the system automatically operates the PTZ camera to minimize predicted video distortion. High-resolution images are cached and sent if a previously recorded region has not changed and the PTZ camera is pointed elsewhere. We present experiments demonstrating that the panoramic image can effectively predict where to gain the most information, and also that the system provides better images to multiple users than conventional camera systems.

Communication and Understanding for Decision Support

Publication Details
  • Proceedings of the IFIP International Conference on Decision Making and Decision Support in the Internet Age
  • Jul 4, 2002

Abstract

Close
As the technology for communication changes, the role of communication in the conduct of business changes with it. Communication is no longer just a technical matter of separating signal from noise and managing bandwidth but also a social matter in which negotiating differences in understanding among and between communicators is a primary business priority. Addressing this priority requires an understanding of how individuals interact in the course of their decision making activities. Using the work of Anthony Giddens as a point of departure, this paper views interaction in communication as consisting of three dimensions - meaning, authority, and trust. These three dimensions are used to identify new opportunities for advances in decision making technology which help deal with potential breakdowns in social interaction.

The Elusive Ubiquitous Information System and m-Links

Publication Details
  • Fuji Xerox Technical Report, No. 14, 2002
  • Jun 25, 2002

Abstract

Close
A basic objective of Weiser's Ubiquitous Computing vision is ubiquitous information access: being able to utilize any content or service (e.g., all the rich media content and services on the WWW), using devices that are always "at hand" (embedded in environments or portable), over a network with universal coverage and adequate bandwidth. Although much progress has been made, the ideal remains elusive. This paper examines the inter-relations among three dimensions of ubiquitous information systems: (1) ubiquitous content; (2) ubiquitous devices; and (3) ubiquitous networking. We use the space defined by these dimensions to reflect on the tradeoffs designers make and to chart some past and current information systems. Given this background, we present m-Links (mobile links), a new system that takes aim at the elusive ideal of ubiquitous information. Our approach builds on wireless web phone technologies because of their trend towards ubiquitous devices and networking (the second and third dimensions). Yet such very small devices sacrifice usability as rich media Internet terminals (the first dimension). To offset this limitation, we propose a new information access model for very small devices that supports a much wider range of content and services than previously possible. We have built this system with an emphasis on open systems extensibility and describe its design and implementation.

Going Back in Hypertext

Publication Details
  • Proceedings of ACM Hypertext 2002
  • Jun 11, 2002

Abstract

Close
Hypertext interfaces typically involve navigation, the act (and interaction) of moving from one piece of information to another. Navigation can be exploratory, or it may involve backtracking to some previously-visited node. While backtracking interfaces are common, they may not reflect differences in readers' purposes and mental models. This paper draws on some empirical evidence regarding navigation between and within documents to suggest improvements on traditional hypertext navigation, and proposes a time-based view of backtracking.
Publication Details
  • Journal of Library Administration, 35:1-2, 99-123, Haworth
  • Jun 7, 2002

Abstract

Close
In the emerging world of electronic publishing how we create, distribute, and read books will be in a large part determined by an underlying framework of content standards that establishes the range of technological opportunities and constraints for publishing and reading systems. But efforts to develop content standards based on sound engineering models must skillfully negotiate competing and sometimes apparently irreconcilable objectives if they are to produce results relevant to the rapidly changing course of technology. The Open eBook Forum's Publication Structure, an XML-based specification for electronic books, is an example of the sort of timely and innovative problem solving required for successful real-world standards development. As a result of this effort, the electronic book industry will not only happen sooner and on a larger scale than it would have otherwise, but the electronic books it produces will be more functional, more interoperable, and more accessible to all readers. Public interest participants have a critical role in this process.
Publication Details
  • CHI 2002
  • Apr 22, 2002

Abstract

Close
Shared text input is a technique we implemented into a note taking system for facilitating text entry on small devices. Instead of writing out words on the tedious text entry interfaces found on handheld computers, users can quickly reuse words and phrases already entered by others. Sharing notes during a meeting also increases awareness among note takers. We found that filtering the text to share was appropriate to deal with a variety of design issues such as screen real estate, scalability, privacy, reciprocity, and predictability of text location
Publication Details
  • CHI 2002
  • Apr 22, 2002

Abstract

Close
In this paper, we describe an evaluation of the Palette, a presentation tool that was reported at CHI '99. The Palette allows presenters to quickly access digital presentations using physical cards that have unique barcodes printed on them. The Palette has been in use in our lab for over three years, and has been released as a product in Japan. Our evaluation consists of an analysis of usage logs, an expert walkthrough review, and observations and interviews with users, non-users and the system administrator. The findings reveal benefits and drawbacks of the technology, and offers design ideas for further work on tangible tools of this kind.
Publication Details
  • International Journal of Human-Computer Studies, 56, pp. 75-107
  • Feb 1, 2002

Abstract

Close
We describe our experiences with the design, implementation, deployment, and evaluation of a Portholes tool which provides group and collaboration awareness through the Web. The research objective was to explore how such a system would improve communication and facilitate a shared understanding among distributed development groups. During the deployment of our Portholes system, we conducted a naturalistic study by soliciting user feedback and evolving the system in response. Many of the initial reactions of potential users indicated that our system projected the wrong image so that we designed a new version that provided explicit cues about being in public and who is looking back to suggest a social rather than information interface. We implemented the new design as a Java applet and evaluated design choices with a preference study. Our experiences with different Portholes versions and user reactions to them provide insights for designing awareness tools beyond Portholes systems. Our approach is for the studies to guide and to provide feedback for the design and technical development of our system.
2001
Publication Details
  • In Workshop on Identifying Objects Across Variations in Lighting: Psychophysics & Computation, Proc. IEEE Intl. Conf. on Computer Vision & Pattern Recognition 2001.
  • Dec 12, 2001

Abstract

Close
In this paper, we document an extension to traditional pattern-theoretic object templates to jointly accommodate variations in object pose and in the radiant appearance of the object surface. We first review classical object templates accommodating pose variation. We then develop an efficient subspace representation for the object radiance indexed on the surface of the three dimensional object template. We integrate the low-dimensional representation for the object radiance, or signature, into the pattern-theoretic template, and present the results of orientation estimation experiments. The experiments demonstrate both estimation performance fluctuations under varying illumination conditions and performance degradations associated with unknown scene illumination. We also present a Bayesian approach for estimation accommodating illumination variability.

Work/place: mobile technologies and arenas of activity

Publication Details
  • ACM SIGGROUP Bulletin, Volume 22, Issue 3, Pp3-9, Publisher ACM Press, New York, NY, USA
  • Dec 8, 2001

Abstract

Close
The increasing number of wireless, portable devices has led inevitably to lyrical rhetorics of business cost-cutting and increased efficiency as workers can be productive while on the and offices become streamlined areas of efficient activity. In this short paper, we raise a number if issues that have been appearing in common discourses the (most) modern office, and the impact of wireless technologies thereupon. We also present an overview of a workshop held at ECSCW in Bonn in September of 2001 on this topic, giving an overview of the comments and discussions that took place at the workshop.

Framing Mobile Collaborations and Mobile Technologies.

Publication Details
  • In B. Brown, N. Green, R. Harper (Eds.) Wireless World: Social and Interactional Aspects of Wireless Technology, London, UK: Springer-Verlag.
  • Dec 1, 2001

Abstract

Close
Recent years have seen a marked increase in the production and promotion of portable, wireless communication devices: mobile phones with internet access, wireless PDAs such as the Palm VII and smart pagers such as RIM's 850 and 950. Some claim the presence of such devices in the hands, bags and pockets of so many people heralds a new world of work in which people can be reached and information accessed "anywhere, anytime". Whether or not access to information in itself can promote new working practices, individuals whose lives revolve around movement between work sites have been singled out as an obvious market for such portable wireless communication devices. Using these devices such “mobile workers” can be in touch with colleagues, collaborators and clients "24/7", and still sustain non-work social relationships due, apparently, to their constant connectedness whilst mobile. In this chapter we have two goals. The first is to address the design of mobile technologies. This second is to illustrate our design approach, wherein we consider local practices of technology use, but also the broader cultural context in which technologies are designed, produced, bought, sold, used and redesigned. Our ultimate design aim is to build upon existing practices, but also to consider possibilities for the development of innovative technologies that enable new, complementary, practices.
Publication Details
  • In Proceedings of the International Conference on Image Processing, Thessaloniki, Greece. October 7-10, 2001.
  • Oct 7, 2001

Abstract

Close
In this paper, we present a novel framework for analyzing video using self-similarity. Video scenes are located by analyzing inter-frame similarity matrices. The approach is flexible to the choice of similarity measure and is robust and data-independent because the data is used to model itself. We present the approach and its application to scene boundary detection. This is shown to dramatically outperform a conventional scene-boundary detector that uses a histogram-based measure of frame difference.
Publication Details
  • Proceedings of ACM Multimedia 2001, Ottawa, Canada, Oct. 5, 2001.
  • Oct 5, 2001

Abstract

Close
Given rapid improvements in storage devices, network infrastructure and streaming-media technologies, a large number of corporations and universities are recording lectures and making them available online for anytime, anywhere access. However, producing high-quality lecture videos is still labor intensive and expensive. Fortunately, recent technology advances are making it feasible to build automated camera management systems to capture lectures. In this paper we report our design of such a system, including system configuration, audio-visual tracking techniques, software architecture, and user study. Motivated by different roles in a professional video production team, we have developed a multi-cinematographer single-director camera management system. The system performs lecturer tracking, audience tracking, and video editing all fully automatically, and offers quality close to that of human-operated systems.
Publication Details
  • Proc. ACM Multimedia 2001, Ottawa,CA, Oct. 2001.
  • Sep 30, 2001

Abstract

Close
We describe a system called FlyAbout which uses spatially indexed panoramic video for virtual reality applications. Panoramic video is captured by moving a 360° camera along continuous paths. Users can interactively replay the video with the ability to view any interesting object or choose a particular direction. Spatially indexed video gives the ability to travel along paths or roads with a map-like interface. At junctions, or intersection points, users can chose which path to follow as well as which direction to look, allowing interaction not available with conventional video. Combining the spatial index with a spatial database of maps or objects allows users to navigate to specific locations or interactively inspect particular objects.
Publication Details
  • Proc. International Conference on Computer Music (ICMC), Habana, Cuba, September 2001.
  • Sep 12, 2001

Abstract

Close
This paper presents a novel approach to visualizing the time structure of musical waveforms. The acoustic similarity between any two instants of an audio recording is displayed in a static 2D representation, which makes structural and rhythmic characteristics visible. Unlike practically all prior work, this method characterizes self-similarity rather than specific audio attributes such as pitch or spectral features. Examples are presented for classical and popular music.
Publication Details
  • IEEE Computer, 34(9), pp. 61-67
  • Sep 1, 2001

Abstract

Close

To meet the diverse needs of business, education, and personal video users, the authors developed three visual interfaces that help identify potentially useful or relevant video segments. In such interfaces, keyframes-still images automatically extracted from video footage-can distinguish videos, summarize them, and provide access points. Well-chosen keyframes enhance a listing's visual appeal and help users select videos. Keyframe selection can vary depending on the application's requirements: A visual summary of a video-captured meeting may require only a few highlight keyframes, a video editing system might need a keyframe for every clip, while a browsing interface requires an even distribution of keyframes over the video's full length. The authors conducted user studies for each of their three interfaces, gathering input for subsequent interface improvements. The studies revealed that finding a similarity measure for collecting video clips into groups that more closely match human perception poses a challenge. Another challenge is to further improve the video-segmentation algorithm used for selecting keyframes. A new version will provide users with more information and control without sacrificing the interface's ease of use.

Recording the Region of Interest from FlyCam Panoramic Video

Publication Details
  • Proc. International Conference on Image Processing, Thessaloniki, Greece, September 2001.
  • Sep 1, 2001

Abstract

Close
A novel method for region of interest tracking and recording video is presented. The proposed method is based on the FlyCam system, which produces high resolution and wide-angle video sequences by stitching the video frames from multiple stationary cameras. The method integrates tracking and recording processes, and targets applications such as classroom lectures and video conferencing. First, the region of interest (which typically covers the speaker) is tracked using a Kalman filter. Then, the Kalman filter estimation results are used for virtual camera control and to record the video. The system has no physical camera motion and the virtual camera parameters are readily available for video indexing. The proposed system has been implemented for real time recording of lectures and presentations.

The Beat Spectrum: A New Approach to Rhythm Analysis

Publication Details
  • In Proceedings of the International Conference on Multimedia and Expo 2001 (ICME), Tokyo, Japan. August 22-25, 2001.
  • Aug 25, 2001

Abstract

Close
We introduce the beat spectrum, a new method of automatically characterizing the rhythm and tempo of music and audio. The beat spectrum is a measure of acoustic self-similarity as a function of time lag. Highly structured or repetitive music will have strong beat spectrum peaks at the repetition times. This reveals both tempo and the relative strength of particular beats, and therefore can distinguish between different kinds of rhythms at the same tempo. We also introduce the beat spectrogram which graphically illustrates rhythm variation over time. Unlike previous approaches to tempo analysis, the beat spectrum does not depend on particular attributes such as energy or frequency, and thus will work for any music or audio in any genre. We present tempo estimation results for a variety of musical genres, which are accurate to within 1%. This approach has a variety of applications, including music retrieval by similarity and automatically generating music videos.
Publication Details
  • In Proceedings of Conference on Modeling and Design of Wireless Networks (ITCOM2001), Denver, Colorado, August 23-24 August 2001.
  • Aug 23, 2001

Abstract

Close
This paper reports our design, and implementation of an automatic lecture-room camera-management system. The motivation for building this system is to facilitate online lecture access and reduce the expense of producing high quality lecture videos. The goal of this project is a camera-management system that can perform as a human video-production team. To achieve this goal, our system collects audio/video signals available in the lecture room and uses the multimodal information to direct our video cameras to interesting events. Compared to previous work--which has tended to be technology centric--we started with interviews with professional video producers and used their knowledge and expertise to create video production rules. We then targeted technology components that allowed us to implement a substantial portion of these rules, including the design of a virtual video director, a speaker cinematographer, and an audience cinematographer. The complete system is installed in parallel with a human-operated video production system in a middle-sized corporate lecture room, and used for broadcasting lectures through the web. The systemí*s performance was compared to that of a human operator via a user study. Results suggest that our system's quality is close to that of a human-controlled system.

The impact of text browsing on text retrieval performance

Publication Details
  • Information Processing and Management 37 (3) pp. 507-520
  • Aug 21, 2001

Abstract

Close
The results from a series of three experiments that used Text Retrieval Conference (TREC) data and TREC search topics are compared. These experiments each involved three novel user interfaces (one per experiment). User interfaces that made it easier for users to view text were found to improve recall in all three experiments. A distinction was found between a cluster of subjects (a majority of whom were search experts) who tended to read fewer documents more carefully (readers, or exclusives) and subjects who skimmed through more documents without reading them as carefully (skimmers, or inclusives). Skimmers were found to have significantly better recall overall. A major outcome from our experiments at TREC and with the TREC data, is that hypertext interfaces to information retrieval (IR) tasks tend to increase recall. Our interpretation of this pattern of results across the three experiments is that increased interaction with the text (more pages viewed) generally improves recall. Findings from one of the experiments indicated that viewing a greater diversity of text on a single screen (i.e., not just more text per se, but more articles available at once) may also improve recall. In an experiment where a traditional (type-in) query interface was contrasted with a condition where queries were marked up on the text, the improvement in recall due to viewing more text was more pronounced with search novices. Our results demonstrate that markup and hypertext interfaces to text retrieval systems can benefit recall and can also benefit novices. The challenge now will be to find modified versions of hypertext interfaces that can improve precision, as well as recall and that can work with users who prefer to use different types of search strategy or have different types of training and experience.

m-Links: An Infrastructure for Very Small Internet Devices

Publication Details
  • The 7th Annual International Conference on Mobile Computing and Networking (MOBICOM 2001), Rome, Italy, July 16-21 2001, ACM Press, 2001, pp. 122-131.
  • Jul 16, 2001

Abstract

Close
In this paper we describe the Mobile Link (m-Links) infrastructure for utilizing existing World Wide Web content and services on wireless phones and other very small Internet terminals. Very small devices, typically with 3-20 lines of text, provide portability and other functionality while sacrificing usability as Internet terminals. In order to provide access on such limited hardware we propose a small device web navigation model that is more appropriate than the desktop computers web browsing model. We introduce a middleware proxy, the Navigation Engine, to facilitate the navigation model by concisely displaying the Webs link (i.e., URL) structure. Because not all Web information is appropriately "linked," the Navigation Engine incorporates data-detectors to extract bits of useful information such as phone numbers and addresses. In order to maximize program-data composibility, multiple network-based services (similar to browser plug-ins) are keyed to a links attributes such as its MIME type. We have built this system with an emphasis on user extensibility and we describe the design and implementation as well as a basic set of middleware services that we have found to be particularly important.
Publication Details
  • Proceedings of the INNS-IEEE International Joint Conference on Neural Networks, vol. 3, pp. 2176 - 2181, Washington DC., July 14-19, 2001.
  • Jul 14, 2001

Abstract

Close
The goal of this project is to teach a computer-robot system to understand human speech through natural human-computer interaction. To achieve this goal, we develop an interactive and incremental learning algorithm based on entropy-guided learning vector quantisation (LVQ) and memory association. Supported by this algorithm, the robot has the potential to learn unlimited sounds progressively. Experimental results of a multilingual short-speech learning task are given after the presentation of the learning system. Further investigation of this learning system will include human-computer interactions that involve more modalities, and applications that use the proposed idea to train home appliances.
Publication Details
  • The Eighth IFIP TC.13 Conference On Human-Computer Interaction (INTERACT 2001). Tokyo, Japan, July 9-13, 2001.
  • Jul 9, 2001

Abstract

Close
The two most commonly used techniques for evaluating the fit between application design and use - namely, usability testing and beta testing with user feedback - suffer from a number of limitations that restrict evaluation scale (in the case of usability tests) and data quality (in the case of beta tests). They also fail to provide developers with an adequate basis for: (1) assessing the impact of suspected problems on users at large, and (2) deciding where to focus development and evaluation resources to maximize the benefit for users at large. This paper describes an agent-based approach for collecting usage data and user feedback over the Internet that addresses these limitations to provide developers with a complementary source of usage- and usability-related information. Contributions include: a theory to motivate and guide data collection, an architecture capable of supporting very large scale data collection, and real-word experience suggesting the proposed approach is complementary to existing practice.
Publication Details
  • In Proceedings of Human-Computer Interaction (INTERACT '01), IOS Press, Tokyo, Japan, pp. 464-471
  • Jul 9, 2001

Abstract

Close
Hitchcock is a system to simplify the process of editing video. Its key features are the use of automatic analysis to find the best quality video clips, an algorithm to cluster those clips into meaningful piles, and an intuitive user interface for combining the desired clips into a final video. We conducted a user study to determine how the automatic clip creation and pile navigation support users in the editing process. The study showed that users liked the ease-of-use afforded by automation, but occasionally had problems navigating and overriding the automated editing decisions. These findings demonstrate the need for a proper balance between automation and user control. Thus, we built a new version of Hitchcock that retains the automatic editing features, but provides additional controls for navigation and for allowing users to modify the system decisions.

Designing e-Books for Legal Research.

Publication Details
  • In Proceedings of JCDL 2001 (Roanoke, VA, June 23-27). ACM Press. pp. 41-48.
  • Jun 23, 2001

Abstract

Close
In this paper we report the findings from a field study of legal research in a first-tier law school and on the resulting redesign of XLibris, a next-generation e-book. We first characterize a work setting in which we expected an e-book to be a useful interface for reading and otherwise using a mix of physical and digital library materials, and explore what kinds of reading-related functionality would bring value to this setting. We do this by describing important aspects of legal research in a heterogeneous information environment, including mobility, reading, annotation, link following and writing practices, and their general implications for design. We then discuss how our work with a user community and an evolving e-book prototype allowed us to examine tandem issues of usability and utility, and to redesign an existing e-book user interface to suit the needs of law students. The study caused us to move away from the notion of a stand-alone reading device and toward the concept of a document laptop, a platform that would provide wireless access to information resources, as well as support a fuller spectrum of reading-related activities.
Publication Details
  • Proceedings of ACM CHI2001, vol. 3, pp. 442 - 449, Seattle, Washington, USA, March 31 - April 5, 2001.
  • Apr 5, 2001

Abstract

Close
Given rapid improvements in network infrastructure and streaming-media technologies, a large number of corporations and universities are recording lectures and making them available online for anytime, anywhere access. However, producing high-quality lecture videos is still labor intensive and expensive. Fortunately, recent technology advances are making it feasible to build automated camera management systems to capture lectures. In this paper we report on our design, implementation and study of such a system. Compared to previous work-which has tended to be technology centric-we started with interviews with professional video producers and used their knowledge and expertise to create video production rules. We then targeted technology components that allowed us to implement a substantial portion of these rules, including the design of a virtual video director. The system's performance was compared to that of a human operator via a user study. Results suggest that our system's quality in close to that of a human-controlled system. In fact most remote audience members could not tell if the video was produced by a computer or a person.

Quiet Calls: Talking Silently on Mobile Phones

Publication Details
  • In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 174-181, ACM Press, March 31-April 5, 2001, Seattle, WA.
  • Mar 30, 2001
Publication Details
  • In Proceedings of the Thirty-fourth Annual Hawaii International Conference on System Sciences (HICSS), Big Island, Hawaii. January 7-12, 2001.
  • Feb 7, 2001

Abstract

Close
This paper describes a new system for panoramic two-way video communication. Digitally combining images from an array of inexpensive video cameras results in a wide-field panoramic camera, from inexpensive off-the-shelf hardware. This system can aid distance learning in several ways, by both presenting a better view of the instructor and teaching materials to the students, and by enabling better audience feedback to the instructor. Because the camera is fixed with respect to the background, simple motion analysis can be used to track objects and people of interest. Electronically selecting a region of this results in a rapidly steerable "virtual camera." We present system details and a prototype distance-learning scenario using multiple panoramic cameras.
Publication Details
  • WebNet 2001 World Conference on the WWW and Internet, Orlando, FL
  • Jan 17, 2001

Abstract

Close
As more information is made available online, users collect information in personal information spaces like bookmarks and emails. While most users feel that organizing these collections is crucial to improve access, studies have shown that this activity is time consuming and highly cognitive. Automatic classification has been used but by relying on the full text of the documents, they do not generate personalized classifications. Our approach is to give users the ability to annotate their documents as they first access them. This annotation tool is unobtrusive and welcome by most users who generally miss this facility when dealing with digital documents. Our experiments show that these annotations can be used to generate personalized classifications of annotated Web pages.

Description and Narrative in Hypervideo

Publication Details
  • Proceedings of the Thirty-Fourth Annual Hawaii International Conference on System Sciences
  • Jan 3, 2001

Abstract

Close
While hypertext was originally conceived for the management of scientific and technical information, it has been embraced with great enthusiasm by several members of the literary community for the promises it offers towards new approaches to narrative. Experiments with hypertext-based interactive narrative were originally based solely on verbal text but have more recently extended to include digital video artifacts. The most accomplished of these experiments, HyperCafe, provided new insights into the nature of narrative and how it may be presented; but it also offered an opportunity to reconsider other text types. This paper is an investigation of the application of an approach similar to HyperCafe to a descriptive text. We discuss how the approach serves the needs of description and illustrate the discussion with a concrete example. We then conclude by considering the extent to which our experiences with description may be applied to our continuing interest in narrative.
2000
Publication Details
  • ACM Computing Surveys, Vol. 32 No. 4, December 2000.
  • Dec 1, 2000

Abstract

Close
Modern window-based user interface systems generate user interface events as natural products of their normal operation. Because such events can be automatically captured and because they indicate user behavior with respect to an application's user interface, they have long been regarded as a potentially fruitful source of information regarding application usage and usability. However, because user interface events are typically voluminos and rich in detail, automated support is generally required to extract information at a level of abstraction that is useful to investigators interested in analyzing application usage or evaluating usability. This survey examines computer-aided techniques used by HCI practitioners and researchers to extract usability-related information from user interface events. A framework is presented to help HCI practitioners and researchers categorize and compare the approaches that have been, or might fruitfully be, applied to this problem. Because many of the techniques in the research literature have not been evaluated in practice, this survey provides a conceptual evaluation to help identify some of the relative merits and drawbacks of the various classes of approaches. Ideas for future research in this area are also presented. This survey addresses the following questions: How might user interface events be used in evaluating usability? How are user interface events related to other forms of usability data? What are the key challenges faced by investigators wishing to exploit this data? What approaches have been brought to bear on this problem and how do they compare to one another? What are some of the important open research questions in this area?
Publication Details
  • Multimedia Modeling: Modeling Multimedia Information and Systems, Nagano, Japan
  • Nov 12, 2000

Abstract

Close
While hypermedia is usually presented as a way to offer content in a nonlinear manner, hypermedia structure tends to reinforce the assumption that reading is basically a linear process. Link structures provide a means by which the reader may choose different paths to traverse; but each of these paths is fundamentally linear, revealed through either a block of text or a well-defined chain of links. While there are experiences that get beyond such linear constraints, such as driving a car, it is very hard to capture this kind of non-linearity, characterized by multiple sources of stimuli competing for attention, in a hypermedia document. This paper presents a multi-channel document infrastructure that provides a means by which all such sources of attention are presented on a single "page" (i.e., a display with which the reader interacts) and move between background and foreground in response to the activities of the reader. The infrastructure thus controls the presentation of content with respect to four dimensions: visual, audio, interaction support, and rhythm.
Publication Details
  • In Proceedings of UIST '00, ACM Press, pp. 81-89, 2000.
  • Nov 4, 2000

Abstract

Close
Hitchcock is a system that allows users to easily create custom videos from raw video shot with a standard video camera. In contrast to other video editing systems, Hitchcock uses automatic analysis to determine the suitability of portions of the raw video. Unsuitable video typically has fast or erratic camera motion. Hitchcock first analyzes video to identify the type and amount of camera motion: fast pan, slow zoom, etc. Based on this analysis, a numerical "unsuitability" score is computed for each frame of the video. Combined with standard editing rules, this score is used to identify clips for inclusion in the final video and to select their start and end points. To create a custom video, the user drags keyframes corresponding to the desired clips into a storyboard. Users can lengthen or shorten the clip without specifying the start and end frames explicitly. Clip lengths are balanced automatically using a spring-based algorithm.
Publication Details
  • In Proceedings of the International Symposium on Music Information Retrieval, in press.
  • Oct 23, 2000

Abstract

Close
We introduce an audio retrieval-by-example system for orchestral music. Unlike many other approaches, this system is based on analysis of the audio waveform and does not rely on symbolic or MIDI representations. ARTHUR retrieves audio on the basis of long-term structure, specifically the variation of soft and louder passages. The long-term structure is determined from envelope of audio energy versus time in one or more frequency bands. Similarity between energy profiles is calculated using dynamic programming. Given an example audio document, other documents in a collection can be ranked by similarity of their energy profiles. Experiments are presented for a modest corpus that demonstrate excellent results in retrieving different performances of the same orchestral work, given an example performance or short excerpt as a query.