Publications

FXPAL publishes in top scientific conferences and journals.

2005
Publication Details
  • Demo and presentation in UbiComp 2005 workshop in Tokyo, Japan.
  • Sep 11, 2005

Abstract

Close
A Post-Bit is a prototype of a small ePaper device for handling multimedia content, combining interaction control and display into one package. Post-Bits are modeled after paper Post-Its™; the functions of each Post-Bit combine the affordances of physical tiny sticky memos and digital handling of information. Post-Bits enable us to arrange multimedia contents in our embodied physical spaces. Tangible properties of paper such as flipping, flexing, scattering and rubbing are mapped to controlling aspects of the content. In this paper, we introduce the integrated design and functionality of the Post-Bit system, including four main components: the ePaper sticky memo/player, with integrated sensors and connectors; a small container/binder that a few Post-Bits can fit into, for ordering and multiple connections; the data and power port that allows communication with the host com-puter; and finally the software and GUI interface that reside on the host PC and manage multimedia transfer.
Publication Details
  • Sixteenth ACM Conference on Hypertext and Hypermedia
  • Sep 6, 2005

Abstract

Close
Hyper-Hitchcock is a hypervideo editor enabling the direct manipulation authoring of a particular form of hypervideo called "detail-on-demand video." This form of hypervideo allows a single link out of the currently playing video to provide more details on the content currently being presented. The editor includes a workspace to select, group, and arrange video clips into several linear sequences. Navigational links placed between the video elements are assigned labels and return behaviors appropriate to the goals of the hypervideo and the role of the destination video. Hyper-Hitchcock was used by students in a Computers and New Media class to author hypervideos on a variety of topics. The produced hypervideos provide examples of hypervideo structures and the link properties and behaviors needed to support them. Feedback from students identified additional link behaviors and features required to support new hypervideo genres. This feedback is valuable for the redesign of Hyper-Hitchcock and the design of hypervideo editors in general.

DoKumobility: Web services for the mobile worker

Publication Details
  • IEEE International Conference on Next Generation Web Services Practices (NWeSP'05), Seoul, Korea
  • Aug 22, 2005

Abstract

Close
Mobile users often require access to their documents while away from the office. While pre-loading documents in a repository can make those documents available remotely, people need to know in advance which documents they might need. Furthermore, it may be difficult to view, print, or share the document through a portable device such as cell phone. We implemented DoKumobility, a network of web services for mobile users for managing, printing, and sharing documents. In this paper, we describe the infrastructure and illustrate its use with several applications
Publication Details
  • ACM Transactions on Multimedia Computing, Communications, and Applications
  • Aug 8, 2005

Abstract

Close
Organizing digital photograph collections according to events such as holiday gatherings or vacations is a common practice among photographers. To support photographers in this task, we present similarity-based methods to cluster digital photos by time and image content. The approach is general, unsupervised, and makes minimal assumptions regarding the structure or statistics of the photo collection. We present several variants of an automatic unsupervised algorithm to partition a collection of digital photographs based either on temporal similarity alone, or on temporal and content-based similarity. First, inter-photo similarity is quantified at multiple temporal scales to identify likely event clusters. Second, the final clusters are determined according to one of three clustering goodness criteria. The clustering criteria trade off computational complexity and performance. We also describe a supervised clustering method based on learning vector quantization. Finally, we review the results of an experimental evaluation of the proposed algorithms and existing approaches on two test collections.

Parallel Changes: Detecting Semantic Interferences

Publication Details
  • The 29th Annual International Computer Software and Applications Conference (COMPSAC 2005), Edinburgh, Scotland
  • Jul 26, 2005

Abstract

Close
Parallel changes are a basic fact of modern software development. Where previously we looked at prima facie interference, here we investigate a less direct form that we call semantic interference. We reduce the forms of semantic interference that we are interested in to overlapping def-use pairs. Using program slicing and data flow analysis, we present algorithms for detecting semantic interference for both concurrent changes (allowed in optimistic version management systems) and sequential parallel changes (supported in pessimistic version management systems), and for changes that are both immediate and distant in time. We provide these algorithms for changes that are additions, showing that interference caused by deletions can be detected by considering the two sets of changes in reverse-time order.
Publication Details
  • International Conference on Image and Video Retrieval 2005
  • Jul 21, 2005

Abstract

Close
Large video collections present a unique set of challenges to the search system designer. Text transcripts do not always provide an accurate index to the visual content, and the performance of visually based semantic extraction techniques is often inadequate for search tasks. The searcher must be relied upon to provide detailed judgment of the relevance of specific video segments. We describe a video search system that facilitates this user task by efficiently presenting search results in semantically meaningful units to simplify exploration of query results and query reformulation. We employ a story segmentation system and supporting user interface elements to effectively present query results at the story level. The system was tested in the 2004 TRECVID interactive search evaluations with very positive results.
Publication Details
  • ICME 2005
  • Jul 20, 2005

Abstract

Close
A common problem with teleconferences is awkward turn-taking - particularly 'collisions,' whereby multiple parties inadvertently speak over each other due to communication delays. We propose a model for teleconference discussions including the effects of delays, and describe tools that can improve the quality of those interactions. We describe an interface to gently provide latency awareness, and to give advanced notice of 'incoming speech' to help participants avoid collisions. This is possible when codec latencies are significant, or when a low bandwidth side channel or out-of-band signaling is available with lower latency than the primary video channel. We report on results of simulations, and of experiments carried out with transpacific meetings, that demonstrate these tools can improve the quality of teleconference discussions.
Publication Details
  • 2005 IEEE International Conference on Multimedia & Expo
  • Jul 6, 2005

Abstract

Close
A convenient representation of a video segment is a single keyframe. Keyframes are widely used in applications such as non-linear browsing and video editing. With existing methods of keyframe selection, similar video segments result in very similar keyframes, with the drawback that actual differences between the segments may be obscured. We present methods for keyframe selection based on two criteria: capturing the similarity to the represented segment, and preserving the differences from other segment keyframes, so that different segments will have visually distinct representations. We present two discriminative keyframe selection methods, and an example of experimental results.

AN ONLINE VIDEO COMPOSITION SYSTEM

Publication Details
  • IEEE International Conference on Multimedia & Expo July 6-8, 2005, Amsterdam, The Netherlands
  • Jul 6, 2005

Abstract

Close
This paper presents an information-driven online video composition system. The composition work handled by the system includes dynamically setting multiple pan/tilt/zoom (PTZ) cameras to proper poses and selecting the best close-up view for passive viewers. The main idea of the composition system is to maximize captured video information with limited cameras. Unlike video composition based on heuristic rules, our video composition is formulated as a process of minimizing distortions between ideal signals (i.e. signals with infinite spatial-temporal resolution) and displayed signals. The formulation is consistent with many well-known empirical approaches widely used in previous systems and may provide analytical explanations to those approaches. Moreover, it provides a novel approach for studying video composition tasks systematically. The composition system allows each user to select a personal close-up view. It manages PTZ cameras and a video switcher based on both signal characteristics and users' view selections. Additionally, it can automate the video composition process based on past users' view-selections when immediate selections are not available. We demonstrate the performance of this system with real meetings.
Publication Details
  • CHI 2005 Extended Abstracts, ACM Press, pp. 1395-1398
  • Apr 1, 2005

Abstract

Close
We present a search interface for large video collections with time-aligned text transcripts. The system is designed for users such as intelligence analysts that need to quickly find video clips relevant to a topic expressed in text and images. A key component of the system is a powerful and flexible user interface that incorporates dynamic visualizations of the underlying multimedia objects. The interface displays search results in ranked sets of story keyframe collages, and lets users explore the shots in a story. By adapting the keyframe collages based on query relevance and indicating which portions of the video have already been explored, we enable users to quickly find relevant sections. We tested our system as part of the NIST TRECVID interactive search evaluation, and found that our user interface enabled users to find more relevant results within the allotted time than those of many systems employing more sophisticated analysis techniques.

Improving Proactive Information Systems

Publication Details
  • International Conference on Intelligent User Interfaces (IUI 2005)
  • Jan 9, 2005

Abstract

Close
Proactive contextual information systems help people locate information by automatically suggesting potentially relevant resources based on their current tasks or interests. Such systems are becoming increasingly popular, but designing user interfaces that effectively communicate recommended information is a challenge: the interface must be unobtrusive, yet communicate enough information at the right time to provide value to the user. In this paper we describe our experience with the FXPAL Bar, a proactive information system designed to provide contextual access to corporate and personal resources. In particular, we present three features designed to communicate proactive recommendations more effectively: translucent recommendation windows increase the user's awareness of particularly highly-ranked recommendations, query term highlighting communicates the relationship between a recommended document and the user's current context, and a novel recommendation digest function allows users to return to the most relevant previously recommended resources. We present empirical evidence supporting our design decisions and relate lessons learned for other designers of contextual recommendation systems.
2004

Contextual Lexical Valence Shifters

Publication Details
  • Yan Qu, James Shanahan, and Janyce Wiebe, Cochairs. 2004. Exploring Attitude and Affect in Text: Theories and Applications. Technical Report SS-04-07, AAAI Press, ISBN 1-57735-219-x
  • Dec 6, 2004
Publication Details
  • Springer Lecture Notes in Computer Science - Advances in Multimedia Information Processing, Proc. PCM 2004 5th Pacific Rim Conference on Multimedia, Tokyo, Japan
  • Dec 1, 2004

Abstract

Close
For some years, our group at FX Palo Alto Laboratory has been developing technologies to support meeting recording, collaboration, and videoconferencing. This paper presents several systems that use video as an active interface, allowing remote devices and information to be accessed "through the screen." For example, SPEC enables collaborative and automatic camera control through an active video window. The NoteLook system allows a user to grab an image from a computer display, annotate it with digital ink, then drag it to that or a different display. The ePIC system facilitates natural control of multi-display and multi-device presentation spaces, while the iLight system allows remote users to "draw" with light on a local object. All our systems serve as platforms for researching more sophisticated algorithms to support additional functionality and ease of use.
Publication Details
  • ACM Multimedia 2004
  • Oct 28, 2004

Abstract

Close
In this paper, we compare several recent approaches to video segmentation using pairwise similarity. We first review and contrast the approaches within the common framework of similarity analysis and kernel correlation. We then combine these approaches with non-parametric supervised classification for shot boundary detection. Finally, we discuss comparative experimental results using the 2002 TRECVID shot boundary detection test collection.

Who cares? Reflecting who is reading what on distributed community bulletin boards

Publication Details
  • UIST 2004, the Seventeenth Annual ACM Symposium on User Interface Software and Technology, October 24-27, 2004
  • Oct 24, 2004

Abstract

Close
In this paper, we describe the YeTi information sharing system that has been designed to foster community building through informal digital content sharing. The YeTi system is a general information parsing, hosting and distribution infrastructure, with interfaces designed for individual and public content reading. In this paper we describe the YeTi public display interface, with a particular focus on tools we have designed to provide lightweight awareness of others' interactions with and interest in posted content. Our tools augment content with metadata that reflect people's reading of content - captured video clips of who's reading and interacting with content, tools to allow people to leave explicit freehand annotations about content, and a visualization of the content access history to show when content is interacted with. Results from an initial evaluation are presented and discussed.
Publication Details
  • UIST 2004 Companion, pp. 37-38
  • Oct 24, 2004

Abstract

Close
As the size of the typical personal digital photo collection reaches well into the thousands or photos, advanced tools to manage these large collections are more and more necessary. In this demonstration, we present a semi-automatic approach that opportunistically takes advantage of the current state-of-the-art technology in face detection and recognition and combines it with user interface techniques to facilitate the task of labeling people in photos. We show how we use an accurate face detector to automatically extract faces from photos. Instead of having a less accurate face recognizer classify faces, we use it to sort faces by their similarity to a face model. We demonstrate our photo application that uses the extracted faces as UI proxies for actions on the underlying photos along with the sorting strategy to identify candidate faces for quick and easy face labeling.
Publication Details
  • UIST 2004 Companion, pp. 13-14
  • Oct 24, 2004

Abstract

Close
We developed a novel technique for creating visually pleasing collages from photo regions. The technique is called "stained glass" because the resulting collage with irregular shapes is reminiscent of a stained glass window. The collages reuse photos in novel ways to present photos with faces that can be printed, included in Web pages, or shared via email. The poster describes the requirements for creating stained glass visualizations from photos of faces, our approach for creating face stained glass, and techniques used to improve the aesthetics and flexibility of the stained glass generation. Early user feedback with face stained glass have been very positive.

Remote Interactive Graffiti

Publication Details
  • Proc. ACM Multimedia 2004
  • Oct 12, 2004

Abstract

Close
We present an installation that allows distributed internet participants to "draw" on a public scene using light. The iLight system is a camera/projector system designed for remote collaboration. Using a familiar digital drawing interface, remote users "draw" on a live video image of a real-life object or scene. Graphics drawn by the user are then projected onto the scene, where they are visible in the camera image. Because camera distortions are corrected and the video is aligned with the image canvas, drawn graphics appear exactly where desired. Thus the remote users may harmlessly mark a physical object to serve their own their artistic and/or expressive needs. We also describe how local participants may interact with remote users through the projected images. Besides the intrinsic "neat factor" of action at a distance, this installation serves as an experiment in how multiple users from different locales and cultures can create a social space that interacts with a physical one, as well as raising issues of free expression in a non-destructive context.
Publication Details
  • Proceedings of the International Workshop on Multimedia Information Retrieval, ACM Press, pp. 99-106
  • Oct 10, 2004

Abstract

Close
With digital still cameras, users can easily collect thousands of photos. We have created a photo management application with the goal of making photo organization and browsing simple and quick, even for very large collections. A particular concern is the management of photos depicting people. We present a semi-automatic approach designed to facilitate the task of labeling photos with people that opportunistically takes advantage of the strengths of current state-of-the-art technology in face detection and recognition. In particular, an accurate face detector is used to automatically extract faces from photos while the less accurate face recognizer is used not to classify the detected faces, but to sort faces by their similarity to a chosen model. This sorting is used to present candidate faces within a user interface designed for quick and easy face labeling. We present results of a simulation of the usage model that demonstrate the improved ease that is achieved by our method.