Publications

FXPAL publishes in top scientific conferences and journals.

2003
Publication Details
  • Proc. SPIE Storage and Retrieval for Multimedia Databases, Vol. 5021, pp. 167-75
  • Jan 20, 2003

Abstract

Close
We present a framework for analyzing the structure of digital media streams. Though our methods work for video,text,and audio,we concentrate on detecting the structure of digital music files. In the first step,spectral data is used to construct a similarity matrix calculated from inter-frame spectral similarity. The digital audio can be robustly segmented by correlating a ernel along the diagonal of the similarity matrix. Once segmented, spectral statistics of each segment are computed.In the second step,segments are clustered based on the self- similarity of their statistics. This reveals the structure of the digital music in a set of segment boundaries and labels.Finally,the music can be summarized by selecting clusters with repeated segments throughout the piece. The summaries can be customized for various applications based on the structure of the original music.

AttrActive Windows: Active Windows for Pervasive Computing Applications

Publication Details
  • ACM Intelligent User Interface (IUI) 2003, Miami Beach, FL, pp 326
  • Jan 12, 2003

Abstract

Close
We introduce the AttrActive Windows user interface, a novel approach for presenting interactive content on large screen, interactive, digital, bulletin boards. Moving away from the desktop metaphor, AttrActive Windows are dynamic, non-uniform windows that can appear in different orientations and have autonomous behaviours to attract passers-by and invite interactions.
2002
Publication Details
  • IEEE Multimedia Signal Processing Workshop
  • Dec 11, 2002

Abstract

Close
We present a novel approach to automatically ex-tracting summary excerpts from audio and video. Our approach is to maximize the average similarity between the excerpt and the source. We first calculate a similarity matrix by comparing each pair of time samples using a quantitative similarity measure. To determine the segment with highest average similarity, we maximize the summation of the self-similarity matrix over the support of the segment. To select multiple excerpts while avoiding redundancy, we compute the non-negative matrix factorization (NMF) of the similarity matrix into its essential structural components. We then build a summary comprised of excerpts from the main components, selecting the excerpts for maximum average similarity within each component. Variations integrating segmentation and other information are also discussed, and experimental results are presented.
Publication Details
  • ACM Multimedia 2002
  • Dec 1, 2002

Abstract

Close
We present methods for automatic and semi-automatic creation of music videos, given an arbitrary audio soundtrack and source video. Significant audio changes are automatically detected; similarly, the source video is automatically segmented and analyzed for suitability based on camera motion and exposure. Video with excessive camera motion or poor contrast is penalized with a high unsuitability score, and is more likely to be discarded in the final edit. High quality video clips are then automatically selected and aligned in time with significant audio changes. Video clips are adjusted to match the audio segments by selecting the most suitable region of the desired length. Besides a fully automated solution, our system can also start with clips manually selected and ordered using a graphical interface. The video is then created by truncating the selected clips (preserving the high quality portions) to produce a video digest that is synchronized with the soundtrack music, thus enhancing the impact of both.
Publication Details
  • ACM Multimedia 2002
  • Dec 1, 2002

Abstract

Close
FlySPEC is a video camera system designed for real-time remote operation. A hybrid design combines the high resolution possible using an optomechanical video camera, with the wide field of view always available from a panoramic camera. The control system integrates requests from multiple users with the result that each controls a virtual camera. The control system seamlessly integrates manual and fully automatic control. It supports a range of options from untended automatic to full manual control, and the system can learn control strategies from user requests. Additionally, the panoramic view is always available for an intuitive interface, and objects are never out of view regardless of the zoom factor. We present the system architecture, an information-theoretic approach to combining panoramic and zoomed images to optimally satisfy user requests, and experimental results that show the FlySPEC system significantly assists users in a remote inspection tasks.
Publication Details
  • ACM 2002 Conference on Computer Supported Cooperative Work
  • Nov 16, 2002

Abstract

Close
Technology can play an important role in enabling people to interact with each other. The Web is one such technology with the affordances for sharing information and for connecting people to people. In this paper, we describe the design of two social interaction Web sites for two different social groups. We review several related efforts to provide principles for creating social interaction environments and describe the specific principles that guided our design. To examine the effectiveness of the two sites, we analyze the usage data. Finally, we discuss approaches for encouraging participation and lessons learned.

Moving Markup: Repositioning Freeform Annotations

Publication Details
  • Proceedings of ACM UIST 2002
  • Oct 27, 2002

Abstract

Close
Freeform digital ink annotation allows readers to interact with documents in an intuitive and familiar manner. Such marks are easy to manage on static documents, and provide a familiar annotation experience. In this paper, we describe an implementation of a freeform annotation system that accommodates dynamic document layout. The algorithm preserves the correct position of annotations when documents are viewed with different fonts or font sizes, with different aspect ratios, or on different devices. We explore a range of heuristics and algorithms required to handle common types of annotation, and conclude with a discussion of possible extensions to handle special kinds of annotations and changes to documents.
Publication Details
  • IEEE InfoVis '02 Interactive Poster and Demo
  • Oct 27, 2002

Abstract

Close
This work presents constructs called interactive space-time maps along with an application called the SpaceTime Browser for visualizing and retrieving documents. A 3D visualization with 2D planar maps and a time line is employed. Users can select regions on the maps and choose precise time intervals by sliding the maps along the telescopic time line. Regions are highlighted to indicate the presence of documents with matching space-time attributes, and documents are retrieved and displayed in an adjoining workspace. We provide two examples: (1) organizing travel photos, (2) managing documents created by room location-aware devices in a building.

Context-Aware Communication

Publication Details
  • IEEE Wireless Communications Magazine, Vol. 9, No. 5.
  • Oct 15, 2002

Abstract

Close
This paper describes how the changing information about an individual's location, environment, and social situation can be used to initiate and facilitate people's interactions with one another, individually and in groups. Context-aware communication is contrasted with other forms of context-aware computing and we characterize applications in terms of design decisions along two dimensions: the extent of autonomy in context sensing and the extent of autonomy in communication action. A number of context-aware communication applications from the research literature are presented in five application categories. Finally, a number of issues related to the design of context-aware communication applications are presented.

Web Interaction Using Very Small Internet Devices

Publication Details
  • IEEE Computer Magazine, Cover Feature, Vol. 35, No. 10.
  • Oct 15, 2002

Abstract

Close
Squeezing desktop Web content into smart phones and text pagers is more practical with separate interfaces for navigation and content manipulation. m-Links, a middleware proxy system, supports this dual-mode browsing, offering phonetop users an extendable set of actions.
Publication Details
  • 2002 International Symposium on Music Information Retrieval
  • Oct 13, 2002

Abstract

Close
We present methods for automatically producing summary excerpts or thumbnails of music. To find the most representative excerpt, we maximize the average segment similarity to the entire work. After window-based audio parameterization, a quantitative similarity measure is calculated between every pair of windows, and the results are embedded in a 2-D similarity matrix. Summing the similarity matrix over the support of a segment results in a measure of how similar that segment is to the whole. This measure is maximized to find the segment that best represents the entire work. We discuss variations on the method, and present experimental results for orchestral music, popular songs, and jazz. These results demonstrate that the method finds significantly representative excerpts, using very few assumptions about the source audio.

Audio Retrieval by Rhythmic Similarity

Publication Details
  • 2002 International Symposium on Music Information Retrieval
  • Oct 13, 2002

Abstract

Close
We present a method for characterizing both the rhythm and tempo of music. We also present ways to quantitatively measure the rhythmic similarity between two or more works of music. This allows rhythmically similar works to be retrieved from a large collection. A related application is to sequence music by rhythmic similarity, thus providing an automatic "disc jockey" function for musical libraries. Besides specific analysis and retrieval methods, we present small-scale experiments that demonstrate ranking and retrieving musical audio by rhythmic similarity.
Publication Details
  • The 4th International Conference on Ubiquitous Computing (UbiComp 2002).
  • Sep 29, 2002

Abstract

Close
As ubiquitous computing becomes widespread, we are increasingly coming into contact with "shared" computer-enhanced devices, such as cars, televisions, and photocopiers. Our interest is in identifying general issues in personalizing such shared everyday devices. Our approach is to compare alternative personalization methods by deploying and using alternative personalization interfaces (portable and embedded) for three shared devices in our workplace (a presentation PC, a plasma display for brainstorming, and a multi-function copier). This paper presents the comparative prototyping methodology we employed, the experimental system we deployed, observations and feedback from use, and resulting issues in designing personalized shared ubiquitous devices.
Publication Details
  • Workshop on User centered Evaluations for Ubiquitous Computing Systems: Best Known Methods, The 4th International Conference on Ubiquitous Computing (UbiComp 2002).
  • Sep 29, 2002

Abstract

Close
Evaluating ubiquitous systems is hard, and has attracted the attention of others in the research community. These investigators, like others in CSCW, argue there is a basic mismatch between traditional evaluation techniques and the needs posed by ubiquitous systems. Namely, these systems are embedded in a variety of complex real world environments that cannot be easily modeled (as required by theoretical analyses), simulated, measured, or controlled (as required by laboratory experiments). As a result, many investigators have abandoned traditional comparative evaluation techniques and opted instead for techniques adapted from the social sciences, such as anthropology. We wanted to perform a comparative evaluation similar to a laboratory experiment, but in such a way that we could observe the effects of our design decisions in relatively unconstrained, real world use. This led us to the process described in this paper.

Low-Resolution Supplementary Tactile Cues for Navigational Assistance

Publication Details
  • In proceedings of Mobile HCI 2002. (Pisa, Italy,2002), Springer-Verlag, Lecture notes in computer science #2411,pp.369-372.
  • Sep 18, 2002

Abstract

Close
The TactGuide is a mobile navigation device 'displaying' personalized direction cues by means of a tactile and 'tactful' representation. The TactGuide is operated by tactile inspection which is subtle enough to allow the users to engage/disengage in device interaction while preserving their visual, auditory and kinesthetic senses for inspection of the environment. The TactGuide design thereby accommodates the users' need to economize their attentional resources between device and environment while navigating through physical space. Preliminary experiments indicates that users readily map the tactile cues to spatial directions and that TactGuide can be operated as a supplement to, and without compromising, the use of our existing wayfinding abilities. substituting the use of our natural abilities and earned skills for wayfinding.
Publication Details
  • Journal of Mathematical Physics, September 2002 special issue on Quantum Information Theory, Vol. 43 (9), pp. 4376 - 7381.
  • Sep 7, 2002

Abstract

Close

To implement any quantum operation (a.k.a. ``superoperator'' or ``CP map'') on a d-dimensional quantum system, it is enough to apply a suitable overall unitary transformation to the system and a d^2-dimensional environment which is initialized in a fixed pure state. It has been suggested that a d-dimensional environment might be enough if we could initialize the environment in a mixed state of our choosing. In this note we show with elementary means that certain explicit quantum operations cannot be realized in this way. Our counterexamples map some pure states to pure states, giving strong and easily manageable conditions on the overall unitary transformation. Everything works in the more general setting of quantum operations from d-dimensional to d'-dimensional spaces, so we place our counterexamples within this more general framework.

Publication Details
  • Proceedings IEEE International Conference on Multimedia and Expo, Lausanne, Switzerland, August 2002
  • Aug 26, 2002

Abstract

Close
We present a method for rapidly and robustly extracting audio excerpts without the overhead of speech recognition or speaker segmentation. An immediate application is to automatically augment keyframe-based video summaries with informative audio excerpts associated with the video segments represented by the keyframes. Short audio clips combined with keyframes comprise an extremely lightweight and Web-browsable interface for auditioning video or similar media, without using bandwidth-intensive streaming video or audio.
Publication Details
  • IEEE International Conference on Multimedia and Expo 2002
  • Aug 26, 2002

Abstract

Close
This paper presents a camera system called FlySPEC. In contrast to a traditional camera system that provides the same video stream to every user, FlySPEC can simultaneously serve different video-viewing requests. This flexibility allows users to conveniently participate in a seminar or meeting at their own pace. Meanwhile, the FlySPEC system provides a seamless blend of manual control and automation. With this control mix, users can easily make tradeoffs between video capture effort and video quality. The FlySPEC camera is constructed by installing a set of Pan/Tilt/Zoom (PTZ) cameras near a high-resolution panoramic camera. While the panoramic camera provides the basic functionality of serving different viewing requests, the PTZ camera is managed by our algorithm to improve the overall video quality that may affect users watching details. The video resolution improvements from using different camera management strategies are compared in the experimental section.

Detecting Path Intersections in Panoramic Video

Publication Details
  • IEEE International Conference on Multimedia and Expo 2002
  • Aug 26, 2002

Abstract

Close
Given panoramic video taken along a self-intersecting path, we present a method for detecting the intersection points. This allows "virtual tours" to be synthesized by splicing the panoramic video at the intersection points. Spatial intersections are detected by finding the best-matching panoramic images from a number of nearby candidates. Each panoramic image is segmented into horizontal strips. Each strip is averaged in the vertical direction. The Fourier coefficients of the resulting 1-D data capture the rotation-invariant horizontal texture of each panoramic image. The distance between two panoramic images is calculated as the sum of the distances between their strip texture pairs at the same row positions. The intersection is chosen as the two candidate panoramic images that have the minimum distance.
Publication Details
  • SPIE ITCOM 2002
  • Jul 31, 2002

Abstract

Close
We present a framework, motivated by rate-distortion theory and the human visual system, for optimally representing the real world given limited video resolution. To provide users with high fidelity views, we built a hybrid video camera system that combines a fixed wide-field panoramic camera with a controllable pan/tilt/zoom (PTZ) camera. In our framework, a video frame is viewed as a limited-frequency representation of some "true" image function. Our system combines outputs from both cameras to construct the highest fidelity views possible, and controls the PTZ camera to maximize information gain available from higher spatial frequencies. In operation, each remote viewer is presented with a small panoramic view of the entire scene, and a larger close-up view of a selected region. Users may select a region by marking the panoramic view. The system operates the PTZ camera to best satisfy requests from multiple users. When no regions are selected, the system automatically operates the PTZ camera to minimize predicted video distortion. High-resolution images are cached and sent if a previously recorded region has not changed and the PTZ camera is pointed elsewhere. We present experiments demonstrating that the panoramic image can effectively predict where to gain the most information, and also that the system provides better images to multiple users than conventional camera systems.

Communication and Understanding for Decision Support

Publication Details
  • Proceedings of the IFIP International Conference on Decision Making and Decision Support in the Internet Age
  • Jul 4, 2002

Abstract

Close
As the technology for communication changes, the role of communication in the conduct of business changes with it. Communication is no longer just a technical matter of separating signal from noise and managing bandwidth but also a social matter in which negotiating differences in understanding among and between communicators is a primary business priority. Addressing this priority requires an understanding of how individuals interact in the course of their decision making activities. Using the work of Anthony Giddens as a point of departure, this paper views interaction in communication as consisting of three dimensions - meaning, authority, and trust. These three dimensions are used to identify new opportunities for advances in decision making technology which help deal with potential breakdowns in social interaction.