Publications

FXPAL publishes in top scientific conferences and journals.

2007

DOTS: Support for Effective Video Surveillance

Publication Details
  • Fuji Xerox Technical Report No. 17, pp. 83-100
  • Nov 1, 2007

Abstract

Close
DOTS (Dynamic Object Tracking System) is an indoor, real-time, multi-camera surveillance system, deployed in a real office setting. DOTS combines video analysis and user interface components to enable security personnel to effectively monitor views of interest and to perform tasks such as tracking a person. The video analysis component performs feature-level foreground segmentation with reliable results even under complex conditions. It incorporates an efficient greedy-search approach for tracking multiple people through occlusion and combines results from individual cameras into multi-camera trajectories. The user interface draws the users' attention to important events that are indexed for easy reference. Different views within the user interface provide spatial information for easier navigation. DOTS, with over twenty video cameras installed in hallways and other public spaces in our office building, has been in constant use for a year. Our experiences led to many changes that improved performance in all system components.
Publication Details
  • UIST 2007 Poster & Demo
  • Oct 7, 2007

Abstract

Close
We are exploring the use of collaborative games to generate meaningful textual tags for photos. We have designed Pho-toPlay to take advantage of the social engagement typical of board games and provide a collocated ludic environment conducive to the creation of text tags. We evaluated Photo-Play and found that it was fun and socially engaging for players. The milieu of the game also facilitated playing with personal photos, which resulted in more specific tags such as named entities than when playing with randomly selected online photos. Players also had a preference for playing with personal photos.
Publication Details
  • TRECVID Video Summarization Workshop at ACM Multimedia 2007
  • Sep 28, 2007

Abstract

Close
This paper describes a system for selecting excerpts from unedited video and presenting the excerpts in a short sum- mary video for eciently understanding the video contents. Color and motion features are used to divide the video into segments where the color distribution and camera motion are similar. Segments with and without camera motion are clustered separately to identify redundant video. Audio fea- tures are used to identify clapboard appearances for exclu- sion. Representative segments from each cluster are selected for presentation. To increase the original material contained within the summary and reduce the time required to view the summary, selected segments are played back at a higher rate based on the amount of detected camera motion in the segment. Pitch-preserving audio processing is used to bet- ter capture the sense of the original audio. Metadata about each segment is overlayed on the summary to help the viewer understand the context of the summary segments in the orig- inal video.
Publication Details
  • ICDSC 2007, pp. 132-139
  • Sep 25, 2007

Abstract

Close
Our analysis and visualization tools use 3D building geometry to support surveillance tasks. These tools are part of DOTS, our multicamera surveillance system; a system with over 20 cameras spread throughout the public spaces of our building. The geometric input to DOTS is a floor plan and information such as cubicle wall heights. From this input we construct a 3D model and an enhanced 2D floor plan that are the bases for more specific visualization and analysis tools. Foreground objects of interest can be placed within these models and dynamically updated in real time across camera views. Alternatively, a virtual first-person view suggests what a tracked person can see as she moves about. Interactive visualization tools support complex camera-placement tasks. Extrinsic camera calibration is supported both by visualizations of parameter adjustment results and by methods for establishing correspondences between image features and the 3D model.

DOTS: Support for Effective Video Surveillance

Publication Details
  • ACM Multimedia 2007, pp. 423-432
  • Sep 24, 2007

Abstract

Close
DOTS (Dynamic Object Tracking System) is an indoor, real-time, multi-camera surveillance system, deployed in a real office setting. DOTS combines video analysis and user interface components to enable security personnel to effectively monitor views of interest and to perform tasks such as tracking a person. The video analysis component performs feature-level foreground segmentation with reliable results even under complex conditions. It incorporates an efficient greedy-search approach for tracking multiple people through occlusion and combines results from individual cameras into multi-camera trajectories. The user interface draws the users' attention to important events that are indexed for easy reference. Different views within the user interface provide spatial information for easier navigation. DOTS, with over twenty video cameras installed in hallways and other public spaces in our office building, has been in constant use for a year. Our experiences led to many changes that improved performance in all system components.
Publication Details
  • IEEE Intl. Conf. on Semantic Computing
  • Sep 17, 2007

Abstract

Close
We present methods for semantic annotation of multimedia data. The goal is to detect semantic attributes (also referred to as concepts) in clips of video via analysis of a single keyframe or set of frames. The proposed methods integrate high performance discriminative single concept detectors in a random field model for collective multiple concept detection. Furthermore, we describe a generic framework for semantic media classification capable of capturing arbitrary complex dependencies between the semantic concepts. Finally, we present initial experimental results comparing the proposed approach to existing methods.
Publication Details
  • Workshop at Ubicomp 2007
  • Sep 16, 2007

Abstract

Close
The past two years at UbiComp, our workshops on design and usability in next generation conference rooms engendered lively conversations in the community of people working in smart environments. The community is clearly vital and growing. This year we would like to build on the energy from previous workshops while taking on a more interactive and exploratory format. The theme for this workshop is "embodied meeting support" and includes three tracks: mobile interaction, tangible interaction, and sensing in smart environments. We encourage participants to present work that focuses on one track or that attempts to bridge multiple tracks.

FXPAL MediaMagic Video Search System

Publication Details
  • ACM Conf. on Image and Video Retrieval 2007
  • Jul 29, 2007

Abstract

Close
This paper describes FXPAL's interactive video search application, "MediaMagic". FXPAL has participated in the TRECVID interactive search task since 2004. In our search application we employ a rich set of redundant visual cues to help the searcher quickly sift through the video collection. A central element of the interface and underlying search engine is a segmentation of the video into stories, which allows the user to quickly navigate and evaluate the relevance of moderately-sized, semantically-related chunks.
Publication Details
  • ICME 2007
  • Jul 2, 2007

Abstract

Close
The recent emergence of multi-core processors enables a new trend in the usage of computers. Computer vision applications, which require heavy computation and lots of bandwidth, usually cannot run in real-time. Recent multi-core processors can potentially serve the needs of such workloads. In addition, more advanced algorithms can be developed utilizing the new computation paradigm. In this paper, we study the performance of an articulated body tracker on multi-core processors. The articulated body tracking workload encapsulates most of the important aspects of a computer vision workload. It takes multiple camera inputs of a scene with a single human object, extracts useful features, and performs statistical inference to find the body pose. We show the importance of properly parallelizing the workload in order to achieve great performance: speedups of 26 on 32 cores. We conclude that: (1) data-domain parallelization is better than function-domain parallelization for computer vision applications; (2) data-domain parallelism by image regions and particles is very effective; (3) reducing serial code in edge detection brings significant performance improvements; (4) domain knowledge about low/mid/high level of vision computation is helpful in parallelizing the workload.

Featured Wand for 3D Interaction

Publication Details
  • ICME 2007
  • Jul 2, 2007

Abstract

Close
Our featured wand, automatically tracked by video cameras, provides an inexpensive and natural way for users to interact with devices such as large displays. The wand supports six degrees of freedom for manipulation of 3D applications like Google Earth. Our system uses a 'line scan' to estimate the wand pose tracking which simplifies processing. Several applications are demonstrated.
Publication Details
  • ICME 2007, pp. 1015-1018
  • Jul 2, 2007

Abstract

Close
We describe a new interaction technique that allows users to control nonlinear video playback by directly manipulating objects seen in the video. This interaction technique is simi-lar to video "scrubbing" where the user adjusts the playback time by moving the mouse along a slider. Our approach is superior to variable-scale scrubbing in that the user can con-centrate on interesting objects and does not have to guess how long the objects will stay in view. Our method relies on a video tracking system that tracks objects in fixed cameras, maps them into 3D space, and handles hand-offs between cameras. In addition to dragging objects visible in video windows, users may also drag iconic object representations on a floor plan. In that case, the best video views are se-lected for the dragged objects.
Publication Details
  • ICME 2007, pp. 675-678
  • Jul 2, 2007

Abstract

Close
In this paper we describe the analysis component of an indoor, real-time, multi-camera surveillance system. The analysis includes: (1) a novel feature-level foreground segmentation method which achieves efficient and reliable segmentation results even under complex conditions, (2) an efficient greedy search based approach for tracking multiple people through occlusion, and (3) a method for multi-camera handoff that associates individual trajectories in adjacent cameras. The analysis is used for an 18 camera surveillance system that has been running continuously in an indoor business over the past several months. Our experiments demonstrate that the processing method for people detection and tracking across multiple cameras is fast and robust.

POEMS: A Paper Based Meeting Service Management Tool

Publication Details
  • ICME 2007
  • Jul 2, 2007

Abstract

Close
As more and more tools are developed for meeting support tasks, properly using these tools to get expected results becomes too complicated for many meeting participants. To address this problem, we propose POEMS (Paper Offered Environment Management Service) that allows meeting participants to control services in a meeting environment through a digital pen and an environment photo on digital paper. Unlike state-of-the-art device control interfaces that require interaction with text commands, buttons, or other artificial symbols, our photo enabled service access is more intuitive. Compared with PC and PDA supported control, this new approach is more flexible and cheap. With this system, a meeting participant can initiate a whiteboard on a selected public display by tapping the display image in the photo, or print out a display by drawing a line from the display image to a printer image in the photo. The user can also control video or other active applications on a display by drawing a link between a printed controller and the image of the display. This paper presents the system architecture, implementation tradeoffs, and various meeting control scenarios.
Publication Details
  • ICME 2007
  • Jul 2, 2007

Abstract

Close
As more and more tools are developed for meeting support tasks, properly using these tools to get expected results becomes very complicated for many meeting participants. To address this problem, we propose POEMS (Paper Offered Environment Management Service) that can facilitate the activation of various services with a pen and paper based interface. With this tool, meeting participants can control meeting support devices on the same paper that they take notes. Additionally, a meeting participant can also share his/her paper drawings on a selected public display or initiate a collaborative discussion on a selected public display with a page of paper. Compared with traditional interfaces, such as tablet PC or PDA based interfaces, the interface of this tool has much higher resolution and is much cheaper and easier to deploy. The paper interface is also natural to use for ordinary people.
Publication Details
  • IEEE Pervasive Computing Magazine, Vol. 6, No. 3, Jul-Sep 2007.
  • Jul 1, 2007

Abstract

Close
AnySpot is a web service-based platform for seamlessly connecting people to their personal and shared documents wherever they go. We describe the principles behind AnySpot's design and report our experience deploying it in a large, multi-national organization.
Publication Details
  • Pervasive 2007 Invited Demo
  • May 13, 2007

Abstract

Close
We present an investigation of interaction models for slideshow applications in a multi-display environment. Three models are examined: Direct Manipulation, Billiard Ball, and Flow. These concepts can be demonstrated by the ModSlideShow prototype, which is designed as a configurable modular display system where each display unit communicates with its neighbors and fundamental operations that act locally can be composed to support the higher level interaction models. We also describe the gesture input scheme, animation feedback, and other enhancements.
Publication Details
  • CHI 2007, pp. 1167-1176
  • Apr 28, 2007

Abstract

Close
A common video surveillance task is to keep track of people moving around the space being monitored. It is often difficult to track activity between cameras because locations such as hallways in office buildings can look quite similar and do not indicate the spatial proximity of the cameras. We describe a spatial video player that orients nearby video feeds with the field of view of the main playing video to aid in tracking between cameras. This is compared with the traditional bank of cameras with and without interactive maps for identifying and selecting cameras. We additionally explore the value of static and rotating maps for tracking activity between cameras. The study results show that both the spatial video player and the map improve user performance when compared to the camera-bank interface. Also, subjects change cameras more often with the spatial player than either the camera bank or the map, when available.
Publication Details
  • CHI 2007
  • Apr 28, 2007

Abstract

Close
We present the iterative design of Momento, a tool that provides integrated support for situated evaluation of ubiquitous computing applications. We derived requirements for Momento from a user-centered design process that included interviews, observations and field studies of early versions of the tool. Motivated by our findings, Momento supports remote testing of ubicomp applications, helps with participant adoption and retention by minimizing the need for new hardware, and supports mid-to-long term studies to address infrequently occurring data. Also, Momento can gather log data, experience sampling, diary, and other qualitative data.

Video Segmentation via Temporal Pattern Classification

Publication Details
  • IEEE Transactions on Multimedia
  • Apr 1, 2007

Abstract

Close
We present a general approach to temporal media segmentation using supervised classification. Given standard low-level features representing each time sample, we build intermediate features via pairwise similarity. The intermediate features comprehensively characterize local temporal structure, and are input to an efficient supervised classifier to identify shot boundaries. We integrate discriminative feature selection based on mutual information to enhance performance and reduce processing requirements. Experimental results using large-scale test sets provided by the TRECVID evaluations for abrupt and gradual shot boundary detection are presented, demonstrating excellent performance.

Abstract

Close
3D renderings can often look cold and impersonal or even cartoonish. They can also appear too crisply detailed . This can cause viewers to concentrate on specific details when they should be focusing on a more general idea or concept. With the techniques covered in this tutorial you will be able to turn your 3D renderings into "hand drawn" looking illustrations.

Context-Aware Telecommunication Services

Publication Details
  • UNESCO Encyclopedia of Life Support Systems
  • Apr 1, 2007

Abstract

Close
This chapter describes how the changing information about an individual's location, environment, and social situation can be used to initiate and facilitate people's interactions with one another, individually and in groups. Context-aware communication is contrasted with other forms of context-aware computing and we characterize applications in terms of design decisions along two dimensions: the extent of autonomy in context sensing and the extent of autonomy in communication action. A number of context-aware communication applications from the research literature are presented in five application categories. Finally, a number of issues related to the design of context-aware communication applications are presented.
Publication Details
  • Proceedings of the AAAI Spring Symposium 2007 on quantum interaction organized by Keith von Rijsbergen, Peter Bruza, Bill Lawless, and Don Sofge
  • Mar 26, 2007

Abstract

Close
This survey, aimed at information processing researchers, highlights intriguing but lesser known results, corrects misconceptions, and suggests research areas. Themes include: certainty in quantum algorithms; the "fewer worlds" theory of quantum mechanics; quantum learning; probability theory versus quantum mechanics.
Publication Details
  • Book chapter in: A Document (Re)turn. Contributions from a Research Field in Transition (Taschenbuch), Roswitha Skare, Niels Windfeld Lund, Andreas Vårheim (eds.), Peter Lang Publishing, Incorporated, 2007.
  • Feb 19, 2007

Abstract

Close
When people are checking in to flights, making reports to their company manager, composing music, delivering papers for exams in schools, or examining patients in hospitals, they all deal with documents and processes of documentation. In earlier times, documentation took place primarily in libraries and archives. While the latter are still important document institutions, documents today play a far more essential role in social life in many different domains and cultures. In this book, which celebrates the ten year anniversary of documentation studies in Tromsø, experts from many different disciplines, professional domains as well as cultures around the world present their way of dealing with documents, demonstrating many potential directions for the emerging broad field of documentation studies.

Adaptive News Access

Publication Details
  • Book chapter in "The Adaptive Web: Methods and Strategies of Web Personalization" (Springer, LNCS #4321)
  • Feb 1, 2007

Abstract

Close
This chapter describes how the adaptive web technologies discussed in this book have been applied to news access. First, we provide an overview of different types of adaptivity in the context of news access and identify corre-sponding algorithms. For each adaptivity type, we briefly discuss representative systems that use the described techniques. Next, we discuss an in-depth case study of a personalized news system. As part of this study, we outline a user modeling approach specifically designed for news personalization, and present results from an evaluation that attempts to quantify the effect of adaptive news access from a user perspective. We conclude by discussing recent trends and novel systems in the adaptive news space.

Content-based Recommendation Systems

Publication Details
  • Book chapter in "The Adaptive Web: Methods and Strategies of Web Personalization" (Springer, LNCS #4321)
  • Feb 1, 2007

Abstract

Close
This chapter discusses content-based recommendation systems, i.e., systems that recommend an item to a user based upon a description of the item and a profile of the user's interests. Content-based recommendation systems may be used in a variety of domains ranging from recommending web pages, news articles, restau-rants, television programs, and items for sale. Although the details of various systems differ, content-based recommendation systems share in common a means for describing the items that may be recommended, a means for creating a profile of the user that describes the types of items the user likes, and a means of comparing items to the user profile to determine what to recommend. The user profile is often created and updated automatically in response to feedback on the desirability of items that have been presented to the user.
Publication Details
  • PSD Magazine 2/2007 - Photoshop Art & Special Effects
  • Feb 1, 2007

Abstract

Close
With the techniques covered in this tutorial you will be able to produce two classic visual effects. First, I'll show you how to make animated titles by importing Photoshop files into Aftereffects. Next we'll add new scenic elements to some video footage, again using Photoshop. This technique will allow you to add or remove elements like tree or buildings from a shot. These techniques, especially the one we will use to alter the scene, are common to most visual effects. Watch the classic old 1933 version of King Kong. Willis O'Brien, the stop motion genius that animated Kong, pioneered the art of extending, or completely fabricating, scenery. Layering several elements painted on glass in front his puppets and rear projected footage allowed O'brien and RKO's visual effects artist Linwood Dunn to create King Kong's fantastic jungle scenes. It is said that these set-ups could be many feet deep.
2006
Publication Details
  • Henry Hexmoor, Marcin Paprzycki, Niranjan Suri (eds) Scalable Computing: Practice and Experience Volume 7, No. 4, December 2006
  • Dec 23, 2006

Abstract

Close
Current search engines crawl the Web, download content, and digest this content locally. For multimedia content, this involves considerable volumes of data. Furthermore, this process covers only publicly available content because content providers are concerned that they otherwise loose control over the distribution of their intellectual property. We present the prototype of our secure and distributed search engine, which dynamically pushes content based feature extraction to image providers. Thereby, the volume of data that is transported over the network is significantly reduced, and the concerns mentioned above are alleviated. The distribution of feature extraction and matching algorithms is done by mobile software agents. Subsequent search requests performed upon the resulting feature indices by means of remote feature comparison can either be realized through mobile software agents, or by the use of implicitly created Web services which wrap the remote comparison functionality, and thereby improve the interoperability of the search engine. We give a description of the search engine's architecture and implementation, depict our concepts to integrate agent and Web service technology, and present quantitative evaluation results. Furthermore, we discuss related security mechanisms for content protection and server security.

Security Risks in Java-based Mobile Code Systems

Publication Details
  • Henry Hexmoor, Marcin Paprzycki, Niranjan Suri (eds) Scalable Computing: Practice and Experience Volume 7, No. 4, December 2006
  • Dec 23, 2006

Abstract

Close
Java is the predominant language for mobile agent systems, both for implementing mobile agent execution environments and for writing mobile agent applications. This is due to inherent support for code mobility by means of dynamic class loading and separable class name spaces, as well as a number of security properties, such as language safety and access control by means of stack introspection. However, serious questions must be raised whether Java is actually up to the task of providing a secure execution environment for mobile agents. At the time of writing, it has neither resource control nor proper application separation. In this article we take an in-depth look at Java as a foundation for secure mobile agent systems.
Publication Details
  • MobCops 2006 Workshop in conjunction with IEEE/ACM CollaborateCom 2006, Atlanta, Georgia, USA.
  • Nov 17, 2006

Abstract

Close
Load balancing has been an increasingly important issue for handling computational intensive tasks in a distributed system such as in Grid and cluster computing. In such systems, multiple server instances are installed for handling requests from client applications, and each request (or task) typically needs to stay in a queue before an available server is assigned to process it. In this paper, we propose a high-performance queueing method for implementing a shared queue for collaborative clusters of servers. Each cluster of servers maintains a local queue and queues of different clusters are networked to form a unified (or shared) queue that may dispatch tasks to all available servers. We propose a new randomized algorithm for forwarding requests in an overcrowded local queue to a networked queue based on load information of the local and neighboring clusters. The algorithm achieves both load balancing and locality awareness.

Term Context Models for Information Retrieval

Publication Details
  • CIKM (Conference on information and Knowledge Management) 2006, Arlington, VA
  • Nov 7, 2006

Abstract

Close
At their heart, most if not all information retrieval models utilize some form of term frequency. The notion is that the more often a query term occurs in a document, the more likely it is that document meets an information need. We examine an alternative. We propose a model which assesses the presence of a term in a document not by looking at the actual occurrence of that term, but by a set of nonindependent supporting terms, i.e. context. This yields a weighting for terms in documents which is different from and complementary to tf-based methods, and is beneficial for retrieval.
Publication Details
  • In Proceedings of the fourth ACM International Workshop on Video Surveillance & Sensor Networks VSSN '06, Santa Barbara, CA, pp. 19-26
  • Oct 27, 2006

Abstract

Close
Video surveillance systems have become common across a wide number of environments. While these installations have included more video streams, they also have been placed in contexts with limited personnel for monitoring the video feeds. In such settings, limited human attention, combined with the quantity of video, makes it difficult for security personnel to identify activities of interest and determine interrelationships between activities in different video streams. We have developed applications to support security personnel both in analyzing previously recorded video and in monitoring live video streams. For recorded video, we created storyboard visualizations that emphasize the most important activity as heuristically determined by the system. We also developed an interactive multi-channel video player application that connects camera views to map locations, alerts users to unusual and suspicious video, and visualizes unusual events along a timeline for later replay. We use different analysis techniques to determine unusual events and to highlight them in video images. These tools aid security personnel by directing their attention to the most important activity within recorded video or among several live video streams.
Publication Details
  • UIST 2006 Companion
  • Oct 16, 2006

Abstract

Close
Video surveillance requires keeping the human in the loop. Software can aid security personnel in monitoring and using video. We have developed a set of interface components designed to locate and follow important activity within security video. By recognizing and visualizing localized activity, presenting overviews of activity over time, and temporally and geographically contextualizing video playback, we aim to support security personnel in making use of the growing quantity of security video.
Publication Details
  • UIST 2006 Companion
  • Oct 16, 2006

Abstract

Close
With the growing quantity of security video, it becomes vital that video surveillance software be able to support security personnel in monitoring and tracking activities. We have developed a multi-stream video player that plays recorded and live videos while drawing the users' attention to activity in the video. We will demonstrate the features of the video player and in particular, how it focuses on keeping the human in the loop and drawing their attention to activities in the video.
Publication Details
  • Proceedings of IEEE Multimedia Signal Processing 2006
  • Oct 3, 2006

Abstract

Close
This paper presents a method for facilitating document redirection in a physical environment via a mobile camera. With this method, a user is able to move documents among electronic devices, post a paper document to a selected public display, or make a printout of a white board with simple point-and-capture operations. More specifically, the user can move a document from its source to a destination by capturing a source image and a destination image in a consecutive order. The system uses SIFT (Scale Invariant Feature Transform) features of captured images to identify the devices a user is pointing to, and issues corresponding commands associated with identified devices. Unlike RF/IR based remote controls, this method uses object visual features as an all time 'transmitter' for many tasks, and therefore is easy to deploy. We present experiments on identifying three public displays and a document scanner in a conference room for evaluation.

The USE Project: Designing Smart Spaces for Real People

Publication Details
  • UbiComp 2006 Workshop position paper
  • Sep 20, 2006

Abstract

Close
We describe our work-in-progress: a "wizard-free" conference room designed for ease of use, yet retaining next-generation functionality. Called USE (Usable Smart Environments), our system uses multi-display systems, immersive conferencing, and secure authentication. It is based in cross-cultural ethnographic studies on the way people use conference rooms. The USE project has developed a flexible, extensible architecture specifically designed to enhance ease of use in smart environment technologies. The architecture allows customization and personalization of smart environments for particular people and groups, types of work, and specific physical spaces. The system consists of a database of devices with attributes, rooms and meetings that implements a prototype-instance inheritance mechanism through which contextual information (e.g. IP addresses application settings, phone numbers for teleconferencing systems, etc.) can be associated

Usable ubiquitous computing in next generation conference rooms: design, architecture and evaluation

Publication Details
  • International workshop at UbiComp 2006.
  • Sep 17, 2006

Abstract

Close
In the UbiComp 2005 workshop "Ubiquitous computing in next generation conference rooms" we learned that usability is one of the primary challenges in these spaces. Nearly all "smart" rooms, though they often have interesting and effective functionality, are very difficult to simply walk in and use. Most such rooms have resident experts who keep the room's systems functioning, and who often must be available on an everyday basis to enable the meeting technologies. The systems in these rooms are designed for and assume the presence of these human "wizards"; they are seldom designed with usability in mind. In addition, people don't know what to expect in these rooms; as yet there is no technology standard for next-generation conference rooms. The challenge here is to strike an effective balance between usability and new kinds of functionality (such as multiple displays, new interfaces, rich media systems, new uploading/access/security systems, robust mobile integration, to name just a few of the functions we saw in last year's workshop). So, this year, we propose a workshop to focus more specifically on how the design of next-generation conference rooms can support usability: the tasks facing the real people who use these rooms daily. Usability in ubiquitous computing has been the topic of several papers and workshops. Focusing on usability in next-generation conference rooms lets us bring to bear some of the insights from this prior work in a delineated application space. In addition the workshop will be informed by the most recent usability research in ubiquitous computing, rich media, context-aware mobile systems, multiple display environments, and interactive physical environments. We also are vitally concerned with how usability in smart environments tracks (or doesn't) across cultures. Conference room research has been and remains a focal point for some of the most interesting and applied work in ubiquitous computing. It is also an area where there are many real-world applications and daily opportunities for user feed-back: in short, a rich area for exploring usable ubiquitous computing. We see a rich opportunity to draw together researchers not only from conference room research but also from areas such as interactive furniture/smart environments, rich media, social computing, remote conferencing, and mobile devices for a lively exchange of ideas on usability in applied ubicomp systems for conference rooms.
Publication Details
  • International Conference on Pattern Recognition
  • Aug 20, 2006

Abstract

Close
This paper describes a framework for detecting unusual events in surveillance videos. Most surveillance systems consist of multiple video streams, but traditional event detection systems treat individual video streams independently or combine them in the feature extraction level through geometric reconstruction. Our framework combines multiple video streams in the inference level, with a coupled hidden Markov Model (CHMM). We use two-stage training to bootstrap a set of usual events, and train a CHMM over the set. By thresholding the likelihood of a test segment being generated by the model, we build a unusual event detector. We evaluate the performance of our detector through qualitative and quantitative experiments on two sets of real world videos.
Publication Details
  • Interactive Video; Algorithms and Technologies Hammoud, Riad (Ed.) 2006, XVI, 250 p., 109 illus., Hardcover.
  • Jun 7, 2006

Abstract

Close
This chapter describes tools for browsing and searching through video to enable users to quickly locate video passages of interest. Digital video databases containing large numbers of video programs ranging from several minutes to several hours in length are becoming increasingly common. In many cases, it is not sufficient to search for relevant videos, but rather to identify relevant clips, typically less than one minute in length, within the videos. We offer two approaches for finding information in videos. The first approach provides an automatically generated interactive multi-level summary in the form of a hypervideo. When viewing a sequence of short video clips, the user can obtain more detail on the clip being watched. For situations where browsing is impractical, we present a video search system with a flexible user interface that incorporates dynamic visualizations of the underlying multimedia objects. The system employs automatic story segmentation, and displays the results of text and image-based queries in ranked sets of story summaries. Both approaches help users to quickly drill down to potentially relevant video clips and to determine the relevance by visually inspecting the material.

Visualization in Audio-Based Music Information Retrieval

Publication Details
  • Computer Music Journal Vol. 30, Issue 2, pp. 42-62, 2006.
  • Jun 6, 2006

Abstract

Close
Music Information Retrieval (MIR) is an emerging research area that explores how music stored digitally can be effectively organized, searched, retrieved and browsed. The explosive growth of online music distribution, portable music players and lowering costs of recording indicate that in the near future most of recorded music in human history will be available digitally. MIR is steadily growing as a research area as can be evidenced by the international conference on music information retrieval (ISMIR) series soon in its sixth year and the increasing number of MIR-related publications in the Computer Music Journal as well as other journals and conferences.
Publication Details
  • Complexity, Vol 11, No 5.
  • Jun 3, 2006

Abstract

Close
Technology-the collection of devices and methods available to human society-evolves by constructing new devices and methods from ones that previously exist, and in turn offering these as possible components-building blocks-for the construction of further new devices and elements. The collective of technology in this way forms a network of elements where novel elements are created from existing ones and where more complicated elements evolve from simpler ones. We model this evolution within a simple artificial system on the computer. The elements in our system are logic circuits. New elements are formed by combination from simpler existing elements (circuits), and if a novel combination satisfies one of a set of needs it is retained as a building block for further combination. We study the properties of the resulting buildout. We find that our artificial system can create complicated technologies (circuits), but only by first creating simpler ones as building blocks. Our results mirror Lenski et al.'s, that complex features can be created in biological evolution only if simpler functions are first favored and act as stepping stones. We also find evidence that the resulting collection of technologies exists at self-organized criticality.
Publication Details
  • Proceedings of AVI '06 (Short Paper), ACM Press, pp. 258-261.
  • May 23, 2006

Abstract

Close
During grouping tasks for data exploration and sense-making, the criteria are normally not well-defined. When users are bringing together data objects thought to be similar in some way, implicit brushing continually detects for groups on the freeform workspace, analyzes the groups' text content or metadata, and draws attention to related data by displaying visual hints and animation. This provides helpful tips for further grouping, group meaning refinement and structure discovery. The sense-making process is further enhanced by retrieving relevant information from a database or network during the brushing. Closely related to implicit brushing, target snapping provides a useful means to move a data object to one of its related groups on a large display. Natural dynamics and smooth animations also help to prevent distractions and allow users to concentrate on the grouping and thinking tasks. Two different prototype applications, note grouping for brainstorming and photo browsing, demonstrate the general applicability of the technique.
Publication Details
  • The 15th International World Wide Web Conference (WWW2006)
  • May 23, 2006

Abstract

Close
In a landmark article, over a half century ago, Vannevar Bush envisioned a "Memory Extender" device he dubbed the "memex". Bush's ideas anticipated and inspired numerous breakthroughs, including hypertext, the Internet, the World Wide Web, and Wikipedia. However, despite these triumphs, the memex has still not lived up to its potential in corporate settings. One reason is that corporate users often don't have sufficient time or incentives to contribute to a corporate memory or to explore others' contributions. At FXPAL, we are investigating ways to automatically create and retrieve useful corporate memories without any added burden on anyone. In this paper we discuss how ProjectorBox a smart appliance for automatic presentation capture and PAL Bar a system for proactively retrieving contextually relevant corporate memories have enabled us to integrate content from a variety of sources to create a cohesive multimedia corporate memory for our organization.

Tunnel Vector: A New Routing Algorithm with Scalability

Publication Details
  • The 9th IEEE Global Internet Symposium in conjunction with the 25th IEEE INFOCOM Conference, Barcelona, Catalunya, Spain, April 28 - 29, 2006
  • Apr 28, 2006

Abstract

Close
Routing algorithms such as Distance Vector and Link States have the routing table size as O(n), where n is the number of destination identifiers, thus providing only limited scalability for large networks when n is high. As the distributed hash table (DHT) techniques are extraordinarily scalable with n, our work aims at adapting a DHT approach to the design of a network-layer routing algorithm so that the average routing table size can be significantly reduced to O(log n) without losing much routing efficiency. Nonetheless, this scheme requires a major breakthrough to address some fundamental challenges. Specifically, unlike a DHT, a network-layer routing algorithm must (1) exchange its control messages without an underlying network, (2) handle link insertion/deletion and link-cost updates, and (3) provide routing efficiency. Thus, we are motivated to propose a new network-layer routing algorithm, Tunnel Vector (TV), using DHT-like multilevel routing without an underlying network. TV exchanges its control messages only via physical links and is self-configurable in response to linkage updates. In TV, the routing path of a packet is near optimal while the routing table size is O(log n) per node, with high probability. Thus, TV is suitable for routing in a very large network.
Publication Details
  • Proceedings of ACM DIS (Designing Interactive Systems) 2006, Penn State, Penn.
  • Apr 5, 2006

Abstract

Close
What does a student need to know to be a designer? Beyond a list of separate skills, what mindset does a student need to develop for designerly action now and into the future? In the excitement of the cognitive revolution, Simon proposed a way of thinking about design that promised to make it more manageable and cognitive: to think of design as a planning problem. Yet, as Suchman argued long ago, planning accounts may be applied to problems that are not at base accomplished by planning, to the detriment of design vision. This paper reports on a pedagogy that takes Suchman's criticism to heart and avoids dressing up design methods as more systematic and predictive than they in fact are. The idea is to teach design through expo-sure to not just one, but rather, many methods---that is, sets of rules or behaviors that produce artifacts for further reflec-tion and development. By introducing a large number of design methods, decoupled from theories, models or frame-works, we teach (a) important cross-methodological regu-larities in competence as a designer, (b) that the practice of design can itself be designed and (c) that method choice affects design outcomes. This provides a rich and produc-tive notion of design particularly necessary for the world of pervasive and ubiquitous computing.
Publication Details
  • EACL (11th Conference of the European Chapter of the Association for Computational Linguistics)
  • Apr 3, 2006

Abstract

Close
Probabilistic Latent Semantic Analysis (PLSA) models have been shown to provide a better model for capturing polysemy and synonymy than Latent Semantic Analysis (LSA). However, the parameters of a PLSA model are trained using the Expectation Maximization (EM) algorithm, and as a result, the trained model is dependent on the initialization values so that performance can be highly variable. In this paper we present a method for using LSA analysis to initialize a PLSA model. We also investigated the performance of our method for the tasks of text segmentation and retrieval on personal-size corpora, and present results demonstrating the efficacy of our proposed approach.

FXPAL at TRECVID 2005

Publication Details
  • Proceedings of TRECVID 2005
  • Mar 14, 2006

Abstract

Close
In 2005 FXPAL submitted results for 3 tasks at TRECVID: shot boundary detection, high-level feature extraction, and interactive search.
Publication Details
  • International Journal of Web Services Practices
  • Jan 17, 2006

Abstract

Close
Mobile users often require access to their documents while away from the office. While pre-loading documents in a repository can make those documents available remotely, people need to know in advance which documents they might need. Furthermore, it may be difficult to view, print, or share the document through a portable device such as cell phone. We describe DoKumobility, a network of web services for mobile users for managing, printing, and sharing documents. In this paper, we describe the infrastructure and illustrate its use with several applications. We conclude with a discussion of lessons learned and future work.
2005

On-Demand Overlay Networking of Collaborative Applications

Publication Details
  • IEEE CollaborateCom 2005 - The First IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing
  • Dec 19, 2005

Abstract

Close
We propose a new overlay network, called Generic Identifier Network (GIN), for collaborative nodes to share objects with transactions across affiliated organizations by merging the organizational local namespaces upon mutual agreement. Using local namespaces instead of a global namespace can avoid excessive dissemination of organizational information, reduce maintenance costs, and improve robustness against external security attacks. GIN can forward a query with an O(1) latency stretch with high probability and achieve high performance. In the absence of a complete distance map, its heuristic algorithms for self configuration are scalable and efficient. Routing tables are maintained using soft-state mechanisms for fault tolerance and adapting to performance updates of network distances. Thus, GIN has significant new advantages for building an efficient and scalable Distributed Hash Table for modern collaborative applications across organizations.