Publications

FXPAL publishes in top scientific conferences and journals.

2010
Publication Details
  • JCDL 2010
  • Jun 21, 2010

Abstract

Close
Photo libraries are growing in quantity and size, requiring better support for locating desired photographs. MediaGLOW is an interactive visual workspace designed to address this concern. It uses attributes such as visual appearance, GPS locations, user-assigned tags, and dates to filter and group photos. An automatic layout algorithm positions photos with similar attributes near each other to support users in serendipitously finding multiple relevant photos. In addition, the system can explicitly select photos similar to specified photos. We conducted a user evaluation to determine the benefit provided by similarity layout and the relative advantages offered by the different layout similarity criteria and attribute filters. Study participants had to locate photos matching probe statements. In some tasks, participants were restricted to a single layout similarity criterion and filter option. Participants used multiple attributes to filter photos. Layout by similarity without additional filters turned out to be one of the most used strategies and was especially beneficial for geographical similarity. Lastly, the relative appropriateness of the single similarity criterion to the probe significantly affected retrieval performance.

Geometric reconstruction from point-normal data

Publication Details
  • SIAM MI'09 monograph. Related talks: SIAM GPM'09, SIAM MI'09, and BAMA (Bay Area Mathematical Adventures)
  • May 1, 2010

Abstract

Close
Creating virtual models of real spaces and objects is cumber- some and time consuming. This paper focuses on the prob- lem of geometric reconstruction from sparse data obtained from certain image-based modeling approaches. A number of elegant and simple-to-state problems arise concerning when the geometry can be reconstructed. We describe results and counterexamples, and list open problems.

Making sense of Twitter Search

Publication Details
  • In Proc. CHI2010 Workshop on Microblogging: What and How Can We Learn From It? April 11, 2010
  • Apr 11, 2010

Abstract

Close
Twitter provides a search interface to its data, along the lines of traditional search engines. But the single ranked list is a poor way to represent the richly-structured Twitter data. A more structured approach that recognizes original messages, re-tweets, people, and documents as interesting constructs is more appropriate for this kind of data. In this paper, we describe a prototype for exploring search results delivered by Twitter. The design is based on our own experience with using Twitter search, and as well as on the results of an small online questionnaire.
Publication Details
  • In Proc. CHI 2010
  • Apr 10, 2010

Abstract

Close
The use of whiteboards is pervasive across a wide range of work domains. But some of the qualities that make them successful—an intuitive interface, physical working space, and easy erasure—inherently make them poor tools for archival and reuse. If whiteboard content could be made available in times and spaces beyond those supported by the whiteboard alone, how might it be appropriated? We explore this question via ReBoard, a system that automatically captures whiteboard images and makes them accessible through a novel set of user-centered access tools. Through the lens of a seven week workplace field study, we found that by enabling new workflows, ReBoard increased the value of whiteboard content for collaboration.

Exploring the Workplace Communication Ecology

Publication Details
  • In Proc. CHI 2010
  • Apr 10, 2010

Abstract

Close
The modern workplace is inherently collaborative, and this collaboration relies on effective communication among coworkers. Many communication tools – email, blogs, wikis, Twitter, etc. – have become increasingly available and accepted in workplace communications. In this paper, we report on a study of communications technologies used over a one year period in a small US corporation. We found that participants used a large number of communication tools for different purposes, and that the introduction of new tools did not impact significantly the use of previously-adopted technologies. Further, we identified distinct classes of users based on patterns of tool use. This work has implications for the design of technology in the evolving ecology of communication tools.
Publication Details
  • In Proc. of CHI 2010
  • Apr 10, 2010

Abstract

Close
PACER is a gesture-based interactive paper system that supports fine-grained paper document content manipulation through the touch screen of a cameraphone. Using the phone's camera, PACER links a paper document to its digital version based on visual features. It adopts camera-based phone motion detection for embodied gestures (e.g. marquees, underlines and lassos), with which users can flexibly select and interact with document details (e.g. individual words, symbols and pixels). The touch input is incorporated to facilitate target selection at fine granularity,and to address some limitations of the embodied interaction, such as hand jitter and low input sampling rate. This hybrid interaction is coupled with other techniques such as semi-real time document tracking and loose physical-digital document registration, offering a gesture-based command system. We demonstrate the use of PACER in various scenarios including work-related reading, maps and music score playing. A preliminary user study on the design has produced encouraging user feedback, and suggested future research for better understanding of embodied vs. touch interaction and one vs. two handed interaction.
Publication Details
  • Symposium on Eye Tracking Research and Applications 2010
  • Mar 22, 2010

Abstract

Close
In certain applications such as radiology and imagery analysis, it is important to minimize errors. In this paper we evaluate a structured inspection method that uses eye tracking information as a feedback mechanism to the image inspector. Our two-phase method starts with a free viewing phase during which gaze data is collected. During the next phase, we either segment the image, mask previously seen areas of the image, or combine the two techniques, and repeat the search. We compare the different methods proposed for the second search phase by evaluating the inspection method using true positive and false negative rates, and subjective workload. Results show that gaze-blocked configurations reduced the subjective workload, and that gaze-blocking without segmentation showed the largest increase in true positive identifications and the largest decrease in false negative identifications of previously unseen objects.
Publication Details
  • IEEE Virtual Reality 2010 conference
  • Mar 19, 2010

Abstract

Close
This project investigates practical uses of virtual, mobile, and mixed reality systems in industrial settings, in particular control and collaboration applications for factories. In collaboration with TCHO, a chocolate maker start-up in San Francisco, we have built virtual mirror-world representations of a real-world chocolate factory and are importing its data and modeling its processes. The system integrates mobile devices such as cell phones and tablet computers. The resulting "virtual factory" is a cross-reality environment designed for simulation, visualization, and collaboration, using a set of interlinked, real-time 3D and 2D layers of information about the factory and its processes.
Publication Details
  • IEEE Pervasive Computing. 9(2). 46-55.
  • Mar 15, 2010

Abstract

Close
Paper is static but it is also light, flexible, robust, and has high resolution for reading documents in various scenarios. Digital devices will likely never match the flexibility of paper, but come with all of the benefits of computation and networking. Tags provide a simple means of bridging the gap between the two media to get the most out of both. In this paper, we explore the tradeoffs between two different types of tagging technologies – marker-based and content-based – through the lens of four systems we have developed and evaluated at our lab. From our experiences, we extrapolate issues for designers to consider when developing systems that transition between paper and digital content in a variety of different scenarios.

Abstract

Close
Browsing and searching for documents in large, online enterprise document repositories are common activities. While internet search produces satisfying results for most user queries, enterprise search has not been as successful because of differences in document types and user requirements. To support users in finding the information they need in their online enterprise repository, we created DocuBrowse, a faceted document browsing and search system. Search results are presented within the user-created document hierarchy, showing only directories and documents matching selected facets and containing text query terms. In addition to file properties such as date and file size, automatically detected document types, or genres, serve as one of the search facets. Highlighting draws the user’s attention to the most promising directories and documents while thumbnail images and automatically identified keyphrases help select appropriate documents. DocuBrowse utilizes document similarities, browsing histories, and recommender system techniques to suggest additional promising documents for the current facet and content filters.
Publication Details
  • IUI 2010 Best Paper Award
  • Feb 7, 2010

Abstract

Close
Embedded Media Markers, or simply EMMs, are nearly transparent iconic marks printed on paper documents that signify the existence of media associated with that part of the document. EMMs also guide users' camera operations for media retrieval. Users take a picture of an EMMsignified document patch using a cell phone, and the media associated with the EMM-signified document location is displayed on the phone. Unlike bar codes, EMMs are nearly transparent and thus do not interfere with the document contents. Retrieval of media associated with an EMM is based on image local features of the captured EMMsignified document patch. This paper describes a technique for semi-automatically placing an EMM at a location in a document, in such a way that it encompasses sufficient identification features with minimal disturbance to the original document.

Seamless Document Handling

Publication Details
  • Fuji Xerox Technical Report, No.19, 2010, pp. 57-65.
  • Jan 12, 2010

Abstract

Close
The current trend toward high-performance mobile networks and increasingly sophisticated mobile devices has fostered the growth of mobile workers. In mobile environments, an urgent need exists for handling documents using a mobile phone, especially for browsing documents and viewing Rich Contents created on computers. This paper describes Seamless Document Handling, which is a technology for viewing electronic documents and Rich Contents on the small screen of a mobile phone. To enhance operability and readability, we devised a method of scrolling documents efficiently by applying document image processing technology, and designed a novel user interface with a pan-and-zoom technique. We conducted on-site observations to test usability of the prototype, and gained insights difficult to acquire in a lab that led to improved functions in the prototype.
Publication Details
  • Fuji Xerox Technical Report No. 19, pp. 88-100
  • Jan 1, 2010

Abstract

Close
Browsing and searching for documents in large, online enterprise document repositories is an increasingly common problem. While users are familiar and usually satisfied with Internet search results for information, enterprise search has not been as successful because of differences in data types and user requirements. To support users in finding the information they need from electronic and scanned documents in their online enterprise repository, we created an automatic detector for genres such as papers, slides, tables, and photos. Several of those genres correspond roughly to file name extensions but are identified automatically using features of the document. This genre identifier plays an important role in our faceted document browsing and search system. The system presents documents in a hierarchy as typically found in enterprise document collections. Documents and directories are filtered to show only documents matching selected facets and containing optional query terms and to highlight promising directories. Thumbnail images and automatically identified keyphrases help select desired documents.
2009

Quantum Computing

Publication Details
  • Entry in Wiley's The Handbook of Technology Management
  • Dec 31, 2009

Abstract

Close
Changing the model underlying information and computation from a classical mechanical to a quantum mechanical one yields faster algorithms, novel cryptographic mechanisms, and alternative methods of communication. Quantum algorithms can perform a select set of tasks vastly more efficiently than any classical algorithm, but for many tasks it has been proven that quantum algorithms provide no advantage. The breadth of quantum computing applications is still being explored. Major application areas include security and the many fields that would benefit from efficient quantum simulation. The quantum information processing viewpoint provides insight into classical algorithmic issues as well as a deeper understanding of entanglement and other non-classical aspects of quantum physics.
Publication Details
  • ACM Multimedia 2009 Workshop on Large-Scale Multimedia Retrieval and Mining
  • Oct 23, 2009

Abstract

Close
We describe an efficient and scalable system for automatic image categorization. Our approach seeks to marry scalable “model-free” neighborhood-based annotation with accurate boosting-based per-tag modeling. For accelerated neighborhood-based classification, we use a set of spatial data structures as weak classifiers for an arbitrary number of categories. We employ standard edge and color features and an approximation scheme that scales to large training sets. The weak classifier outputs are combined in a tag-dependent fashion via boosting to improve accuracy. The method performs competitively with standard SVM-based per-tag classification with substantially reduced computational requirements. We present multi-label image annotation experiments using data sets of more than two million photos.

Marking up a World: Physical Markup for Virtual Content Creation (Video)

Publication Details
  • ACM Multimedia
  • Oct 21, 2009

Abstract

Close
The Pantheia system enables users to create virtual models by `marking up' the real world with pre-printed markers. The markers have prede fined meanings that guide the system as it creates models. Pantheia takes as input user captured images or video of the marked up space. This video illustrates the workings of the system and shows it being used to create three models, one of a cabinet, one of a lab, and one of a conference room. As part of the Pantheia system, we also developed a 3D viewer that spatially integrates a model with images of the model.
Publication Details
  • ACM Multimedia 2009
  • Oct 19, 2009

Abstract

Close
Existing cameraphone-based interactive paper systems fall short of the flexibility of GUIs, partly due to their deficient fine-grained interactions, limited interaction styles and inadequate targeted document types. We present PACER, a platform for applications to interact with document details (e.g. individual words, East Asian characters, math symbols, music notes, and user-specified arbitrary image regions) of generic paper documents through a camera phone. With a see-through phone interface, a user can discover symbol recurrences in a document by pointing the phone's crosshair to a symbol within a printout. The user can also continuously move the phone over a printout for gestures to copy and email an arbitrary region, or play music notes on the printout.
Publication Details
  • IJCSI International Journal of Computer Science Issues. Vol. 1.
  • Oct 15, 2009

Abstract

Close
Reading documents on mobile devices is challenging. Not only are screens small and difficult to read, but also navigating an environment using limited visual attention can be difficult and potentially dangerous. Reading content aloud using text-to-speech (TTS) processing can mitigate these problems, but only for content that does not include rich visual information. In this paper, we introduce a new technique, SeeReader, that combines TTS with automatic content recognition and document presentation control that allows users to listen to documents while also being notified of important visual content. Together, these services allow users to read rich documents on mobile devices while maintaining awareness of their visual environment.
Publication Details
  • Book chapter in "Designing User Friendly Augmented Work Environments" Series: Computer Supported Cooperative Work Lahlou, Saadi (Ed.) 2009, Approx. 340 p. 117 illus., Hardcove
  • Sep 30, 2009

Abstract

Close
The Usable Smart Environment project (USE) aims at designing easy-to-use, highly functional next-generation conference rooms. Our first design prototype focuses on creating a "no wizards" room for an American executive; that is, a room the executive could walk into and use by himself, without help from a technologist. A key idea in the USE framework is that customization is one of the best ways to create a smooth user experience. Since the system needs to fit both with the personal leadership style of the executive and the corporation's meeting culture, we began the design process by exploring the work flow in and around meetings attended by the executive. Based on our work flow analysis and the scenarios we developed from it, USE developed a flexible, extensible architecture specifically designed to enhance ease of use in smart environment technologies. The architecture allows customization and personalization of smart environments for particular people and groups, types of work, and specific physical spaces. The first USE room was designed for FXPAL's executive "Ian" and installed in Niji, a small executive conference room at FXPAL. The room Niji currently contains two large interactive whiteboards for projection of presentation material, for annotations using a digital whiteboard, or for teleconferencing; a Tandberg teleconferencing system; an RFID authentication plus biometric identification system; printing via network; a PDA-based simple controller, and a tabletop touch-screen console. The console is used for the USE room control interface, which controls and switches between all of the equipment mentioned above.
Publication Details
  • ACM Mindtrek 2009
  • Sep 30, 2009

Abstract

Close

Most mobile navigation systems focus on answering the question,“I know where I want to go, now can you show me exactly how to get there?” While this approach works well for many tasks, it is not as useful for unconstrained situations in which user goals and spatial landscapes are more fluid, such as festivals or conferences. In this paper we describe the design and iteration of the Kartta system, which we developed to answer a slightly different question: “What are the most interesting areas here and how do I find them?”

Publication Details
  • Mobile HCI 2009 (poster)
  • Sep 15, 2009

Abstract

Close
Most mobile navigation systems focus on answering the question, "I know where I want to go, now can you show me exactly how to get there?" While this approach works well for many tasks, it is not as useful for unconstrained situations in which user goals and spatial landscapes are more fluid, such as festivals or conferences. In this paper we describe the design and iteration of the Kartta system, which we developed to answer a slightly different question: "What are the most interesting areas here and how do I find them?"
Publication Details
  • Book chapter in "Understanding the New Generation Office: Collective Intelligence of 100 Specialists" (book project in Japan, by New Era Office Research Center, Tokyo)
  • Aug 18, 2009

Abstract

Close

A personal interface for information mash-up: exploring worlds both physical and virtual

Book chapter in "Understanding the New Generation Office: Collective Intelligence of 100 Specialists" (book project in Japan, by New Era Office Research Center, Tokyo) , August 18, 2009

This is a Big Idea piece for a collective intelligence book project by the New Era Office Research Center, Tokyo. It is written at the invitation of FX colleague Koushi Kawamoto. The project asks the same questions of 100 specialists: Answer these four questions about an idea for a next-generation workplace: 1. Want: what do I want to be able to do? 2. Should: what should a system to support this "want" be able to do? 3. Create: imagine what an instance of this idea might be. 4. Can: how could this instance be realized in reality?

WANT: In my ideal work environment, the data I need on everything and everyone should be available at my fingertips, all the time, in many configurations that I can mix-and-match to suit the needs of any task. This includes things like: • documents of all types • people's status, tasks, and availability • audio, video, mobile, and virtual world communication channels • links to the physical world as appropriate, for example sensors delivering factory data, or the state of the machines I use daily in the workplace (printers, my PC, conference room systems), or awareness data about my colleagues. CAN: How can we approach this problem? Let's consider the creation of a personal interface or instrument for information mashup, capable of interacting with complex data structures, for tuning smart environments, and for exploring worlds both physical and virtual, in business, social and personal realms. Like any interactive system this idea has two parts: human-facing and system-facing. These can be called Interstitia I (extending human interactivity) and Interstitia II (enabling smart environments).
Publication Details
  • Presentation at SIGGRAPH 2009, New Orleans, LA. ACM.
  • Aug 3, 2009

Abstract

Close
FXPAL, a research lab in Silicon Valley, and TCHO, a chocolate manufacturer in San Francisco, have been collaborating on exploring emerging technologies for industry. The two companies seek ways to bring people closer to the products they consume, clarifying end-to-end production processes with technologies like sensor networks for fine-grained monitoring and control, mobile process control, and real/virtual mashups using virtual and augmented realities. This work lies within and extends the area of research called mixed- or cross-reality

Mirror World Chocolate Factory

Publication Details
  • IEEE Pervasive Computing July-August 2009 (Journal, Works in Progress section)
  • Jul 18, 2009

Abstract

Close
FXPAL, a research lab in Silicon Valley, and TCHO, a chocolate manufacturer in San Francisco, have been collaborating on exploring emerging technologies for industry. The two companies seek ways to bring people closer to the products they consume, clarifying end-to-end production processes with technologies like sensor networks for fine-grained monitoring and control, mobile process control, and real/virtual mashups using virtual and augmented realities.

Interactive Models from Images of a Static Scene

Publication Details
  • Computer Graphics and Virtual Reality (CGVR '09)
  • Jul 13, 2009

Abstract

Close
FXPAL's Pantheia system enables users to create virtual models by 'marking up' a physical space with pre-printed visual markers. The meanings associated with the markers come from a markup language that enables the system to create models from a relatively sparse set of markers. This paper describes extensions to our markup language and system that support the creation of interactive virtual objects. Users place markers to define components such as doors and drawers with which an end user of the model can interact. Other interactive elements, such as controls for color changes or lighting choices, are also supported. Pantheia produced a model of a room with hinged doors, a cabinet with drawers, doors, and color options, and a railroad track.
Publication Details
  • 2009 IEEE International Conference on Multimedia and Expo (ICME)
  • Jun 30, 2009

Abstract

Close

This paper presents a tool and a novel Fast Invariant Transform (FIT) algorithm for language independent e-documents access. The tool enables a person to access an e-document through an informal camera capture of a document hardcopy. It can save people from remembering/exploring numerous directories and file names, or even going through many pages/paragraphs in one document. It can also facilitate people’s manipulation of a document or people’s interactions through documents. Additionally, the algorithm is useful for binding multimedia data to language independent paper documents. Our document recognition algorithm is inspired by the widely known SIFT descriptor [4] but can be computed much more efficiently for both descriptor construction and search. It also uses much less storage space than the SIFT approach. By testing our algorithm with randomly scaled and rotated document pages, we can achieve a 99.73% page recognition rate on the 2188-page ICME06 proceedings and 99.9% page recognition rate on a 504-page Japanese math book.

Image-based Lighting Adjustment Method for Browsing Object Images

Publication Details
  • 2009 IEEE International Conference on Multimedia and Expo (ICME)
  • Jun 30, 2009

Abstract

Close
In this paper, we describe an automatic lighting adjustment method for browsing object images. From a set of images of an object, taken under different lighting conditions, we generate two types of illuminated images: a textural image which eliminates unwanted specular reflections of the object, and a highlight image in which specularities of the object are highly preserved. Our user interface allows viewers to digitally zoom into any region of the image, and the lighting adjusted images are automatically generated for the selected region and displayed. Switching between the textural and the highlight images helps viewers to understand characteristics of the object surface.

WebNC: efficient sharing of web applications

Publication Details
  • Hypertext 2009
  • Jun 29, 2009

Abstract

Close
WebNC is a system for efficiently sharing, retrieving and viewing web applications. Unlike existing screencasting and screensharing tools, WebNC is optimized to work with web pages where a lot of scrolling happens. WebNC uses a tile-based encoding to capture, transmit and deliver web applications, and relies only on dynamic HTML and JavaScript. The resulting webcasts require very little bandwidth and are viewable on any modern web browser including Firefox and Internet Explorer as well as browsers on the iPhone and Android platforms.
Publication Details
  • Journal article in Artificial Intelligence for Engineering Design, Analysis and Manufacturing (2009), 23, 263-274. Printed in the USA. 2009 Cambridge University Press.
  • Jun 17, 2009

Abstract

Close
Modern design embraces digital augmentation, especially in the interplay of digital media content and the physical dispersion and handling of information. Based on the observation that small paper memos with sticky backs (such as Post-Its ™) are a powerful and frequently used design tool, we have created Post-Bits, a new interface device with a physical embodiment that can be handled as naturally as paper sticky notes by designers, yet add digital information affordances as well. A Post-Bit is a design prototype of a small electronic paper device for handling multimedia content, with interaction control and display in one thin flexible sheet. Tangible properties of paper such as flipping, flexing, scattering, and rubbing are mapped to controlling aspects of the multimedia content such as scrubbing, sorting, or up- or downloading dynamic media (images, video, text). In this paper we discuss both the design process involved in building a prototype of a tangible interface using new technologies, and how the use of Post-Bits as a tangible design tool can impact two common design tasks: design ideation or brainstorming, and storyboarding for interactive systems or devices.
Publication Details
  • Immerscom 2009
  • May 27, 2009

Abstract

Close
We describe Pantheia, a system that constructs virtual models of real spaces from collections of images, through the use of visual markers that guide and constrain model construction. To create a model users simply `mark up' the real world scene by placing pre-printed markers that describe scene elements or impose semantic constraints. Users then collect still images or video of the scene. From this input, Pantheia automatically and quickly produces a model. The Pantheia system was used to produce models of two rooms that demonstrate the e ectiveness of the approach.
Publication Details
  • Pervasive 2009
  • May 11, 2009

Abstract

Close
Recorded presentations are difficult to watch on a mobile phone because of the small screen, and even more challenging when the user is traveling or commuting. This demo shows an application designed for viewing presentations in a mobile situation, and describes the design process that involved on-site observation and informal user testing at our lab. The system generates a user-controllable movie by capturing a slide presentation, extracting active regions of interest using cues from the presenter, and creating pan-and-zoom effects to direct the active regions within a small screen. During playback, the user can simply watch the movie in automatic mode using a minimal amount of effort to operate the application. When more flexible control is needed, the user can switch into manual mode to temporarily focus on specific regions of interest.
Publication Details
  • ACM Transactions on Multimedia Computing, Communications and Applications, Vol. 5, Issue 2
  • May 1, 2009

Abstract

Close
Hyper-Hitchcock consists of three components for creating and viewing a form of interactive video called detail-on-demand video: a hypervideo editor, a hypervideo player, and algorithms for automatically generating hypervideo summaries. Detail-on-demand video is a form of hypervideo that supports one hyperlink at a time for navigating between video sequences. The Hyper-Hitchcock editor enables authoring of detail-on-demand video without programming and uses video processing to aid in the authoring process. The Hyper-Hitchcock player uses labels and keyframes to support navigation through and back hyperlinks. Hyper-Hitchcock includes techniques for automatically generating hypervideo summaries of one or more videos that take the form of multiple linear summaries of different lengths with links from the shorter to the longer summaries. User studies on authoring and viewing provided insight into the various roles of links in hypervideo and found that player interface design greatly affects people's understanding of hypervideo structure and the video they access.

WebNC: efficient sharing of web applications

Publication Details
  • WWW 2009
  • Apr 22, 2009

Abstract

Close
WebNC is a browser plugin that leverages the Document Object Model for efficiently sharing web browser windows or recording web browsing sessions to be replayed later. Unlike existing screen-sharing or screencasting tools, WebNC is optimized to work with web pages where a lot of scrolling happens. Rendered pages are captured as image tiles, and transmitted to a central server through http post. Viewers can watch the webcasts in realtime or asynchronously using a standard web browser: WebNC only relies on html and javascript to reproduce the captured web content. Along with the visual content of web pages, WebNC also captures their layout and textual content for later retrieval. The resulting webcasts require very little bandwidth, are viewable on any modern web browser including the iPhone and Android phones, and are searchable by keyword.
Publication Details
  • CHI2009
  • Apr 4, 2009

Abstract

Close
Zooming user interfaces are increasingly popular on mobile devices with touch screens. Swiping and pinching finger gestures anywhere on the screen manipulate the displayed portion of a page, and taps open objects within the page. This makes navigation easy but limits other manipulations of objects that would be supported naturally by the same gestures, notably cut and paste, multiple selection, and drag and drop. A popular device that suffers from this limitation is Apple's iPhone. In this paper, we present Bezel Swipe, an interaction technique that supports multiple selection, cut, copy, paste and other operations without interfering with zooming, panning, tapping and other pre-defined gestures. Participants of our user study found Bezel Swipe to be a viable alternative to direct touch selection.

DICE: Designing Conference Rooms for Usability

Publication Details
  • In Proceedings of CHI 2009
  • Apr 4, 2009

Abstract

Close
One of the core challenges now facing smart rooms is supporting realistic, everyday activities. While much research has been done to push forward the frontiers of novel interaction techniques, we argue that technology geared toward widespread adoption requires a design approach that emphasizes straightforward configuration and control, as well as flexibility. We examined the work practices of users of a large, multi-purpose conference room, and designed DICE, a system to help them use the room's capabilities. We describe the design process, and report findings about the system's usability and about people's use of a multi-purpose conference room.

Gaze-aided human-computer and human-human dialogue

Publication Details
  • Book chapter in Handbook of Research on Socio-Technical Design and Social Networking Systems, eds. Whitworth B., and de Moor, A. Information Science Reference, pp. 529-543.
  • Mar 2, 2009

Abstract

Close
Eye-gaze plays an important role in face-to-face communication. This chapter presents research on exploiting the rich information contained in human eye-gaze for two types of applications. The first is to enhance computer mediated human-human communication by overlaying eye-gaze movement onto the shared visual spatial discussion material such as a map. The second is to manage multimodal human-computer dialogue by tracking the user's eye-gaze pattern as an indicator of user's interest. We briefly review related literature and summarize results from two research projects on human-human and human-computer communication.
Publication Details
  • Proceedings of TRECVID 2008 Workshop
  • Mar 1, 2009

Abstract

Close
In 2008 FXPAL submitted results for two tasks: rushes summarization and interactive search. The rushes summarization task has been described at the ACM Multimedia workshop [1]. Interested readers are referred to that publication for details. We describe our interactive search experiments in this notebook paper.
Publication Details
  • IUI '09
  • Feb 8, 2009

Abstract

Close
We designed an interactive visual workspace, MediaGLOW, that supports users in organizing personal and shared photo collections. The system interactively places photos with a spring layout algorithm using similarity measures based on visual, temporal, and geographic features. These similarity measures are also used for the retrieval of additional photos. Unlike traditional spring-based algorithms, our approach provides users with several means to adapt the layout to their tasks. Users can group photos in stacks that in turn attract neighborhoods of similar photos. Neighborhoods partition the workspace by severing connections outside the neighborhood. By placing photos into the same stack, users can express a desired organization that the system can use to learn a neighborhood-specific combination of distances.
2008

Interactive Multimedia Search: Systems for Exploration and Collaboration

Publication Details
  • Fuji Xerox Technical Report
  • Dec 15, 2008

Abstract

Close
We have developed an interactive video search system that allows the searcher to rapidly assess query results and easily pivot off those results to form new queries. The system is intended to maximize the use of the discriminative power of the human searcher. The typical video search scenario we consider has a single searcher with the ability to search with text and content-based queries. In this paper, we evaluate a new collaborative modification of our search system. Using our system, two or more users with a common information need search together, simultaneously. The collaborative system provides tools, user interfaces and, most importantly, algorithmically-mediated retrieval to focus, enhance and augment the team's search and communication activities. In our evaluations, algorithmic mediation improved the collaborative performance of both retrieval (allowing a team of searchers to find relevant information more efficiently and effectively), and exploration (allowing the searchers to find relevant information that cannot be found while working individually). We present analysis and conclusions from comparative evaluations of the search system.

Rethinking the Podium

Publication Details
  • Chapter in "Interactive Artifacts and Furniture Supporting Collaborative Work and Learning", ed. P. Dillenbourg, J. Huang, and M. Cherubini. Published Nov. 28, 2008, Springer. Computer Supported Collaborative learning Series Vol 10.
  • Nov 28, 2008

Abstract

Close
As the use of rich media in mobile devices and smart environments becomes more sophisticated, so must the design of the everyday objects used as controllers and interfaces. Many new interfaces simply tack electronic systems onto existing forms. However, an original physical design for a smart artefact, that integrates new systems as part of the form of the device, can enhance the end-use experience. The Convertible Podium is an experiment in the design of a smart artefact with complex integrated systems for the use of rich media in meeting rooms. It combines the highly designed look and feel of a modern lectern with systems that allow it to serve as a central control station for rich media manipulation. The interface emphasizes tangibility and ease of use in controlling multiple screens, multiple media sources (including mobile devices) and multiple distribution channels, and managing both data and personal representation in remote telepresence.

Cerchiamo: a collaborative exploratory search tool

Publication Details
  • CSCW 2008 (Demo), San Diego, CA, ACM Press.
  • Nov 10, 2008

Abstract

Close
We describe Cerchiamo, a collaborative exploratory search system that allows teams of searchers to explore document collections synchronously. Working with Cerchiamo, team members use independent interfaces to run queries, browse results, and make relevance judgments. The system mediates the team members' search activity by passing and reordering search results and suggested query terms based on the teams' actions. The combination of synchronous influence with independent interaction allows team members to be more effective and efficient in performing search tasks.
Publication Details
  • Workshop held in conjunction with CSCW2008
  • Nov 8, 2008

Abstract

Close
It is increasingly common to find Multiple Display Environments (MDEs) in a variety of settings, including the workplace, the classroom, and perhaps soon, the home. While some technical challenges exist even in single-user MDEs, collaborative use of MDEs offers a rich set of opportunities for research and development. In this workshop, we will bring together experts in designing, developing, building and evaluating MDEs to improve our collective understanding of design guidelines, relevant real-world activities, evaluation methods and metrics, and opportunities for remote as well as collocated collaboration. We intend to create not only a broader understanding of this growing field, but also to foster a community of researchers interested in bringing these environments from the laboratory to the real world. In this workshop, we intended to explore the following research themes:
  • Elicitation and process of distilling design guidelines for MDE systems and interfaces.
  • Investigation and classification of activities suited for MDEs.
  • Exploration and assessment of how existing groupware theories apply to collaboration in MDEs.
  • Evaluation techniques and metrics for assessing effectiveness of prototype MDE systems and interfaces.
  • Exploration of MDE use beyond strictly collocated collaboration.

Remix rooms: Redefining the smart conference room

Publication Details
  • CSCW 2008 (Workshop)
  • Nov 8, 2008

Abstract

Close
In this workshop we will explore how the experience of smart conference rooms can be broadened to include different contexts and media such as context-aware mobile systems, personal and professional videoconferencing, virtual worlds, and social software. How should the technologies behind conference room systems reflect the rapidly changing expectations around personal devices and social online spaces like Facebook, Twitter, and Second Life? What kinds of systems are needed to support meetings in technologically complex environments? How can a mashup of conference room spaces and technologies account for differing social and cultural practices around meetings? What requirements are imposed by security and privacy issues in public and semi-public spaces?

Reading in the Office

Publication Details
  • BooksOnline'08, October 30, 2008
  • Oct 30, 2008

Abstract

Close
Reading online poses a number of technological challenges. Advances in technology such as touch screens, light-weight high-power computers, and bi-stable displays have periodically renewed interest in online reading over the last twenty years, only to see that interest decline to a small early-adopter community. The recent release of the Kindle by Amazon is another attempt to create an online reading device. Has publicity surrounding Kindle and other such devices has reached critical mass to allow them to penetrate the consumer market successfully, or will we see a decline in interest over the next couple of years echoing the lifecycle of Softbook™ and Rocket eBook™ devices that preceded them? I argue that the true value of online reading lies in supporting activities beyond reading per se: activities such as annotation, reading and comparing multiple documents, transitions between reading, writing and retrieval, etc. Whether the current hardware will be successful in the long term may depend on its abilities to address the reading needs of knowledge workers, not just leisure readers.
Publication Details
  • ACM Multimedia 2008
  • Oct 27, 2008

Abstract

Close
Audio monitoring has many applications but also raises pri- vacy concerns. In an attempt to help alleviate these con- cerns, we have developed a method for reducing the intelli- gibility of speech while preserving intonation and the ability to recognize most environmental sounds. The method is based on identifying vocalic regions and replacing the vocal tract transfer function of these regions with the transfer function from prerecorded vowels, where the identity of the replacement vowel is independent of the identity of the spoken syllable. The audio signal is then re-synthesized using the original pitch and energy, but with the modi ed vocal tract transfer function. We performed an intelligibility study which showed that environmental sounds remained recognizable but speech intelligibility can be dramatically reduced to a 7% word recognition rate.
Publication Details
  • Proceedings of ACM Multimedia '08, pp. 817-820 (Short Paper).
  • Oct 27, 2008

Abstract

Close
We present an automatic zooming technique that leverages content analysis for viewing a document page on a small display such as a mobile phone or PDA. The page can come from a scanned document (bitmap image) or an electronic document (text and graphics data plus metadata). The page with text and graphics is segmented into regions. For each region, a scale-distortion function is constructed based on image analysis of the signal distortion that occurs at different scales. During interactive viewing of the document, as the user navigates by moving the viewport around the page, the zoom factor is automatically adjusted by optimizing the scale-distortion functions of the regions visible in the viewport.

mTable: Browsing Photos and Videos on a Tabletop System

Publication Details
  • ACM Multimedia 2008 (Video)
  • Oct 27, 2008

Abstract

Close
In this video demo, we present mTable, a multimedia tabletop system for browsing photo and video collections. We have developed a set of applications for visualizing and exploring photos, a board game for labeling photos, and a 3D cityscape metaphor for browsing videos. The system is suitable for use in a living room or office lounge, and can support multiple displays by visualizing the collections on the tabletop and showing full-size images and videos on another flat panel display in the room.
Publication Details
  • ACM Multimedia 2008
  • Oct 27, 2008

Abstract

Close
PicNTell is a new technique for generating compelling screencasts where users can quickly record desktop activities and generate videos that are embeddable on popular video sharing distributions such as YouTube®. While standard video editing and screen capture tools are useful for some editing tasks, they have two main drawbacks: (1) they require users to import and organize media in a separate interface, and (2) they do not support natural (or camcorder-like) screen recording, and instead usually require the user to define a specific region or window to record. In this paper we review current screen recording use, and present the PicNTell system, pilot studies, and a new six degree-of-freedom tracker we are developing in response to our findings.
Publication Details
  • ACM Multimedia 2008
  • Oct 27, 2008

Abstract

Close
This demo introduces a tool for accessing an e-document by capturing one or more images of a real object or document hardcopy. This tool is useful when a file name or location of the file is unknown or unclear. It can save field workers and office workers from remembering/exploring numerous directories and file names. Frequently, it can convert tedious keyboard typing in a search box to a simple camera click. Additionally, when a remote collaborator cannot clearly see an object or a document hardcopy through remote collaboration cameras, this tool can be used to automatically retrieve and send the original e-document to a remote screen or printer.

Ranked Feature Fusion Models for Ad Hoc Retrieval

Publication Details
  • CIKM (Conference on Information and Knowledge Management) 2008, October, Napa, CA
  • Oct 27, 2008

Abstract

Close
We introduce the Ranked Feature Fusion framework for information retrieval system design. Typical information retrieval formalisms such as the vector space model, the best-match model and the language model first combine features (such as term frequency and document length) into a unified representation, and then use the representation to rank documents. We take the opposite approach: Documents are first ranked by the relevance of a single feature value and are assigned scores based on their relative ordering within the collection. A separate ranked list is created for every feature value and these lists are then fused to produce a final document scoring. This new ``rank then combine'' approach is extensively evaluated and is shown to be as effective as traditional ``combine then rank'' approaches. The model is easy to understand and contains fewer parameters than other approaches. Finally, the model is easy to extend (integration of new features is trivial) and modify. This advantage includes but is not limited to relevance feedback and distribution flattening.
Publication Details
  • ACM Multimedia
  • Oct 27, 2008

Abstract

Close
Retail establishments want to know about traffic flow and patterns of activity in order to better arrange and staff their business. A large number of fixed video cameras are commonly installed at these locations. While they can be used to observe activity in the retail environment, assigning personnel to this is too time consuming to be valuable for retail analysis. We have developed video processing and visualization techniques that generate presentations appropriate for examining traffic flow and changes in activity at different times of the day. Taking the results of video tracking software as input, our system aggregates activity in different regions of the area being analyzed, determines the average speed of moving objects in the region, and segments time based on significant changes in the quantity and/or location of activity. Visualizations present the results as heat maps to show activity and object counts and average velocities overlaid on the map of the space.

Virtual Physics Circus (video)

Publication Details
  • ACM Multimedia 2008
  • Oct 27, 2008

Abstract

Close
This video shows the Virtual Physics Circus, a kind of playground for experimenting with simple physical models. The system makes it easy to create worlds with common physical objects such as swings, vehicles, ramps, and walls, and interactively play with those worlds. The system can be used as a creative art medium as well as to gain understanding and intuition about physical systems. The system can be controlled by a number of UI devices such as mouse, keyboard, joystick, and tags which are tracked in 6 degrees of freedom.
Publication Details
  • ACM Multimedia 2008 Workshop: TrecVid Summarization 2008 (TVS'08)
  • Oct 26, 2008

Abstract

Close
In this paper we describe methods for video summarization in the context of the TRECVID 2008 BBC Rushes Summarization task. Color, motion, and audio features are used to segment, filter, and cluster the video. We experiment with varying the segment similarity measure to improve the joint clustering of segments with and without camera motion. Compared to our previous effort for TRECVID 2007 we have reduced the complexity of the summarization process as well as the visual complexity of the summaries themselves. We find our objective (inclusion) performance to be competitive with systems exhibiting similar subjective performance.
Publication Details
  • Demonstration at UIST 2008
  • Oct 20, 2008

Abstract

Close
The iPhone takes a fresh approach at defining the user interface for mobile devices, which invites further innovation for new generations of touch enabled mobile devices. At the same time, some of its interaction designs provide challenges. For example, swiping gestures can be used anywhere on the screen of an iPhone for navigation, no scroll bars are used. This makes navigation remarkably seamless and easy, at the expense of selection tasks that would also be supported naturally by the same gestures. In this demo, we show techniques that enable both activities simultaneously with minimal interference. We also demonstrate other user interface designs that are driven by the features and and a desire to overcome the limits of small displays for iPhone-type devices. This includes diagonal scrolling as a means to maximize line width and font size for mobile reading, and a graphical authentication method.

UbiMEET: Design and Evaluation of Smart Environments in the Workplace

Publication Details
  • Ubicomp 2008 (Workshop)
  • Sep 21, 2008

Abstract

Close
This workshop is the fourth in a series of UbiComp workshops on smart environment technologies and applications for the workplace. It offers a unique window into the state of the art through the participation of a range of researchers, designers and builders who exchange both basic research and real-world case experiences; and invites participants to share ideas about them. This year we focus on understanding appropriate design processes and creating valid evaluation metrics for smart environments (a recurrent request from previous workshop participants). What design processes allow integration of new ubicomp-style systems with existing technologies in a room that is in daily use? What evaluation methods and metrics give us an accurate picture, and how can that information best be applied in an iterative design process?

General Certificateless Encryption and Timed-Release Encryption

Publication Details
  • SCN 2008
  • Sep 10, 2008

Abstract

Close
While recent timed-release encryption (TRE) schemes are implicitly supported by a certificateless encryption (CLE) mechanism, the security models of CLE and TRE differ and there is no generic trans- formation from a CLE to a TRE. This paper gives a generalized model for CLE that fulfills the requirements of TRE. This model is secure against adversaries with adaptive trapdoor extraction capabilities for arbitrary identifiers, decryption capabilities for arbitrary public keys, and partial decryption capabilities. It also supports hierarchical identifiers. We pro- pose a concrete scheme under our generalized model and prove it secure without random oracles, yielding the first strongly-secure SMCLE and the first TRE in the standard model. In addition, our technique of partial decryption is different from the previous approach.
Publication Details
  • Social Mobile Media Workshop
  • Aug 1, 2008

Abstract

Close
Mobile media applications need to balance user and group goals, attentional constraints, and limited screen real estate. In this paper, we describe the development and testing of two application sketches designed to explore these tradeoffs. The first is retrospective and time- based and the second is prospective and space-based. We found that attentional demands dominate and mobile media applications should therefore be lightweight and hands-free as much as possible.
Publication Details
  • IADIS e-Learning 2008
  • Jul 22, 2008

Abstract

Close
While researchers have been exploring automatic presentation capture since the 1990's, real world adoption has been limited. Our research focuses on simplifying presentation capture and retrieval to reduce adoption barriers. ProjectorBox is our attempt to create a smart appliance that seamlessly captures, indexes, and archives presentation media, with streamlined user interfaces for searching, skimming, and sharing content. In this paper we describe the design of ProjectorBox and compare its use across corporate and educational settings. While our evaluation confirms the usability and utility of our approach across settings, it also highlights differences in usage and user needs, suggesting enhancements for both markets. We describe new features we have implemented to address corporate needs for enhanced privacy and security, and new user interfaces for content discovery.

Algorithmic Mediation for Collaborative Exploratory Search.

Publication Details
  • SIGIR 2008. (Singapore, Singapore, July 20 - 24, 2008). ACM, New York, NY, 315-322. Best Paper Award.
  • Jul 22, 2008

Abstract

Close
We describe a new approach to information retrieval: algorithmic mediation for intentional, synchronous collabo- rative exploratory search. Using our system, two or more users with a common information need search together, simultaneously. The collaborative system provides tools, user interfaces and, most importantly, algorithmically-mediated retrieval to focus, enhance and augment the team's search and communication activities. Collaborative search outperformed post hoc merging of similarly instrumented single user runs. Algorithmic mediation improved both collaborative search (allowing a team of searchers to nd relevant in- formation more efficiently and effectively), and exploratory search (allowing the searchers to find relevant information that cannot be found while working individually).
Publication Details
  • ACM Conf. on Image and Video Retrieval (CIVR) 2008
  • Jul 7, 2008

Abstract

Close
We have developed an interactive video search system that allows the searcher to rapidly assess query results and easily pivot on those results to form new queries. The system is intended to maximize the use of the discriminative power of the human searcher. This is accomplished by providing a hierarchical segmentation, streamlined interface, and redundant visual cues throughout. The typical search scenario includes a single searcher with the ability to search with text and content-based queries. In this paper, we evaluate new variations on our basic search system. In particular we test the system using only visual content-based search capabilities, and using paired searchers in a realtime collaboration. We present analysis and conclusions from these experiments.

FXPAL Collaborative Exploratory Video Search System

Publication Details
  • CIVR 2008 VideOlympics (Demo)
  • Jul 7, 2008

Abstract

Close
This paper describes FXPAL's collaborative, exploratory interactive video search application. We introduce a new approach to information retrieval: algorithmic mediation in support of intentional, synchronous collaborative exploratory search. Using our system, two or more users with a common information need search together, simultaneously. The collaborative system provides tools, user interfaces and, most importantly, algorithmically-mediated retrieval to focus, enhance and augment the team's search and communication activities.

Collaborative Information Seeking in Electronic Environments

Publication Details
  • Information Seeking Support Systems Workshop. An Invitational Workshop Sponsored by the National Science Foundation. Available online at http://www.ils.unc.edu/ISSS/
  • Jun 26, 2008

Abstract

Close
Collaboration in information seeking, while common in practice, is just being recognized as an important research area. Several studies have documented various collaboration strategies that people have adopted (and adapted), and some initial systems have been built. This field is in its infancy, however. We need to understand which real-world tasks are best suited for collaborative work. We need to extend models of information seeking to accommodate explicit and implicit collaboration. We need to invent a suite of algorithms to mediate search activities. We need to devise evaluation metrics that take into account multiple people's contributions to search.
Publication Details
  • IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2008
  • Jun 24, 2008

Abstract

Close
Current approaches to pose estimation and tracking can be classified into two categories: generative and discriminative. While generative approaches can accurately determine human pose from image observations, they are computationally intractable due to search in the high dimensional human pose space. On the other hand, discriminative approaches do not generalize well, but are computationally efficient. We present a hybrid model that combines the strengths of the two in an integrated learning and inference framework. We extend the Gaussian process latent variable model (GPLVM) to include an embedding from observation space (the space of image features) to the latent space. GPLVM is a generative model, but the inclusion of this mapping provides a discriminative component, making the model observation driven. Observation Driven GPLVM (OD-GPLVM) not only provides a faster inference approach, but also more accurate estimates (compared to GPLVM) in cases where dynamics are not sufficient for the initialization of search in the latent space. We also extend OD-GPLVM to learn and estimate poses from parameterized actions/gestures. Parameterized gestures are actions which exhibit large systematic variation in joint angle space for different instances due to difference in contextual variables. For example, the joint angles in a forehand tennis shot are function of the height of the ball (Figure 2). We learn these systematic variations as a function of the contextual variables. We then present an approach to use information from scene/object to provide context for human pose estimation for such parameterized actions.

Vital Sign Estimation from Passive Thermal Video

Publication Details
  • IEEE Computer Society Conference on Computer Vision and Pattern Recognition
  • Jun 24, 2008

Abstract

Close
Conventional wired detection of vital signs limits the use of these important physiological parameters by many applications, such as airport health screening, elder care, and workplace preventive care. In this paper, we explore contact-free heart rate and respiratory rate detection through measuring infrared light modulation emitted near superficial blood vessels or a nasal area respectively. To deal with complications caused by subjects' movements, facial expressions, and partial occlusions of the skin, we propose a novel algorithm based on contour segmentation and tracking, clustering of informative pixels, and dominant frequency component estimation. The proposed method achieves robust subject regions-of-interest alignment and motion compensation in infrared video with low SNR. It relaxes some strong assumptions used in previous work and substantially improves on previously reported performance. Preliminary experiments on heart rate estimation for 20 subjects and respiratory rate estimation for 8 subjects exhibit promising results.

1st International Workshop on Collaborative Information Retrieval

Publication Details
  • JCDL 2008
  • Jun 20, 2008

Abstract

Close
Explicit support for collaboration is becoming increasingly important for certain kinds of collection-building activities in digital libraries. In the last few years, several research groups have also pursued various issues related to collaboration during search [4][5][6]. We can represent collaboration in search on two dimensions - synchrony and intent. Asynchronous collaboration means that people are not working on the same problem simultaneously; implicit collaboration occurs when the system uses information from others' use of the system to inform new searches, but does not guarantee consistency of search goals. In this workshop, we are concerned with the top-left quadrant of Figure 1 that represents small groups of people working toward a common goal at the same time. These synchronous, explicit collaborations could occur amongst remotely situated users, each with their own computers, or amongst a co-located group sharing devices; these spatial configurations add yet another dimension to be considered when designing collaborative search systems.
Publication Details
  • 1st International Workshop on Collaborative Information Retrieval. JCDL 2008.
  • Jun 20, 2008

Abstract

Close
People can help other people find information in networked information seeking environments. Recently, many such systems and algorithms have proliferated in industry and in academia. Unfortunately, it is difficult to compare the systems in meaningful ways because they often define collaboration in different ways. In this paper, we propose a model of possible kinds of collaboration, and illustrate it with examples from literature. The model contains four dimensions: intent, concurrency, depth and location. This model can be used to classify existing systems and to suggest possible opportunities for design in this space.

Simple and Effective Defense Against Evil Twin Access Points

Publication Details
  • Proceedings ACM WiSec, pp. 220-235, 2008
  • Mar 31, 2008

Abstract

Close
Wireless networking is becoming widespread in many public places such as cafes. Unsuspecting users may become victims of attacks based on ``evil twin'' access points. These rogue access points are operated by criminals in an attempt to launch man-in-the-middle attacks. We present a simple protection mechanism against binding to an evil twin. The mechanism leverages short authentication string protocols for the exchange of cryptographic keys. The short string verification is performed by encoding the short strings as a sequence of colors, rendered sequentially by the user's device and by the designated access point of the cafe. The access point must have a light capable of showing two colors and must be mounted prominently in a position where users can have confidence in its authenticity. We conducted a usability study with patrons in several cafes and participants found our protection mechanism very usable.

FXPAL Interactive Search Experiments for TRECVID 2007

Publication Details
  • TRECVid 2007
  • Mar 1, 2008

Abstract

Close
In 2007 FXPAL submitted results for two tasks: rushes summarization and interactive search. The rushes summarization task has been described at the ACM Multimedia workshop. Interested readers are referred to that publication for details. We describe our interactive search experiments in this notebook paper.

Exiting the Cleanroom: On Ecological Validity and Ubiquitous Computing

Publication Details
  • Human-Computer Interaction Journal
  • Feb 15, 2008

Abstract

Close
Over the past decade and a half, corporations and academies have invested considerable time and money in the realization of ubiquitous computing. Yet design approaches that yield ecologically valid understandings of ubiquitous computing systems, which can help designers make design decisions based on how systems perform in the context of actual experience, remain rare. The central question underlying this paper is: what barriers stand in the way of real-world, ecologically valid design for ubicomp? Using a literature survey and interviews with 28 developers, we illustrate how issues of sensing and scale cause ubicomp systems to resist iteration, prototype creation, and ecologically valid evaluation. In particular, we found that developers have difficulty creating prototypes that are both robust enough for realistic use and able to handle ambiguity and error, and that they struggle to gather useful data from evaluations either because critical events occur infrequently, because the level of use necessary to evaluate the system is difficult to maintain, or because the evaluation itself interferes with use of the system. We outline pitfalls for developers to avoid as well as practical solutions, and we draw on our results to outline research challenges for the future. Crucially, we do not argue for particular processes, sets of metrics, or intended outcomes but rather focus on prototyping tools and evaluation methods that support realistic use in realistic settings that can be selected according to the needs and goals of a particular developer or researcher.
2007
Publication Details
  • The 3rd International Conference on Collaborative Computing: Networking, Applications and Worksharing
  • Nov 12, 2007

Abstract

Close
This paper summarizes our environment-image/videosupported collaboration technologies developed in the past several years. These technologies use environment images and videos as active interfaces and use visual cues in these images and videos to orient device controls, annotations and other information access. By using visual cues in various interfaces, we expect to make the control interface more intuitive than buttonbased control interfaces and command-based interfaces. These technologies can be used to facilitate high-quality audio/video capture with limited cameras and microphones. They can also facilitate multi-screen presentation authoring and playback, teleinteraction, environment manipulation with cell phones, and environment manipulation with digital pens.

Collaborative Exploratory Search

Publication Details
  • HCIR 2007, Boston, Massachusetts (HCIR = Human Computer Interaction and Information Retrieval)
  • Nov 2, 2007

Abstract

Close
We propose to mitigate the deficiencies of correlated search with collaborative search, that is, search in which a small group of people shares a common information need and actively (and synchronously) collaborates to achieve it. Furthermore, we propose a system architecture that mediates search activity of multiple people by combining their inputs and by specializing results delivered to them to take advantage of their skills and knowledge.

DOTS: Support for Effective Video Surveillance

Publication Details
  • Fuji Xerox Technical Report No. 17, pp. 83-100
  • Nov 1, 2007

Abstract

Close
DOTS (Dynamic Object Tracking System) is an indoor, real-time, multi-camera surveillance system, deployed in a real office setting. DOTS combines video analysis and user interface components to enable security personnel to effectively monitor views of interest and to perform tasks such as tracking a person. The video analysis component performs feature-level foreground segmentation with reliable results even under complex conditions. It incorporates an efficient greedy-search approach for tracking multiple people through occlusion and combines results from individual cameras into multi-camera trajectories. The user interface draws the users' attention to important events that are indexed for easy reference. Different views within the user interface provide spatial information for easier navigation. DOTS, with over twenty video cameras installed in hallways and other public spaces in our office building, has been in constant use for a year. Our experiences led to many changes that improved performance in all system components.
Publication Details
  • UIST 2007 Poster & Demo
  • Oct 7, 2007

Abstract

Close
We are exploring the use of collaborative games to generate meaningful textual tags for photos. We have designed Pho-toPlay to take advantage of the social engagement typical of board games and provide a collocated ludic environment conducive to the creation of text tags. We evaluated Photo-Play and found that it was fun and socially engaging for players. The milieu of the game also facilitated playing with personal photos, which resulted in more specific tags such as named entities than when playing with randomly selected online photos. Players also had a preference for playing with personal photos.
Publication Details
  • TRECVID Video Summarization Workshop at ACM Multimedia 2007
  • Sep 28, 2007

Abstract

Close
This paper describes a system for selecting excerpts from unedited video and presenting the excerpts in a short sum- mary video for eciently understanding the video contents. Color and motion features are used to divide the video into segments where the color distribution and camera motion are similar. Segments with and without camera motion are clustered separately to identify redundant video. Audio fea- tures are used to identify clapboard appearances for exclu- sion. Representative segments from each cluster are selected for presentation. To increase the original material contained within the summary and reduce the time required to view the summary, selected segments are played back at a higher rate based on the amount of detected camera motion in the segment. Pitch-preserving audio processing is used to bet- ter capture the sense of the original audio. Metadata about each segment is overlayed on the summary to help the viewer understand the context of the summary segments in the orig- inal video.
Publication Details
  • ICDSC 2007, pp. 132-139
  • Sep 25, 2007

Abstract

Close
Our analysis and visualization tools use 3D building geometry to support surveillance tasks. These tools are part of DOTS, our multicamera surveillance system; a system with over 20 cameras spread throughout the public spaces of our building. The geometric input to DOTS is a floor plan and information such as cubicle wall heights. From this input we construct a 3D model and an enhanced 2D floor plan that are the bases for more specific visualization and analysis tools. Foreground objects of interest can be placed within these models and dynamically updated in real time across camera views. Alternatively, a virtual first-person view suggests what a tracked person can see as she moves about. Interactive visualization tools support complex camera-placement tasks. Extrinsic camera calibration is supported both by visualizations of parameter adjustment results and by methods for establishing correspondences between image features and the 3D model.

DOTS: Support for Effective Video Surveillance

Publication Details
  • ACM Multimedia 2007, pp. 423-432
  • Sep 24, 2007

Abstract

Close
DOTS (Dynamic Object Tracking System) is an indoor, real-time, multi-camera surveillance system, deployed in a real office setting. DOTS combines video analysis and user interface components to enable security personnel to effectively monitor views of interest and to perform tasks such as tracking a person. The video analysis component performs feature-level foreground segmentation with reliable results even under complex conditions. It incorporates an efficient greedy-search approach for tracking multiple people through occlusion and combines results from individual cameras into multi-camera trajectories. The user interface draws the users' attention to important events that are indexed for easy reference. Different views within the user interface provide spatial information for easier navigation. DOTS, with over twenty video cameras installed in hallways and other public spaces in our office building, has been in constant use for a year. Our experiences led to many changes that improved performance in all system components.
Publication Details
  • IEEE Intl. Conf. on Semantic Computing
  • Sep 17, 2007

Abstract

Close
We present methods for semantic annotation of multimedia data. The goal is to detect semantic attributes (also referred to as concepts) in clips of video via analysis of a single keyframe or set of frames. The proposed methods integrate high performance discriminative single concept detectors in a random field model for collective multiple concept detection. Furthermore, we describe a generic framework for semantic media classification capable of capturing arbitrary complex dependencies between the semantic concepts. Finally, we present initial experimental results comparing the proposed approach to existing methods.
Publication Details
  • Workshop at Ubicomp 2007
  • Sep 16, 2007

Abstract

Close
The past two years at UbiComp, our workshops on design and usability in next generation conference rooms engendered lively conversations in the community of people working in smart environments. The community is clearly vital and growing. This year we would like to build on the energy from previous workshops while taking on a more interactive and exploratory format. The theme for this workshop is "embodied meeting support" and includes three tracks: mobile interaction, tangible interaction, and sensing in smart environments. We encourage participants to present work that focuses on one track or that attempts to bridge multiple tracks.
Publication Details
  • ACM Conf. on Image and Video Retrieval 2007
  • Jul 29, 2007

Abstract

Close
This paper describes FXPAL's interactive video search application, "MediaMagic". FXPAL has participated in the TRECVID interactive search task since 2004. In our search application we employ a rich set of redundant visual cues to help the searcher quickly sift through the video collection. A central element of the interface and underlying search engine is a segmentation of the video into stories, which allows the user to quickly navigate and evaluate the relevance of moderately-sized, semantically-related chunks.
Publication Details
  • ICME 2007
  • Jul 2, 2007

Abstract

Close
The recent emergence of multi-core processors enables a new trend in the usage of computers. Computer vision applications, which require heavy computation and lots of bandwidth, usually cannot run in real-time. Recent multi-core processors can potentially serve the needs of such workloads. In addition, more advanced algorithms can be developed utilizing the new computation paradigm. In this paper, we study the performance of an articulated body tracker on multi-core processors. The articulated body tracking workload encapsulates most of the important aspects of a computer vision workload. It takes multiple camera inputs of a scene with a single human object, extracts useful features, and performs statistical inference to find the body pose. We show the importance of properly parallelizing the workload in order to achieve great performance: speedups of 26 on 32 cores. We conclude that: (1) data-domain parallelization is better than function-domain parallelization for computer vision applications; (2) data-domain parallelism by image regions and particles is very effective; (3) reducing serial code in edge detection brings significant performance improvements; (4) domain knowledge about low/mid/high level of vision computation is helpful in parallelizing the workload.

Featured Wand for 3D Interaction

Publication Details
  • ICME 2007
  • Jul 2, 2007

Abstract

Close
Our featured wand, automatically tracked by video cameras, provides an inexpensive and natural way for users to interact with devices such as large displays. The wand supports six degrees of freedom for manipulation of 3D applications like Google Earth. Our system uses a 'line scan' to estimate the wand pose tracking which simplifies processing. Several applications are demonstrated.
Publication Details
  • ICME 2007, pp. 1015-1018
  • Jul 2, 2007

Abstract

Close
We describe a new interaction technique that allows users to control nonlinear video playback by directly manipulating objects seen in the video. This interaction technique is simi-lar to video "scrubbing" where the user adjusts the playback time by moving the mouse along a slider. Our approach is superior to variable-scale scrubbing in that the user can con-centrate on interesting objects and does not have to guess how long the objects will stay in view. Our method relies on a video tracking system that tracks objects in fixed cameras, maps them into 3D space, and handles hand-offs between cameras. In addition to dragging objects visible in video windows, users may also drag iconic object representations on a floor plan. In that case, the best video views are se-lected for the dragged objects.
Publication Details
  • ICME 2007, pp. 675-678
  • Jul 2, 2007

Abstract

Close
In this paper we describe the analysis component of an indoor, real-time, multi-camera surveillance system. The analysis includes: (1) a novel feature-level foreground segmentation method which achieves efficient and reliable segmentation results even under complex conditions, (2) an efficient greedy search based approach for tracking multiple people through occlusion, and (3) a method for multi-camera handoff that associates individual trajectories in adjacent cameras. The analysis is used for an 18 camera surveillance system that has been running continuously in an indoor business over the past several months. Our experiments demonstrate that the processing method for people detection and tracking across multiple cameras is fast and robust.

POEMS: A Paper Based Meeting Service Management Tool

Publication Details
  • ICME 2007
  • Jul 2, 2007

Abstract

Close
As more and more tools are developed for meeting support tasks, properly using these tools to get expected results becomes too complicated for many meeting participants. To address this problem, we propose POEMS (Paper Offered Environment Management Service) that allows meeting participants to control services in a meeting environment through a digital pen and an environment photo on digital paper. Unlike state-of-the-art device control interfaces that require interaction with text commands, buttons, or other artificial symbols, our photo enabled service access is more intuitive. Compared with PC and PDA supported control, this new approach is more flexible and cheap. With this system, a meeting participant can initiate a whiteboard on a selected public display by tapping the display image in the photo, or print out a display by drawing a line from the display image to a printer image in the photo. The user can also control video or other active applications on a display by drawing a link between a printed controller and the image of the display. This paper presents the system architecture, implementation tradeoffs, and various meeting control scenarios.
Publication Details
  • ICME 2007
  • Jul 2, 2007

Abstract

Close
As more and more tools are developed for meeting support tasks, properly using these tools to get expected results becomes very complicated for many meeting participants. To address this problem, we propose POEMS (Paper Offered Environment Management Service) that can facilitate the activation of various services with a pen and paper based interface. With this tool, meeting participants can control meeting support devices on the same paper that they take notes. Additionally, a meeting participant can also share his/her paper drawings on a selected public display or initiate a collaborative discussion on a selected public display with a page of paper. Compared with traditional interfaces, such as tablet PC or PDA based interfaces, the interface of this tool has much higher resolution and is much cheaper and easier to deploy. The paper interface is also natural to use for ordinary people.
Publication Details
  • IEEE Pervasive Computing Magazine, Vol. 6, No. 3, Jul-Sep 2007.
  • Jul 1, 2007

Abstract

Close
AnySpot is a web service-based platform for seamlessly connecting people to their personal and shared documents wherever they go. We describe the principles behind AnySpot's design and report our experience deploying it in a large, multi-national organization.
Publication Details
  • Pervasive 2007 Invited Demo
  • May 13, 2007

Abstract

Close
We present an investigation of interaction models for slideshow applications in a multi-display environment. Three models are examined: Direct Manipulation, Billiard Ball, and Flow. These concepts can be demonstrated by the ModSlideShow prototype, which is designed as a configurable modular display system where each display unit communicates with its neighbors and fundamental operations that act locally can be composed to support the higher level interaction models. We also describe the gesture input scheme, animation feedback, and other enhancements.
Publication Details
  • CHI 2007, pp. 1167-1176
  • Apr 28, 2007

Abstract

Close
A common video surveillance task is to keep track of people moving around the space being monitored. It is often difficult to track activity between cameras because locations such as hallways in office buildings can look quite similar and do not indicate the spatial proximity of the cameras. We describe a spatial video player that orients nearby video feeds with the field of view of the main playing video to aid in tracking between cameras. This is compared with the traditional bank of cameras with and without interactive maps for identifying and selecting cameras. We additionally explore the value of static and rotating maps for tracking activity between cameras. The study results show that both the spatial video player and the map improve user performance when compared to the camera-bank interface. Also, subjects change cameras more often with the spatial player than either the camera bank or the map, when available.
Publication Details
  • CHI 2007
  • Apr 28, 2007

Abstract

Close
We present the iterative design of Momento, a tool that provides integrated support for situated evaluation of ubiquitous computing applications. We derived requirements for Momento from a user-centered design process that included interviews, observations and field studies of early versions of the tool. Motivated by our findings, Momento supports remote testing of ubicomp applications, helps with participant adoption and retention by minimizing the need for new hardware, and supports mid-to-long term studies to address infrequently occurring data. Also, Momento can gather log data, experience sampling, diary, and other qualitative data.
Publication Details
  • IEEE Transactions on Multimedia
  • Apr 1, 2007

Abstract

Close
We present a general approach to temporal media segmentation using supervised classification. Given standard low-level features representing each time sample, we build intermediate features via pairwise similarity. The intermediate features comprehensively characterize local temporal structure, and are input to an efficient supervised classifier to identify shot boundaries. We integrate discriminative feature selection based on mutual information to enhance performance and reduce processing requirements. Experimental results using large-scale test sets provided by the TRECVID evaluations for abrupt and gradual shot boundary detection are presented, demonstrating excellent performance.

Abstract

Close
3D renderings can often look cold and impersonal or even cartoonish. They can also appear too crisply detailed . This can cause viewers to concentrate on specific details when they should be focusing on a more general idea or concept. With the techniques covered in this tutorial you will be able to turn your 3D renderings into "hand drawn" looking illustrations.

Context-Aware Telecommunication Services

Publication Details
  • UNESCO Encyclopedia of Life Support Systems
  • Apr 1, 2007

Abstract

Close
This chapter describes how the changing information about an individual's location, environment, and social situation can be used to initiate and facilitate people's interactions with one another, individually and in groups. Context-aware communication is contrasted with other forms of context-aware computing and we characterize applications in terms of design decisions along two dimensions: the extent of autonomy in context sensing and the extent of autonomy in communication action. A number of context-aware communication applications from the research literature are presented in five application categories. Finally, a number of issues related to the design of context-aware communication applications are presented.
Publication Details
  • Proceedings of the AAAI Spring Symposium 2007 on quantum interaction organized by Keith von Rijsbergen, Peter Bruza, Bill Lawless, and Don Sofge
  • Mar 26, 2007

Abstract

Close
This survey, aimed at information processing researchers, highlights intriguing but lesser known results, corrects misconceptions, and suggests research areas. Themes include: certainty in quantum algorithms; the "fewer worlds" theory of quantum mechanics; quantum learning; probability theory versus quantum mechanics.
Publication Details
  • Book chapter in: A Document (Re)turn. Contributions from a Research Field in Transition (Taschenbuch), Roswitha Skare, Niels Windfeld Lund, Andreas Vårheim (eds.), Peter Lang Publishing, Incorporated, 2007.
  • Feb 19, 2007

Abstract

Close
When people are checking in to flights, making reports to their company manager, composing music, delivering papers for exams in schools, or examining patients in hospitals, they all deal with documents and processes of documentation. In earlier times, documentation took place primarily in libraries and archives. While the latter are still important document institutions, documents today play a far more essential role in social life in many different domains and cultures. In this book, which celebrates the ten year anniversary of documentation studies in Tromsø, experts from many different disciplines, professional domains as well as cultures around the world present their way of dealing with documents, demonstrating many potential directions for the emerging broad field of documentation studies.

Adaptive News Access

Publication Details
  • Book chapter in "The Adaptive Web: Methods and Strategies of Web Personalization" (Springer, LNCS #4321)
  • Feb 1, 2007

Abstract

Close
This chapter describes how the adaptive web technologies discussed in this book have been applied to news access. First, we provide an overview of different types of adaptivity in the context of news access and identify corre-sponding algorithms. For each adaptivity type, we briefly discuss representative systems that use the described techniques. Next, we discuss an in-depth case study of a personalized news system. As part of this study, we outline a user modeling approach specifically designed for news personalization, and present results from an evaluation that attempts to quantify the effect of adaptive news access from a user perspective. We conclude by discussing recent trends and novel systems in the adaptive news space.

Content-based Recommendation Systems

Publication Details
  • Book chapter in "The Adaptive Web: Methods and Strategies of Web Personalization" (Springer, LNCS #4321)
  • Feb 1, 2007

Abstract

Close
This chapter discusses content-based recommendation systems, i.e., systems that recommend an item to a user based upon a description of the item and a profile of the user's interests. Content-based recommendation systems may be used in a variety of domains ranging from recommending web pages, news articles, restau-rants, television programs, and items for sale. Although the details of various systems differ, content-based recommendation systems share in common a means for describing the items that may be recommended, a means for creating a profile of the user that describes the types of items the user likes, and a means of comparing items to the user profile to determine what to recommend. The user profile is often created and updated automatically in response to feedback on the desirability of items that have been presented to the user.
Publication Details
  • PSD Magazine 2/2007 - Photoshop Art & Special Effects
  • Feb 1, 2007

Abstract

Close
With the techniques covered in this tutorial you will be able to produce two classic visual effects. First, I'll show you how to make animated titles by importing Photoshop files into Aftereffects. Next we'll add new scenic elements to some video footage, again using Photoshop. This technique will allow you to add or remove elements like tree or buildings from a shot. These techniques, especially the one we will use to alter the scene, are common to most visual effects. Watch the classic old 1933 version of King Kong. Willis O'Brien, the stop motion genius that animated Kong, pioneered the art of extending, or completely fabricating, scenery. Layering several elements painted on glass in front his puppets and rear projected footage allowed O'brien and RKO's visual effects artist Linwood Dunn to create King Kong's fantastic jungle scenes. It is said that these set-ups could be many feet deep.
2006
Publication Details
  • Henry Hexmoor, Marcin Paprzycki, Niranjan Suri (eds) Scalable Computing: Practice and Experience Volume 7, No. 4, December 2006
  • Dec 23, 2006

Abstract

Close
Current search engines crawl the Web, download content, and digest this content locally. For multimedia content, this involves considerable volumes of data. Furthermore, this process covers only publicly available content because content providers are concerned that they otherwise loose control over the distribution of their intellectual property. We present the prototype of our secure and distributed search engine, which dynamically pushes content based feature extraction to image providers. Thereby, the volume of data that is transported over the network is significantly reduced, and the concerns mentioned above are alleviated. The distribution of feature extraction and matching algorithms is done by mobile software agents. Subsequent search requests performed upon the resulting feature indices by means of remote feature comparison can either be realized through mobile software agents, or by the use of implicitly created Web services which wrap the remote comparison functionality, and thereby improve the interoperability of the search engine. We give a description of the search engine's architecture and implementation, depict our concepts to integrate agent and Web service technology, and present quantitative evaluation results. Furthermore, we discuss related security mechanisms for content protection and server security.