Andreas Girgensohn, Ph.D.

Distinguished Scientist

Andreas Girgensohn

Andreas is currently involved in several projects pertaining to the design, development, deployment, and evaluation of applications for video editing and indexing and for organizing digital photos. Examples of recent projects are Video MangaVideo Keyframes and Embedded Media MarkersAndreas’ research combines his interests in creating tools to support developers and end users, designing user interfaces, and improving communication and collaboration among people using multiple media and the Web. Previously, Andreas worked at NYNEX on task-oriented user interface design and development, on support for software developers, and on tools for improving communication and collaboration. Andreas has published extensively at HCI, CSCW, and Multimedia conferences. He earned his Ph.D. in Computer Science at the University of Colorado at Boulder.

Co-Authors

Publications

2007
Publication Details
  • CHI 2007, pp. 1167-1176
  • Apr 28, 2007

Abstract

Close
A common video surveillance task is to keep track of people moving around the space being monitored. It is often difficult to track activity between cameras because locations such as hallways in office buildings can look quite similar and do not indicate the spatial proximity of the cameras. We describe a spatial video player that orients nearby video feeds with the field of view of the main playing video to aid in tracking between cameras. This is compared with the traditional bank of cameras with and without interactive maps for identifying and selecting cameras. We additionally explore the value of static and rotating maps for tracking activity between cameras. The study results show that both the spatial video player and the map improve user performance when compared to the camera-bank interface. Also, subjects change cameras more often with the spatial player than either the camera bank or the map, when available.
2006
Publication Details
  • In Proceedings of the fourth ACM International Workshop on Video Surveillance & Sensor Networks VSSN '06, Santa Barbara, CA, pp. 19-26
  • Oct 27, 2006

Abstract

Close
Video surveillance systems have become common across a wide number of environments. While these installations have included more video streams, they also have been placed in contexts with limited personnel for monitoring the video feeds. In such settings, limited human attention, combined with the quantity of video, makes it difficult for security personnel to identify activities of interest and determine interrelationships between activities in different video streams. We have developed applications to support security personnel both in analyzing previously recorded video and in monitoring live video streams. For recorded video, we created storyboard visualizations that emphasize the most important activity as heuristically determined by the system. We also developed an interactive multi-channel video player application that connects camera views to map locations, alerts users to unusual and suspicious video, and visualizes unusual events along a timeline for later replay. We use different analysis techniques to determine unusual events and to highlight them in video images. These tools aid security personnel by directing their attention to the most important activity within recorded video or among several live video streams.
Publication Details
  • UIST 2006 Companion
  • Oct 16, 2006

Abstract

Close
Video surveillance requires keeping the human in the loop. Software can aid security personnel in monitoring and using video. We have developed a set of interface components designed to locate and follow important activity within security video. By recognizing and visualizing localized activity, presenting overviews of activity over time, and temporally and geographically contextualizing video playback, we aim to support security personnel in making use of the growing quantity of security video.
Publication Details
  • UIST 2006 Companion
  • Oct 16, 2006

Abstract

Close
With the growing quantity of security video, it becomes vital that video surveillance software be able to support security personnel in monitoring and tracking activities. We have developed a multi-stream video player that plays recorded and live videos while drawing the users' attention to activity in the video. We will demonstrate the features of the video player and in particular, how it focuses on keeping the human in the loop and drawing their attention to activities in the video.
Publication Details
  • Interactive Video; Algorithms and Technologies Hammoud, Riad (Ed.) 2006, XVI, 250 p., 109 illus., Hardcover.
  • Jun 7, 2006

Abstract

Close
This chapter describes tools for browsing and searching through video to enable users to quickly locate video passages of interest. Digital video databases containing large numbers of video programs ranging from several minutes to several hours in length are becoming increasingly common. In many cases, it is not sufficient to search for relevant videos, but rather to identify relevant clips, typically less than one minute in length, within the videos. We offer two approaches for finding information in videos. The first approach provides an automatically generated interactive multi-level summary in the form of a hypervideo. When viewing a sequence of short video clips, the user can obtain more detail on the clip being watched. For situations where browsing is impractical, we present a video search system with a flexible user interface that incorporates dynamic visualizations of the underlying multimedia objects. The system employs automatic story segmentation, and displays the results of text and image-based queries in ranked sets of story summaries. Both approaches help users to quickly drill down to potentially relevant video clips and to determine the relevance by visually inspecting the material.
2005
Publication Details
  • ACM Multimedia 2005, Technical Demonstrations.
  • Nov 5, 2005

Abstract

Close
The MediaMetro application provides an interactive 3D visualization of multimedia document collections using a city metaphor. The directories are mapped to city layouts using algorithms similar to treemaps. Each multimedia document is represented by a building and visual summaries of the different constituent media types are rendered onto the sides of the building. From videos, Manga storyboards with keyframe images are created and shown on the façade; from slides and text, thumbnail images are produced and subsampled for display on the building sides. The images resemble windows on a building and can be selected for media playback. To support more facile navigation between high overviews and low detail views, a novel swooping technique was developed that combines altitude and tilt changes with zeroing in on a target.
Publication Details
  • INTERACT 2005, LNCS 3585, pp. 781-794
  • Sep 12, 2005

Abstract

Close
A video database can contain a large number of videos ranging from several minutes to several hours in length. Typically, it is not sufficient to search just for relevant videos, because the task still remains to find the relevant clip, typically less than one minute of length, within the video. This makes it important to direct the users attention to the most promising material and to indicate what material they already investigated. Based on this premise, we created a video search system with a powerful and flexible user interface that incorporates dynamic visualizations of the underlying multimedia objects. The system employes an automatic story segmentation, combines text and visual search, and displays search results in ranked sets of story keyframe collages. By adapting the keyframe collages based on query relevance and indicating which portions of the video have already been explored, we enable users to quickly find relevant sections. We tested our system as part of the NIST TRECVID interactive search evaluation, and found that our user interface enabled users to find more relevant results within the allotted time than other systems employing more sophisticated analysis techniques but less helpful user interfaces.
Publication Details
  • Sixteenth ACM Conference on Hypertext and Hypermedia
  • Sep 6, 2005

Abstract

Close
Hyper-Hitchcock is a hypervideo editor enabling the direct manipulation authoring of a particular form of hypervideo called "detail-on-demand video." This form of hypervideo allows a single link out of the currently playing video to provide more details on the content currently being presented. The editor includes a workspace to select, group, and arrange video clips into several linear sequences. Navigational links placed between the video elements are assigned labels and return behaviors appropriate to the goals of the hypervideo and the role of the destination video. Hyper-Hitchcock was used by students in a Computers and New Media class to author hypervideos on a variety of topics. The produced hypervideos provide examples of hypervideo structures and the link properties and behaviors needed to support them. Feedback from students identified additional link behaviors and features required to support new hypervideo genres. This feedback is valuable for the redesign of Hyper-Hitchcock and the design of hypervideo editors in general.
Publication Details
  • ACM Transactions on Multimedia Computing, Communications, and Applications
  • Aug 8, 2005

Abstract

Close
Organizing digital photograph collections according to events such as holiday gatherings or vacations is a common practice among photographers. To support photographers in this task, we present similarity-based methods to cluster digital photos by time and image content. The approach is general, unsupervised, and makes minimal assumptions regarding the structure or statistics of the photo collection. We present several variants of an automatic unsupervised algorithm to partition a collection of digital photographs based either on temporal similarity alone, or on temporal and content-based similarity. First, inter-photo similarity is quantified at multiple temporal scales to identify likely event clusters. Second, the final clusters are determined according to one of three clustering goodness criteria. The clustering criteria trade off computational complexity and performance. We also describe a supervised clustering method based on learning vector quantization. Finally, we review the results of an experimental evaluation of the proposed algorithms and existing approaches on two test collections.
Publication Details
  • International Conference on Image and Video Retrieval 2005
  • Jul 21, 2005

Abstract

Close
Large video collections present a unique set of challenges to the search system designer. Text transcripts do not always provide an accurate index to the visual content, and the performance of visually based semantic extraction techniques is often inadequate for search tasks. The searcher must be relied upon to provide detailed judgment of the relevance of specific video segments. We describe a video search system that facilitates this user task by efficiently presenting search results in semantically meaningful units to simplify exploration of query results and query reformulation. We employ a story segmentation system and supporting user interface elements to effectively present query results at the story level. The system was tested in the 2004 TRECVID interactive search evaluations with very positive results.
Publication Details
  • CHI 2005 Extended Abstracts, ACM Press, pp. 1395-1398
  • Apr 1, 2005

Abstract

Close
We present a search interface for large video collections with time-aligned text transcripts. The system is designed for users such as intelligence analysts that need to quickly find video clips relevant to a topic expressed in text and images. A key component of the system is a powerful and flexible user interface that incorporates dynamic visualizations of the underlying multimedia objects. The interface displays search results in ranked sets of story keyframe collages, and lets users explore the shots in a story. By adapting the keyframe collages based on query relevance and indicating which portions of the video have already been explored, we enable users to quickly find relevant sections. We tested our system as part of the NIST TRECVID interactive search evaluation, and found that our user interface enabled users to find more relevant results within the allotted time than those of many systems employing more sophisticated analysis techniques.
2004
Publication Details
  • Proceedings of 2004 IEEE International Conference on Multimedia and Expo (ICME 2004)
  • Jun 27, 2004

Abstract

Close
This paper presents a method for creating highly condensed video summaries called Stained-Glass visualizations. These are especially suitable for small displays on mobile devices. A morphological grouping technique is described for finding 3D regions of high activity or motion from a video embedded in x-y-t space. These regions determine areas in the keyframes, which can be subsumed in a more general geometric framework of germs and supports: germs are the areas of interest, and supports give the context. Algorithms for packing and laying out the germs are provided. Gaps between the germs are filled using a Voronoi-based method. Irregular shapes emerge, and the result looks like stained glass.
Publication Details
  • Journal of Human Interface Society, 6(2), pp. 51-58
  • Jun 1, 2004

Abstract

Close
FX Palo Alto Laboratory provides multimedia and information technology research for the Fuji Xerox corporation based in Tokyo, Japan. FXPAL's mission is to help Fuji Xerox with a digital information technology infrastructure to support services in Fuji Xerox's Open Office Frontier. Our research spans interactive media, immersive conferencing, social computing, mobile and adaptive computing, natural language inquiry, and emerging technologies such as quantum computing and bioinformatics. Our research methods combine determining user needs, inventing new technologies, building prototype systems, informing professional communities, and transferring technology to Fuji Xerox. The physical distance between our laboratory and our parent company makes it natural for us to research problems with collaborations across time zones and cultures. To address these problems, to test our ideas, and to prepare for technology transfers, we actively create prototype systems for interactive media, immersive conferencing, and social and mobile computing. We also foster collaboration with our Japanese colleagues through a combination of face-to-face visits and both synchronous and asynchronous remote communication.
Publication Details
  • Proceedings of the Working Conference on Advanced Visual Interfaces, AVI 2004, pp. 290-297
  • May 25, 2004

Abstract

Close
We introduced detail-on-demand video as a simple type of hypervideo that allows users to watch short video segments and to follow hyperlinks to see additional detail. Such video lets users quickly access desired information without having to view the entire contents linearly. A challenge for presenting this type of video is to provide users with the appropriate affordances to understand the hypervideo structure and to navigate it effectively. Another challenge is to give authors tools that allow them to create good detail-on-demand video. Guided by user feedback, we iterated designs for a detail-on-demand video player. We also conducted two user studies to gain insight into people's understanding of hypervideo and to improve the user interface. We found that the interface design was tightly coupled to understanding hypervideo structure and that different designs greatly affected what parts of the video people accessed. The studies also suggested new guidelines for hypervideo authoring.
Publication Details
  • Communications of the ACM, February 2004, Vol. 47, No. 2, pp. 38-44
  • Feb 1, 2004

Abstract

Close
Blurring the boundary between the digital and physical in social activity spaces helps blend - and motivate - online and face-to-face community participation. This paper discusses two experimental installations of large screen displays at conferences - CHI 2002 and CSCW 2002. The displays offered a window in the conference arena onto online community information.
2003
Publication Details
  • Proc. ACM Multimedia 2003. pp. 364-373
  • Nov 1, 2003

Abstract

Close
We present similarity-based methods to cluster digital photos by time and image content. The approach is general, unsupervised, and makes minimal assumptions regarding the structure or statistics of the photo collection. We present results for the algorithm based solely on temporal similarity, and jointly on temporal and content-based similarity. We also describe a supervised algorithm based on learning vector quantization. Finally, we include experimental results for the proposed algorithms and several competing approaches on two test collections.
Publication Details
  • Proc. ACM Multimedia 2003. pp. 92-93
  • Nov 1, 2003

Abstract

Close
To simplify the process of editing interactive video, we developed the concept of "detail-on-demand" video as a subset of general hypervideo. Detail-on-demand video keeps the authoring and viewing interfaces relatively simple while supporting a wide range of interactive video applications. Our editor, Hyper-Hitchcock, provides a direct manipulation environment in which authors can combine video clips and place hyperlinks between them. To summarize a video, Hyper-Hitchcock can also automatically generate a hypervideo composed of multiple video summary levels and navigational links between these summaries and the original video. Viewers may interactively select the amount of detail they see, access more detailed summaries, and navigate to the source video through the summary.
Publication Details
  • Proc. ACM Multimedia 2003. pp. 392-401
  • Nov 1, 2003

Abstract

Close
In this paper, we describe how a detail-on-demand representation for interactive video is used in video summarization. Our approach automatically generates a hypervideo composed of multiple video summary levels and navigational links between these summaries and the original video. Viewers may interactively select the amount of detail they see, access more detailed summaries, and navigate to the source video through the summary. We created a representation for interactive video that supports a wide range of interactive video applications and Hyper-Hitchcock, an editor and player for this type of interactive video. Hyper-Hitchcock employs methods to determine (1) the number and length of levels in the hypervideo summary, (2) the video clips for each level in the hypervideo, (3) the grouping of clips into composites, and (4) the links between elements in the summary. These decisions are based on an inferred quality of video segments and temporal relations those segments.

Detail-on-Demand Hypervideo

Publication Details
  • Proc. ACM Multimedia 2003. pp. 600-601
  • Nov 1, 2003

Abstract

Close
We demonstrate the use of detail-on-demand hypervideo in interactive training and video summarization. Detail-on-demand video allows viewers to watch short video segments and to follow hyperlinks to see additional detail. The player for detail-ondemand video displays keyframes indicating what links are available at each point in the video. The Hyper-Hitchcock authoring tool helps users create hypervideo by automatically dividing video into clips that can be combined in a direct manipulation interface. Clips can be grouped into composites and hyperlinks can be placed between clips and composites. A summarization algorithm creates multi-level hypervideo summaries from linear video by automatically selecting clips and placing links between them.
Publication Details
  • Proc. IEEE Intl. Conf. on Image Processing
  • Sep 14, 2003

Abstract

Close
We present similarity-based methods to cluster digital photos by time and image content. This approach is general, unsupervised, and makes minimal assumptions regarding the structure or statistics of the photo collection. We describe versions of the algorithm using temporal similarity with and without content-based similarity, and compare the algorithms with existing techniques, measured against ground-truth clusters created by humans.
Publication Details
  • SPIE Information Technologies and Communications
  • Sep 9, 2003

Abstract

Close
Hypervideo is a form of interactive video that allows users to follow links to other video. A simple form of hypervideo, called "detail-on-demand video," provides at most one link from one segment of video to another, supporting a singlebutton interaction. Detail-on-demand video is well suited for interactive video summaries, because the user can request a more detailed summary while watching the video. Users interact with the video is through a special hypervideo player that displays keyframes with labels indicating when a link is available. While detail-on-demand summaries can be manually authored, it is a time-consuming task. To address this issue, we developed an algorithm to automatically generate multi-level hypervideo summaries. The highest level of the summary consists of the most important clip from each take or scene in the video. At each subsequent level, more clips from each take or scene are added in order of their importance. We give one example in which a hypervideo summary is created for a linear training video. We also show how the algorithm can be modified to produce a hypervideo summary for home video.

The Plasma Poster Network: Posting Multimedia Content in Public Places

Publication Details
  • Human-Computer Interaction INTERACT '03, IOS Press, pp. 599-606
  • Sep 1, 2003

Abstract

Close
Much effort has been expended in creating online information resources to foster social networks, create synergies between collocated and remote colleagues, and enhance social capital within organizations. Following the observation that physical bulletin boards serve an important community building and maintenance function, in this paper we describe a network of large screen, digital bulletin boards, the Plasma Poster Network. The function of this system is to bridge the gap between online community interactions and shared physical spaces. We describe our motivation, a fieldwork study of information sharing practices within our organization, and an internal deployment of Plasma Posters.

Weaving Between Online and Offline Community Participation

Publication Details
  • Human-Computer Interaction INTERACT '03, IOS Press, pp. 729-732
  • Sep 1, 2003

Abstract

Close
Much effort has been expended in creating online spaces for people to meet, network, share and organize. However, there is relatively little work, in comparison, that has addressed creating awareness of online community activities for those gathered together physically. We describe our efforts to advertise the online community spaces of CHIplace and CSCWplace using large screen, interactive bulletin boards that show online community information mixed with content generated at the conference itself. Our intention was to raise awareness of the online virtual community within the offline, face-to-face event. We describe the two deployments, at CHI 2002 and at CSCW 2002, and provide utilization data regarding people's participation within the physical and virtual locales.