Publications

FXPAL publishes in top scientific conferences and journals.

1999

Autonomous Synthetic Computer Characters as Personal Representatives.

Publication Details
  • In Human Cognition and Social Agent Technology, Kerstin Dautenhahn (Guest-editor), Advances in Consciousness Research Series. John Benjamins Publishing Company.
  • Feb 1, 1999
Publication Details
  • In The Computer Journal, 42 (6), pp. 534-546, 1999.
  • Feb 1, 1999

Abstract

Close
The Digestor system automatically converts web-based documents designed for desktop viewing into formats appropriate for handheld devices with small display screens, such as Palm-PCs, PDAs, and cellular phones. Digestor employs a heuristic planning algorithm and a set of structural page transformations to produce the "best" looking document for a given display size. Digestor can also be instructed, via a scripting language, to render portions of documents, thereby avoiding navigation through many screens of information. Two versions of Digestor have been deployed, one that re-authors HTML into HTML for conventional browsers, and one that converts HTML into HDML for Unwired Planet's micro-browsers. Digestor provides a crucial technology for rapidly accessing, scanning and processing information from arbitrary web-based documents from any location reachable by wired or unwired communication.
Publication Details
  • In IEEE Multimedia Systems '99, IEEE Computer Society, vol. 1, pp. 756-761, 1999.
  • Feb 1, 1999

Abstract

Close
In accessing large collections of digitized videos, it is often difficult to find both the appropriate video file and the portion of the video that is of interest. This paper describes a novel technique for determining keyframes that are different from each other and provide a good representation of the whole video. We use keyframes to distinguish videos from each other, to summarize videos, and to provide access points into them. The technique can determine any number of keyframes by clustering the frames in a video and by selecting a representative frame from each cluster. Temporal constraints are used to filter out some clusters and to determine the representative frame for a cluster. Desirable visual features can be emphasized in the set of keyframes. An application for browsing a collection of videos makes use of the keyframes to support skimming and to provide visual summaries.

As We May Read: The Reading Appliance Revolution.

Publication Details
  • Computer, Vol. 32, No. 1, January 1999, pp. 65-73.
  • Feb 1, 1999

Abstract

Close
Reading appliances allow people to work on electronic documents much as they would on paper. They therefore provide an alternative to the standard "browse or search and then print" model of reading online. By integrating a wide variety of document activities, such as searching, organizing, and skimming, and by allowing fluid movement among them, reading appliances eliminate disruptive transitions between paper and digital media.

Collaborating over Portable Reading Appliances.

Publication Details
  • In Personal Technologies, Vol. 3, No. 1, 1999.
  • Feb 1, 1999

Abstract

Close
Reading appliances or e-books hold substantial promise to help us collaborate. In this paper, we use a study of a group activity - a reading group that meets to discuss articles of mutual interest - to explore four scenarios for collaborating with e-books: (1) meetings and face-to-face discussions; (2) serendipitous sharing of annotations, as when we borrow a document from a colleague or buy a used book; (3) community-wide use of anonymous annotations to guide future readers; and (4) e-books as a basis for initiating interaction between people. In so doing, we describe some methods for implementing these facilities, and introduce design guidelines.
1998
Publication Details
  • UIST '98, ACM Press, 1998, pp. 195-202.
  • Oct 31, 1998

Abstract

Close
In this paper, we describe a technique for dynamically grouping digital ink and audio to support user interaction in freeform note-taking systems. For ink, groups of strokes might correspond to words, lines, or paragraphs of handwritten text. For audio, groups might be a complete spoken phrase or a speaker turn in a conversation. Ink and audio grouping is important for editing operations such as deleting or moving chunks of ink and audio notes. The grouping technique is based on hierarchical agglomerative clustering. This clustering algorithm yields groups of ink or audio in a range of sizes, depending on the level in the hierarchy, and thus provides structure for simple interactive selection and rapid non-linear expansion of a selection. Ink and audio grouping is also important for marking portions of notes for subsequent browsing and retrieval. Integration of the ink and audio clusters provides a flexible way to browse the notes by selecting the ink cluster and playing the corresponding audio cluster.

A Framework for Sharing Handwritten Notes.

Publication Details
  • UIST '98, ACM Press, 1998, pp. 119-120.
  • Oct 31, 1998

Abstract

Close
NotePals is an ink-based, collaborative note taking application that runs on personal digital assistants (PDAs). Meeting participants write notes in their own handwriting on a PDA. These notes are shared with other participants by synchronizing later with a shared note repository that can be viewed using a desktop-based web browser. NotePals is distinguished by its lightweight process, interface, and hardware. This demonstration illustrates the design of two different NotePals clients and our web-based note browser.
Publication Details
  • MULTIMEDIA '98, ACM Press, 1998, pp. 375-380.
  • Sep 14, 1998

Abstract

Close
Many techniques can extract information from an multimedia stream, such as speaker identity or shot boundaries. We present a browser that uses this information to navigate through stored media. Because automatically-derived information is not wholly reliable, it is transformed into a time-dependent "confidence score." When presented graphically, confidence scores enable users to make informed decisions about regions of interest in the media, so that non-interesting areas may be skipped. Additionally, index points may be determined automatically for easy navigation, selection, editing, and annotation and will support analysis types other than the speaker identification and shot detection used here.

Digital Library Information Appliances.

Publication Details
  • In Proceedings of Digital Libraries 98 (Pittsburgh, PA June 23-26), ACM Press, 1998, pp. 217-226.
  • Jun 23, 1998

Abstract

Close
Although digital libraries are intended to support education and knowledge work, current digital library interfaces are narrowly focused on retrieval. Furthermore, they are designed for desktop computers with keyboards, mice, and high-speed network connections. Desktop computers fail to support many key aspects of knowledge work, including active reading, free form ink annotation, fluid movement among document activities, and physical mobility. This paper proposes portable computers specialized for knowledge work, or digital library information appliances, as a new platform for accessing digital libraries. We present a number of ways that knowledge work can be augmented and transformed by the use of such appliances. These insights are based on our implementation of two research prototype systems: XLibris,™ an "active reading machine," and TeleWeb, a mobile World Wide Web browser.

Linking By Inking: Trailblazing in a Paper-like Hypertext

Publication Details
  • In Proceedings of Hypertext '98 (Pittsburgh, PA), ACM Press, 1998, pp. 30-39.
  • Jun 20, 1998

Abstract

Close
"Linking by inking" is a new interface for reader-directed link construction that bridges reading and browsing activities. We are developing linking by inking in XLibris,™ a hypertext system based on the paper document metaphor. Readers use a pen computer to annotate page images with free-form ink, much as they would on paper, and the computer constructs hypertext links based on the ink marks. This paper proposes two kinds of reader-directed links: automatic and manual. Automatic links are created in response to readers' annotations. The system extracts the text near free-form ink marks, uses these terms to construct queries, executes queries against a collection of documents, and unobtrusively displays links to related documents in the margin or as "further reading lists." We also present a design for manual (ad hoc) linking: circling an ink symbol generates a multi-way link to other instances of the same symbol.
Publication Details
  • Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (Seattle, WA), Vol. 6, 1998, pp. 3741-3744.
  • May 12, 1998

Abstract

Close
This paper describes a technique for segmenting video using hidden Markov models (HMM). Video is segmented into regions defined by shots, shot boundaries, and camera movement within shots. Features for segmentation include an image-based distance between adjacent video frames, an audio distance based on the acoustic difference in intervals just before and after the frames, and an estimate of motion between the two frames. Typical video segmentation algorithms classify shot boundaries by computing an image-based distance between adjacent frames and comparing this distance to fixed, manually determined thresholds. Motion and audio information is used separately. In contrast, our segmentation technique allows features to be combined within the HMM framework. Further, thresholds are not required since automatically trained HMMs take their place. This algorithm has been tested on a video data base, and has been shown to improve the accuracy of video segmentation over standard threshold-based systems.

Animated Autonomous Personal Representatives.

Publication Details
  • In Proceedings of the Second International Conference on Autonomous Agents (Minneapolis, MN), 1998, pp 8-15.
  • May 9, 1998

Abstract

Close
We describe the research goals and issues in constructing autonomous personal representatives, and the desirability of using synthetic characters as the user interface for such artifacts. An application of these autonomous representatives is then described in which characters can be attached to a document to express a user's point of view or give guided tours or presentations of the document's contents.

XLibris: The Active Reading Machine.

Publication Details
  • CHI 98 Summary, ACM Press, 1998, pp. 22-23.
  • Apr 18, 1998

Abstract

Close

Active reading is the combination of reading with critical thinking and learning, and involves not just reading per se, but also underlining, highlighting and commenting. We have built the XLibris™ "Active Reading Machine" to explore the premise that computation can enhance the active reading process. XLibris™ uses a high-resolution pen tablet display along with a paper-like user interface to emulate the physical experience of reading a document on paper: the reader can hold a scanned image of a page in his lap and mark on it with digital ink. XLibris™ monitors free-form ink annotations made while reading, and uses these to organize and to search for information. Readers can review, sort and filter clippings of their annotated text in a "Reader's Notebook." Finally, XLibris™ searches for material related to the annotated text, and displays links unobtrusively in the margin. XLibris™ demonstrates that computers can help active readers organize and find information while retaining many of the advantages of reading on paper.

The Rise of Personal Web Pages at Work.

Publication Details
  • CHI 98 Summary, ACM Press, 1998, pp. 313-314.
  • Apr 18, 1998

Abstract

Close
A series of 20 interviews in four organizations explores the ways in which employees take advantage of personal web pages to support their work and to reflect who they are. Both interviewee comments and web page examples suggest the importance of individual personalizations of information management and dissemination, presentation and perception of personality, and usage from the reader's perspective. These results can inform the development of future web technologies for use in organizations. Furthermore, this self representation on web pages is a way of making individual knowledge more available in the workplace.
Publication Details
  • CHI 98 Summary, ACM Press, 1998, pp. 141-142.
  • Apr 18, 1998

Abstract

Close
The World Wide Web is often viewed as the latest and most user friendly way of providing information over the Internet (i.e., server of documents). It is not customarily viewed as a platform for developing and deploying applications. In this tutorial, we introduce, demonstrate, and discuss how Web technologies like CGI scripts, Javascript, and Java can be used in combination with Web browsers to design, create, distribute and execute collaborative applications. We discuss constraints with the Web approach as well as recent extensions that support application development.

Beyond Paper: Supporting Active Reading with Free Form Digital Ink Annotations.

Publication Details
  • In CHI 97 Extended Abstracts, ACM Press, 1997, pp. 22-23.
  • Apr 18, 1998

Abstract

Close
Reading frequently involves not just looking at words on a page, but also underlining, highlighting and commenting, either on the text or in a separate notebook. This combination of reading with critical thinking and learning is called active reading [2]. To explore the premise that computation can enhance active reading we have built the XLibris™ "active reading machine." XLibris™ uses a commercial high-resolution pen tablet display along with a paper-like user interface to support the key affordances of paper for active reading: the reader can hold a scanned image of a page in his lap and mark on it with digital ink. To go beyond paper, XXLibris™ monitors the free-form ink annotations made while reading, and uses these to organize and to search for information. Readers can review, sort and filter clippings of their annotated text in a "Reader's Notebook." XLibris™ also searches for material related to the annotated text, and displays links to similar documents unobtrusively in the margin. XLibris™ demonstrates that computers can help active readers organize and find information while retaining many of the advantages of reading on paper.
Publication Details
  • CHI 98 Summary, ACM Press, 1998, pp. 283-284.
  • Apr 18, 1998

Abstract

Close
Peripheral awareness is a powerful human resource that has only recently been addressed in media space design. The challenge is to figure out what would be important to convey remotely and to strike a balance between too much and too little. Symbolic representation of remote activity is a powerful way to go, but as it turns out also easy to do wrong. This paper presents some early findings on problems and promises of using symbolic representation: it reports from informal studies of people using the AROMA prototype in regular office and home settings, and it conveys some lessons and designing appropriate and effective symbolic representations.

Meetings in a Virtual Space: Creating a Digital Document.

Publication Details
  • In Proceedings of the Thirty-first Annual Hawaii International Conference on System Sciences (Wailea, Hawaii, January 1998).
  • Feb 6, 1998

Abstract

Close
Improvements in computer network infrastructures and information utilities have led to an increase in the number of social and work interactions carried out 'virtually' by geographically separated group members [1, 5, 6, 7]. In this paper we describe the design and evaluation of a prototype system that supports synchronous and asynchronous collaboration between researchers separated by space and time. The system provides non-collocated team members with a digital, virtual space for information sharing and discussion. For synchronous interactions, our design prioritizes provision of shared context, real-time discourse, and real-time problem solving and negotiation between the team members. In the case of asynchronous interactions, we have prioritized the capture of team decision making and negotiation processes and the representation of these processes in a context-rich, hypertextual document of team problem solving and negotiation.
Publication Details
  • In Proceedings of the Thirty-first Annual Hawaii International Conference on System Sciences (Wailea, Hawaii, January 1998), Volume II, pp. 259-267.
  • Feb 6, 1998

Abstract

Close
In this paper we describe a method for indexing and retrieval of multimedia data based on annotation and segmentation. Our goal is the retrieval of segments of audio and video suitable for inclusion in multimedia documents. Annotation refers to the association of text data with particular time locations of the media. Segmentation is the partitioning of continuous media into homogenous regions. Retrieval is performed over segments of the media using the annotations associated with the segments. We present two scenarios that describe how these techniques might be applied. In the first, we describe how excerpts from a video-taped usage study of a new device are located for inclusion in a report on the utility of the device. In the second, we show how sound bites from a recorded meeting are obtained for use in authoring a summary of the meeting.

AESOP: An Outline-Oriented Authoring System.

Publication Details
  • In Proceedings of the Thirty-first Annual Hawaii International Conference on System Sciences (Wailea, Hawaii, January 1998), Volume II, pp. 207-215.
  • Feb 6, 1998

Abstract

Close
Because a hypermedia document is more complex than conventional text, it requires preparation with respect to two key aspects. First, the author begins to develop a "vision" of the document-usually based on some outline-level description of his objectives. At the same time, as this outline is being developed, the author begins to extract useful segments from his resource materials and prepares his first version of the logic of a system of hyperlinks among those segments. In this paper we present a system named "Authoring Environment for the deSktOP" (AESOP) with two different types of "outlining" tools to handle these aspects. Planning the "vision" consists in defining a "logical" tree structure of the document. The plan for the link structure is based on a primitive unit called the view area, and AESOP provides a construct named Bento-Box for creating and manipulating view areas. Authors specify spatial and temporal layout within a single Bento-Box and define hyperlinks among the Bento-Boxes.
1997
Publication Details
  • In Proceedings: VISual'97; Second International Conference on Visual Information Systems (San Diego, CA), 1997, pp. 53-60.
  • Dec 15, 1997

Abstract

Close
As the concept of what constitutes a "content-based" search grows more mature, it becomes valuable for us to establish a clear sense of just what is meant by "content." Recent multimedia document retrieval systems have dealt with this problem by indexing across multiple indexes; but it is important to identify how such multiple indexes are dealing with multiple dimensions of a description space, rather than simply providing the user with more descriptors. In this paper we consider a description space for multimedia documents based on three "dimensions" of a document, namely context, form, and content. We analyze the nature of this space with respect to three challenging examples of multimedia search tasks, and we address the nature of the index structures that would facilitate how these tasks may be achieved. These examples then lead us to some general conclusions on the nature of multimedia indexing and the intuitions we have inherited from the tradition of books and libraries.
Publication Details
  • In GROUP'97, Proceedings of the International ACM SIGGROUP Conference on Supporting Group Work, ACM Press, 1997, pp. 385-394.
  • Nov 16, 1997

Abstract

Close
The prevalence of audio and video options on computers, coupled with the promise of bandwidth, have many prognosticators predicting a revolution in human communications. But what if the revolution materializes and no users show up? We were confronted with this question when we began deploying and studying the use of a video-based, background awareness application within our organization. Repeatedly, new users raised strong concerns about self-presentation, surveillance, privacy, video snapshots, and lack of audience cues. We describe how we addressed these concerns by evolving the application. As a consequence, we are also redesigning the user interface to the application.
Publication Details
  • Computer Networks and ISDN Systems, 29(8-13), pp. 1531-1542
  • Sep 30, 1997

Abstract

Close
The phenomenal interest and growth of the World Wide Web as an application server has pushed the Web model to its limits. Specifically, the Web offers limited interactivity and versatility as a platform for networked applications. One major challenge for the HCI community is to determine how to improve the human-computer interface for Web-based applications. This paper focuses on a significant Web deficiency - supporting truly interactive and dynamic form-based input. We propose a well-worked form interaction abstraction that alleviates this Web deficiency. We describe how the abstraction is seamlessly integrated into the Web framework by leveraging on the virtues of the Web and fitting within the interaction and usage model of the Web.

Formal experiments in causal attire: Case studies in information exploration

Publication Details
  • New Review of Hypermedia and Multimedia . Vol 3. (1997), Taylor Graham. pp. 123-158.
  • Jun 1, 1997

Abstract

Close
This paper addresses the issue of how research methodology can be developed for the specific needs of research into information exploration behavior, based on a four year program of research on individual strategies in information exploration. We propose a meta-experimental framework where research is carried out through a dynamic interaction between what and why questions, and between confirmatory and exploratory analyses. This approach preserves many of the advantages of formal experimentation, while permitting a more holistic examination of phenomena that is characteristic of ethnography. The application of the meta-theoretical framework is illustrated in three case studies that examined new information exploration functionalities and interfaces and their relationship to expertise and exploration strategy.

The Newspaper as an Information Exploration Metaphor

Publication Details
  • Journal of Information Processing & Management.33(5) pp. 663-683.
  • Jun 1, 1997

Abstract

Close
The newspaper represents a mature information presentation medium that is well-suited to the presentation of relatively short, loosely related pieces of text. This work examines the implementation of the newspaper metaphor in an information exploration interface. Based on an analysis of differences between electronic books and electronic newspapers, we submit that the newspaper metaphor is an appropriate interface paradigm for large-scale full-text databases. Similarities between newspapers and hypertext databases lead us to suggest that this metaphor is appropriate for large automatically-generated hypertexts, independent of the nature of their content. We describe VOIR, a software prototype that we have used as an electronic newspaper workbench. The program constructs newspaper pages interactively, and allows users to specify their information-seeking intent in a variety of ways, including graphical Boolean queries, hypertext links, and typed-in queries. Finally, we discuss some implications that this work has for hypertext and information retrieval in general.

Signs, Links, and the Semiotics of Hypertext,

Publication Details
  • Proceedings: 1st International Workshop on Computational Semiotics (SEMIOTICS'97), 1997.
  • May 26, 1997

Abstract

Close
This paper examines the semiotic nature of the hypertext document from two points of view, both of which are based on Roland Barthes' ELEMENTS OF SEMIOLOGY. From the more conventional point of view, the hypertext document is discussed with respect to the four areas analyzed by Barthes: the distinction between language and speech, the relationship between the signifier and the signified, the syntagmatic and associative relationships among signs, and the hierarchical embedding of signs through connotation and metalanguage. The second examination is a hypertext document designed in such a way that the reader may EXPERIENCE Barthes' elements of semiology, rather than serve as a passive receiver of his exposition. The result is an interactive environment in its own right, entitled "Signs and Links," that, through the experience of interaction, engages the reader in the definition of the conceptual space of Barthes' text.

Digestor: Device-Independent Access to the World Wide Web.

Publication Details
  • In Proceedings for the Sixth International World Wide Web Conference, 1997, pp. 655-663.
  • Apr 7, 1997

Abstract

Close
Digestor is a software system which automatically re-authors arbitrary documents from the world-wide web to display appropriately on small screen devices such as PDAs and cellular phones, providing device-independent access to the web. Digestor is implemented as an HTTP proxy which dynamically re-authors requested web pages using a heuristic planning algorithm and a set of structural page transformations to achieve the best looking document for a given display size.
Publication Details
  • In Proceedings of Hypertext 97, ACM Press, 1997, pp. 67-74.
  • Apr 5, 1997

Abstract

Close
Traditionally hypertexts have been limited in size by the manual effort required to create hypertext links. In addition, large hyper-linked collections may overwhelm users with the range of possible links from any node, only a fraction of which may be appropriate for a given user at any time. This work explores automatic methods of link construction based on feedback from users collected during browsing. A full-text search engine mediates the linking process. Query terms that distinguish well among documents in the database become candidate anchors; links are mediated by passage-based relevance feedback queries. The newspaper metaphor is used to organize the retrieval results. VOIR, a software prototype that implements these algorithms has been used to browse a 74,500 node (250MB) database of newspaper articles. An experiment has been conducted to test the relative effectiveness of dynamic links and user-specified queries. Experimental results suggest that link-mediated queries are more effective than user-specified queries in retrieving relevant information. The paper concludes with a discussion of possible extensions to the linking algorithms.
Publication Details
  • In CHI 97 Conference Proceedings, ACM Press, 1997, pp. 186-193.
  • Mar 21, 1997

Abstract

Close
Dynomite is a portable electronic notebook for the capture and retrieval of handwritten and audio notes. The goal of Dynomite is to merge the organization, search, and data acquisition capabilities of a computer with the benefits of a paper-based notebook. Dynomite provides novel solutions in four key problem areas. First, Dynomite uses a casual, low cognitive overhead interface. Second, for content indexing of notes, Dynomite uses ink properties and keywords. Third, to assist organization, Dynomite's properties and keywords define views, presenting a subset of the notebook content that dynamically changes as users add new information. Finally, to augment handwritten notes with audio on devices with limited storage, Dynomite continuously records audio, but only permanently stores those parts highlighted by the user.

Sensing Activity in Video Images.

Publication Details
  • In CHI 97 Extended Abstracts, ACM Press, 1997, pp. 319-320.
  • Mar 21, 1997

Abstract

Close
Video-based awareness tools increase familiarity among remote group members and provide pre-communication information. Low-cost iconic indicators provide less but more succinct information than video images while preserving privacy. Observations of and feedback from users of our video awareness tool suggest that an activity sensing feature along with a variety of privacy options combines advantages of both the video images and iconic indicator approaches. We introduced the activity sensing feature in response to user requests. It derives activity information from video images and provides options to control privacy and improves the usability of video-based awareness tools.
Publication Details
  • In CHI 97 Extended Abstracts, ACM Press, 1997, pp. 22-23.
  • Mar 21, 1997

Abstract

Close
Dynomite is a portable electronic notebook that merges the benefits of paper note-taking with the organizational capabilities of computers. Dynomite incorporates four complementary features which combine to form an easy to use system for the capture and retrieval of handwritten and audio notes. First, Dynomite has a paper-like user interface. Second, Dynomite uses ink properties and keywords for content indexing of notes. Third, Dynomite's properties and keywords allows retrieval of specific ink and notes. The user is shown a view, or a subset of the notebook content, that dynamically changes as new information is added. Finally, Dynomite continuously records audio, but only permanently stores highlighted portions so that it is possible to augment handwritten notes with audio on devices with limited storage.
Publication Details
  • In CHI 97 Conference Proceedings, ACM Press, 1997, pp. 407-414.
  • Mar 21, 1997

Abstract

Close
Hypertext interfaces are considered appropriate for information exploration tasks. The prohibitively expensive link creation effort, however, prevents traditional hypertext interfaces from being used with large coherent collections of text. Such collections typically require query-based interfaces. This paper examines a hybrid approach: the system described here creates anchors dynamically based on users' queries, and uses anchor selection as a query expansion mechanism. An experiment was conducted to compare browsing behavior in query- and link-based interfaces. Results suggest that query-mediated links are as effective as explicit queries, and that strategies adopted by users affect performance. This work has implications for the design of information exploration interfaces; the dynamic link algorithms described here are being incorporated into a Web server.
Publication Details
  • In CHI 97 Conference Proceedings, ACM Press, 1997, pp. 550-551.
  • Mar 21, 1997

Abstract

Close
Palplates are a collection of touch-screen terminals placed around the office enabling human-computer interactions at the point of need. Supporting a community of mobile authenticated workers with a small number of stationary devices is an alternative to providing each person with a portable wireless computer. In contrast to the PC's desktop metaphor, Palplates use a place metaphor that reflect the actual rooms, corridors, and buildings that are part of the office place. Users interact graphically with applications supported by a geographic database. The user interface is generated dynamically based on the user's identity, the point-of-access, and the changing collection of physical office equipment, electronic documents and applications present at any given location.

Metadata for Mixed Media Access.

Publication Details
  • In Managing Multimedia Data: Using Metadata to Integrate and Apply Digital Data. A. Sheth and W. Klas (eds.), McGraw Hill, 1997.
  • Feb 1, 1997

Abstract

Close
In this chapter, we discuss mixed-media access, an information access paradigm for multimedia data in which the media type of a query may differ from that of the data. This allows a single query to be used to retrieve information from data consisting of multiple types of media. In addition, multiple queries formulated in different media types can be used to more accurately specify the data to be retrieved. The types of media considered in this paper are speech, images of text, and full-length text. Some examples of metadata for mixed-media access are locations of keywords in speech and images, identification of speakers, locations of emphasized regions in speech, and locations of topic boundaries in text. Algorithms for automatically generating this metadata are described, including word spotting, speaker segmentation, emphatic speech detection, and subtopic boundary location. We illustrate the use of mixed-media access with an example of information access from multimedia data surrounding a formal presentation.

Text Types in Hypermedia

Publication Details
  • In Proceedings of the Thirtieth Annual Hawaii International Conference on System Sciences (Wailea, Hawaii, January 1997), Volume VI, pp. 68-77.
  • Jan 7, 1997

Abstract

Close
The discipline of narratology has long recognized the need to classify documents as instances of different text types. We have discovered that classification is as applicable to hypermedia as it is to any other document presentation. Following the work of S. Chatman (1978; 1990), we consider three such text types: description, argument and narrative. The goal of a description document is to describe some object or concept; this is usually achieved by describing component parts and then describing how those parts combine to constitute the entirety. An argument document, on the other hand, is concerned with establishing some assertion or point of view, and it is based on supporting evidence, as well as possible refutations and justifications for defeating those regulations. Finally, a narrative document recounts some sequence of events in time, addressing relationships such as causality and contingency among those events. We analyze these types through case studies that give an example of each as a hypermedia document. We then argue that this classification provides an organizational framework that facilitates the construction of outlines that serve the writer in preparing the actual content of a document. Such outlines can also benefit the reader's understanding of the content that the writer intended to convey; if the writer does not make those outlines available explicitly to the reader, the reader can use knowledge of the document type to construct his own version of those outlines. Finally, we review some early work in content based indexing and search of multimedia documents
1996
Publication Details
  • Proceedings Knowledge Representation for Interactive Multimedia Systems: Research and Experience (Budapest, Hungary, August 1996), ECAI, pp. 57-65.
  • Aug 1, 1996

Abstract

Close
An approach to semantics based on traditional paradigms of knowledge representation (e.g., developing reductionist models of video document genres), while not entirely of the mark, may be significantly misdirected. Understanding the semantics of video and multimedia must begin with understanding how video (and film) are "read" and "written." The purpose of this paper is to set an agenda for coming to an understanding of reading and writing multimedia and to address the representationalist implications of achieving that end. Further, we illustrate this research agenda, showing how we have applied concepts and methods from film theory (to the reading process) and interaction analysis (to the work of multimedia production), and what our preliminary findings might mean for computational support and knowledge representation in multimedia.
Publication Details
  • IEEE Computer Society Multimedia Newsletter, 4, 1 (August 1996), pp. 45-48.
  • Aug 1, 1996

Abstract

Close
The fundamental objective of Extended Media research is the empowering of documents through technology. We see our goal as that of inventing documents which communicate more effectively. Furthermore, we need to make it easier for the writer to record what must be communicated and for the reader to access it. Thus, we anticipate that one cannot invent new documents without also inventing new processes for both writing and reading. Our primary research thrust is thus in the authoring of hypermedia documents, supplemented by a secondary thrust concerned with the problem of managing archives and libraries where more than text is involved.
Publication Details
  • Proceedings of the 16th International Conference on Computational Linguistics (Copenhagen, Denmark, August 1996).
  • Aug 1, 1996

Abstract

Close
We have developed a technique that categorizes document images based on their content. Unlike conventional methods that use optical character recognition (OCR), we convert document images into word shape takens, a shape-based representation of words. Because we have only to recognize simple graphical features from image, this process is much faster than OCR. Although the mapping between word shape tokens and words is one-to-many, they are a rich source of information for content characterization. Using a vector space classifier with a scanned document image database, we show that the word shape token-based approach is quite adequate for content-oriented categorization in terms of accuracy compared with conventional OCR-based approaches.
Publication Details
  • Proceedings Interface Conference (Sydney, Australia, July 1996).
  • Jul 1, 1996

Abstract

Close
Online digital audio is a rapidly growing resource, which can be accessed in rich new ways not previously possible. For example, it is possible to listen to just those portions of a long discussion which involve a given subset of people, or to instantly skip ahead to the next speaker. Providing this capability to users, however, requires generation of necessary indices, as well as an interface which utilizes these indices to aid navigation. We describe algorithms which generate indices from automatic acoustic segmentation. These algorithms use hidden Markov models to segment audio into segments corresponding to different speakers or acoustics classes (e.g. music). Unsupervised model initialization using agglomerative clustering is described, and shown to work as well in most cases as supervised initialization. We also describe a user interface which displays the segmentation in the form of a timeline, which tracks for the different acoustic classes. The interface can be used for direct navigation through the audio.

Integrating Information via Matchmaking

Publication Details
  • Journal of Intelligent Information Systems, 1996.
  • Jun 1, 1996

Abstract

Close
Trends such as the massive increase in information available via electronic networks, the use of on-line product data by distributed concurrent engineering teams, and dynamic supply chain integration for electronic commerce are placing severe burdens on traditional methods of information sharing and retrieval. Sources of information are far too numerous and dynamic to be found via traditional information retrieval methods, and potential consumers are seeing increased need for automatic notification services. Matchmaking is an approach based on emerging information integration technologies whereby potential producers and consumers of information send messages describing their information capabilities and needs. These descriptions, represented in rich, machine-interpretable description languages, are unified by the matchmaker to identify potential matches. Based on the matches, a variety of information brokering services are performed. We introduce matchmaking, and argue that it permits large numbers of dynamic consumers and providers, operating on rapidly-changing data, to share information more effectively than via traditional methods. Two matchmakers are described, the SHADE matchmaker, which operates over logical and structured text languages, and the COINS matchmaker, which operates over free text. These matchmakers have been used for a variety of applications, most significantly, in the domains of engineering and electronic commerce. We describe our experiences with the SHADE and COINS matchmaker, and we outline the major observed benefits and problems of matchmaking.
Publication Details
  • International Journal of Cooperative Information Systems 6(2), 1996.
  • Feb 1, 1996

Abstract

Close
As agents see more use in dynamic, distributed information networks, information sharing facilitators, such as the SHADE matchmaker, and underlying knowledge-based agent communication protocols, such as the Knowledge Query and Manipulation Language, will see increased use. We have created several communities of agents collaborating via KQML and matchmaking within the domains of collaborative engineering and satellite image retrieval. Based on these experiences, matchmaking has proven to be very beneficial for multi-agent systems, but we have also identified a number of issues and extensions that are not only vital to KQML-based matchmaking, but to inter-agent protocols in general. These include representational approaches to advertising complex databases, approaches to error recovery and response timing, maintaining consistency among information providers, scalability, security, persistent requests in information brokering, and the dilemma between explicit vs. implicit brokering.
1995

Multimedia Document System for Temporal and Spatial Structuring

Publication Details
  • In Hypermedia Design, Montpellier 1995. S. FranssT, F. Garzotto, T. Isakowitz, J. Nardard, and M. Narard (eds.), Springer Verlag, 1996, pp. 39-58.
  • Oct 31, 1995

Abstract

Close
Structuring temporal relationships among multimedia information elements is one of the most important facilities for editing and creating multimedia documents. We have developed a multimedia system named MediaPreview which provides facilities for structuring and creating multimedia documents. In this paper, we present a document model which is adopted in MediaPreview. This model has been designed to realize spatial and temporal structuring for the documents. The main feature of this model is the concept of a "Multimedia Paragraph" (MMP) which is introduced to reduce the complexity of the temporal structuring, such as the asynchronous interactive operation among documents. The concept of "MMP" provides an explicit and basic unit which is used to create a document in a "top-down" manner. This paper also presents the system architecture and implementation of MediaPreview in a distributed environment including database system facilities. This system realizes "static" and "dynamic" integration schemes for multimedia information elements. Our system includes a parallel database engine which manipulates multimedia information elements as streams. This database engine is effectively used for creating multimedia documents.