We are exploring the design space of information retrieval interfaces and algorithms to support exploratory, long-running, recall-oriented information needs. This kind of information seeking is typically performed by academic researchers, patent agents, medical and pharmaceutical researchers, intelligence analysts, etc. It's characterized by many queries run by one or more people. It is an on-going activity that spans many hours, days, or even longer periods of time. While typical web-based information retrieval systems focus on high-precision, known-item types of searches, we are interested in exploring more complex and demanding information retrieval tasks.
Flowchart of interactive exploratory search
We can approximate a searcher's actions with a flow chart shown on the right. Information seeking is an iterative process that consists of many steps; effective interfaces will support transitions among these steps with minimum extra effort required of the user. The more seamless the interactions, the more attention searchers can devote to the search task rather than to the search tool.
To support these kinds of interactions, we are building interfaces and algorithms that allow searchers to explore collections in a variety of ways. We've started with Querium, a session-based search framework that keeps track of queries, documents, and other activities that occur in a search session to help people reflect on what they have done and to allow them to pivot among documents, queries, and terms to discover new information. Querium can be configured to use a range of collections, including DocuBrowse, and the TREC newspaper corpus. Queruim uses Reverted Indexing to find documents similar to documents that a user has identified as useful.
Querium allows users to perform the following actions:
- Search based on keywords
- Search based on groups of one or more documents
- Fuse results from multiple queries into a single list
- Sort and filter results based on document metadata and on (retrieval) process metadata
- Integrate inputs from multiple searchers working on a shared information need to implement collaborative search
More details about the approach can be found in this slide deck which was presented at the IIiX 2010 conference.
Technical Contact: Gene Golovchinsky.