DiG

Mining Amazon reviews

While there are many commercial systems to help people browse and compare products, these interfaces are typically product centric. To help users identify products that match their needs more efficiently, we instead focus on building a task centric interface and system.

Based on answers to initial questions about the situations in which they expect to use the product, the interface provides a ranked list of products that match their needs, presenting product features related to their tasks, as well as customer reviews and product specifications.

We mined Amazon product reviews for several products, including cameras, MFDs, and vacuums. We developed semi-automatic methods to extract the high-level information used by the system from the reviews and product specifications, including:

  • Identify and group product features, e.g., battery life, photo quality, screen, lens
  • Estimate and summarize the sentiment of opinions about those features
  • Identify product uses

Product ranking is based on identifying products with the most positive sentiment about features that are important for the planned product uses.

User studies verified our focus on high-level features for browsing products and low-level features and specifications for comparing products.

Related Publications

2011

DiG: A task-based approach to product search

Publication Details
  • IUI 2011
  • Feb 13, 2011

Abstract

Close
While there are many commercial systems designed to help people browse and compare products, these interfaces are typically product centric. To help users more efficiently identify products that match their needs, we instead focus on building a task centric interface and system. With this approach, users initially answer questions about the types of situations in which they expect to use the product. The interface reveals the types of products that match their needs and exposes high-level product features related to the kinds of tasks in which they have expressed an interest. As users explore the interface, they can reveal how those high-level features are linked to actual product data, including customer reviews and product specifications. We developed semi-automatic methods to extract the high-level features used by the system from online product data. These methods identify and group product features, mine and summarize opinions about those features, and identify product uses. User studies verified our focus on high-level features for browsing and low-level features and specifications for comparison.