David Ayman Shamma, Ph. D.

Senior Research Scientist

David Ayman Shamma

Dr. David A. Shamma is a senior research scientist at FX Palo Alto Labratory (FXPAL). Prior to FXPAL, he was a principal investigator at Centrum Wiskunde & Informatica (CWI) where he lead a project on Artificial Intelligence (AI), wearables, and fashion. Before CWI, he was the founding director of the HCI Research Group at Yahoo Labs and Flickr. He investigates social computing systems (how people interact, engage, and share media experiences both online and in-the-world) through three avenues: AI, systems & prototypes, and qualitative research; his goal is to create and understand methods for media-mediated communication in small environments and at web scale. Ayman holds a B.S./M.S. from the Institute for Human and Machine Cognition at The University of West Florida and a Ph.D. in Computer Science from the Intelligent Information Laboratory at Northwestern University. He has taught courses at the Medill School of Journalism and also in many Computer Science and Studio Art departments. Prior to his Ph.D., he was a visiting research scientist in the Center for Mars Exploration at NASA Ames Research Center. Ayman’s research on technology and creative acts has attracted international attention from Wired, New York Magazine, and the Library of Congress to name a few. Outside of the lab, Ayman’s media art installations have been reviewed by The New York Times and Chicago Magazine and exhibited internationally, including the Amsterdam Dance Event, Second City Chicago, the Berkeley Art Museum, SIGGRAPH, Chicago Improv Festival, and Wired NextFest/NextMusic.

Specialties: Artificial Intelligence, HCI, Photos, Video, Synchronous Interaction, Microblogging Sharing, Social Networks, Design, Socio-Digital Systems.

Co-Authors

Publications

2019
Publication Details
  • ACM MM
  • Oct 20, 2019

Abstract

Close
Multimedia research has now moved beyond laboratory experiments and is rapidly being deployed in real-life applications including advertisements, social interaction, search, security, automated driving, and healthcare. Hence, the developed algorithms now have a direct impact on the individuals using the abovementioned services and the society as a whole. While there is a huge potential to benefit the society using such technologies, there is also an urgent need to identify the checks and balances to ensure that the impact of such technologies is ethical and positive. This panel will bring together an array of experts who have experience collecting large-scale datasets, building multimedia algorithms, and deploying them in practical applications, as well as, a lawyer whose eyes have been on the fundamental rights at stake. They will lead a discussion on the ethics and lawfulness of dataset creation, licensing, privacy of individuals represented in the datasets, algorithmic transparency, algorithmic bias, explainability, and the implications of application deployment. Through an interactive process engaging the audience, the panel hopes to: increase the awareness of such concepts in the multimedia research community; initiate a discussion on community guidelines all for setting the future direction of conducting multimedia research in a lawful and ethical manner.
Publication Details
  • International Conference on Weblogs and Social Media (ICWSM) 2019
  • Jun 12, 2019

Abstract

Close
Millions of images are shared through social media every day. Yet, we know little about how the activities and preferences of users are dependent on the content of these images. In this paper, we seek to understand viewers engagement with photos. We design a quantitative study to expand previous research on in-app visual effects (also known as filters) through the examination of visual content identified through computer vision. This study is based on analysis of 4.9M Flickr images and is organized around three important engagement factors, likes, comments and favorites. We find that filtered photos are not equally engaging across different categories of content. Photos of food and people attract more engagement when filters are used, while photos of natural scenes and photos taken at night are more engaging when left unfiltered. In addition to contributing to the research around social media engagement and photography practices, our findings offer several design implications for mobile photo sharing platforms.
Publication Details
  • ACM TVX 2019
  • Jun 5, 2019

Abstract

Close
Advancements in 360° cameras have increased their related livestreams. In the case of video conferencing, 360° cameras provide almost unrestricted visibility into a conference room for a remote viewer without the need for an articulating camera. However, local participants are left wondering if someone is connected and where remote participants might be looking. To address this, we fabricated a prototype device that shows the gaze and presence of remote 360° viewers using a ring of LEDs that match the remote viewports. We discuss the long term use of one of the prototypes in a lecture hall and present future directions for visualizing gaze presence in 360° video streams.
Publication Details
  • ACM TVX 2019
  • Jun 5, 2019

Abstract

Close
Livestreaming and video calls have grown in popularity due to the increased connectivity and advancements in mobile devices. Our interactions with these cameras are limited as the cameras are either fixed or manually remote controlled. Here we present a Wizard-of-Oz elicitation study to inform the design of interactions with smart 360\textdegree\ cameras or robotic mobile desk cameras for use in video-conferences and live-streaming situations. There was an overall preference for devices that can minimize distraction as well as preferences for devices that can show they demonstrate an understanding of video-meeting context. We find participants dynamically grow with regards to the complexity of interactions which illustrate the need for deeper event semantics within the Camera AI. Finally, we detail interaction techniques and design insights to inform the next generation of personal video cameras for streaming and collaboration.
2018

AI for Toggling the Linearity of Interactions in AR

Publication Details
  • IEEE AIVR 18
  • Dec 10, 2018

Abstract

Close
Interaction in Augmented Reality or Mixed Reality environments is generally classified into two modalities: linear (relative to object) or non-linear (relative to camera). Switching between these modes can be arduous in cases where someone's interaction with the device is limited or restricted as is often the case in medical or industrial applications where one's hands might be sterile or soiled. To solve this, we present Sound-to-Experience where the modality can be effectively toggled by a noise or sound which is detected using a modern Artificial Intelligence deep-network classifier.
Publication Details
  • ACM Intl. Conf. on Multimedia Retrieval (ICMR)
  • Jun 11, 2018

Abstract

Close
Massive Open Online Course (MOOC) platforms have scaled online education to unprecedented enrollments, but remain limited by their rigid, predetermined curricula. Increasingly, professionals consume this content to augment or update specific skills rather than complete degree or certification programs. To better address the needs of this emergent user population, we describe a visual recommender system called MOOCex. The system recommends lecture videos {\em across} multiple courses and content platforms to provide a choice of perspectives on topics. The recommendation engine considers both video content and sequential inter-topic relationships mined from course syllabi. Furthermore, it allows for interactive visual exploration of the semantic space of recommendations within a learner's current context.
Publication Details
  • Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
  • Apr 21, 2018

Abstract

Close
Massive Open Online Course (MOOC) platforms have scaled online education to unprecedented enrollments, but remain limited by their rigid, predetermined curricula. This paper presents MOOCex, a technique that can offer a more flexible learning experience for MOOCs. MOOCex can recommend lecture videos across different courses with multiple perspectives, and considers both the video content and also sequential inter-topic relationships mined from course syllabi. MOOCex is also equipped with interactive visualization allowing learners to explore the semantic space of recommendations within their current learning context. The results of comparisons to traditional methods, including content-based recommendation and ranked list representation, indicate the effectiveness of MOOCex. Further, feedback from MOOC learners and instructors suggests that MOOCex enhances both MOOC-based learning and teaching.
Publication Details
  • CHI 2018
  • Apr 21, 2018

Abstract

Close
This paper describes the development of a multi-sensory clubbing experience which was deployed during two a two-day event within the context of the Amsterdam Dance Event in October 2016 in Amsterdam. We present how the entire experience was developed end-to-end and deployed at the event through the collaboration of several project partners from industries such as art and design, music, food, technology and research. Central to the system are smart textiles, namely wristbands equipped with Bluetooth LE sensors which were used to sense people attending the dance event. We describe the components of the system, the development process, collaboration between the involved entities and the event itself. To conclude the paper, we highlight insights gained from conducting a real world research deployment across many collaborators and stakeholders.

Rethinking Summarization and Storytelling for Modern Social Multimedia

Publication Details
  • Multimedia Modeling
  • Feb 5, 2018

Abstract

Close
Traditional summarization initiatives have been focused on specific types of documents such as articles, reviews, videos, image feeds, or tweets, a practice which may result in pigeonholing the summarization task in the surrounding of modern, content-rich multimedia collections. Consequently, much of the research to date has revolved around mostly toy problems in narrow domains and working on single-source media types. We argue that summarization and story generation systems need to refocus the problem space in order to meet the information needs in the age of user-generated content in different formats and languages. Here we create a framework for flexible multimedia storytelling. Narratives, stories, and summaries carry a set of challenges in big data and dynamic multi-source media that give rise to new research in spatial-temporal representation, viewpoint generation, and explanation.
2017
Publication Details
  • ACM MM Workshop
  • Oct 23, 2017

Abstract

Close
Humans are complex and their behaviors follow complex multimodal patterns, however to solve many social computing problems one often looks at complexity in large-scale yet single point data sources or methodologies. While single data/single method techniques, fueled by large scale data, enjoyed some success, it is not without fault. Often with one type of data and method, all the other aspects of human behavior are overlooked, discarded, or, worse, misrepresented. We identify this as two succinct problems. First, social computing problems that cannot be solved using a single data source and need intelligence from multiple modals and, second, social behavior that cannot be fully understood using only one form of methodology. Throughout this talk, we discuss these problems and their implications, illustrate examples, and propose new directives to properly approach in the social computing research in today’s age.