Tangible Ideas | Jin Gao

AI-mediated brainstorming on a tabletop AR interface — speech to semantic graph to gesture-based co-ideation.

Introduction

We present an AI-mediated brainstorming system that treats artificial intelligence as an active conversation partner—not only a transcription or organization tool. Users speak ideas aloud, see them transformed into a shared visual map in real time, and interact through gesture-based input on a projection-based tabletop canvas.

Brainstorming is open-ended and rarely documented in ways that preserve spoken thought, emerging themes, and shifts in attention. People gesture, cluster, and spatially organize ideas; this project asks how AI can leverage embodied, co-located interaction to support human–AI co-ideation. The system integrates LLM-driven speech understanding with gesture on tabletop AR: it listens, extracts salient concepts, visualizes them as a live graph network, and lets users elaborate single ideas or synthesize relationships across multiple nodes—positioning AI as a partner that makes brainstorming more visual, collaborative, and tangible.

Background research on tactile mapping and data physicalization — Background — tactile mapping and data physicalization

System Pipeline

Live speech is transcribed with OpenAI Whisper. A meeting agenda provides context so the system can judge each utterance’s relevance. For every sentence, an LLM runs three parallel operations:

Keyword extraction — a concise phrase capturing the central idea.
Importance scoring — relevance to the agenda, filtering less meaningful content from the map.
Semantic embedding — a high-dimensional vector representing meaning.

Embeddings are projected to 2D with t-SNE, clustered, and linked into a graph visualization (nodes = condensed ideas, edges = semantic relationships), rendered with D3.js and React. Hand tracking supports tangible interaction: a long press on one node prompts the LLM to generate related ideas; selecting two nodes proposes a bridging concept that connects separate themes.

System pipeline from speech through LLM processing to semantic graph visualization — Data flow — speech to semantic graph

Interactive mind-map exploration through gesture.

Preliminary Evaluation

In a preliminary self-study, the system transformed spoken discussion into a structure more trackable than a transcript alone. Keyword reduction surfaced main themes while keeping the map readable; importance scoring helped the visualization stay focused during conversation.

The node layout revealed clusters of related ideas and gaps between topics—supporting documentation of what was said and discovery of directions to explore. Single-node elaboration deepened one idea; two-node selection often produced bridging concepts. Gesture interaction felt lightweight and conversational without traditional UI chrome, though robust targeting and feedback remain areas for improvement. Overall, AI contributed not only by organizing conversation but by generating new ideas through tangible interaction.

Physical Idea Droplets

Physical droplet prototype responding to hand gestures.

Robotic droplets — Convert digital to physical.
Physical volume — Explore the interaction between onsite actors and remote actors.
Movable ideation — Tangible droplets move along with the mind map.
Recognition and interaction — Physical droplets capture ideas and embed them into the mind map to form new ideas.

Droplet hardware — transparent shell, LED ring, and wheeled base

Droplet interaction flow with the projected mind map — Droplet interaction flow

Visualization of multiple droplets coordinating on a shared surface — Idealization for multiple droplets

Future Work

We plan broader user studies, more robust gesture interaction, and richer AI participation—proactive intervention, multi-agent collaboration, and documenting how ideas evolve over time—as steps toward embodied human–AI collaboration for ideation and creative dialogue.