Watch-Later Visualization: 3D Youtube content explorer

The Problem with Watch Later Lists

YouTube’s Watch Later feature has a fundamental flaw: most videos added never get watched. Like browser bookmarks, these lists become digital hoarding grounds where content accumulates but rarely surfaces. The simple chronological list doesn’t help us understand why we saved something or how it relates to our broader interests.

This project visualizes Watch Later history as a three-dimensional map that reveals patterns and relationships between saved content.

Transforming Chaos into Clarity

Personal data is messy. Multilingual titles, clickbait thumbnails, trending keywords, and fleeting hype all obscure genuine signals. The preprocessing pipeline extracts meaning from noise:

Language normalization: Amazon Translate standardizes multilingual content, creating a consistent dataset for analysis
Signal extraction: TF-IDF filtering surfaces genuinely informative terms while suppressing common noise
Dimensionality reduction: Truncated SVD compresses the sparse TF-IDF matrix into a compact latent space where semantic similarity becomes geometric proximity

The result is a semantically structured dataset where each video has a position in conceptual space and a relevance score reflecting its importance.

Preprocessing pipeline — Translate → TF‑IDF → SVD → Topics

Spatial Metaphor: Size and Distance

The visualization translates abstract relevance into intuitive spatial properties:

Size encodes importance: More relevant videos appear larger, naturally drawing attention
Distance from center: Core interests cluster near the origin; peripheral curiosities drift to the edges

This relevance-based spatial mapping transforms a flat chronological list into explorable terrain—dense hubs reveal obsessions, sparse outskirts show passing interests.

Two Views, One System

The interface provides complementary exploration modes:

Keyword Sphere: High-level thematic overview where hovering reveals relationships between topics
Video Universe: Detailed neighborhood view where individual videos form constellations around shared themes

Keyword Sphere (left) and Video Universe (right) — overview vs neighborhood

Technical Implementation

The implementation prioritizes simplicity and performance:

Three.js CSS3DRenderer: Renders DOM elements (thumbnails, labels) as 3D objects, maintaining text readability and interaction responsiveness without GPU-intensive shaders
Svelte + TypeScript: Lightweight, reactive UI layer with strong typing
TrackballControls: Intuitive camera navigation with smooth physics
Hybrid computation: Accepts precomputed JSON (TF-IDF+SVD results) or performs heuristic calculations client-side

Design Philosophy

Two principles guided the experience:

Legibility over spectacle: Quiet typography and subtle motion let the data communicate; the visualization serves the content, not the reverse
Performance as UX: Prefetching, minimal DOM reflows, and conservative animations maintain smooth exploration without distraction

Key Insights

Building this project reinforced several lessons:

Classical NLP remains powerful: TF-IDF and SVD, though decades old, provide clarity and interpretability that modern neural methods often sacrifice for marginal accuracy gains
Direct relevance visualization works: Encoding importance as size and centrality makes patterns immediately obvious—no legend required

Live Demo

The project is deployed at Youniverse, where you can explore the visualization with sample data. It’s hard to construct your own YouTube history because our system requires your Youtube watch history which needs to be exported from Youtube manually.