Claude explored the TXT folder structure, discovered 12 monthly text files containing ~115,590 lines of poetry, and examined the format to understand how poems were separated (blank lines and whitespace).
Created a custom Python script to extract individual poems, generate word-frequency embeddings, and apply PCA dimensionality reduction to create 2D coordinates. Original plan was to use UMAP, but adapted to use PCA when external ML libraries weren't available.
Ran the Python script to process all 8,005 poems, extracting the 150 most common words as features, computing vectors, and projecting into 2D space. Output saved as JSON (2.3MB).
Built a pure vanilla HTML/JavaScript/D3.js visualization with hover interactions, monochrome aesthetic, and integrated project context from the original ReRites site.
Created README and this "How It Was Made" page to document the process and provide context.
Why word-frequency embeddings instead of transformers?
The VM environment had limited connectivity, so Claude built a standalone solution using only Python standard libraries. Word-frequency vectors (TF-IDF style) capture semantic meaning effectively for poetry without requiring large language models.
Why PCA instead of UMAP/t-SNE?
PCA was implementable from scratch in pure Python, produces stable and reproducible results, and preserves the major variance in the data—perfect for showing thematic neighborhoods across 8,000 poems.
Why vanilla HTML instead of React?
Modern web development has shifted away from heavy frameworks for simple visualizations. Vanilla JavaScript with D3.js is leaner, faster, has no dependencies to maintain, and will work indefinitely without framework updates breaking it.
Why a .js file instead of loading JSON?
Browsers block local JSON file loading due to CORS security policies, which would require running a local web server. By converting the data to a JavaScript file (poems_data.js), the visualization works instantly—just double-click the HTML file. No server, no setup, no hassle.
Clean aesthetic: Light background with black points creates a clear, readable field—poems as a constellation in semantic space.
Hover interaction: Reading appears on demand, respecting the contemplative pace of poetry. No clicks required.
Minimal UI: The visualization itself is the interface. No complex controls, filters, or menus—just pure exploration.
New iteration: This is explicitly framed as a 2026 re-exploration of the original 2017-2018 ReRites project. The page integrates the original project description, links back to David Jhave Johnston's site, and clarifies this as a new data visualization method for engaging with the corpus—a regeneration of the material through contemporary AI tools.
rerites_visualization.html - Main interactive visualization (9.5KB)poems_data.js - Poem data as JavaScript (2.0MB)poems_visualization_data.json - Same data in JSON format (2.3MB)how_it_was_made.html - This page (12KB)process_poems_optimized.py - Python processing scriptREADME.md - Technical documentationImportant: The HTML file needs poems_data.js in the same folder to work. No web server required—just double-click the HTML file!
This entire visualization was created in a single 30-minute session using Claude's Cowork mode, which provides AI assistance with file access, code execution, and data processing capabilities.
Get Started:
Model Used: Claude Sonnet 4.5 (claude-sonnet-4-5-20250929)
Date: January 22, 2026