How This Page Was Made

← Back to Visualization

The Original Prompt

User: You are currently within a folder, called regen and that folder contains a folder called TXT, which contains text files. The text files are all monthly records from a project which used AI in 2018 to write poetry with human editing once a month it would publish a book so all the books are there from 2017 to 2018. What I would like you to do is develop an interface that exposes a data analysis of all of that poetry. Could you run a umap or TSNE analysis on it and then from that data, which will give you some sort of two dimensional neighbourhood embedding of these poems could you create a monochrome data visualization browser based to keep it relatively simple so please propose a plan for how to do that and it would need to include details from the following website, which is the proceeding project upon which it's based so we need to take this synopsis information and linked to that origin website and then say this is the further 2026 extrapolation using Claude cowork to develop a data visualization and a regeneration process so we have exploring the poems so if we had a two dimensional point field of where the poems are in this neighbourhood embedding space then rolling over each other points we could read each these small poems by just rolling over a dot so that's the beginning of the interface. I'd like you to plan to build. Here's the origin site: https://glia.ca/rerites/

Creation Timeline

Step 1: Data Exploration (5 minutes)

Claude explored the TXT folder structure, discovered 12 monthly text files containing ~115,590 lines of poetry, and examined the format to understand how poems were separated (blank lines and whitespace).

Step 2: Python Processing Script (10 minutes)

Created a custom Python script to extract individual poems, generate word-frequency embeddings, and apply PCA dimensionality reduction to create 2D coordinates. Original plan was to use UMAP, but adapted to use PCA when external ML libraries weren't available.

Step 3: Data Processing (3 minutes)

Ran the Python script to process all 8,005 poems, extracting the 150 most common words as features, computing vectors, and projecting into 2D space. Output saved as JSON (2.3MB).

Step 4: HTML Visualization (8 minutes)

Built a pure vanilla HTML/JavaScript/D3.js visualization with hover interactions, monochrome aesthetic, and integrated project context from the original ReRites site.

Step 5: Documentation (4 minutes)

Created README and this "How It Was Made" page to document the process and provide context.

Project Statistics
Total Time
~30 minutes
Poems Processed
8,005
Source Files
12 months
Code Lines
~200 (Python + HTML)

Technical Approach

Why word-frequency embeddings instead of transformers?
The VM environment had limited connectivity, so Claude built a standalone solution using only Python standard libraries. Word-frequency vectors (TF-IDF style) capture semantic meaning effectively for poetry without requiring large language models.

Why PCA instead of UMAP/t-SNE?
PCA was implementable from scratch in pure Python, produces stable and reproducible results, and preserves the major variance in the data—perfect for showing thematic neighborhoods across 8,000 poems.

Why vanilla HTML instead of React?
Modern web development has shifted away from heavy frameworks for simple visualizations. Vanilla JavaScript with D3.js is leaner, faster, has no dependencies to maintain, and will work indefinitely without framework updates breaking it.

Why a .js file instead of loading JSON?
Browsers block local JSON file loading due to CORS security policies, which would require running a local web server. By converting the data to a JavaScript file (poems_data.js), the visualization works instantly—just double-click the HTML file. No server, no setup, no hassle.

Tools Used

Key Design Decisions

Clean aesthetic: Light background with black points creates a clear, readable field—poems as a constellation in semantic space.

Hover interaction: Reading appears on demand, respecting the contemplative pace of poetry. No clicks required.

Minimal UI: The visualization itself is the interface. No complex controls, filters, or menus—just pure exploration.

New iteration: This is explicitly framed as a 2026 re-exploration of the original 2017-2018 ReRites project. The page integrates the original project description, links back to David Jhave Johnston's site, and clarifies this as a new data visualization method for engaging with the corpus—a regeneration of the material through contemporary AI tools.

Files Generated

Important: The HTML file needs poems_data.js in the same folder to work. No web server required—just double-click the HTML file!

Try Claude Cowork Yourself

This entire visualization was created in a single 30-minute session using Claude's Cowork mode, which provides AI assistance with file access, code execution, and data processing capabilities.

Get Started:

Model Used: Claude Sonnet 4.5 (claude-sonnet-4-5-20250929)
Date: January 22, 2026