Enabling Codex to Analyze Two Decades of Hacker News Data

This article explores how to use Codex and Modolap to analyze a 10GB dataset of Hacker News history, revealing trends in programming languages, databases, and user behavior.
The entirety of Hacker News, stored in parquet files, is approximately 10GB in size. I was interested in analyzing the dataset and, in the fashion of the contemporary zeitgeist, in doing so with Codex. With Modolap, Codex can analyze it well. After simply adding the skill with npx, the first topic of interest was mention history: whether mentions of Rust superseded those of Go, and MySQL versus Postgres. Simply running a query through Codex and some minimal back-and-forth yielded an adequate script using Modolap. Key areas of analysis included Rust vs Golang, Codex vs Claude Code, and Postgres vs MySQL. An additional hypothesis is whether the average comment got shorter. From an initial look, it seems as if there does exist a gradual decline in length.
Source: Hacker News












