NOW LET US – AI RAG SaaS Studio TP.HCM
NOW LET US
Digital Product Studio
Back to news
DEV-TOOLS...2 min read

Pgit: I Imported the Linux Kernel into PostgreSQL

Share
NOW LET US Article – Pgit: I Imported the Linux Kernel into PostgreSQL

A software engineer successfully imported the entire 1.4 million commits and 20 years of Linux Kernel history into a PostgreSQL database using pgit, making the massive repository fully queryable via SQL.

TL;DR: Imported the full Linux kernel history into pgit. 1,428,882 commits, 24.4 million file versions, 20 years of development, stored in PostgreSQL with delta compression. Actual data: 2.7 GB (git gc --aggressive gets 1.95 GB). The import took 2 hours on a dedicated server. Then I started asking questions. 7 f-bombs in 1.4 million commit messages (all from 2 people). 665 bug fixes pointing at a single commit. A filesystem that took 13 years to merge. Here's what the Linux kernel looks like as a SQL database.

The import

This post builds on pgit: What If Your Git History Was a SQL Database?. If you haven't read it, start there. Short version: pgit is a Git-like CLI where everything lives in PostgreSQL instead of the filesystem. It uses pg-xpatch for transparent delta compression and makes your entire commit history SQL-queryable. After the pgit post hit the HN front page and got picked up by TLDR, console.dev, and dailydev, I teased that I was importing the Linux kernel. Here's what happened.

The Linux kernel is one of the largest actively developed repositories in the world. 1.4 million commits spanning 20 years, 171,000 files, 38,000 contributors. From what I've found, only a handful of VCS besides git have ever managed a full import of the kernel's history. Fossil (SQLite-based, by the SQLite team) never did. Darcs and Monotone attempted it with severe performance problems. Mercurial can do it. Correct me if I'm wrong on any of this.

pgit handled it.

| Metric | Value | |---|---| | Commits | 1,428,882 | | File versions (file refs) | 24,384,844 | | Unique blobs | 3,089,589 | | Unique paths | 171,525 | | Path groups (delta chains) | 137,600 | Import time | 2h 0m 48s |

The import ran on a Hetzner dedicated server in Finland: AMD EPYC 7401P (24 cores / 48 threads), 512 GB DDR4 ECC RAM, 2x1.92 TB SSD in RAID 0. With a 350 GB xpatch content cache, the entire decoded repository fits in memory.

Full server setup, git baseline, and pgit configuration

The server

Hetzner Dedicated "Server Auction" from their Finland datacenter (HEL1):

| Component | Spec | |---|---| | CPU | AMD EPYC 7401P (24 cores / 48 threads) | | RAM | 16x32 GB DDR4 ECC reg. (512 GB total) | | Storage | 2xMicron SSD SATA 1.92 TB Datacenter (RAID 0) | | NIC | 1 Gbit Intel I350 | | Cost | ~€272/month |

OS installation

Hetzner installimage with Ubuntu 24.04 LTS. Two changes from the default config: RAID 0 (SWRAIDLEVEL 0) for maximum throughput (no redundancy needed for ephemeral analysis work), and a simple partition layout:

PART /boot ext3 1024M
PART swap swap 4G
PART / ext4 all

This gives ~3.5 TB usable storage across the two 1.92 TB SSDs.

OS tuning

After booting into the installed image, the system was tuned for performance by setting the CPU governor to performance, disabling kernel mitigations, and optimizing sysctl parameters like swappiness and dirty ratios. Transparent Huge Pages were disabled to ensure stability for the database workload.

pgit configuration

PostgreSQL was configured with a 64GB shared buffer and a 350GB xpatch cache to ensure that the massive amount of file versions and delta chains could be processed efficiently in memory. Parallelism was tuned to match the 24-core EPYC processor.

© 2026 Now Let Us. All rights reserved.

Source: Hacker News

Advertisement
Ad slot ready: 5887729102

More in this category

NOW LET US Related – Leaving Mozilla

dev-tools

Leaving Mozilla

A poignant and candid reflection from a 15-year Mozilla veteran upon their departure. The author highlights the leadership's missteps in trying to emulate tech giants and urges Mozilla to return to its core values: community and uniqueness.

NOW LET US Related – Shepherd's Dog: A Game by the Most Dangerous AI Model

dev-tools

Shepherd's Dog: A Game by the Most Dangerous AI Model

A developer tested Anthropic's latest, supposedly 'too dangerous' AI model by asking it to build a long-held game idea in a single shot. The model succeeded, generating a complete 2,319-line game after a 45-minute reasoning session.

NOW LET US Related – Open source AI must win

dev-tools

Open source AI must win

If artificial intelligence becomes a utility rented only from a few closed institutions, humanity loses its operational freedom. Open-source AI is a vital infrastructure for the future of our digital society.

NOW LET US Related – Statement on US government directive to suspend access to Fable 5 and Mythos 5

dev-tools

Statement on US government directive to suspend access to Fable 5 and Mythos 5

The US government has issued an export control directive forcing Anthropic to suspend all access to its Fable 5 and Mythos 5 models due to national security concerns, a move the AI safety startup strongly disputes.

NOW LET US Related – Electric motors with no rare earths

dev-tools

Electric motors with no rare earths

Renault Group is pioneering the development of electrically excited synchronous motors (EESM) that eliminate the need for rare earth magnets, reducing dependency on global monopolies while driving efficiency and sustainability.

NOW LET US Related – Swift at Apple: Migrating the TrueType hinting interpreter

dev-tools

Swift at Apple: Migrating the TrueType hinting interpreter

Apple has rewritten its TrueType hinting interpreter from C to memory-safe Swift for its Fall 2025 OS releases, improving security and boosting performance by an average of 13%.

EXPLORE TOPICS

Discover All Categories

Deep dive into the specific technology sectors that matter most to you.