AGENTIC-SYSTEMSMarch 10, 20261 min read28 views

AI can rewrite open source code—but can it rewrite the license, too?

An AI-powered rewrite of a popular open-source library has sparked a legal and ethical debate over whether the resulting code can be relicensed, challenging the traditional 'clean room' development process.

Computer engineers and programmers have long relied on reverse engineering as a way to copy the functionality of a computer program without copying that program’s copyright-protected code directly. Now, AI coding tools are raising new issues with how that “clean room” rewrite process plays out both legally, ethically, and practically.

Those issues came to the forefront last week with the release of a new version of chardet, a popular open source python library for automatically detecting character encoding. The repository was originally written by coder Mark Pilgrim in 2006 and released under an LGPL license that placed strict limits on how it could be reused and redistributed.

Dan Blanchard took over maintenance of the repository in 2012 but waded into some controversy with the release of version 7.0 of chardet last week. Blanchard described that overhaul as “a ground-up, MIT-licensed rewrite” of the entire library built with the help of Claude Code to be “much faster and more accurate” than what came before.

Speaking to The Register, Blanchard said that he has long wanted to get chardet added to the Python standard library but that he didn’t have the time to fix problems with “its license, its speed, and its accuracy” that were getting in the way of that goal. With the help of Claude Code, though, Blanchard said he was able to overhaul the library “in roughly five days” and get a 48x performance boost to boot.

Not everyone has been happy with that outcome, though. A poster using the name Mark Pilgrim surfaced on GitHub to argue that this new version amounts to an illegitimate relicensing of Pilgrim’s original code under a more permissive MIT license (which, among other things, allows for its use in closed-source projects). As a modification of his original LGPL-licensed code, Pilgrim argues this new version of chardet must also maintain the same LGPL license.

Source: Ars Technica AI

More in this category

agentic-systems

Generative Ontology Induction: Domain-Agnostic Schema Discovery from Document Corpora Using Large Language Models

Researchers introduce Generative Ontology Induction (GOI), a domain-agnostic framework that automatically extracts structured ontologies from document corpora using LLMs. Achieving 95-100% structural coverage, GOI addresses a major bottleneck in knowledge-intensive AI systems.

agentic-systems

JUMP: Single-Pass Membership Inference on Fine-Tuned Diffusion Language Models

Researchers have proposed JUMP, a novel single-pass membership inference attack designed for fine-tuned discrete diffusion language models (dLLMs). By leveraging the unique properties of dLLMs, JUMP significantly improves detection accuracy while drastically reducing the number of required queries.

agentic-systems

Masked Diffusion Language Models are Strong and Steerable Text-Based World Models for Agentic RL

Researchers demonstrate that Masked Diffusion Language Models (MDLMs) serve as highly effective, steerable text-based world models for agentic reinforcement learning. By leveraging bidirectional denoising, MDLMs outperform autoregressive models four times their size in coherence, groundedness, and rollout diversity.

NOW LET US Related – Democratizing AI with Small Language Models: Structured Benchmarking and Parameter-Efficient Fine-Tuning for Local Deployment

agentic-systems

Democratizing AI with Small Language Models: Structured Benchmarking and Parameter-Efficient Fine-Tuning for Local Deployment

A new study demonstrates that small language models (SLMs) under 3 billion parameters can serve as highly capable local experts for specialized tasks. By combining structured benchmarking with low-cost parameter-efficient fine-tuning (PEFT), institutions can achieve AI autonomy without relying on expensive hardware.

agentic-systems

PlanFlip: Attacking Multi-Agent LLM Systems via Planning-Phase Prompt Injection

Researchers have introduced PlanFlip, a novel prompt injection attack framework targeting the planning phase of multi-agent LLM systems. The study reveals critical security blind spots in homogeneous agent pipelines and demonstrates that reasoning-augmented models like DeepSeek-R1 exhibit strong resistance.

agentic-systems

Rater State Bias in RLHF Preference Data: An Audit Framework

A new study identifies a structured bias in Reinforcement Learning from Human Feedback (RLHF) caused by the psychological state of human raters. Under stress, raters' shifting preferences can propagate through reward modeling, potentially compromising AI policy optimization.