Why are we still using Markdown?

Despite being the gold standard for technical writing, Markdown suffers from syntax ambiguities and security risks that continue to challenge developers.
There are few things in life that bring me much joy and hate at the same time. Like chocolate that hurts when eaten and markdown. Seriously why?? Half of the time we aren’t even using the full language!
HTML is the best Programming Language!
I know you’ve heard people say the only programming language they know is HTML. And I know, we both rolled our eyes in discontent trying to get our PL papers out of our assembled decks of papers on how HTML is only a markup language and not a programming language.
I mean yes we’re on the right but that guy probably has something we don’t have. A life.
[Note] When I’m talking about markdown, I am specifically talking about CommonMark Unless stated otherwise. Because it is the unambiguous syntax specification. I love the project, I really appreciate their efforts on making this language a bit more grounded. It’s not the specification that’s broken, it’s the language itself.
The Good
Markdown is a minimal language used for typesetting trivial documents. It needs to do one simple job: get a Markdown file and output an HTML file. Its syntax is legible as it gets and is easy to write even with no assists. Like the C language you can see the output that will be created. Bold is always <b></b> at the end and italic the same. Learning curve is simply nonexistent if you’re just a casual user. Just one look at the cheat sheet and you’re ready.
The Bad
We don’t know what we want. Do we want UI? Do we want a programming language? We don’t know. The only reason feature creep exists is because of unclear specifications. You want a MINIMAL easily legible markup language, you have markdown. Simple as that right?
well…
(output taken from dingus)
# Hello
*I am an*
__Unambiguous__
> Grammar
<h1>Hello</h1>
<p>
<em>I am an</em>
<strong>Unambiguous</strong>
</p>
<blockquote>
<p>Grammar</p>
</blockquote>
Hello
=====
_I am an_
**Unambiguous**
> Grammar
I hope you have the 2 eyeballs enough to see that markdown is NOT what you asked for. These 2 produce IDENTICAL output. And this is just the tip of the iceberg? It has so many poor decisions baked in that if you try to use it it will actively fight against you the moment you think you know what you’re doing.
Exhibit A: bold, italic, bold-italic, ???
In markdown you can write a bold in different ways. **bold**, __bold__, <b>bold</b> are some of the ways a valid bold can be written. And these are for commonmark. If you’re using something which isn’t marketing itself as “CommonMark compliant”, you can very well encounter valid stuff that produce the same input. Truly magnificent. And please don’t let me get started on layered ones. This thing is actually so peak that we have class of parser vulnerabilities called ReDoS (Regular Expression Denial of Service) affecting this. Like this 6.9 severity level CVE for markdown-it.
Exhibit B: asm Was a Good Idea, but This?
In old languages, inline assembly helped write performance critical code. Now let’s take this wonderful idea and bolt it directly into the most bloated, single threaded, sandboxed environment expecting a simple and easy way to write documents. And this is how inline HTML inside Markdown was born! The real issue is that to ship a Markdown parser you also need to ship a friendly HTML parser. And if you’re using HTML inside Markdown, why not use HTML from the start!
Markdown in and of itself isn’t powerful enough to satisfy the developer who wants the site to look good. We want LaTeX, PlantUML, Mermaid, custom styling, footnotes... To nail a painting to the wall I need a hammer (Markdown). But if I wanted to paint it too, I will break the canvas the moment I try to paint with the hammer. Breaking the canvas also means a whole lot of CVEs, primarily around XSS vulnerabilities.
Exhibit C: Obscure and Old Syntax
Markdown was made in the good old 00’s! It was inspired by preexisting conventions for marking up plaintext in email and usenet posts. Because of this legacy, you have 2 different ways of writing headings, the normal way and the ATX way. You have 2 different ways and more than 2 ways to do almost everything.
Source: Hacker News












