New Sep 1, 2024

Hmmarkdown

More Front-end Bloggers All from dbushell.com View Hmmarkdown on dbushell.com

I rolled my own Markdown library!

It’s called Hmmarkdown on GitHub. It’s published on both JSR and NPM too. I’m testing in production on my own website! That might be a mistake but it’s too late now.

⚠️ Work in progress! ⚠️

Hmmarkdown is bad code and full of bugs. Okay, well it does work, but with limitations. Some of those limitations are by design!

Backstory

I’ve long used Marked which is a perfectly cromulent choice. Marked has active development and plenty of plugins. It gets the job done.

Why change? I wanted:

Most Markdown parsers give up when they find HTML. I created Hmmarkdown to better handle scenarios where I want HTML in my Markdown.

For example Hmmarkdown transforms this:

<figure class="Box">
  This is a **boxed** paragraph.
</figure>

Into this:

<figure class="Box">
  <p>This is a <strong>boxed</strong> paragraph.</p>
</figure>

Which looks like this:

This is a boxed paragraph.

Marked couldn’t handle this without a bespoke and hacky extension.

Hmmarkdown is HTML-aware. When it finds HTML it creates a lightweight node tree to isolate text content and apply inline Markdown filters. This works regardless of node depth. Only inline Markdown is supported inside HTML for now. Block Markdown is trickier because I’ll need to track indentation and do some recursive magic. That’s the end goal for “v1.0”.

First I’m considering a major refactor. I’ve hit a bit of a technical wall and probably over-engineered it. I have a better approach I want to try.

Markdown Specification

As CommonMark says:

John Gruber’s canonical description of Markdown’s syntax does not specify the syntax unambiguously.

Attempts to standardise Markdown are insane. Just look at the GitHub Flavored Markdown Spec which attempts to support every variation imaginable. No way I’m parsing that. My syntax support for top-level blocks is extremely strict. See the README documentation.

Hmmarkdown is so strict and unforgiving I’ve spent the last two week correcting errors on my blog. I have no plans to add support for anything beyond basic Markdown. Extended syntax for tables is wild, for example. It’s easier to just write HTML in my opinion. That’s why I created Hmmarkdown. I can break out the HTML and mix-n-match.

Is it Fast?

I benchmarked a few micro-optimisations like checking if a string begins with a specific character before matching:

if (line[0] !== '#') return false;
const match = line.match(/^(#{1,6})\s+/);

I stopped optimising when I realised speed was a non-issue.

Every JavaScript library claims to be the “fastest ever”. The truth with Markdown is that most competent parsers are already as fast as JavaScript allows. The bottleneck becomes I/O and CPU & memory limits. Rendering the ~400 pages on my website takes under two seconds and 90% of that time is waiting for Shiki syntax highlighting. It’s milliseconds or less per file. If that’s too slow stop using JavaScript!

Work in Progress

Hmmarkdown is early stage development and not the prettiest code. I will be extending the Markdown syntax support and there will be one or two refactors along the way. It will always be an opinionated library made for my use case; this blog. It’s open source and MIT licensed so anyone is welcome to try it. It should work in all JavaScript runtimes.

Scroll to top