How to wrap tag with another tag in Earmark Elixir

Jul 04, 2024

Earmark is a cool library that allows you to convert markdown to HTML, it is beneficial when you want to do any SSG within Elixir.

By default, Earmark takes your Markdown:

- Item 1
- Item 2
- Item 3

and gives back clean sensible HTML:

<ul>
  <li>Item 1</li>
  <li>Item 2</li>
  <li>Item 3</li>
</ul>

The problem is when you need to style some parts such clean HTML may not be enough. What I wanted to do is wrap this ul list with div that has a specific class name, like this:

<div class="list-spectacular">
  <ul>
    <li>Item 1</li>
    <li>Item 2</li>
    <li>Item 3</li>
  </ul>
</div>

Unfortunately, Earmark docs don’t have a clear example of how to do that, and the documentation of manipulating Earmark’s AST is hard to understand.

I tried my luck searching Elixir forum but it seemed like any topics that were covering AST manipulation were also getting very complicated some of the issues were older than 4 years old and since then Eamrk seemed to have updates that are not compatible with some of the code discussed.

After searching around this is the fastest way if you need to wrap HTML generated by Earmark with another tag

defmodule EarmarkRender do
  defp wrap_tag(node) do
    {:replace, {"div", [{"class", "list-spectacular"}], [node], %{}}}
  end

  def render(markdon) do
    Earmark.as_html!(markdon, %Earmark.Options{registered_processors: {"ul", &wrap_tag/1}})
  end
end

In the paragraphs below I explain some interesting things, edge cases, and maybe a bug I discovered while searching for the the solution. If you are only looking to get the job done the above code snippet is what you are looking for and you may skip the rest.

A couple of important things:

Earmarks options has the field registered_processors that accepts processor or a list of processors

%Earmark.Options{registered_processors: &wrap_tag/1}
# or
%Earmark.Options{registered_processors: [&wrap_tag/1, &other_processor/1]}

By default, this will wrap every HTML tag in our Earmark output with <div class="list-spectacular">, and that is not what we want.

Fortunately, Earmark’s registered_processors option has filter functionality that runs the processor only for the specified tag name, so instead of processor function you give registered_processors option a tuple with the first element as a string tag name and the second element as processor function like this:

%Earmark.Options{registered_processors: [{"ul", &wrap_tag/1}}
# or
%Earmark.Options{registered_processors: [{"ul", &wrap_tag/1}, &other_processor/1]}

What is proccessor? it is just a function that takes a node and returns the node

def wrap_tag(node) do
 node
end

And node can only be a string or Earmark node quadruple, we know what string is and node quadruple is this:

{"", [], [], %{}}
  • the first element is the name of the tag,
  • the second element is “attributes”; it is a list of {"string", "string"} tuples
  • the third element is the child nodes of the current node, and it can be a mixed list of qudruple nodes or strings nodes or just an empty list
  • the fourth element is “meta”; a map of key-value pairs

In real life, it may look like this:

{"p", [{"class": "my-style"}], ["text inside paragraph"], %{}}

Confusion of :replace in proccessor funcion. Initially when I wrote the processor function to wrap the ul list with <div class="list-spectacular">

defp wrap_tag(node) do
 {"div", [{"class", "list-spectacular"}], [node], %{}}
end

Something unexpected was happening the HTML output was looking like this. Note: missing ul element

<div class="list-spectacular">
  <li>Item 1</li>
  <li>Item 2</li>
  <li>Item 3</li>
</div>

That is not what I wanted, I wanted to wrap the ul list not replace the parent element, I wanted the output to be like this:

<div class="list-spectacular">
  <ul>
    <li>Item 1</li>
    <li>Item 2</li>
    <li>Item 3</li>
  </ul>
</div>

After reading further the documentation I found a third option what processor function can return a :replace atom and node tuple {:replace, node}

After adjusting my wrap_tag function to the above return type like this:

  defp wrap_tag(node) do
 {:replace, {"div", [{"class", "list-spectacular"}], [node], %{}}}
  end

The HTML output was what I wanted, it was wrapping the ul list in <div class="list-spectacular">

<div class="list-sepctacular">
  <ul>
    <li>Item 1</li>
    <li>Item 2</li>
    <li>Item 3</li>
  </ul>
</div>

Is it a bug? I’m not sure if the issue described above is a bug or if is it intended like that. But from reading the documentation nowhere does it mention this point and from the logic outlined in the documentation and the API of Earmark it would make me think it is a bug or unintended behavior.

When you read the documentation of :replace atoms purpose is to stop processor functions from recursively processing node’s children, grandchildren, and so on with the same processor function.

Implications of recursion in processor functions

What if we have two lists:

- Item 1
  - List 2 Item 1
  - List 2 Item 2
  - List 2 Item 3
- Item 2
- Item 3

The HTML output would be this:

<div class="list-spectacular">
  <ul>
    <li>
      Item 1
      <ul>
        <li>Item 1</li>
        <li>Item 2</li>
        <li>Item 3</li>
      </ul>
    </li>
    <li>Item 2</li>
    <li>Item 3</li>
  </ul>
</div>

you see how it wrapped only the topmost ul element but it is doing so only because of the :replace key in my wrap_tag

defp wrap_tag(node) do
 {:replace, {"div", [{"class", "list-spectacular"}], [node], %{}}}
end
# if :replace atom is removed
defp wrap_tag(node) do
 {"div", [{"class", "list-spectacular"}], [node], %{}}
end

if we use the wrap_tag processor function without :replace key HTMl would be this:

<div class="list-spectacular">
    <li>
 Item 1
      <div class="list-sepctacular">
        <li>Item 1</li>
        <li>Item 2</li>
        <li>Item 3</li>
    </li>
    <li>Item 2</li>
    <li>Item 3</li>
</div>

I already explained the missing ul tag issue in paragraphs above but what is important to note here is how we are able to control the wrapping of our <div class="list-spectacular"> tag recursively to the node’s children, grandchildren or however deep it goes.

This could be very useful when we need to add classes or attributes to some nodes with subnodes. But because it is impossible to achieve proper wrapping of nodes with tags by not using :replace key recursive process function wrapping isn’t something that needs to be of concern.

Now if recursive tag wrapping is something that is needed it would still be achievable by manually iterating and changing over the AST tree returned from Earmark.Parser.as_ast(). But that is outside the scope of this article

Probably something to watch out for in the future is if that :replace atom key presence in the processor function and how not using it has unexpected behavior of losing a tag when wrapping nodes with nodes. If it is indeed a bug this may be fixed in future.