Introducing Tablespoon
tbsp (tree-based
source-processing language) is an awk-like language that operates on
tree-sitter syntax trees. To motivate the need for such a program, we
could begin by writing a markdown-to-html converter using
tbsp
and tree-sitter-md.
We need some markdown to begin with:
# 1 heading
content of first paragraph
## 1.1 heading
content of nested paragraph
For future reference, this markdown is parsed like so by tree-sitter-md (visualization generated by tree-viz):
document
| section
| | atx_heading
| | | atx_h1_marker "#"
| | | heading_content inline "1 heading"
| | paragraph
| | | inline "content of first paragraph"
| | section
| | | atx_heading
| | | | atx_h2_marker "##"
| | | | heading_content inline "1.1 heading"
| | | paragraph
| | | | inline "content of nested paragraph"
Onto the converter itself. Every tbsp
program is written
as a collection of stanzas. Typically, we start with a stanza like
so:
BEGIN {
int depth = 0;
print("<html>\n");
print("<body>\n");
}
The stanza begins with a “pattern”, in this case, BEGIN
,
and is followed a block of code. This block specifically, is executed
right at the beginning, before traversing the parse tree. In this
stanza, we set a “depth” variable to keep track of nesting of markdown
headers, and begin our html document by printing the
<html>
and <body>
tags.
We can follow this stanza with an END
stanza, that is
executed after the traversal:
END {
print("</body>\n");
print("</html>\n");
}
In this stanza, we close off the tags we opened at the start of the document. We can move onto the interesting bits of the conversion now:
enter section {
depth += 1;
}
leave section {
depth -= 1;
}
The above stanzas begin with enter
and
leave
clauses, followed by the name of a tree-sitter node
kind: section
. The section
identifier is
visible in the tree-visualization above, it encompasses a
markdown-section, and is created for every markdown header. To
understand how tbsp
executes above stanzas:
document ... depth = 0
| section <-------- enter section (1) ... depth = 1
| | atx_heading
| | | inline
| | paragraph
| | | inline
| | section <----- enter section (2) ... depth = 2
| | | atx_heading
| | | | inline
| | | paragraph
| | | | inline
| | | <----------- leave section (2) ... depth = 1
| | <-------------- leave section (1) ... depth = 0
The following stanzas should be self-explanatory now:
enter atx_heading {
print("<h");
print(depth);
print(">");
}
leave atx_heading {
print("</h");
print(depth);
print(">\n");
}
enter inline {
print(text(node));
}
But an explanation is included nonetheless:
document ... depth = 0
| section <-------- enter section (1) ... depth = 1
| | atx_heading <- enter atx_heading ... print "<h1>"
| | | inline <--- enter inline ... print ..
| | | <----------- leave atx_heading ... print "</h1>"
| | paragraph
| | | inline <--- enter inline ... print ..
| | section <----- enter section (2) ... depth = 2
| | | atx_heading enter atx_heading ... print "<h2>"
| | | | inline <- enter inline ... print ..
| | | | <-------- leave atx_heading ... print "</h2>"
| | | paragraph
| | | | inline <- enter inline ... print ..
| | | <----------- leave section (2) ... depth = 1
| | <-------------- leave section (1) ... depth = 0
The examples directory contains a complete markdown-to-html converter, along with a few other motivating examples.
Usage
The tbsp
evaluator is written in rust, use cargo to
build and run:
cargo build --release
./target/release/tbsp --help
tbsp
requires three inputs:
- a
tbsp
program, referred to as “program file” - a language
- an input file or some input text at stdin
You can run the interpreter like so (this program prints an overview of a rust file):
$ ./target/release/tbsp \
-f./examples/code-overview/overview.tbsp \
-l rust \
src/main.rs
module
└╴struct Cli
└╴trait Cli
└╴fn program
└╴fn language
└╴fn file
└╴fn try_consume_stdin
└╴fn main
I'm Akshay, programmer and pixel-artist. I write open-source stuff. I also design fonts: scientifica, curie.
Reach out at oppili@irc.rizon.net.