Introduction
I was always intrigued by the Esoteric programming languages, programming languages designed to test the boundaries of computer programming language design, to serve as a proof of concept or just as a software art! One of the more popular Esoteric programming languages is the Branfuck (BF in the rest of the article), a language which uses only 8 characters. Developers challenge themselves to create a BF interpreter. That’s exactly what I’ve been wanting to make for a long time. I’ve set some rules for myself for this codding challenge: I can Google only Golang specific questions (as my Go knowledge is limited), but I MUST NOT search or look at any other BF interpreter implementation.
Join me in this article as we explore how to achieve such a goal.
Choosing a language
I am mostly using Java for development, and naturally, that is the right language to choose for this task. Or maybe not! This is a great opportunity to try a new language.
I chose Go. I’ve passed all tests on Codecademy, which covers very basic stuff. This is perfect, as a BF interpreter is relatively easy to write.
Brainfuck language design
The language consists of 8 commands, listed below. A BF program is a sequence of these commands, possibly interspersed with other characters (which are ignored).
The BF language uses a simple machine model consisting of the program and instruction pointer, as well as an array of at least 30,000 byte cells initialized to zero; a movable data pointer (initialized to point to the leftmost byte of the array); and two streams of bytes for input and output (most often connected to a keyboard and a monitor respectively, and using the ASCII character encoding).
The eight language commands each consist of a single character:
Character | Meaning |
---|---|
> | increment the data pointer (to point to the next cell to the right) |
< | decrement the data pointer (to point to the next cell to the left) |
+ | increment (increase by one) the byte at the data pointer |
- | decrement (decrease by one) the byte at the data pointer |
. | output the byte at the data pointer |
, | accept one byte of input, storing its value in the byte at the data pointer |
[ | if the byte at the data pointer is zero, then instead of moving the instruction pointer forward to the next command, jump it forward to the command after the matching ] command |
] | if the byte at the data pointer is nonzero, then instead of moving the instruction pointer forward to the next command, jump it back to the command after the matching [ command |
Coding
Let’s start with a simple hello world program.
First, I want to read contents of a file (containing a BF program) and store it in a variable.
The code reads contents of a file helloworld.bf
, removes all spaces, new lines, numbers and letters (as they are not valid BF command) and prints it to the standard output.
I’ll create a new type Program
, that will store the address array of bytes, as well as the size of an array and a current pointer position.
Next, I’ll create a function that takes a BF code as an argument, creates a new Program
and interprets the code.
Because of the fact that there are loops and nested loops in BF, I’ll create a recursive function that will be executing BF code for the existing Program instruction array. The function will iterate through every rune (character or char) in code
and execute a specific set of instructions for each BF command.
The first 6 commands are straight forward (+, -, >, <, ., ,).
The program.instructions
is a byte array and program.at
is a current array index pointing to one cell in instruction array.
Coding loops
This was the challenging part. I’ve decided to have a flag that I will raise when the loop start [
command occurs. When the flag is raised, the program will be ignoring all other loop start commands and will try to find it’s matching pair of closing bracket ]
. When it finds it’s closing bracket, the code within the loop will be extracted and then the recursive function executeWith
will be called for that code, until the current cell is 0.
First, the necessary flags and variables
If current command is [
, raise that flag and remember the index of an array where that loop started.
Next is the “ignore” logic. If the program encounters another [
command, add 1 to the skipClosingLoop
variable. This variable will be decremented every time a ]
command is encountered, until the final ]
is found. When the matching closing bracket is found, the code inside the loop will be extracted and executed. There is a check if loopStart == loopEnd
in case there is a []
in code.
And we’re done! Last thing I want to add is passing a file in program arguments.
Let’s run the hello world BF program.
Source code
Source code can be found on GitHub
Conclusion
The program lacks in error handling and error checking. But, this was really fun to implement and good time to practice a new programming language. I challenge you to try and implement the BF interpreter in a language you’re learning.
Comments