MinJ: A Programming Language for the JVM

2024

Java 21ANTLR4GradleShadowJarGitHub ActionsJVM
GitHub
MinJ: A Programming Language for the JVM

Overview

MinJ (Minimalistic Java) is a scripting language I designed and built to learn how interpreters work from the inside. Not a toy REPL that evaluates arithmetic — it has classes, methods, typed variables, loops, lists, and return with proper stack unwinding. The whole thing runs on the JVM: a 246-line ANTLR4 grammar generates the lexer and parser, a 730-line EvalVisitor tree-walks the parse tree and executes it, and the result ships as a single fat-JAR you can run anywhere Java runs.

I built it in my third semester. The goal was to understand every layer of a language implementation — lexical analysis, parsing, scoping, type checking, method dispatch — by actually building one, not just reading about it.

How it works

Source code goes through three stages:

  1. MinJ.g4 defines the grammar. During the Gradle build, ANTLR4 generates MinJLexer and MinJParser from it. The parser produces a typed parse tree.
  2. EvalVisitor extends ANTLR's MinJBaseVisitor<Object> and overrides one method per grammar rule. Each visit* call walks the tree and executes: variable allocation, arithmetic, control flow, method dispatch, object instantiation.
  3. Main.java reads a .mj file, feeds it through the lexer and parser, and calls visitor.visit(tree).

The grammar and execution logic are strictly separate. The .g4 file defines what MinJ programs look like; EvalVisitor defines what they do. Adding a new language construct means touching exactly two files.

Under the hood, the runtime uses Cell objects to track each variable's value, declared type, mutability, and whether it's dynamically typed. return statements throw a ReturnSignal (a RuntimeException subclass) to unwind the call stack — simple, and it means returns work correctly from inside nested loops and conditionals. Object instances deep-copy their class's field cells on construction, so each instance has its own state.

The language

MinJ has var (mutable, dynamically typed), val (immutable, single assignment enforced at runtime), and explicit types (int, float, double, boolean, char, String). Typed declarations are checked at assignment — if you declare int x and try to assign a String, it fails.

Control flow: if/elseif/else, while, for i = 1 to n step s, and foreach x in list. All blocks are terminated with end. Boolean operators accept both symbols and words (&& or and, || or or, ! or not).

Classes have fields, methods, and new-based instantiation with dot-call syntax. Global functions use func and support multiple return values with destructuring (var sum, product = addAndMultiply(3, 4)). Comments work with //, #, or /* */.

Examples

FizzBuzz

val limit = 100
for i = 1 to limit do:
  if i % 15 == 0 then:
    print "FizzBuzz"
  elseif i % 3 == 0 then:
    print "Fizz"
  elseif i % 5 == 0 then:
    print "Buzz"
  else:
    print i
  end
end

Classes and methods

class Counter:
  var count = 0
  method inc():
    count = count + 1
  end
  method get():
    return count
  end
end

var c = new Counter()
c.inc()
c.inc()
print c.get()   // 2

Recursive factorial

func factorial(n):
  if n <= 1 then:
    return 1
  end
  return n * factorial(n - 1)
end

print factorial(5)   // 120

Tooling and CI/CD

The Gradle build uses the ANTLR plugin to regenerate the lexer and parser from MinJ.g4 on every build. ShadowJar packages everything (ANTLR runtime included) into a single fat-JAR — run any .mj file with java -jar minjc-0.3.0.jar yourfile.mj.

A GitHub Actions pipeline runs on every push to the release branch: checks out the code, sets up JDK 21, builds the fat-JAR, runs tests, bundles the examples, and publishes a versioned release asset. Documentation auto-deploys to GitHub Pages on each push to master.

How to extend it

Adding a new language construct is mechanical. For example, the while loop:

  1. Grammar rule in MinJ.g4:
whileStmt : WHILE expr DO COLON block END ;
  1. Register it in the statement rule:
statement : varDecl | assign | printStmt | ifStmt | whileStmt ;
  1. Lexer tokens:
WHILE : 'while' ;
DO    : 'do' ;
END   : 'end' ;
  1. Visitor override in EvalVisitor.java:
@Override
public Object visitWhileStmt(MinJParser.WhileStmtContext ctx) {
    while ((Boolean) visit(ctx.expr())) {
        visitBlock(ctx.block());
    }
    return null;
}
  1. Rebuild: ./gradlew clean shadowJar

Grammar rule, tokens, visitor method. That's it.