System F extraction to C++, Rust

System F Intermediate Language

HNC is an optimizing source-to-source translator in the spirit of PureScript.

Like PureScript, it is a call-by-value functional language with type inference, generating human-readable code in another high-level language.

Unlike PureScript, it is designed to be used indirectly as a compiler backend to generate non-abstract non-functional loops with entirely stack-allocated temporaries from a more functional representation.

System F Core.HNC input language is a non-recursive dialect of ML with call by value reduction, assignment, loops and controlled effects. It is weak enough to be implemented without runtime boxing or heap-allocated closures.

Readable Extraction. As such it is suitable to translation into languages without automatic heap memory management or even without generics such as plain C. The type system is a bare minimum to infer a fixed runtime representation of every variable, annotate types everywhere C++ wants them and take advantage of C++ standard template library. The C++ backend is designed to be easily portable to both static and dynamic mainstream languages such as Java, C, C#, JavaScript or Python.

fold f x0 l = x = x0 p = l while *p != NULL x := f x *p p := p->next x x = fold (\x y -> x + y + 1) 0 lint x = 0; list_int *p = l; while (*p != NULL) { x += y + 1; p = p->next; }

Refined Optimizer HNC key ingredients are an optimizer using a nameless scopeless untyped control flow graph and scope, name and type reconstruction phase. HNC tries as much as possible to preserve the documentary structure of human-supplied identifiers in its input to generate meaningful human-understandable output in the spirit of PureScript.

Minimal Codebase. The type inference algorithm should be as simple as possible and be ready for formal proof. HNC strive to escape from unneeded features relying on attribute grammars and generic programming.

Hackability. HNC should be considered as a framework or lightweight System F core useful in other projects as a library.


Unlike many legacy-burdened compilers, HNC optimizer doesn't use peephole-style hand-ordered phases. Instead, it uses a novel algorithm of interleaved dataflow analysis and rewriting to make optimizer smaller, more modular and less fragile (Lerner et al — 2002).

The algebraic approach lets HNC perform common optimizations such as dead code elimination, inlining, copy propagation, arity reduction in less than 500 lines of code specific to HNC. For the subtle algorithms for forward and backward flow traversal, interleaving, speculative rewrites and fixed point computation HNC uses HOOPL — a reusable component of the mature GHC compiler (Ramsey et al — 2010).

Compiler Passes

  1. INF — Infinity Language
  2. BASELIB — Morte Library
  3. WEAK — Weaking (pending)
  4. AST — Implicitly Typed AST
  5. TYPE — Typechecking
  6. FLOW — Control Flow Graph
  7. SCOPE — Name and Scope Reconstruction
  8. INFER — Type Inference
  9. TARGET — Target AST
  10. LLVM — Clang/Rust Compiler
  11. NATIVE — Native Executable