zygomys: embedded scripting toolkit for Go 16 March 2016 Jason E. Aten, Ph.D. Principal Engineer, Betable.com j.e.aten@gmail.com @jasonaten_ https://github.com/glycerine/zygomys [[https://github.com/glycerine/zygomys/wiki.]] The wiki has details, examples, and discussion. * scenario * Snoopy versus the Red Baron .image snoopy_ace.png * What is his cry? .image curse_you_red_baron.png - imagine Snoopy comes back with friends... * data model for a plane formation, Go structs .code snoopy1 - note slice of interface / embedded elements (we handle these) * Methods defined on the Snoopy struct .code snoopy2 - anything that can Fly satisfies the Flyer interface. Weather, informs the Fly event: .code weather * Flyer friends, can be listed in the Friends slice in Plane .code other.planes * with the data model in mind, lets interact with it using zygo... * Make it rain .code make.it.rain * Bring in some planes, in formation .code make.planes - in three lines we've instantiated and configured 3 Go structs in a tree - call a Go method on a Go struct... Snoopy goes Fly()-ing .code plane.interact * we've just been interacting/scripting Go data and methods... * yeah, reflection is pretty cool * zygomys - a scripting toolkit for Go * zygomys -- what's in a name? - zygo means union (yoke in Greek; the zygote was the first cell that was you). - mys means mouse - this is a little mouse of a language - bonus: a "pocket gopher" known as Zygogeomys trichopus. Our mascot, "Ziggy". - this is the union of lisp and Go. In a small cute package. - Let's use the shorter `"zygo"` for the language, when speaking aloud. The Michoacan pocket gopher is a small animal with short, dense, black, lustrous fur... It is docile when caught, making no attempt to bite as do other pocket gophers. -- [[https://en.wikipedia.org/wiki/Michoacan_pocket_gopher]] .image pocket_gopher2.jpeg * Getting started: How to embed the REPL in your code See [[https://github.com/glycerine/zygomys/blob/master/cmd/zygo/main.go]]. Just three steps. .code main.go /START OMIT/,/END OMIT/ * architecture / overview of design .image pocket_gopher3.jpeg - a) lexer produces tokens - b) parser produces lists and arrays of symbols - c) macros run at definition type - d) codegen produces s-expression byte-code - e) (work in progress) builders create and check types at `run` time. (Builders are a hybrid between a function and a macro.) - f) a simple virtual machine executes s-expression byte-code. User's functions run. * Let's see some code - hashmaps can define arbitrary records; with or without attached Go shadow structs .code sample.run1 we range through the hashmap, `hsh`, like this: .code sample.run3 * records - records are hash tables with a name. All hash tables preserve key-order. .code harry.record * plain hash maps, without a distinct record TypeName .code hash.demo * arrays (slices) .code arrays.run * aims - interactive, but also aim to eventually compile-down to Go - blending Go and lisp - I built it for myself - technically interesting about the zygo implementation: - using goroutines as coroutines to get pause-able parsing. avoids the O(n^2) trap. Call for more input from many points; inversion of select loop -- exit when you've got another line of input tokens. See repl/parser.go. - if you haven't discovered how to do conditional sends on a channel yet, examples inside `github.com/glycerine/zygomys/repl/parser.go`. * hard parts that are already done - script calls to existing Go functions using existing Go structs. Using reflect is somewhat laborious; but its done - Go-style for loops, nest-able, with break to label and continue to label. - eval - sandbox / restrict filesystem access - full closures with lexical scope - adjust lisp syntax to be Go compatible: % for quoting, 'a' for characters. - higher order functions. * classic lisp style - list processing .code hof.foldr - see the closure tests in [[https://github.com/glycerine/zygomys/blob/master/tests/closure.zy]] * there is also an infix interface - anything inside curly braces is infix parsed. Can mix in function calls. Math becomes more readable. Uses a Pratt parser. .code infix.session * inspect the transformation .code infix.txt * Top-down Operator Precedence (Pratt parsing) - Like quicksort, Pratt's is a short, sharp algorithm; not particularly easy to understand on the first pass. It builds a parse tree so that precedence is decreasing as you go up to the root. Higher precedence means stronger binding, which means being lower in the parse tree. - works like magic. Very easy to extend your language with new infix operations, once the core Expression routine is implemented. This is how zygo processes infix syntax. - links for learning: [[https://github.com/glycerine/zygomys/blob/master/repl/pratt.go]] Already written into zygo. [[https://github.com/glycerine/PrattParserInC/blob/master/Vaughan.Pratt.TDOP.pdf]] original [[http://javascript.crockford.com/tdop/tdop.html]] Douglas Crockford's article, javascript example in python: [[http://effbot.org/zone/simple-top-down-parsing.htm]] blog with discussion/Java: [[http://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/]] * parse tree .image parseTree.jpg - hill-climbing: the precedence gets smaller as you go up. - + addition has precedence 50. - * multiplication has precedence 60, and so binds more tightly. * zygo use cases - as a query language - configuration language that can query itself. - Eventually... multi-core friendly scripting. (Channels now, but no select yet). - Eventually... leverage Go's multicore strength for exploratory configuration, data analysis and scripting. (I love R for productivity, Go for production). * the basic Go API: adding compiled functions to zygo .code first.go /START OMIT/,/END OMIT/ .code use.first * tour of the insides: major files of `github.com/glycerine/zygomys/repl/` - repl.go ..........ReplMain() lives here, toplevel loop - expression.go ..........the core Sexp s-expression definitions - environment.go ..........lexical scope, symbol table lookup - lexer.go ..........hand written lexer - parser.go ..........recursive parsing of symbols into lists and arrays - generator.go ..........code-gen: generate Sexp byte code - vm.go ..........virtual machine that executes the byte code - builders.go ..........declarations and type checking - gotypereg.go ..........the type registry for Go/zygo type correspondence * auxiliary/helper files; also in `github.com/glycerine/zygomys/repl/` - typeutils.go ..........runtime type inspection from zygo (type? var) - hashutils.go ..........records (a.k.a hash tables) - listutils.go ..........linked list (*SexpPair) utilities - strutils.go ..........string utilities - scope.go ..........stack of symbol tables - stack.go ..........used by scopes - rawutils.go ..........raw []byte handling - numerictower.go ..........number conversions - vprint.go ..........debug print stuff for development * custom types: extension example files - random.go .......... (wrap math/rand.Float64() call) - regexp.go .......... (wrap regexp.Regexp) - time.go .......... (wrap time.Time) - jsonmsgp.go .......... (conversions to/from json and msgpack) * the fundamental Sexp types .code exp.txt - SexpNull (actually a value; an instance of the SexpSentinel type) - SexpSymbol (variable and function names; symbol table entries) - SexpPair (linked lists) - SexpArray (slices) - SexpHash (hash table; keys and values are any Sexp, key ordering preserved) * debug tools, at the zygo repl - `.dump` .......... shows the data stack - `.debug` .......... traces the PC (program counter) through the byte code - `.undebug` .......... turns off traces - `.gls` .......... global list of all symbols - `.ls` .......... list of local symbols - `(macexpand)` .......... show what a macro expands to - `(infixExpand {})` .......... show the s-expression expansion of an infix expression - the wiki has a lot of documentation. [[https://github.com/glycerine/zygomys/wiki]] * json / msgpack support See the top of `github.com/glycerine/zygomys/repl/jsonmsgp.go` for a guide. .code msgp.txt * type system * type system. work in progress... - always manifestly typed: a variable points to a value that knows its own type. - add-on: optional static type system -- enforced at definition time -- is half implemented - to follow status, `github.com/glycerine/zygomys/tests/decl_fun.zy` - struct declarations done, function type declarations with (func) not yet done. - the static typing aims for compatibility with Go types (to enable compile-down) .code declare.struct * intro to the mechanism of zygo's static type system: Builders - A builder is a special kind of function; a hybrid between a function and a macro. I chose the term `builder` to reflect their ability to build structs (i.e. new types), packages, type aliases. - Like a macro, a builder receives the un-evaluated parse-tree of symbols from its caller. A builder can therefore be used to build new types and declare new functions/methods. - Like a function, a builder is called at run/evaluation time, not at definition time. (Macros are run at code-gen, which is definition time). - Since it receives an un-evaluated tree of symbols, a builder must manually eval arguments it wants to find bindings for. - Used in zygo to define structs. Next planned use: define interfaces, functions, methods, and type aliases. See [[https://github.com/glycerine/zygomys/blob/master/repl/builders.go]] for examples. * If you want to play with type systems - try out your parametric-polymorphism idea? - I don't know if its a good idea but... - the sigil-system is setup to let you explore this area - sigil prefixed symbols start with `$`, `#`, or `?` - makes it easy to define type-variables, for example. * sigil system symbols with a special prefix - `mysym` is a regular symbol - `$mysym` is a sigil symbol, with sigil '$'. It is distinct from `mysym`. - `#mysym` is a sigil symbol, with sigil '#'. It is distinct from `$mysym` and `mysym`. - `?mysym` is a sigil symbol, with sigil '?'. It is distinct from the above. * sigils part 2 - sigil prefixed-symbols evaluate to themselves by default. - useful for grammars, symbolic reasoning. * todo - dataframe - matrix / tensor types - complex number support * it's the future ... dream big - model checking syntax and checker (Ordered-Binary-Decision-Diagram based) - port TLA+ model checker to Go, with zygomys interface - unification engine for type system experiments * final thoughts - interpreters teach the value of a test-driven approach - tests are simply language fragments * credits The ancestor dialect of zygomys, [[https://github.com/zhemao/glisp]] Glisp, was designed and implemented by Howard Mao [[https://zhehaomao.com/]]. Thanks Howard!