𝔹  uilding
𝔻  istributed
𝔸  pplications in
𝕆  Caml
Making types your friends, not your enemies

François-René Rideau,  Alacris
fare@alacris.io

LambdaConf 2019, 2019-06-07
https://alacrisio.github.io/bdao/

Introduction

Introduction

This talk

A collection of general-purpose techniques
In OCaml, yet relevant to any modern typed FP
Oriented towards distributed applications
Used in the context of blockchain, but more general
... From a Lisper at heart

Introduction

Going from Lisp to OCaml

Experience report adopting language-enforced types
The Good: where the typesystem really helps
The Bad: where the typesystem gets in the way
The Ugly: how the typesystem has to improve

Introduction

Take Home Points

Typed FP enables robust programming at scale
Monotonic FP is especially fit for distributed applications
You pay a steep price for types
To improve Typed FP: better types & metaprogramming

Types at Work: Marshaling

Types at Work: Marshaling

Marshaling data to bytes and back

Translating data structures into byte sequences
a.k.a. Serializing, Encoding, Formatting, Unparsing
Must be matched by unmarshaling (a.k.a. …)
Necessary to communicate and/or persist data
Great sink of labor (old IBM study: 30% of code)
Great fount of bugs & vulnerabilities…

Types at Work: Marshaling

Level 1: Processing Byte Buffers

process_request: Bytes.t -> Bytes.t
Read: let b = inbuf.(n) in if b == 1 then …
Write: match val with C x -> outbuf.(n) = (f x) …
OCaml types don't help at all (but bounds checks do)
Might as well use C - it'd be faster
"Shotgun parser": parser peppered into processor
Very fast... like grapeshot fired at your foot
LangSec: DON'T DO IT

Types at Work: Marshaling

Level 2: Processing JSON (or XML, etc.)

process_request: json -> json
  
Dynamic type safety
Might as well use Lisp - macros would help a lot
Still "Shotgunning" structural integrity checks
Somewhat slow, only so safe, painful in OCaml
Missing the whole point of OCaml

Types at Work: Marshaling

Level 3: Marshaling messages by hand

process_request: request -> response
type 'a marshaler = Buffer.t -> 'a -> unit
type 'a unmarshaler = int -> Bytes.t -> 'a*int
Lots of bug-prone drudgery writing (un)parsers

Types at Work: Marshaling

Level 4: Marshaling combinators

type 'a marshaling =
   { marshal: 'a marshaler; unmarshal: 'a unmarshaler } val marshaling2 : ('x -> 'a*'b) -> ('a -> 'b -> 'x) ->
  'a marshaling -> 'b marshaling -> 'x marshaling
Much less bug-prone and drudgery, but still too much

Types at Work: Marshaling

Level 5: Derive marshaling from type

OCaml PPX: metaprogramming
Compared to Lisp macros, quite heavy to use and not composable

Types at Work: Marshaling

Level 6: Use GADT

process_request: 'a request -> 'a response
Problem: even tougher to use with PPX
Still a TBD item for us

Monotonic Programming

Monotonic Programming

Monotonicity

Purity: no state change
Monotonicity: one-way state may change
Knowledge can increase, never be invalidated
Least fixed-point algorithms. Git. CRDTs. Append-only logs.
Lazy is already monotonic, not pure
But no computation can live to tell the difference

Monotonic Programming

Content-Addressed Storage

Same graph-reduction model as all common functional languages
Pointer: not memory address, but content digest
Standard cryptographic assumptions
DAG-only, no cycles
let db_value_of_digest unmarshal_string digest =
  digest |> db_key_of_digest |>
  Db.get |> Option.get |> unmarshal_string

Monotonic Programming

Abstracting Content-Addressing

General wrapper interface
type +'a wrap
type t = trie wrap
and trie = Leaf of value
  | Branch of {left: t; right: t}
Identity vs lazy-loading from content-addressed store
Higher-Order modules to abstract over persistence of not

Monotonic Programming

Is it an effect?

Escape to impure behavior
Pure at one level, impure at another
Same for logging
Same for lazy evaluation
Same for allocation!
Same for everything!!!

Monotonic Programming

Lenses

Usual pure functional read/write accessors.
type ('a, 'b) t =
  { get : 'a -> 'b; set : 'b -> 'a -> 'a }
Do all modifications in pure style,
then monotonically update the state variables

Monotonic Programming

Zippers

kind of an ('a, 'a) lens as data, not function
Can be generalized to paths in more complex data structures.
Zipping through a binary tree:
type (+'a) path
type zipper = t * t path
val unzip : zipper -> t
val find_path : key -> t -> zipper val path_map: ('a -> 'b) -> 'a path -> 'b path

Monotonic Programming

Merklization

Merklize: make computations verifiable
Zippers-as-data trivialize merklization:
let merkle_proof key mt =
  match map_fst Wrap.get (find_path key mt) with
    | Leaf {value}, up ->
        Some { key ; trie = dv_digest mt
             ; leaf = merklize_leaf value
             ; steps = (path_map dv_digest up).steps }
    | _ -> None

Types: Benefits and Costs

Types: Benefits and Costs

Why use OCaml rather than Lisp?

Cryptocurrency applications: can't afford a single bug.
Static types can find bugs before deployment.
Types critical to communicate design constraints to coworkers.
Parametricity enables robust abstraction at many levels
Putting types first makes you ask important questions.

Types: Benefits and Costs

Downsides of OCaml versus Lisp?

The typesystem rejects the Lisp-easy solution.
The syntax is a hindrance at many levels.
Types-first makes exploration harder.
Type errors can boggle the mind.
Dynamic evaluation is slow and awkward.
Static design patterns are repetitive and error-prone.
No macros. PPX a poor yet expensive substitute.

Types: Benefits and Costs

Bugs eliminated by types

Type mismatch (int for string, etc.)
Improper function calls
Pattern exhaustiveness
Errors in untested intermediate computations
Incomplete refactoring
Abstraction violation (using parametrized modules)

Types: Benefits and Costs

Bugs not eliminated by types

Logic bugs within a module
Configuration bugs
Integer interval errors
Concurrency Issues
Insufficient error handling
          
Still, by restricting interactions, bugs are not just fewer, but simpler
— they fit a brainful.

Types: Benefits and Costs

Performance

Skipping runtime typechecks!
Types enable many optimizations
Dynamic evaluation when needed is actually slower
Macros are harder, limiting user-provided optimizations

Types: Benefits and Costs

Monads vs Lisp

MonadLisp
Errorraising conditions
Stateassignment
Futurethreads, futures
Readerdynamic parameters
Controlpartial continuations
NonDetnon-determinism
IOintercepting IO primitives
Parsingparser DSL

Types: Benefits and Costs

Monads vs Lisp costs & benefits

Monads make explicit what effects may happen where
Monads force you to think about responsibility
Monads introduce plumbing that implementations may not support
Monads build higher towers of structure
Monad syntax is low-level - CPS or ANF, not nested expressions

Future Improvements to Typesystems

Future Improvements to Typesystems

Objects done right

OCaml modules don't support late binding or fix-pointing
OCaml objects are limited in many ways
Nix: prototype inheritance in 6 short library functions
You can't type prototypes in OCaml (or Haskell, etc.)
You need appendable row types
Multimethods? Method combinations? Meta-object protocol?

Future Improvements to Typesystems

Schema Upgrade

A real program works on persistent data
Modern programming languages only create toys
Non-toys use databases. But they have horrible PLs!
Lisp supports schema upgrade. Can static types do it?

Future Improvements to Typesystems

Better Effect Typing and Syntax

Monads are semantically awkward
Monad notation is syntactically horrible
NB: Jane Street is funding effect typing for OCaml
One man's pure bliss is another man's effects

Future Improvements to Typesystems

Compile-time Reflection

Refinement logic --- relating layers of abstraction, correctly
Make explicit multiple levels of computations
Extending the type with domain-specific logic fragments:
linear (URAL…), modal (temporal, epistemic…), finitary, etc.
A composable alternative to PPX... macros?

Future Improvements to Typesystems

Reconciling Types and Macros

Anything information used by a macro is a type:
compile-time information deduced from the source.
Any type-level programming is a macro:
compile-time transformation of source code.
WHY CAN'T I HAVE BOTH, COMPOSABLY?
"Type Systems as Macros" by Chang, Knauth, Greenman
Hackett

Future Improvements to Typesystems

Runtime Reflection

Providing safety and performance to dynamic code, too
Virtualization: Separate fore-program from controlling back-program
Instrumentation: Natural transformations on implementations
Code Migration. Native GC…
GC integration with monotonic database cache
… See my previous talk on first-class implementatoins

Conclusion

Conclusion

Take Home Points (redux)

Typed FP enables robust programming at scale
Monotonic FP is especially fit for distributed applications
You pay a steep price for types
To improve Typed FP: better types & metaprogramming

Conclusion

The Meta-Story

Types can be your friends or your enemies
Life is happier when they are your friends
Life is happier when type friends are less dumb
OCaml type friends are decent, could be even better

Conclusion

Thanks

Our startup:   Alacris  https://alacris.io/
WE ARE EITHER HIRING OR BANKRUPT!   (No constructive proof yet.)
DO YOU HAVE EXTRA MILLIONS?   Seed capital, Research grants…
SHOW ME THE CODE!   https://github.com/AlacrisIO/legicash-facts