𝔹  uilding
𝔻  istributed
𝔸  pplications in
𝕆  Caml

Making types your friends, not your enemies

François-René Rideau,  Alacris
fare@alacris.io

LambdaConf 2019, 2019-06-07
https://alacrisio.github.io/bdao/

Introduction

Introduction

This talk

A collection of general-purpose techniques

In OCaml, yet relevant to any modern typed FP

Oriented towards distributed applications

Used in the context of blockchain, but more general

... From a Lisper at heart

Introduction

Going from Lisp to OCaml

Experience report adopting language-enforced types

The Good: where the typesystem really helps

The Bad: where the typesystem gets in the way

The Ugly: how the typesystem has to improve

Introduction

Take Home Points

Typed FP enables robust programming at scale

Monotonic FP is especially fit for distributed applications

You pay a steep price for types

To improve Typed FP: better types & metaprogramming

Types at Work: Marshaling

Types at Work: Marshaling

Marshaling data to bytes and back

Translating data structures into byte sequences

a.k.a. Serializing, Encoding, Formatting, Unparsing

Must be matched by unmarshaling (a.k.a. …)

Necessary to communicate and/or persist data

Great sink of labor (old IBM study: 30% of code)

Great fount of bugs & vulnerabilities…

Types at Work: Marshaling

Level 1: Processing Byte Buffers

process_request: Bytes.t -> Bytes.t

Read: let b = inbuf.(n) in if b == 1 then …

Write: match val with C x -> outbuf.(n) = (f x) …

OCaml types don't help at all (but bounds checks do)

Might as well use C - it'd be faster

"Shotgun parser": parser peppered into processor

Very fast... like grapeshot fired at your foot

LangSec: DON'T DO IT

Types at Work: Marshaling

Level 2: Processing JSON (or XML, etc.)

process_request: json -> json

Dynamic type safety

Might as well use Lisp - macros would help a lot

Still "Shotgunning" structural integrity checks

Somewhat slow, only so safe, painful in OCaml

Missing the whole point of OCaml

Types at Work: Marshaling

Level 3: Marshaling messages by hand

process_request: request -> response

type 'a marshaler = Buffer.t -> 'a -> unit

type 'a unmarshaler = int -> Bytes.t -> 'a*int

Lots of bug-prone drudgery writing (un)parsers

Types at Work: Marshaling

Level 4: Marshaling combinators

type 'a marshaling = 

   { marshal: 'a marshaler; unmarshal: 'a unmarshaler }
val marshaling2 : ('x -> 'a*'b) -> ('a -> 'b -> 'x) -> 

  'a marshaling -> 'b marshaling -> 'x marshaling

Much less bug-prone and drudgery, but still too much

Types at Work: Marshaling

Level 5: Derive marshaling from type

OCaml PPX: metaprogramming

Compared to Lisp macros, quite heavy to use and not composable

Types at Work: Marshaling

Level 6: Use GADT

process_request: 'a request -> 'a response

Problem: even tougher to use with PPX

Still a TBD item for us

Monotonic Programming

Monotonic Programming

Monotonicity

Purity: no state change

Monotonicity: one-way state may change

Knowledge can increase, never be invalidated

Least fixed-point algorithms. Git. CRDTs. Append-only logs.

Lazy is already monotonic, not pure

But no computation can live to tell the difference

Monotonic Programming

Content-Addressed Storage

Same graph-reduction model as all common functional languages

Pointer: not memory address, but content digest

Standard cryptographic assumptions

DAG-only, no cycles

let db_value_of_digest unmarshal_string digest = 

  digest |> db_key_of_digest |> 

  Db.get |> Option.get |> unmarshal_string

Monotonic Programming

Abstracting Content-Addressing

General wrapper interface

type +'a wrap 

type t = trie wrap 

and trie = Leaf of value 

  | Branch of {left: t; right: t}

Identity vs lazy-loading from content-addressed store

Higher-Order modules to abstract over persistence of not

Monotonic Programming

Is it an effect?

Escape to impure behavior

Pure at one level, impure at another

Same for logging

Same for lazy evaluation

Same for allocation!

Same for everything!!!

Monotonic Programming

Lenses

Usual pure functional read/write accessors.

type ('a, 'b) t = 

  { get : 'a -> 'b; set : 'b -> 'a -> 'a }

Do all modifications in pure style,
then monotonically update the state variables

Monotonic Programming

Zippers

kind of an ('a, 'a) lens as data, not function
Can be generalized to paths in more complex data structures.

Zipping through a binary tree:

type (+'a) path 

type zipper = t * t path 

val unzip : zipper -> t 

val find_path : key -> t -> zipper
val path_map: ('a -> 'b) -> 'a path -> 'b path

Monotonic Programming

Merklization

Merklize: make computations verifiable

Zippers-as-data trivialize merklization:

let merkle_proof key mt = 

  match map_fst Wrap.get (find_path key mt) with 

    | Leaf {value}, up -> 

        Some { key ; trie = dv_digest mt 

             ; leaf = merklize_leaf value 

             ; steps = (path_map dv_digest up).steps } 

    | _ -> None

Types: Benefits and Costs

Types: Benefits and Costs

Why use OCaml rather than Lisp?

Cryptocurrency applications: can't afford a single bug.

Static types can find bugs before deployment.

Types critical to communicate design constraints to coworkers.

Parametricity enables robust abstraction at many levels

Putting types first makes you ask important questions.

Types: Benefits and Costs

Downsides of OCaml versus Lisp?

The typesystem rejects the Lisp-easy solution.

The syntax is a hindrance at many levels.

Types-first makes exploration harder.

Type errors can boggle the mind.

Dynamic evaluation is slow and awkward.

Static design patterns are repetitive and error-prone.

No macros. PPX a poor yet expensive substitute.

Types: Benefits and Costs

Bugs eliminated by types

Type mismatch (int for string, etc.)

Improper function calls

Pattern exhaustiveness

Errors in untested intermediate computations

Incomplete refactoring

Abstraction violation (using parametrized modules)

Types: Benefits and Costs

Bugs not eliminated by types

Logic bugs within a module

Configuration bugs

Integer interval errors

Concurrency Issues

Insufficient error handling

Still, by restricting interactions, bugs are not just fewer, but simpler
— they fit a brainful.

Types: Benefits and Costs

Performance

Skipping runtime typechecks!

Types enable many optimizations

Dynamic evaluation when needed is actually slower

Macros are harder, limiting user-provided optimizations

Types: Benefits and Costs

Monads vs Lisp

Monad	Lisp
Error	raising conditions
State	assignment
Future	threads, futures
Reader	dynamic parameters
Control	partial continuations
NonDet	non-determinism
IO	intercepting IO primitives
Parsing	parser DSL

Types: Benefits and Costs

Monads vs Lisp costs & benefits

Monads make explicit what effects may happen where

Monads force you to think about responsibility

Monads introduce plumbing that implementations may not support

Monads build higher towers of structure

Monad syntax is low-level - CPS or ANF, not nested expressions

Future Improvements to Typesystems

Future Improvements to Typesystems

Objects done right

OCaml modules don't support late binding or fix-pointing

OCaml objects are limited in many ways

Nix: prototype inheritance in 6 short library functions

You can't type prototypes in OCaml (or Haskell, etc.)

You need appendable row types

Multimethods? Method combinations? Meta-object protocol?

Future Improvements to Typesystems

Schema Upgrade

A real program works on persistent data

Modern programming languages only create toys

Non-toys use databases. But they have horrible PLs!

Lisp supports schema upgrade. Can static types do it?

Future Improvements to Typesystems

Better Effect Typing and Syntax

Monads are semantically awkward

Monad notation is syntactically horrible

NB: Jane Street is funding effect typing for OCaml

One man's pure bliss is another man's effects

Future Improvements to Typesystems

Compile-time Reflection

Refinement logic --- relating layers of abstraction, correctly

Make explicit multiple levels of computations

Extending the type with domain-specific logic fragments:
linear (URAL…), modal (temporal, epistemic…), finitary, etc.

A composable alternative to PPX... macros?

Future Improvements to Typesystems

Reconciling Types and Macros

Anything information used by a macro is a type:
compile-time information deduced from the source.

Any type-level programming is a macro:
compile-time transformation of source code.

WHY CAN'T I HAVE BOTH, COMPOSABLY?

"Type Systems as Macros" by Chang, Knauth, Greenman
Hackett

Future Improvements to Typesystems

Runtime Reflection

Providing safety and performance to dynamic code, too

Virtualization: Separate fore-program from controlling back-program

Instrumentation: Natural transformations on implementations

Code Migration. Native GC…

GC integration with monotonic database cache

… See my previous talk on first-class implementatoins

Conclusion

Conclusion

Take Home Points (redux)

Typed FP enables robust programming at scale

Monotonic FP is especially fit for distributed applications

You pay a steep price for types

To improve Typed FP: better types & metaprogramming

Conclusion

The Meta-Story

Types can be your friends or your enemies

Life is happier when they are your friends

Life is happier when type friends are less dumb

OCaml type friends are decent, could be even better

Conclusion

Thanks

Our startup: Alacris https://alacris.io/

WE ARE EITHER HIRING OR BANKRUPT! (No constructive proof yet.)

DO YOU HAVE EXTRA MILLIONS? Seed capital, Research grants…

SHOW ME THE CODE! https://github.com/AlacrisIO/legicash-facts