Introduction
This talk
A collection of general-purpose techniques
In OCaml, yet relevant to any modern typed FP
Oriented towards distributed applications
Used in the context of blockchain, but more general
... From a Lisper at heart
Introduction
Going from Lisp to OCaml
Experience report adopting language-enforced types
The Good: where the typesystem really helps
The Bad: where the typesystem gets in the way
The Ugly: how the typesystem has to improve
Introduction
Take Home Points
Typed FP enables robust programming at scale
Monotonic FP is especially fit for distributed applications
You pay a steep price for types
To improve Typed FP: better types & metaprogramming
Types at Work: Marshaling
Types at Work: Marshaling
Marshaling data to bytes and back
Translating data structures into byte sequences
a.k.a. Serializing, Encoding, Formatting, Unparsing
Must be matched by unmarshaling (a.k.a. …)
Necessary to communicate and/or persist data
Great sink of labor (old IBM study: 30% of code)
Great fount of bugs & vulnerabilities…
Types at Work: Marshaling
Level 1: Processing Byte Buffers
process_request: Bytes.t -> Bytes.t
Read: let b = inbuf.(n) in if b == 1 then …
Write: match val with C x -> outbuf.(n) = (f x) …
OCaml types don't help at all (but bounds checks do)
Might as well use C - it'd be faster
"Shotgun parser": parser peppered into processor
Very fast... like grapeshot fired at your foot
LangSec: DON'T DO IT
Types at Work: Marshaling
Level 2: Processing JSON (or XML, etc.)
process_request: json -> json
Dynamic type safety
Might as well use Lisp - macros would help a lot
Still "Shotgunning" structural integrity checks
Somewhat slow, only so safe, painful in OCaml
Missing the whole point of OCaml
Types at Work: Marshaling
Level 3: Marshaling messages by hand
process_request: request -> response
type 'a marshaler = Buffer.t -> 'a -> unit
type 'a unmarshaler = int -> Bytes.t -> 'a*int
Lots of bug-prone drudgery writing (un)parsers
Types at Work: Marshaling
Level 4: Marshaling combinators
type 'a marshaling =
{ marshal: 'a marshaler; unmarshal: 'a unmarshaler }
val marshaling2 : ('x -> 'a*'b) -> ('a -> 'b -> 'x) ->
'a marshaling -> 'b marshaling -> 'x marshaling
Much less bug-prone and drudgery, but still too much
Types at Work: Marshaling
Level 5: Derive marshaling from type
OCaml PPX: metaprogramming
Compared to Lisp macros, quite heavy to use and not composable
Types at Work: Marshaling
Level 6: Use GADT
process_request: 'a request -> 'a response
Problem: even tougher to use with PPX
Still a TBD item for us
Monotonic Programming
Monotonicity
Purity: no state change
Monotonicity: one-way state may change
Knowledge can increase, never be invalidated
Least fixed-point algorithms. Git. CRDTs. Append-only logs.
Lazy is already monotonic, not pure
But no computation can live to tell the difference
Monotonic Programming
Content-Addressed Storage
Same graph-reduction model as all common functional languages
Pointer: not memory address, but content digest
Standard cryptographic assumptions
DAG-only, no cycles
let db_value_of_digest unmarshal_string digest =
digest |> db_key_of_digest |>
Db.get |> Option.get |> unmarshal_string
Monotonic Programming
Abstracting Content-Addressing
General wrapper interface
type +'a wrap
type t = trie wrap
and trie = Leaf of value
| Branch of {left: t; right: t}
Identity vs lazy-loading from content-addressed store
Higher-Order modules to abstract over persistence of not
Monotonic Programming
Is it an effect?
Escape to impure behavior
Pure at one level, impure at another
Same for logging
Same for lazy evaluation
Same for allocation!
Same for everything!!!
Monotonic Programming
Lenses
Usual pure functional read/write accessors.
type ('a, 'b) t =
{ get : 'a -> 'b; set : 'b -> 'a -> 'a }
Do all modifications in pure style,
then monotonically update the state variables
Monotonic Programming
Zippers
kind of an ('a, 'a) lens
as data, not function
Can be generalized to paths in more complex data structures.
Zipping through a binary tree:
type (+'a) path
type zipper = t * t path
val unzip : zipper -> t
val find_path : key -> t -> zipper
val path_map: ('a -> 'b) -> 'a path -> 'b path
Monotonic Programming
Merklization
Merklize: make computations verifiable
Zippers-as-data trivialize merklization:
let merkle_proof key mt =
match map_fst Wrap.get (find_path key mt) with
| Leaf {value}, up ->
Some { key ; trie = dv_digest mt
; leaf = merklize_leaf value
; steps = (path_map dv_digest up).steps }
| _ -> None
Types: Benefits and Costs
Types: Benefits and Costs
Why use OCaml rather than Lisp?
Cryptocurrency applications: can't afford a single bug.
Static types can find bugs before deployment.
Types critical to communicate design constraints to coworkers.
Parametricity enables robust abstraction at many levels
Putting types first makes you ask important questions.
Types: Benefits and Costs
Downsides of OCaml versus Lisp?
The typesystem rejects the Lisp-easy solution.
The syntax is a hindrance at many levels.
Types-first makes exploration harder.
Type errors can boggle the mind.
Dynamic evaluation is slow and awkward.
Static design patterns are repetitive and error-prone.
No macros. PPX a poor yet expensive substitute.
Types: Benefits and Costs
Bugs eliminated by types
Type mismatch (int
for string
, etc.)
Improper function calls
Pattern exhaustiveness
Errors in untested intermediate computations
Incomplete refactoring
Abstraction violation (using parametrized modules)
Types: Benefits and Costs
Bugs not eliminated by types
Logic bugs within a module
Configuration bugs
Integer interval errors
Concurrency Issues
Insufficient error handling
Still, by restricting interactions,
bugs are not just fewer, but simpler
— they fit a brainful.
Types: Benefits and Costs
Performance
Skipping runtime typechecks!
Types enable many optimizations
Dynamic evaluation when needed is actually slower
Macros are harder, limiting user-provided optimizations
Types: Benefits and Costs
Monads vs Lisp
Monad | Lisp |
---|
Error | raising conditions |
State | assignment |
Future | threads, futures |
Reader | dynamic parameters |
Control | partial continuations |
NonDet | non-determinism |
IO | intercepting IO primitives |
Parsing | parser DSL |
Types: Benefits and Costs
Monads vs Lisp costs & benefits
Monads make explicit what effects may happen where
Monads force you to think about responsibility
Monads introduce plumbing that implementations may not support
Monads build higher towers of structure
Monad syntax is low-level - CPS or ANF, not nested expressions
Future Improvements to Typesystems
Future Improvements to Typesystems
Objects done right
OCaml modules don't support late binding or fix-pointing
OCaml objects are limited in many ways
Nix: prototype inheritance in 6 short library functions
You can't type prototypes in OCaml (or Haskell, etc.)
You need appendable row types
Multimethods? Method combinations? Meta-object protocol?
Future Improvements to Typesystems
Schema Upgrade
A real program works on persistent data
Modern programming languages only create toys
Non-toys use databases. But they have horrible PLs!
Lisp supports schema upgrade. Can static types do it?
Future Improvements to Typesystems
Better Effect Typing and Syntax
Monads are semantically awkward
Monad notation is syntactically horrible
NB: Jane Street is funding effect typing for OCaml
One man's pure bliss is another man's effects
Future Improvements to Typesystems
Compile-time Reflection
Refinement logic --- relating layers of abstraction, correctly
Make explicit multiple levels of computations
Extending the type with domain-specific logic fragments:
linear (URAL…), modal (temporal, epistemic…), finitary, etc.
A composable alternative to PPX... macros?
Future Improvements to Typesystems
Reconciling Types and Macros
Anything information used by a macro is a type:
compile-time information deduced from the source.
Any type-level programming is a macro:
compile-time transformation of source code.
WHY CAN'T I HAVE BOTH, COMPOSABLY?
"Type Systems as Macros" by Chang, Knauth, Greenman
Hackett
Future Improvements to Typesystems
Runtime Reflection
Providing safety and performance to dynamic code, too
Virtualization: Separate fore-program from controlling back-program
Instrumentation: Natural transformations on implementations
Code Migration. Native GC…
GC integration with monotonic database cache
… See my previous talk on first-class implementatoins
Conclusion
Take Home Points (redux)
Typed FP enables robust programming at scale
Monotonic FP is especially fit for distributed applications
You pay a steep price for types
To improve Typed FP: better types & metaprogramming
Conclusion
The Meta-Story
Types can be your friends or your enemies
Life is happier when they are your friends
Life is happier when type friends are less dumb
OCaml type friends are decent, could be even better
Conclusion
Thanks
WE ARE EITHER HIRING OR BANKRUPT! (No constructive proof yet.)
DO YOU HAVE EXTRA MILLIONS? Seed capital, Research grants…