Year: 2011

Authors: P Alvaro N Conway, JM Hellerstein, WR Marczak

Summary

Distributed programming is difficult to do correctly due to the many details and different combinations of scenarios. The authors propose a new programming language that helps the programmers write high level code that behaves in the intended manners (levels of consistency). The key underlying drive is to ensure that programmers don’t assume sequentiality and only require order when needed.

Consistency and Logical Monotonicity

Motivation: “Order independence is a key attribute of declarative languages based on sets, which has led most notably to the success of paral- lel databases and web search infrastructure.”

Intuitive Definition: “Monotonic programs—e.g., programs expressible via selection, projection and join (even with recursion)—can be implemented by streaming algorithms that incrementally produce output elements as they receive input elements. The final order or contents of the input will never cause any earlier output to be “revoked” once it has been generated.”

Language Basics

Time: “Bloom statements are defined with respect to atomic “timesteps,” which can be implemented via successive rounds of evaluation. In each timestep, certain “ground facts” exist in collections due to persistence or the arrival of messages from outside agents (e.g., the network or system clock). The statements in a Bloom program specify the derivation of additional facts, which can be declared to exist either in the current timestep, at the very next timestep, or at some non-deterministic time in the future at a remote node.”

Collections: * Table: The contents of a table persist across consecutive timesteps (until that persistence is interrupted via a Bloom statement containing the <- operator, which deletes). * Scratch collections are useful for transient data like intermediate results and “macro” definitions that enable code reuse * The facts of the “real world”, including network messages and the passage of wall-clock time, are captured via channel and periodic collections; these are scratch collections whose contents “appear” at non-deterministic timesteps. * interface: specifies a connection point between Bloom modules.

Below is what makes up the main parts of the language:

bloom basics

And there are some common functionalities that are supported for collections. These don’t interact directly with time so they are orthogonal to the temporal semantics.

bloom basics

People tend to think that logic programming is difficult sometimes because the interactions between the rules is hard to understand. Bloom makes it easier by

  • keeping it purely declarative (as opposed to mixing with interactive)
  • exposing very tight interfaces and combining via modules (through the object oriented interface of Ruby)

Below is a sample program that declares an interface and implements a key value store.

bloom example

How Often Are Things Monotone?

Most basically set membership (without deletion) is monotone, but that’s very restrictive. To support deletes, we could introduce a tombstone set that captures the deletes to make it monotone despite supporting deletes. The expression MIN(x) < 100 is monotonic despite containing an aggregation.

This is probably the biggest concern for Bloomg-lang (after performance issues, followed by programmability), and there was a followup paper, Bloom^L that seek to use the lattice framework to bring more functionalities into the nice monotone world.