Year: 2013 Authors: Diaconu et al

Summary

Hekaton targets in memory highly concurrent OLTP workloads and is developed by Microsoft researchers and part of SQL Server. The system is completely lock free and uses a optimistic multiversion concurrency control scheme. The paper presents lots of neat tricks for in memory DBs.

Ideas

  • No partitioning: partitioning works well when the workload is partionable, otherwise a transaction touches several partitions and deteriorates performance.
  • Principles:
    • indexes for main memory
    • eliminate latches and locks
    • compile requests to native code: (per evaluation this turns out to be high impact)

Compilation

  • Compiled schema: the reason this is needed is that the tables are treated as a blackbox to the engine so each table must have custom functions that perform tasks such as hash function computation, and the reason for this is to increase speed since the functions could be compiled to native code.
  • Compiled stored procedure
    • Instead of functions on the query plan label (scans, joins, aggregation), “we collapse an en- tire query plan into a single function using labels and gotos to implement and connect these interfaces”. This speeds things up because there isn’t “wasted instructions where one operator merely calls another without performing any useful work”

Transactions

  • To ensure that a transaction is serializable requires the following two properties (note that snapshot isolation, a popular method for implementing MVCC is not serializable).
    • Read stability: the version visible to T at the beginning of the transaction should be the same at the end
    • Phantom avoidance:
  • Commit dependency: this is needed to implement non blocking transactions where the system allows a transaction T1 that begins when a transaction T2 is being validated, but as a result:
    • (1) must make sure that T1 cannot commit until T2 (and other dependent transaction) are committed: basically implement a dependency counter and have T2 keep a reference to T1 and decrement the counter when committed.
    • (2) T1 cannot work with uncommitted data: set up read barrier.
  • Durability:
    • Logging: log stream and checkpoint stream (pretty standard)
      • Heckaton never writes dirty data, so it could log at commit time.
      • To make sure that checkpointing doesnt bloat, versions are merged when active content (not deleted versions in datafile) is below a certain threshold.
    • No logging for indices
    • All logging is logical

Garbage Collection

  • Principles: non-blocking, cooperative (transaction execution threads could opportunistically GC when it runs across data not accessed).