Year: 1986

Authors: Jim Gray, Franco Putzolu

tr;dr: it’s 5 hours in 2007 (get more main memory!)


Database architecture has a tight relationship with underlying hardware characteristics: e.g. I/O, volatility. This paper points out that as of 2007, there is a lot more data, some virtualization, and most importantly flash memory that should really change how people think about how the memory hierarchy.

The Original

When is it cheaper to keep a record in main memory rather than access it on disc? For high-end systems of the 1980’s the answer is: pages referenced every five minutes should be memory resident.

Notice that this is not about speed but just price. Because back then the world was different: > In some situations, response time dictates that the data be main-memory resident because disc accesses introduce too much delay. These situations are rare. More commonly keeping data main memory resident is purely an economic issue.

I got really tripped up with how to calculate cost (without thinking about time). We basically have the following formulations:

  • Disc: 15 accesses per second, priced at 15K$ for a small 180Mb. So the price per access per second is about 1K$. The extra CPU and channel cost for supporting a disc are lK$/a/s. So one disc access per second costs about 2K$.
  • Main: 5$ per kilobyte.

So Making a 1Kb record resident saves 1 a/s, it saves about 2K$ worth of disc accesses at a cost of 5$ (great!) and the break even point is every 2000/5 seconds. The more frequently the access (the shorter the time) the more worth it to get main memory.

I think the fundamental economics at play is that if you access something frequently enough and cannot store in memory, it needs to be duplicated on disk more so that the access wouldn’t be bottlenecked (assuming there is no queuing and contention). This is why speed is not concerned and just the raw storage.

If we increase the record size, one would expect the time to be shorter (remember the shorter the time, the less main memory one buys).The math is obvious but intuitively it was strange that record size made a difference.

The Flash

What is flash memory? It’s intermediary memory, performance between RAM and disk, and persistent. It also has lower energy consumption (reduced RAM usage)

A few key questions:

  • Whats the hardware interface?
  • Is it a special part of either main memory or persistent storage. This question is critical for where it fits for the different logging algorithms.
  • How to track frequent pages (caching mechanism)
  • How much to get?
  • How to move pages among the layers of the hierarchy?
  • How to track page locations?

The New Rules

  • For RAM and flash disks of 32GB, the break-even interval is 15 min
  • Flash and disk: 2.25 hours
    • two hours is longer than any common checkpoint interval, which implies that dirty pages in ash are forced to disk not by page replacement but by checkpoints

For checkpointing, database systems benefit if flash memory is treated as part of the system’s persistent storage.

I omit many (potentially important details…)


  • apparently Gray and Putzolu’s prediction of 5 hours was amazingly accurate
  • We are almost ready for a 30 year writeup! (2017) time flies!
  • I find hardware changes incredibly boring despite their importance…