Year: 2005

Authors: V Prabhakaran, AC Arpaci-Dusseau, RH Arpaci-Dusseau


This paper presents two new methods for analyzing file system behavior and evaluating file system changes.

  • semantic block-level analysis: understand the internal behavior and policies of the file system
  • semantic trace playback: quantify how changing the file system will impact the performance of real workloads

Their analysis has uncovered design flaws, performance problems, and even correctness bugs in these file systems.

Semantic Block-Level Analysis

Two existing approaches:

  • applies synthetic or real workloads and measures the resulting file system performance
  • collects traces to understand how file systems are used

These approaches don’t answer why the workload behaves in a certain way though. By analyzing on a block level, it’s possible to gain the following insights:

  • how caching/buffering affects performance
  • is traffic sequential or random
  • does the file system have bursty traffic

Tie this with information about the on-disk format (the semantic part of this approach) one could gain even more insights.

Semantic Trace Playback

What it is: it takes as input a trace (generated by the SBA driver and two more abservations, listed below), parses it, and issues I/O requests to the disk using the raw disk interface.

  • observe
    • any file-system level operations that create dirty buffers in memory
    • application-level calls to fsync

Ext3 File System

What it is:

  • journaling file system
  • loosely based on FFS
  • has three journaling modes (each one with more functionality and overhead)
  • compounded transactions

Evaluation results highlight:

  • “The journaling mode that delivers the best perfor- mance depends strongly on the workload.”
  • tangled synchrony of the compound transactions is disastrous for asynchronous traffic
  • there are additional parallelism gains to be made in the ordered mode
  • when a timer flushes meta-data to disk, the corresponding data must be flushed as well– forces dirty data.

They then made some improvement suggestions to ext3


Design differences to ext3 highlight:

  • on-disk structures to track their fixed-location data
  • journal format: ReiserFS is at beginning of the FS
  • journal content: descriptor block and commit block store the location of the block

I think the results show that ReiserFS suffers from some of the same issues as ext3 with some differences like checkpointing data much more aggressively than ext3

The paper then went on to show results and analysis for several other FS.