Tahoe – The Least-Authority Filesystem

Authors

Venue

StorageSS’08, October 31, 2008, Fairfax, Virginia, USA

Publication Year

2008

Tahoe – The Least-Authority Filesystem

Abstract

Tahoe is a storage grid designed to provide secure, longterm storage, such as for backup applications. It consists of userspace processes running on commodity PC hardware and communicating with one another over TCP/IP. Tahoe was designed following the Principle of Least Authority – each user or process that needs to accomplish a task should be able to perform that task without having or wielding more authority than is necessary.

Tahoe was developed by allmydata.com to serve as the storage backend for their backup service. It is now in operation and customers are relying on the Tahoe grid for the safety of their data. Allmydata.com has released the complete Tahoe implementation under open source software licences.

The data and metadata in the filesystem is distributed among servers using erasure coding and cryptography. The erasure coding parameters determine how many servers are used to store each file – denoted N, and how many of them are necessary for the file to be available – denoted K. The default settings for those parameters, and the settings which are used in the allmydata.com service, are K = 3, N = 10, so each file is shared across 10 different servers, and the correct function of any 3 of those servers is sufficient to access the file. The combination of cryptography and erasure coding minimizes the user’s vulnerability to these servers. It is an unavoidable fact of life that servers can fail, or can turn against their clients. This can happen if the server is compromised through a remote exploit, subverted by insider attack, or if the owners of the server are required by their government to change its behavior, as in the case of Hushmail in 2007. Tahoe’s cryptography ensures that even if the servers fail or turn against the client they cannot violate confidentiality by reading the plaintext, nor can they violate integrity by forging file contents. In addition, the servers cannot violate freshness by causing file contents to rollback to an earlier state without a collusion of multiple servers.