September 19, 2023

NFS > FUSE: Why We Built our own NFS Server in Rust

Yucheng Low

The Optimistic Case for FUSE

I love files. Everything understands files. Every program knows how to read files and write files. Its a truly universal API. As such, I love the idea of FUSE. FUSE, or “Filesystem in Userspace” is a set of Linux interfaces that allow user-mode programs to define a filesystem.

This allows filesystem drivers to be built very easily without needing a kernel module. Fuse is the basis for a large number of filesystem clients, including NTFS and even remote “filesystems” like SFTP or Amazon S3. It can also be used to make strange filesystems which are not actually filesystems like WikipediaFS which allows one to edit wikipedia articles using their own text editor.

Here at XetHub, we wanted to build an easy way to access any version of any dataset from your laptop using the tools you have. It is really nice being able to directly browse an image dataset without having to go through S3 commands.

What if you could interact with S3 this way?

The obvious solution to enabling this superpower was FUSE.

FUSE however, is an frustrating API to build against:

  • there are 2 API classes to choose from — a low-level API and a high-level API

  • there are 2 incompatible API versions (libfuse2 and libfuse3)

  • and lots of other smaller API changes over time (See FUSE_USE_VERSION).

In addition, FUSE is unavailable natively on Mac and Windows and requires the user to install a 3rd party driver (MacFuse, WinFuse). Each of these drivers may have subtle API incompatibilities.

The Key Question

Because of these issues, I asked myself the following questions.

Is it possible to build a userspace filesystem interface that is truly cross-platform?

To answer this question, I had to look back 20 years into computer science history to stumble into NFSv3.

NFS

NFSv3 is 20 years old and is a network filesystem protocol that was so simple and so ubiquitous that nearly every operating system has a built-in implementation of it.

From Wikipedia

The NFSv3 protocol is has a beautiful and simple set of design principles:

  1. The server is completely stateless: This simplifies the implementation immensely.

  2. NFS Servers are dumb and NFS clients are smart:  (explicitly stated in RFC 1813  Section 1.6 Par 4). This is great because we only need to implement the server, and the very smart clients have already been implemented and hardened for > 20 years.

  3. Simple Cache Consistency Rules: Server does not define cache policy. The client can be as smart as it wants. Instead, the protocol defines a mechanism for the server to notify the client when something changes. This implementation is simpler and more efficient than FUSE. In practice the FUSE daemon has to explicitly implement a lot caching itself. With NFS we can avoid all of that extra complexity.

  4. The NFS client knows its talking over a network: This means that the NFS Client and protocol has builtin timeout, retry and failure semantics we can immediately take advantage of. The stateless protocol makes this very easy. With FUSE, the timeout/failure behavior has to be implemented robustly everywhere in the daemon. It is remarkably easy to hang the daemon and all programs reading from the filesystem if you get stuck in an API call.

  5. Actually extremely good performance. On the nixes at least, localhost networking is as fast as pipes. I do not know about Windows but I will be surprised if its not very fast too.

The summary is that implementing a user-mode filesystem using localhost NFS instead of FUSE makes it easier get performance and resiliency. We can take advantage of the existing caching support and 20+ years of robustness and hardening. We just need to implement the server protocol once.

So last year, while isolating myself from a COVID-19 infection, I implemented an NFSv3 server in Rust as an experiment. It turned out fantastically.

How we use NFS at XetHub

XetHub has the world’s first natively cross-platform, user-mode filesystem implementation, allowing you to mount arbitrarily large datasets on your machine without needing any kernel driver.

This enables you to, in just a few seconds, locally mount ~660 GB of Llama 2 models or write DuckDB queries to analyze large parquet files and scan just the data you need.

All of this is currently supported on Linux, Mac and Windows Pro (unfortunately doesn’t work with Windows Home). Windows has some minor quirks in the experience but it generally works.

Open Sourcing nfsserve

We are open sourcing nfsserve, our Rust implementation for the NFS server, on GitHub.  If you’re a 🦀 Rust-acean, you can install it using cargo with the dependency nfsserve = "0.10".

The documentation could definitely use more love and we’d happily accept PR’s for any improvements! You can find some ways to contribute here in the readme.

We plan to maintain this library moving forward because it’s an actual dependency to enabling our xet mount implementation within PyXet and xet-core (which we’ve also open sourced).

Here are the initial capabilities we’ve implemented:

  • Reads are pretty performant

  • Writes work but still need a lot of optimizations

I hope others will find this useful and help. There are a lot of low-hanging fruit for performance improvements and lots of refactoring to be done!

Share on