August 8, 2024

XetHub joins Hugging Face to replace Git LFS and accelerate AI collaboration

Yucheng Low

XetHub joins Hugging Face
XetHub joins Hugging Face
XetHub joins Hugging Face

I am excited to announce that XetHub is joining forces with Hugging Face!

Together we share a vision of democratizing AI to enable everyone to host, share, and build models and datasets. At Hugging Face, we will continue to pursue this vision, integrating our technologies into the Hugging Face Hub to create the future of AI collaboration. I am thrilled to be working with Julien and many others on the Hugging Face team to help shape this future.

Hugging Face’s current Git LFS storage backend makes it easy to publish datasets and models - but difficult for anyone to collaborate on repos because LFS gets slower and slower with every change, with history bloated with full versions of every changed file. We've seen this snail-like speed at scale in both our 2023 and 2024 benchmarks.

With over 12PB stored in Hugging Face Hub via Git LFS (1.3m models, 450k datasets, 680k spaces) and over 6PB served per day and almost 1B requests/day and growing, it's time for a better solution. We will replace Git LFS as Hugging Face's storage backend. XetHub's content-defined store that uses Merkle trees to deduplicate at the block level unlocks the ability to push small changes to huge files without having to transfer/save the whole file, resulting in massive time, storage, and bandwidth savings.

And that's only the beginning — we are looking forward to bringing custom views, visual differences, and efficient access patterns to Hugging Face as well.

We can't wait to speed things up for the largest AI developer community in the world!
Read more about more use cases in the official Hugging Face announcement and follow our team's work on Hugging Face.

Share on