• Technology
      • AI
      • Al Tools
      • Biotech & Health
      • Climate Tech
      • Robotics
      • Space
      • View All

      AI・Biotech & Health

      I Used Claude Code to Analyze My MRI. Here’s What Happened.

      Read More
  • Businesses
      • Corporate moves
      • Enterprise
      • Fundraising
      • Layoffs
      • Startups
      • Venture
      • View All

      Enterprise・Technology

      Why HackerRank’s Open-Source ATS Gave My Resume Three Different Scores

      Read More
  • Social
          • Apps
          • Digital Culture
          • Gaming
          • Media & Entertainment
          • View AIl

          Gaming・Technology

          How to Set Up a Hytale Dedicated Server on Linux

          Read More
  • Economy
          • Commerce
          • Crypto
          • Fintech
          • Payments
          • Web 3 & Digital Assets
          • View AIl

          Economy・Enterprise

          The Office Doesn’t Fix Loneliness at Work

          Read More
  • Mobility
          • Ev's
          • Transportation
          • View AIl
          • Autonomus & Smart Mobility
          • Aviation & Aerospace
          • Logistics & Supply Chain

          AI・Mobility

          TechCrunch Mobility: All Eyes on Tesla FSD

          Read More
  • Platforms
          • Amazon
          • Anthropic
          • Apple
          • Deepseek
          • Data Bricks
          • Google
          • Github
          • Huggingface
          • Meta
          • Microsoft
          • Mistral AI
          • Netflix
          • NVIDIA
          • Open AI
          • Tiktok
          • xAI
          • View All

          Apple・Hardware

          Apple’s touchscreen MacBook to launch with M5 chips, not M7

          Read More
  • Techinfra
          • Gadgets
          • Cloud Computing
          • Hardware
          • Privacy
          • Security
          • View All

          Hardware・Technology

          Resurrecting a Nearly Forgotten Netbook: The Lemote Yeeloong and OpenBSD

          Read More
  • More
    • Events
    • Advertise
    • Newsletter
    • Got a Tip
    • Media Kit
  • Reviews
  • Technology
    • AI
    • AI Tools
    • Biotech & Health
    • Climate
    • Robotics
    • Space
  • Businesses
    • Enterprise
    • Fundraising
    • Layoffs
    • Startups
    • Venture
  • Social
    • Apps
    • Gaming
    • Media & Entertainment
  • Economy
    • Commerce
    • Crypto
    • Fintech
  • Mobility
    • EVs
    • Transportation
  • Platforms
    • Amazon
    • Apple
    • Google
    • Meta
    • Microsoft
    • TikTok
  • Techinfra
    • Gadgets
    • Cloud Computing
    • Hardware
    • Privacy
    • Security
  • More
    • Events
    • Advertise
    • Newsletter
    • Request Media Kit
    • Got a Tip
thebytebeam_logo
  • Technology
    • AI
    • AI Tools
    • Biotech & Health
    • Climate
    • Robotics
    • Space
  • Businesses
    • Enterprise
    • Fundraising
    • Layoffs
    • Startups
    • Venture
  • Social
    • Apps
    • Gaming
    • Media & Entertainment
  • Economy
    • Commerce
    • Crypto
    • Fintech
  • Mobility
    • EVs
    • Transportation
  • Platforms
    • Amazon
    • Apple
    • Google
    • Meta
    • Microsoft
    • TikTok
  • Techinfra
    • Gadgets
    • Cloud Computing
    • Hardware
    • Privacy
    • Security
  • More
    • Events
    • Advertise
    • Newsletter
    • Request Media Kit
    • Got a Tip
thebytebeam_logo

AI • Techinfra

Migrating the Hub from Git LFS to Xet

TBB Desk

13 hours ago · 8 min read

READS
0

TBB Desk

13 hours ago · 8 min read

READS
0
Diagram illustrating the migration process from Git LFS to Xet storage, highlighting key steps and benefits.
A visual guide to understanding the transition from Git Large File Storage (LFS) to Xet, emphasizing improved performance and cost-effectiveness. (Illustrative AI-generated image).

Key Takeaways

The main points at a glance

  • Hugging Face is replacing Git LFS with Xet storage on its Hub to improve AI model distribution.
  • Xet offers faster downloads and uploads by serving only necessary data chunks and reusing existing ones.
  • The new system significantly reduces storage costs through intelligent deduplication, storing identical file chunks only once.
  • Xet’s content-addressable nature enhances caching and verification speed.
  • The migration is largely transparent to users, maintaining existing workflows and commands.
  • This shift aims to lower barriers to entry for AI researchers and developers by making model access faster and cheaper.

Hugging Face Switches from Git LFS to Xet Storage for AI Models

Downloading large AI models from Hugging Face’s Hub can be a slow and frustrating experience. Files may take a long time to transfer, processes can stall, and users might have to restart downloads. This is about to change as Hugging Face transitions its Hub storage from Git LFS to a new system called Xet. This technical shift offers significant benefits, including faster downloads, reduced storage usage, and lower costs for AI model distribution.

The Challenges of Using Git LFS for Large AI Files

Git is excellent for tracking code changes, but it struggles with the massive binary files characteristic of AI models. Git LFS (Large File Storage) was developed as an add-on to handle these large files by storing them separately with pointers in the main repository. However, this approach has limitations.

Speed Issues with Git LFS Downloads

When cloning a repository using Git LFS, users download both code and large model files. This can be time-consuming, especially since users might have to download the entire file history even if they only need one version. It’s like having to pull out an entire huge drawer to get a single document.

High Costs and Wasted Storage

Git LFS charges based on data transfer and storage. For a platform like Hugging Face, hosting numerous models and datasets, these costs escalate quickly. Furthermore, Git LFS lacks efficient deduplication. If similar models are uploaded, they are stored as separate files, leading to significant wasted space.

Frustrating User Experience

For researchers and developers who frequently download multiple models for testing, the slow speeds and potential connection drops with Git LFS create a frustrating user experience. The Hugging Face team recognized the need for a more scalable solution.

Introducing Xet Storage: A Smarter Approach

Xet is a new storage system developed in-house at Hugging Face, designed specifically for the Hub’s needs. It utilizes content-addressable storage, meaning each piece of data is identified by a unique digital fingerprint based on its content.

How Xet’s Deduplication Works

Unlike Git LFS, Xet breaks files into smaller chunks. These chunks are stored based on their fingerprints. If a chunk already exists in the system, Xet only stores a reference to it, avoiding redundant copies. This deduplication process significantly reduces storage requirements. For example, if two books share the same first chapter, Xet stores that chapter only once.

Content Addressability and Efficiency

Because Xet stores data by its content hash, it enables fast caching and verification. The system is designed to be efficient, akin to a smart library that stores each paragraph only once and tracks which books use it.

Intelligent Compression and Versioning

Xet also handles compression and versioning intelligently. It can store multiple versions of a model efficiently by only storing the differences between them, optimizing space usage for evolving AI models.

Benefits of Xet Storage: Speed and Cost Savings

Xet’s design translates directly into tangible benefits for users and Hugging Face.

Faster Downloads and Uploads

Xet can serve only the specific chunks of data a user needs, and it reuses existing chunks from previous downloads. This dramatically speeds up download times, with users reporting 2 to 5 times faster speeds in some cases. Uploads are also quicker, as only new chunks need to be transferred.

Reduced Infrastructure Costs

Deduplication and efficient data transfer lead to significant cost savings on storage and bandwidth for Hugging Face. The system’s efficiency also means it can handle high loads with fewer servers, further reducing operational costs.

Efficient Model Versioning

Storing new versions of models is much more efficient with Xet. Only changed chunks are stored, making it cost-effective for researchers who manage multiple model iterations.

Improved Repository Cloning

Cloning repositories becomes much faster with Xet, as many chunks can be cached or streamed efficiently, reducing the time spent waiting for large files to download.

The Migration Process to Xet Storage

Hugging Face is gradually migrating its Hub storage from Git LFS to Xet. This process is largely happening in the background, aiming for a seamless transition for users.

User Experience During Migration

For most users, the migration is transparent. The user interface and APIs remain unchanged, and existing commands continue to work. The primary change is the backend storage system. Users do not need to learn new commands or alter their scripts.

Potential Considerations

Users with custom Git LFS hooks or scripts might need to update them. There could be a brief warm-up period where some operations are slightly slower as the Xet cache populates. Compatibility with certain Git LFS-specific commands may also differ, though Hugging Face is working to maintain high compatibility.

Automated and Monitored Transition

The migration is automated, with models moved in batches. Hugging Face monitors the process closely to prevent data loss and ensures a smooth transition. Users do not need to take any action for their existing models.

Impact of Xet Storage on the AI Community

The shift to Xet storage has significant implications for the broader AI community.

Lowering Barriers to Entry

Faster downloads enable researchers to experiment with more models in less time, accelerating the pace of innovation. This allows them to focus more on analysis and less on waiting for files.

Potential Cost Reductions

Hugging Face’s infrastructure savings from Xet could lead to more accessible free tiers and lower prices for premium services, making AI resources more affordable.

Setting a New Industry Standard

As a leading AI platform, Hugging Face’s adoption of Xet could influence other platforms to move towards similar content-addressable storage solutions, fostering a more efficient AI ecosystem.

Addressing Potential Risks

While Xet offers many benefits, potential risks like vendor lock-in and system complexity are being addressed. Hugging Face emphasizes open standards and has invested in robust monitoring and testing to ensure data integrity and system reliability.

The Future of AI Data Storage

The evolution of AI models and datasets necessitates advancements in storage solutions. Xet represents a significant step forward, but innovation in this area is ongoing.

Content Addressable Storage for All AI Data

The principles of Xet, such as deduplication, could be applied to other large AI artifacts like datasets and embeddings, leading to substantial space savings.

Advancements in Streaming and Integration

Future storage systems may offer more advanced streaming capabilities, allowing users to access only the necessary parts of models or datasets. Tighter integration between storage systems and AI frameworks could also speed up training pipelines.

Continued Development and Open Source

Hugging Face plans to continue developing Xet, adding new features and improving performance. They are also open-sourcing parts of the system, encouraging community contributions and the development of related tools.

A More Efficient AI Ecosystem

The migration to Xet is a move towards a faster, cheaper, and more efficient AI ecosystem. As storage technologies evolve, they will better support the growing demands of modern artificial intelligence.

Frequently Asked Questions

Why is Hugging Face moving from Git LFS to Xet storage?

Hugging Face is moving from Git LFS to Xet storage to address the slow download speeds and high costs associated with handling large AI model files. Xet offers a more efficient and scalable solution for storing and distributing these massive datasets.

What are the main benefits of Xet storage for users?

Xet storage provides significantly faster download and upload speeds for AI models. It also reduces storage space usage through deduplication and can lead to cost savings, potentially making AI resources more accessible.

How does Xet storage work differently from Git LFS?

Xet breaks files into smaller chunks and stores them based on their content fingerprint, reusing existing chunks to save space and time. Git LFS stores entire files separately, which is less efficient for large, often similar, AI models.

Will users need to learn new commands for Xet storage?

No, the migration to Xet is designed to be seamless for users. Existing commands and workflows for interacting with the Hugging Face Hub will continue to work as before, with the changes happening in the backend.

Are there any potential downsides to Xet storage?

Potential concerns include the complexity of content-addressable storage and the risk of vendor lock-in. However, Hugging Face is committed to open standards and has implemented robust systems to ensure data integrity and reliability.

How will Xet storage impact the cost of using Hugging Face?

By reducing Hugging Face's infrastructure costs for storage and bandwidth, the savings are expected to trickle down to users. This could result in more generous free tiers or lower prices for premium services.

References

  • Migrating the Hub from Git LFS to Xet – Original report (Hugging Face)
  • Is Hugging Face's Xet Storage the Future of AI Repositories? – Analytics India Magazine – This article provides industry perspective on the significance of the migration, though full text was inaccessible.
  • AI Models, data storage, Git LFS, Hugging Face, Xet storage

Leave a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Tech news, trends & expert how-tos

Daily coverage of technology, innovation, and actionable insights that matter.
Advertisement

Join thousands of readers shaping the tech conversation.

A daily briefing on innovation, AI, and actionable technology insights.

By subscribing, you agree to The Byte Beam’s Privacy Policy .

Join thousands of readers shaping the tech conversation.

A daily briefing on innovation, AI, and actionable technology insights.

By subscribing, you agree to The Byte Beam’s Privacy Policy .

The Byte Beam delivers timely reporting on technology and innovation, covering AI, digital trends, and what matters next.

Sections

  • Technology
  • Businesses
  • Social
  • Economy
  • Mobility
  • Platfroms
  • Techinfra

Topics

  • AI
  • Startups
  • Gaming
  • Crypto
  • Transportation
  • Meta
  • Gadgets

Resources

  • Events
  • Newsletter
  • Got a tip

Advertise

  • Advertise on TBB
  • Request Media Kit

Company

  • About
  • Contact
  • Privacy Policy
  • Terms of Service
  • Cookie Policy
  • Do Not Sell My Personal Info
  • Accessibility Statement
  • Trust and Transparency

© 2026 The Byte Beam. All rights reserved.

The Byte Beam delivers timely reporting on technology and innovation,
covering AI, digital trends, and what matters next.

Sections
  • Technology
  • Businesses
  • Social
  • Economy
  • Mobility
  • Platfroms
  • Techinfra
Topics
  • AI
  • Startups
  • Gaming
  • Startups
  • Crypto
  • Transportation
  • Meta
Resources
  • Apps
  • Gaming
  • Media & Entertainment
Advertise
  • Advertise on TBB
  • Banner Ads
Company
  • About
  • Contact
  • Privacy Policy
  • Terms of Service
  • Cookie Policy
  • Do Not Sell My Personal Info
  • Accessibility Statement
  • Trust and Transparency

© 2026 The Byte Beam. All rights reserved.

Subscribe
Latest
  • All News
  • SEO News
  • PPC News
  • Social Media News
  • Webinars
  • Podcast
  • For Agencies
  • Career
SEO
Paid Media
Content
Social
Digital
Webinar
Guides
Resources
Company
Advertise
Do Not Sell My Personal Info