Skip to content

Solve "Cannot find an appropriate cached snapshot folder" Error: A Comprehensive Guide to Hugging Face Downloads

As an AI developer, the last thing you want to see when you eagerly type from_pretrained is a bright red error message. The following error is one that many Hugging Face users have likely encountered:

err[msg]=Cannot find an appropriate cached snapshot folder for the specified revision on the local disk and outgoing traffic has been disabled. To enable repo look-ups and downloads online, pass 'local_files_only=False' as input.

Don't worry, this is usually not a major problem. However, it's often the beginning of a "chain reaction" of issues, especially for developers in mainland China. This article will guide you through understanding and resolving this error from its root cause and provide best practices for the network environment in China, allowing you and your team to say goodbye to model download troubles forever.

Step 1: Understanding and Solving the local_files_only Error

The error message is actually quite straightforward. We can break it down into two parts to understand it:

  1. "Cannot find an appropriate cached snapshot folder... on the local disk": The program cannot find the required model files in your computer's local cache directory.
  2. "and outgoing traffic has been disabled": At the same time, the code settings (local_files_only=True) prevent it from connecting to the Hugging Face Hub to download these files.

In simple terms, it's like saying, "I don't have what I need locally, but you won't let me go online to get it, so I can only throw an error."

How to Solve It?

The error message itself provides the answer: Allow the program to connect to the internet to download.

You only need to explicitly set the parameter local_files_only=False in the function that loads the model or Tokenizer.

Incorrect Code Example:

python
from transformers import AutoModel

# This setting will disable network downloads, and an error will occur if there is no local cache
model = AutoModel.from_pretrained("google-bert/bert-base-cased", local_files_only=True)

Corrected Code:

python
from transformers import AutoModel

# Set to False to allow the program to automatically download from the internet if no local cache is found
model = AutoModel.from_pretrained("google-bert/bert-base-cased", local_files_only=False)

Note: In many newer versions of the transformers library, the default value of local_files_only is already False. If you encounter this error, it means that your code or environment has explicitly set it to True.

After solving this problem, you might breathe a sigh of relief and rerun the code. But soon, a new problem may arise – the program gets stuck at Downloading... and eventually ends with a Connection Timeout.

Step 2: Solving Network Challenges in China – Timeouts and Connection Failures

This is because Hugging Face's servers are located abroad, and for well-known reasons, they cannot be directly accessed from mainland China.

Fortunately, we have very mature and simple solutions. We strongly recommend using a domestic mirror source.

Best Solution: Use a Domestic Mirror (hf-mirror.com)

The community-driven mirror hf-mirror.com is currently the most recommended, stable, and reliable Hugging Face mirror. You don't need to modify any Python code; just set an environment variable.

The huggingface_hub library will automatically recognize this environment variable named HF_ENDPOINT and direct all download requests to this domestic mirror address, achieving high-speed and stable downloads.

How to Set:

Before running your Python program, execute the following command in your terminal:

  • Linux / macOS Users:

    bash
    export HF_ENDPOINT=https://hf-mirror.com
  • Windows (CMD) Users:

    cmd
    set HF_ENDPOINT=https://hf-mirror.com
  • Windows (PowerShell) Users:

    powershell
    $env:HF_ENDPOINT = "https://hf-mirror.com"
  • Set in Python Code

    python
    import os
    os.environ['HF_ENDPOINT']='https://hf-mirror.com'

After setting it up, you can run your previous Python script normally, and the download speed will dramatically improve.

Alternative Solution: Setting Up a Network Proxy

If you already have a stable and reliable proxy tool, you can also solve the problem by setting proxy environment variables. Assuming your proxy is at 127.0.0.1 with port 7890:

  • Linux / macOS:
    bash
    export HTTP_PROXY=http://127.0.0.1:7890
    export HTTPS_PROXY=http://127.0.0.1:7890
  • Windows (CMD):
    cmd
    set HTTP_PROXY=http://127.0.0.1:7890
    set HTTPS_PROXY=http://127.0.0.1:7890

Although this method is feasible, for the sole purpose of solving Hugging Face download issues, using a mirror source (HF_ENDPOINT) is a more recommended, stable, and focused solution.

One-Stop Solution

The next time you encounter the Cannot find an appropriate cached snapshot folder error or any Hugging Face download-related network issues, follow these two steps to solve them once and for all:

  1. Check the Code: Ensure that you haven't mistakenly set local_files_only=True in the from_pretrained function. If you need to connect to the internet, remove the parameter or set it to local_files_only=False.

  2. Configure the Environment (Mandatory for Users in China): Set the environment variable HF_ENDPOINT in the terminal or system.

    bash
    export HF_ENDPOINT=https://hf-mirror.com