下载Hugging Face模型的国内站点

今天推荐一个 HuggingFace 的网站

在本站搜索，并在模型主页的Files and Version中下载文件。

huggingface-cli 是 Hugging Face 官方提供的命令行工具，自带完善的下载功能。1. 安装依赖

pip install -U huggingface_hub

1	pip install -U huggingface_hub

2. 设置环境变量
Linux

export HF_ENDPOINT=https://hf-mirror.com

1	export HF_ENDPOINT=https://hf-mirror.com

Windows Powershell

$env:HF_ENDPOINT = "https://hf-mirror.com"

1	$env:HF_ENDPOINT = "https://hf-mirror.com"

建议将上面这一行写入 ~/.bashrc。
3.1 下载模型

huggingface-cli download --resume-download gpt2 --local-dir gpt2

1	huggingface-cli download --resume-download gpt2 --local-dir gpt2

3.2 下载数据集

huggingface-cli download --repo-type dataset --resume-download wikitext --local-dir wikitext

1	huggingface-cli download --repo-type dataset --resume-download wikitext --local-dir wikitext

可以添加 --local-dir-use-symlinks False 参数禁用文件软链接，这样下载路径下所见即所得，详细解释请见上面提到的教程。

hfd 是本站开发的 huggingface 专用下载工具，基于成熟工具 git+aria2，可以做到稳定下载不断线。1. 下载hfd

wget https://hf-mirror.com/hfd/hfd.sh<br>chmod a+x hfd.sh

1	wget https://hf-mirror.com/hfd/hfd.sh<br>chmod a+x hfd.sh

2. 设置环境变量
Linux

export HF_ENDPOINT=https://hf-mirror.com

1	export HF_ENDPOINT=https://hf-mirror.com

Windows Powershell

$env:HF_ENDPOINT = "https://hf-mirror.com"

1	$env:HF_ENDPOINT = "https://hf-mirror.com"

3.1 下载模型

./hfd.sh gpt2 --tool aria2c -x 4

1	./hfd.sh gpt2 --tool aria2c -x 4

3.2 下载数据集

./hfd.sh wikitext --dataset --tool aria2c -x 4

1	./hfd.sh wikitext --dataset --tool aria2c -x 4

非侵入式，能解决大部分情况。huggingface 工具链会获取HF_ENDPOINT环境变量来确定下载文件所用的网址，所以可以使用通过设置变量来解决。

HF_ENDPOINT=https://hf-mirror.com python your_script.py

1	HF_ENDPOINT=https://hf-mirror.com python your_script.py

不过有些数据集有内置的下载脚本，那就需要手动改一下脚本内的地址来实现了。

相关文章