构建 nanoGPT – 杰力皓博

该存储库包含 nanoGPT 的从头开始复制。git 提交是专门一步一步地保持干净的，因此人们可以很容易地浏览 git 提交历史记录，看看它慢慢构建。此外，YouTube 上还有一个伴奏视频讲座，您可以在其中看到我介绍每个提交并解释过程中的各个部分。

我们基本上从一个空文件开始，然后逐步复制 GPT-2 （124M）模型。如果你有更多的耐心或金钱，代码也可以重现 GPT-3 模型。虽然 GPT-2 （124M）模型可能在当时（2019 年，~5 年前）训练了相当长一段时间，但今天，复制它只需 ~1 小时和 ~10 美元。如果你没有足够的云GPU盒，你需要一个云GPU盒，为此我推荐Lambda。

请注意，GPT-2 和 GPT-3 以及这两种简单的语言模型，都是在互联网文档上训练的，它们所做的只是“梦想”互联网文档。所以这个 reppo/video 这不包括聊天微调，你不能像和 ChatGPT 交谈一样与它交谈。微调过程（虽然在概念上非常简单 – SFT 只是关于交换数据集并继续训练）在本部分之后，稍后将介绍。就目前而言，这是 124M 模型在 10B 训练后用“你好，我是语言模型”提示它时说的那种东西：

Hello, I'm a language model, and my goal is to make English as easy and fun as possible for everyone, and to find out the different grammar rules
Hello, I'm a language model, so the next time I go, I'll just say, I like this stuff.
Hello, I'm a language model, and the question is, what should I do if I want to be a teacher?
Hello, I'm a language model, and I'm an English person. In languages, "speak" is really speaking. Because for most people, there's

Hello, I'm a language model, and my goal is to make English as easy and fun as possible for everyone, and to find out the different grammar rules

Hello, I'm a language model, so the next time I go, I'll just say, I like this stuff.

Hello, I'm a language model, and the question is, what should I do if I want to be a teacher?

Hello, I'm a language model, and I'm an English person. In languages, "speak" is really speaking. Because for most people, there's

在 40B 令牌的培训之后：

Hello, I'm a language model, a model of computer science, and it's a way (in mathematics) to program computer programs to do things like write
Hello, I'm a language model, not a human. This means that I believe in my language model, as I have no experience with it yet.
Hello, I'm a language model, but I'm talking about data. You've got to create an array of data: you've got to create that.
Hello, I'm a language model, and all of this is about modeling and learning Python. I'm very good in syntax, however I struggle with Python due

Hello, I'm a language model, a model of computer science, and it's a way (in mathematics) to program computer programs to do things like write

Hello, I'm a language model, not a human. This means that I believe in my language model, as I have no experience with it yet.

Hello, I'm a language model, but I'm talking about data. You've got to create an array of data: you've got to create that.

Hello, I'm a language model, and all of this is about modeling and learning Python. I'm very good in syntax, however I struggle with Python due

哈哈。无论如何，一旦视频出来，这也将成为常见问题解答的地方，以及修复和勘误表的地方，我相信会有很多:)

有关讨论和问题，请使用“讨论”选项卡，为了更快地沟通，请查看我的 Zero To Hero Discord

原文链接：karpathy/build-nanogpt：关于从头开始构建 nanoGPT 的视频+代码讲座 (github.com)

克隆目录

git clone https://github.com/karpathy/build-nanogpt

1	git clone https://github.com/karpathy/build-nanogpt

需要先下载 fineweb-edu, 然后预处理到edu_fineweb10B 目录

python fineweb.py

1	python fineweb.py

相关文章

发表评论 取消回复

发表评论取消回复