使用本地 Ollama 的 GraphRAG

毕竟用 openai api 来使用 GraphRAG 很贵，所以下面是在 widnows 11 下用 ollama 环境来使用GraphRAG

GraphRag 的 python 环境是 3.10-3.12, ollama 的模型是 llama3.1:70b

以下是使用 GraphRAG 系统的简单端到端示例。它演示如何使用系统对某些文本进行索引，然后使用索引数据来回答有关文档的问题。

1. 创建 GraphRAG 环境

conda create -n graphrag python=3.10
conda activate graphrag

1 2	conda create -n graphrag python=3.10 conda activate graphrag

2. 安装 GraphRAG 和 ollama

2.1 安装 GraphRAG

pip install graphrag

1	pip install graphrag

2.1 安装 ollama

需要到 library (ollama.com) 现在系统所需的文件，这里测试的是windows 环境，所以需要下载 windows 版本的程序

https://ollama.com/download/OllamaSetup.exe

1	https://ollama.com/download/OllamaSetup.exe

2.2 下载下面需要使用的模型

ollama pull llama3.1:70b
ollama pull nomic-embed-text

1 2	ollama pull llama3.1:70b ollama pull nomic-embed-text

llama3.1 是会话模型，

nomic-embed-text 是嵌入模型

3. 运行索引器

现在我们需要设置一个数据项目和一些初始配置。让我们来设置一下。我们使用的是默认配置模式，您可以根据需要使用配置文件（我们推荐）或环境变量进行自定义。

3.1 创建目录

mkdir ragtest\input

1	mkdir ragtest\input

3.2 下载文件

现在，让我们从可信赖的来源获取查尔斯·狄更斯（Charles Dickens）的《圣诞颂歌》的副本

curl https://www.gutenberg.org/cache/epub/24022/pg24022.txt > ragtest\input/book.txt

1	curl https://www.gutenberg.org/cache/epub/24022/pg24022.txt > ragtest\input/book.txt

3.3 设置工作区变量

要初始化您的工作区，让我们首先运行 graphrag.index --init 命令。由于我们已经在上一步中配置了一个名为 .ragtest’ 的目录，因此我们可以运行以下命令：

python -m graphrag.index --init --root ragtest
Initializing project at ragtest
⠋ GraphRAG Indexer

python -m graphrag.index --init --root ragtest

Initializing project at ragtest

⠋ GraphRAG Indexer

这将在 ./ragtest 目录中创建两个文件：.env 和 settings.yaml

.env 包含运行 GraphRAG 管道所需的环境变量。如果检查该文件，您将看到定义的单个环境变量 GRAPHRAG_API_KEY=<API_KEY>。这是 OpenAI API 或 Azure OpenAI 终结点的 API 密钥。您可以将其替换为您自己的 API 密钥。这里我们使用ollama 所以不用理会

settings.yaml 包含管道的设置。您可以修改此文件以更改管道的设置。这里需要修改一些参数

这里需要修改 llm 和 e，涉及到参数有 model, api_base 修改如下：

llm:
  model: llama3.1:70b
  api_base: http://localhost:11434/v1

llm:

model: llama3.1:70b

api_base: http://localhost:11434/v1

mbeddings 节的参数，涉及到参数有 model, api_base 修改如下：

embeddings:
  llm:
    model: nomic-embed-text
    api_base: http://localhost:11434/v1

embeddings:

llm:

model: nomic-embed-text

api_base: http://localhost:11434/v1

其他参数修改如下：

entity_extraction:
  max_gleanings: 0

1 2	entity_extraction: max_gleanings: 0

claim_extraction:
  max_gleanings: 0

1 2	claim_extraction: max_gleanings: 0

snapshots:
  graphml: yes
  raw_entities: yes
  top_level_nodes: yes

snapshots:

graphml: yes

raw_entities: yes

top_level_nodes: yes

chunks:
  size: 300
  overlap: 100
  group_by_columns: [id] # by default, we don't allow chunks to cross documents

chunks:

size: 300

overlap: 100

group_by_columns: [id] # by default, we don't allow chunks to cross documents

上面如果使用 openai 的 api, 则 api_base:

  api_base: https://api.openai.com/v1

1	api_base: https://api.openai.com/v1

3.4 运行索引 pipeline

python -m graphrag.index --root ragtest

1	python -m graphrag.index --root ragtest

正常的输出应该如下：

python -m graphrag.index --root ragtest
🚀 Reading settings from ragtest\settings.yaml
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is
deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
  return bound(*args, **kwds)
🚀 create_base_text_units
                                  id  ... n_tokens
0   d6583840046247f428a9f02738842a7c  ...     1200
1   10730234d6ccc7cee08f3cfc58d8a9a1  ...     1200
2   980594a50d68db06e6ca257bdb9ae95e  ...     1200
3   080d8e696ff38c653ca90fa086415e74  ...     1200
4   0e2b719e4c97d0d8bfeb2a53f7638eb6  ...     1200
5   7064df4af064aeb556e5bab52e896414  ...     1200
6   759315fa84c14e81f84fc71c73746184  ...     1200
7   e8d4072836ac08145edc2fa8c15ea2c2  ...     1200
8   e3bef9514042131cf477476725497416  ...     1200
9   4ffd9df98742c035b3e15bb24c3edb12  ...     1200
10  8435b078474636a989a8c22f5493e1b6  ...     1200
11  3763b08136628f77304cb4eb1136ea48  ...     1200
12  206c2f9fd249659c7a897d323459cb6f  ...     1200
13  ce95e4fc6ee410973c040fc628dce155  ...     1200
14  260fb94666cbdfb08286ce8d8162130d  ...     1200
15  bf29edcb41403e5af43aa101072f4fdf  ...     1200
16  d453d198afec5b284ff36024780b088c  ...     1200
17  c79e67fc6f74a9afbe79c158000cc71b  ...     1200
18  77ae3762a0b062ca5350ea54a05450ae  ...     1200
19  b029f1164f623c14a0cfaa73c246f50d  ...     1200
20  29793cee69d4eefd5fea8a5f2f27b521  ...     1200
21  b4dec8fbe9f2a2c6a79d09c9484d15ae  ...     1200
22  5d70b47bf7167b7586f47fcc4355a746  ...     1200
23  1bdf253855a115bcf51faa63d7b07e82  ...     1200
24  999c9887098d1a25dc3b42a8da7ddc8c  ...     1200
25  bc5fde5d1e00a3ecc1e548c8d24f1c1f  ...     1200
26  4cf4deeb7f61acb7b7db4ce0e57fb1e6  ...     1200
27  61a042016835080f3d334560b13b0e35  ...     1200
28  98f3970b31dfa1d7391cdaa453d6ade7  ...     1200
29  ebc403dd3df39bacc3443ef4afb7edfd  ...     1200
30  1cb66ea16e5e4f2816f0e188d3acc792  ...     1200
31  bc606176c752984da6d202275ee8c7a6  ...     1200
32  cd8a47ace09b9cee1e8b27b0689f2822  ...     1200
33  f40e4b274b5e1a25afbff9ecb733e1f4  ...     1200
34  19f8fd68a8dbc1bba7058e13ce3a2e3d  ...     1200
35  0f9b4e5a7cfc0c3c89a8898a45383588  ...     1200
36  6c362d3f8d01c76d84443dcabf3f322a  ...     1200
37  04e5c071e4ee5496d5380662e1339f45  ...     1200
38  2b5ecb7fba1301d1f3d307e194a6c435  ...     1200
39  aa8d2310a206001404282ddb3fd645aa  ...     1200
40  0ddc17ea5e566006c000b4013f2181a5  ...     1200
41  cd4234ed6caba8f15d09a2e3ee604b2a  ...     1055

[42 rows x 5 columns]
🚀 create_base_extracted_entities
                                        entity_graph
0  <graphml xmlns="http://graphml.graphdrawing.or...
🚀 create_summarized_entities
                                        entity_graph
0  <graphml xmlns="http://graphml.graphdrawing.or...
🚀 create_base_entity_graph
   level                                    clustered_graph
0      0  <graphml xmlns="http://graphml.graphdrawing.or...
1      1  <graphml xmlns="http://graphml.graphdrawing.or...
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is
deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
  return bound(*args, **kwds)
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is
deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
  return bound(*args, **kwds)
🚀 create_final_entities
                                   id  ...                              description_embedding
0    b45241d70f0e43fca764df95b2b81f77  ...  [-0.03227982670068741, 0.026621580123901367, 0...
1    4119fd06010c494caa07f439b333f4c5  ...  [0.005379790905863047, 0.0033020235132426023, ...
2    d3835bf3dda84ead99deadbeac5d0d7d  ...  [0.06518136709928513, 0.009380006231367588, -0...
3    077d2820ae1845bcbb1803379a3d1eae  ...  [0.004046803805977106, 0.010871930047869682, -...
4    3671ea0dd4e84c1a9b02c5ab2c8f4bac  ...  [0.018699107691645622, -0.0062330360524356365,...
..                                ...  ...                                                ...
173  5a28b94bc63b44edb30c54748fd14f15  ...  [-0.036404531449079514, -0.01810646429657936, ...
174  f97011b2a99d44648e18d517e1eae15c  ...  [-0.019464654847979546, -0.005517757963389158,...
175  35489ca6a63b47d6a8913cf333818bc1  ...  [-0.03391822427511215, -0.006714249961078167, ...
176  5d3344f45e654d2c808481672f2f08dd  ...  [0.0029292278923094273, 0.03771210461854935, 0...
177  6fb57f83baec45c9b30490ee991f433f  ...  [0.0347650907933712, -0.00041183660505339503, ...

[178 rows x 8 columns]
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is
deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
  return bound(*args, **kwds)
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\datashaper\engine\verbs\convert.py:72: FutureWarning: errors='ignore' is
deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  datetime_column = pd.to_datetime(column, errors="ignore")
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\datashaper\engine\verbs\convert.py:72: UserWarning: Could not infer format,
so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please
specify a format.
  datetime_column = pd.to_datetime(column, errors="ignore")
🚀 create_final_nodes
     level                          title            type  ...                 top_level_node_id  x  y
0        0            "PROJECT GUTENBERG"  "ORGANIZATION"  ...  b45241d70f0e43fca764df95b2b81f77  0  0
1        0            "A CHRISTMAS CAROL"         "EVENT"  ...  4119fd06010c494caa07f439b333f4c5  0  0
2        0              "CHARLES DICKENS"        "PERSON"  ...  d3835bf3dda84ead99deadbeac5d0d7d  0  0
3        0               "ARTHUR RACKHAM"        "PERSON"  ...  077d2820ae1845bcbb1803379a3d1eae  0  0
4        0             "EBENEZER SCROOGE"        "PERSON"  ...  3671ea0dd4e84c1a9b02c5ab2c8f4bac  0  0
..     ...                            ...             ...  ...                               ... .. ..
351      1  "LITERARY ARCHIVE FOUNDATION"  "ORGANIZATION"  ...  5a28b94bc63b44edb30c54748fd14f15  0  0
352      1              "MICHAEL S. HART"        "PERSON"  ...  f97011b2a99d44648e18d517e1eae15c  0  0
353      1               "SALT LAKE CITY"           "GEO"  ...  35489ca6a63b47d6a8913cf333818bc1  0  0
354      1                    "DONATIONS"         "EVENT"  ...  5d3344f45e654d2c808481672f2f08dd  0  0
355      1          "PUBLIC DOMAIN WORKS"         "EVENT"  ...  6fb57f83baec45c9b30490ee991f433f  0  0

[356 rows x 14 columns]
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is
deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
  return bound(*args, **kwds)
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is
deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
  return bound(*args, **kwds)
🚀 create_final_communities
    id         title  ...                                   relationship_ids                                      text_unit_ids
0    4   Community 4  ...  [ad1595a78935472999444c9330e7730e, 735d19aea07...  [260fb94666cbdfb08286ce8d8162130d,d65838400462...
1    2   Community 2  ...  [31a7e680c4d54101afe4c8d52d246913, 351abba16e5...  [04e5c071e4ee5496d5380662e1339f45,1bdf253855a1...
2    0   Community 0  ...  [5ac60a941a5b4934bdc43d2f87de601c, d405c3154d0...  [04e5c071e4ee5496d5380662e1339f45,080d8e696ff3...
3    1   Community 1  ...  [b35c3d1a7daa4924b6bdb58bc69c354d, a97e2ecd870...  [04e5c071e4ee5496d5380662e1339f45,080d8e696ff3...
4   10  Community 10  ...  [31499ee6277a4d71b19cb5b6be554c69, d99eabad5df...                 [0e2b719e4c97d0d8bfeb2a53f7638eb6]
5    6   Community 6  ...  [d53f15cb7f7845de91cc44ad44ff9f6e, 0080f96708c...                 [0e2b719e4c97d0d8bfeb2a53f7638eb6]
6    9   Community 9  ...  [0ec262c2cfef4dd581f3655e5e496e31, 40e4ef7dbc9...                 [8435b078474636a989a8c22f5493e1b6]
7    3   Community 3  ...  [100c2fccd7f74d9281707082f062ba72, 4d183e70076...  [19f8fd68a8dbc1bba7058e13ce3a2e3d,8435b0784746...
8    8   Community 8  ...  [4e9ca18ccc1d4527a3bc035d07f5e162, 5564257e89f...
9    5   Community 5  ...  [2325dafe50d1435cbee8ebcaa69688df, 9ed7e3d187b...  [4cf4deeb7f61acb7b7db4ce0e57fb1e6,bc5fde5d1e00...
10   7   Community 7  ...  [469aeef98cd1421fa123277b93d7b83a, 2fb66f9a0de...  [1cb66ea16e5e4f2816f0e188d3acc792, 1cb66ea16e5...
11  20  Community 20  ...  [ad1595a78935472999444c9330e7730e, 735d19aea07...  [260fb94666cbdfb08286ce8d8162130d,d65838400462...
12  16  Community 16  ...  [31a7e680c4d54101afe4c8d52d246913, 351abba16e5...  [04e5c071e4ee5496d5380662e1339f45,1bdf253855a1...
13  12  Community 12  ...  [5ac60a941a5b4934bdc43d2f87de601c, d405c3154d0...  [04e5c071e4ee5496d5380662e1339f45,080d8e696ff3...
14  19  Community 19  ...  [c1a146d7fb16429ea6d0aa2a55ee597f, ede93506320...  [080d8e696ff38c653ca90fa086415e74,0e2b719e4c97...
15  14  Community 14  ...  [f422035f8b78417f98e4d116971cf9f3, c79d686eba0...  [04e5c071e4ee5496d5380662e1339f45,1cb66ea16e5e...
16  11  Community 11  ...  [bcfdc48e5f044e1d84c5d217c1992d4b, b232fb0f2ac...  [04e5c071e4ee5496d5380662e1339f45,10730234d6cc...
17  13  Community 13  ...  [b3aeb7ae009a4f52ae3ae4586e32fe11, 089b9b98417...  [0e2b719e4c97d0d8bfeb2a53f7638eb6,0f9b4e5a7cfc...
18  15  Community 15  ...  [23becf8c6fca4f47a53ec4883d4bf63f, d0ffa3bcd12...  [0f9b4e5a7cfc0c3c89a8898a45383588,1bdf253855a1...
19  22  Community 22  ...  [83c76fbd2a004d90a5b0a6736ffed61d, d9779c41e3c...  [98f3970b31dfa1d7391cdaa453d6ade7,b029f1164f62...
20  21  Community 21  ...  [bd43f3d439a54781bd4b721a9a269b92, adc0f95733e...
21  18  Community 18  ...  [225105a7be14447cb03186bd40756059, efce8a9d612...  [19f8fd68a8dbc1bba7058e13ce3a2e3d,1bdf253855a1...
22  17  Community 17  ...  [f2c06f3a0c704296bf3353b91ee8af47, 9d08f285a7b...  [b4dec8fbe9f2a2c6a79d09c9484d15ae,f40e4b274b5e...

[23 rows x 6 columns]
🚀 join_text_units_to_entity_ids
                       text_unit_ids                                         entity_ids                                id
0   0ddc17ea5e566006c000b4013f2181a5  [b45241d70f0e43fca764df95b2b81f77, f2ff8044718...  0ddc17ea5e566006c000b4013f2181a5
1   cd4234ed6caba8f15d09a2e3ee604b2a  [b45241d70f0e43fca764df95b2b81f77, f7e11b0e297...  cd4234ed6caba8f15d09a2e3ee604b2a
2   d6583840046247f428a9f02738842a7c  [b45241d70f0e43fca764df95b2b81f77, 4119fd06010...  d6583840046247f428a9f02738842a7c
3   260fb94666cbdfb08286ce8d8162130d  [3671ea0dd4e84c1a9b02c5ab2c8f4bac, e7ffaee9d31...  260fb94666cbdfb08286ce8d8162130d
4   04e5c071e4ee5496d5380662e1339f45  [19a7f254a5d64566ab5cc15472df02de, f7e11b0e297...  04e5c071e4ee5496d5380662e1339f45
5   1bdf253855a115bcf51faa63d7b07e82  [19a7f254a5d64566ab5cc15472df02de, de988724cfd...  1bdf253855a115bcf51faa63d7b07e82
6   29793cee69d4eefd5fea8a5f2f27b521  [19a7f254a5d64566ab5cc15472df02de, de988724cfd...  29793cee69d4eefd5fea8a5f2f27b521
7   2b5ecb7fba1301d1f3d307e194a6c435  [19a7f254a5d64566ab5cc15472df02de, de988724cfd...  2b5ecb7fba1301d1f3d307e194a6c435
8   4ffd9df98742c035b3e15bb24c3edb12  [19a7f254a5d64566ab5cc15472df02de, 254770028d7...  4ffd9df98742c035b3e15bb24c3edb12
9   5d70b47bf7167b7586f47fcc4355a746  [19a7f254a5d64566ab5cc15472df02de, e7ffaee9d31...  5d70b47bf7167b7586f47fcc4355a746
10  6c362d3f8d01c76d84443dcabf3f322a  [19a7f254a5d64566ab5cc15472df02de, de988724cfd...  6c362d3f8d01c76d84443dcabf3f322a
11  7064df4af064aeb556e5bab52e896414  [19a7f254a5d64566ab5cc15472df02de, e7ffaee9d31...  7064df4af064aeb556e5bab52e896414
12  759315fa84c14e81f84fc71c73746184  [19a7f254a5d64566ab5cc15472df02de, e7ffaee9d31...  759315fa84c14e81f84fc71c73746184
13  8435b078474636a989a8c22f5493e1b6  [19a7f254a5d64566ab5cc15472df02de, 254770028d7...  8435b078474636a989a8c22f5493e1b6
14  b4dec8fbe9f2a2c6a79d09c9484d15ae  [19a7f254a5d64566ab5cc15472df02de, e7ffaee9d31...  b4dec8fbe9f2a2c6a79d09c9484d15ae
15  bf29edcb41403e5af43aa101072f4fdf  [19a7f254a5d64566ab5cc15472df02de, de988724cfd...  bf29edcb41403e5af43aa101072f4fdf
16  c79e67fc6f74a9afbe79c158000cc71b  [19a7f254a5d64566ab5cc15472df02de, de988724cfd...  c79e67fc6f74a9afbe79c158000cc71b
17  e8d4072836ac08145edc2fa8c15ea2c2  [19a7f254a5d64566ab5cc15472df02de, e7ffaee9d31...  e8d4072836ac08145edc2fa8c15ea2c2
18  f40e4b274b5e1a25afbff9ecb733e1f4  [19a7f254a5d64566ab5cc15472df02de, f7e11b0e297...  f40e4b274b5e1a25afbff9ecb733e1f4
19  080d8e696ff38c653ca90fa086415e74  [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297...  080d8e696ff38c653ca90fa086415e74
20  0f9b4e5a7cfc0c3c89a8898a45383588  [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297...  0f9b4e5a7cfc0c3c89a8898a45383588
21  10730234d6ccc7cee08f3cfc58d8a9a1  [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297...  10730234d6ccc7cee08f3cfc58d8a9a1
22  3763b08136628f77304cb4eb1136ea48  [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297...  3763b08136628f77304cb4eb1136ea48
23  4cf4deeb7f61acb7b7db4ce0e57fb1e6  [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297...  4cf4deeb7f61acb7b7db4ce0e57fb1e6
24  77ae3762a0b062ca5350ea54a05450ae  [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297...  77ae3762a0b062ca5350ea54a05450ae
25  980594a50d68db06e6ca257bdb9ae95e  [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297...  980594a50d68db06e6ca257bdb9ae95e
26  bc5fde5d1e00a3ecc1e548c8d24f1c1f  [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297...  bc5fde5d1e00a3ecc1e548c8d24f1c1f
27  1cb66ea16e5e4f2816f0e188d3acc792  [f7e11b0e297a44a896dc67928368f600, 1fd3fa8bb5a...  1cb66ea16e5e4f2816f0e188d3acc792
28  98f3970b31dfa1d7391cdaa453d6ade7  [1fd3fa8bb5a2408790042ab9573779ee, de988724cfd...  98f3970b31dfa1d7391cdaa453d6ade7
29  0e2b719e4c97d0d8bfeb2a53f7638eb6  [de988724cfdf45cebfba3b13c43ceede, 9646481f66c...  0e2b719e4c97d0d8bfeb2a53f7638eb6
30  206c2f9fd249659c7a897d323459cb6f  [de988724cfdf45cebfba3b13c43ceede, 254770028d7...  206c2f9fd249659c7a897d323459cb6f
31  61a042016835080f3d334560b13b0e35  [de988724cfdf45cebfba3b13c43ceede, c9632a35146...  61a042016835080f3d334560b13b0e35
32  999c9887098d1a25dc3b42a8da7ddc8c  [de988724cfdf45cebfba3b13c43ceede, 254770028d7...  999c9887098d1a25dc3b42a8da7ddc8c
33  b029f1164f623c14a0cfaa73c246f50d  [de988724cfdf45cebfba3b13c43ceede, 254770028d7...  b029f1164f623c14a0cfaa73c246f50d
34  ce95e4fc6ee410973c040fc628dce155  [de988724cfdf45cebfba3b13c43ceede, 04dbbb2283b...  ce95e4fc6ee410973c040fc628dce155
35  d453d198afec5b284ff36024780b088c  [de988724cfdf45cebfba3b13c43ceede, 254770028d7...  d453d198afec5b284ff36024780b088c
36  e3bef9514042131cf477476725497416  [de988724cfdf45cebfba3b13c43ceede, 254770028d7...  e3bef9514042131cf477476725497416
37  ebc403dd3df39bacc3443ef4afb7edfd  [de988724cfdf45cebfba3b13c43ceede, 9646481f66c...  ebc403dd3df39bacc3443ef4afb7edfd
38  19f8fd68a8dbc1bba7058e13ce3a2e3d  [254770028d7a4fa9877da4ba0ad5ad21, 273daeec8ca...  19f8fd68a8dbc1bba7058e13ce3a2e3d
39  bc606176c752984da6d202275ee8c7a6  [273daeec8cad41e6b3e450447db58ee7, 3e95dacfe57...  bc606176c752984da6d202275ee8c7a6
40  cd8a47ace09b9cee1e8b27b0689f2822  [273daeec8cad41e6b3e450447db58ee7, 4f3c97517f7...  cd8a47ace09b9cee1e8b27b0689f2822
41  aa8d2310a206001404282ddb3fd645aa  [f2ff8044718648e18acef16dd9a65436, 00d785e7d76...  aa8d2310a206001404282ddb3fd645aa
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is
deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
  return bound(*args, **kwds)
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is
deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
  return bound(*args, **kwds)
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\datashaper\engine\verbs\convert.py:65: FutureWarning: errors='ignore' is
deprecated and will raise in a future version. Use to_numeric without passing `errors` and catch exceptions explicitly instead
  column_numeric = cast(pd.Series, pd.to_numeric(column, errors="ignore"))
🚀 create_final_relationships
                                              source                                           target  ...  target_degree rank
0                                "PROJECT GUTENBERG"                              "A CHRISTMAS CAROL"  ...              3   10
1                                "PROJECT GUTENBERG"  "PROJECT GUTENBERG LITERARY ARCHIVE FOUNDATION"  ...              3   10
2                                "PROJECT GUTENBERG"                                        "DEFECTS"  ...              1    8
3                                "PROJECT GUTENBERG"                                "MICHAEL S. HART"  ...              1    8
4                                "PROJECT GUTENBERG"                                 "SALT LAKE CITY"  ...              1    8
..                                               ...                                              ...  ...            ...  ...
176                                            "BOB"                                          "PETER"  ...              1    4
177  "PROJECT GUTENBERG LITERARY ARCHIVE FOUNDATION"                             "PROJECT GUTENBERG™"  ...              3    6
178  "PROJECT GUTENBERG LITERARY ARCHIVE FOUNDATION"                               "ROYALTY PAYMENTS"  ...              1    4
179                             "PROJECT GUTENBERG™"                                  "UNITED STATES"  ...              1    4
180                             "PROJECT GUTENBERG™"                                  "COPYRIGHT LAW"  ...              1    4

[181 rows x 10 columns]
🚀 join_text_units_to_relationship_ids
                                  id                                   relationship_ids
0   d6583840046247f428a9f02738842a7c  [68762e6f0d1c41cd857c6b964a8e76c3, 101572f552b...
1   0ddc17ea5e566006c000b4013f2181a5  [70634e10a5e845aa8c6a32fe7e8eb2b2, 04085f7cf46...
2   cd4234ed6caba8f15d09a2e3ee604b2a  [70634e10a5e845aa8c6a32fe7e8eb2b2, d203efdbfb2...
3   260fb94666cbdfb08286ce8d8162130d  [80020a1da63042459e00266b2a605452, 9a8ce816ee9...
4   04e5c071e4ee5496d5380662e1339f45  [31a7e680c4d54101afe4c8d52d246913, b7e9c9ef572...
5   1bdf253855a115bcf51faa63d7b07e82  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
6   29793cee69d4eefd5fea8a5f2f27b521  [31a7e680c4d54101afe4c8d52d246913, 4465efb7f6e...
7   2b5ecb7fba1301d1f3d307e194a6c435  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
8   4ffd9df98742c035b3e15bb24c3edb12  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
9   6c362d3f8d01c76d84443dcabf3f322a  [31a7e680c4d54101afe4c8d52d246913, adffed660d1...
10  7064df4af064aeb556e5bab52e896414  [31a7e680c4d54101afe4c8d52d246913, 351abba16e5...
11  759315fa84c14e81f84fc71c73746184  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
12  8435b078474636a989a8c22f5493e1b6  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
13  bf29edcb41403e5af43aa101072f4fdf  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
14  c79e67fc6f74a9afbe79c158000cc71b  [31a7e680c4d54101afe4c8d52d246913, c1a146d7fb1...
15  e8d4072836ac08145edc2fa8c15ea2c2  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
16  5d70b47bf7167b7586f47fcc4355a746  [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...
17  b4dec8fbe9f2a2c6a79d09c9484d15ae  [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...
18  f40e4b274b5e1a25afbff9ecb733e1f4  [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...
19  3763b08136628f77304cb4eb1136ea48  [072cdee531b74513984f49d99a8d64a0, 5ae335d9210...
20  77ae3762a0b062ca5350ea54a05450ae  [5ac60a941a5b4934bdc43d2f87de601c, d405c3154d0...
21  98f3970b31dfa1d7391cdaa453d6ade7  [5bd156c87ec44e19ae6f8f62e6e50b9d, c8e706fbdc9...
22  61a042016835080f3d334560b13b0e35  [f422035f8b78417f98e4d116971cf9f3, c79d686eba0...
23  19f8fd68a8dbc1bba7058e13ce3a2e3d  [c79d686eba044c5586c706cdc096817d, da1684437ab...
24  bc5fde5d1e00a3ecc1e548c8d24f1c1f  [c79d686eba044c5586c706cdc096817d, 0f70db1e598...
25  10730234d6ccc7cee08f3cfc58d8a9a1  [b35c3d1a7daa4924b6bdb58bc69c354d, a97e2ecd870...
26  ce95e4fc6ee410973c040fc628dce155  [3e1b063bbfa9423d84e50311296d2f3c, 1dbc51475cb...
27  080d8e696ff38c653ca90fa086415e74  [1dbc51475cb04dafa4a8833a8378635e, c12b9ebd8b4...
28  0e2b719e4c97d0d8bfeb2a53f7638eb6  [1dbc51475cb04dafa4a8833a8378635e, fdc954b4547...
29  206c2f9fd249659c7a897d323459cb6f  [1dbc51475cb04dafa4a8833a8378635e, c2d48b75af6...
30  4cf4deeb7f61acb7b7db4ce0e57fb1e6  [1dbc51475cb04dafa4a8833a8378635e, c12b9ebd8b4...
31  999c9887098d1a25dc3b42a8da7ddc8c  [1dbc51475cb04dafa4a8833a8378635e, f9005e5c01b...
32  980594a50d68db06e6ca257bdb9ae95e  [c12b9ebd8b4e42b7896822a32e3fa6eb, 27505f6ade4...
33  ebc403dd3df39bacc3443ef4afb7edfd  [5a6c1d15424149f69052cd8d91fbff75, f9005e5c01b...
34  0f9b4e5a7cfc0c3c89a8898a45383588  [da1684437ab04f23adac28ff70bd8429, 6768339b540...
35  e3bef9514042131cf477476725497416  [da1684437ab04f23adac28ff70bd8429, 4517768fc4e...
36  d453d198afec5b284ff36024780b088c  [dbe9063124d047dc8d6fcaeadcda038f, 89b2003e978...
37  b029f1164f623c14a0cfaa73c246f50d  [c8e706fbdc90420d952deed03c4f04b4, 83c76fbd2a0...
38  1cb66ea16e5e4f2816f0e188d3acc792  [40450f2c91944a81944621b94f190b49, 5b9fa6a9592...
39  cd8a47ace09b9cee1e8b27b0689f2822  [40450f2c91944a81944621b94f190b49, 5d97ff82691...
40  bc606176c752984da6d202275ee8c7a6  [b84d71ed9c3b45819eb3205fd28e13a0, b0b464bc92a...
41  aa8d2310a206001404282ddb3fd645aa  [24652fab20d84381b112b8491de2887e, 36be44627ec...
🚀 create_final_community_reports
   community  ...                                    id
0         11  ...  00f4f26e-f665-4513-a11b-c2af7a02a36e
1         12  ...  9cdd9926-6613-4dc6-8b76-1701b42a67e5
2         13  ...  6df08abe-4339-406b-8f8e-88508ea788d4
3         14  ...  e128aff8-1831-4546-bc12-dfe6fe5b7456
4         15  ...  a304c0d0-8d80-459b-a6d0-338c40913f0e
5         16  ...  570f44a7-aa91-4799-8ec0-f83b0249093c
6         17  ...  a70866bd-ef07-4ffd-9481-32e843273f93
7         18  ...  bea51767-f949-4533-8dec-7dac859ca953
8         19  ...  9569bb60-3df5-44f1-be6c-eb01b2499cba
9         20  ...  eeccba53-7dfa-4b75-93ba-af46ebfe57e7
10        21  ...  57c87ad1-6417-45c2-a340-be1a9dd581d3
11        22  ...  ace75235-f237-4da4-b239-40418d462680
12         0  ...  186203c5-a162-4cad-a8f5-7c92af213b31
13         1  ...  e8e63308-7a83-4abc-a123-87952fb88042
14        10  ...  726b6e9a-70ab-45c5-890c-c7845d62564c
15         2  ...  1db816f5-301a-433c-95bf-a208fb6a06e9
16         3  ...  f06bd7e9-029f-425a-aa19-0987930dcfbb
17         4  ...  32285e91-d28f-487a-b438-8f978c9e5a19
18         5  ...  c0679eb7-369f-4792-8849-b326bebe1381
19         6  ...  4588263c-01b2-4689-b103-d6effcfc6870
20         7  ...  7097902f-0f84-4e41-93bc-b319a59abef4
21         8  ...  5584e7b9-227d-4248-b766-aab34a3e5805
22         9  ...  44b1d2d9-bc0d-493e-b124-f2530d55bdf8

[23 rows x 10 columns]
🚀 create_final_text_units
                                  id  ...                                   relationship_ids
0   d6583840046247f428a9f02738842a7c  ...  [68762e6f0d1c41cd857c6b964a8e76c3, 101572f552b...
1   10730234d6ccc7cee08f3cfc58d8a9a1  ...  [b35c3d1a7daa4924b6bdb58bc69c354d, a97e2ecd870...
2   980594a50d68db06e6ca257bdb9ae95e  ...  [c12b9ebd8b4e42b7896822a32e3fa6eb, 27505f6ade4...
3   080d8e696ff38c653ca90fa086415e74  ...  [1dbc51475cb04dafa4a8833a8378635e, c12b9ebd8b4...
4   0e2b719e4c97d0d8bfeb2a53f7638eb6  ...  [1dbc51475cb04dafa4a8833a8378635e, fdc954b4547...
5   7064df4af064aeb556e5bab52e896414  ...  [31a7e680c4d54101afe4c8d52d246913, 351abba16e5...
6   759315fa84c14e81f84fc71c73746184  ...  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
7   e8d4072836ac08145edc2fa8c15ea2c2  ...  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
8   e3bef9514042131cf477476725497416  ...  [da1684437ab04f23adac28ff70bd8429, 4517768fc4e...
9   4ffd9df98742c035b3e15bb24c3edb12  ...  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
10  8435b078474636a989a8c22f5493e1b6  ...  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
11  3763b08136628f77304cb4eb1136ea48  ...  [072cdee531b74513984f49d99a8d64a0, 5ae335d9210...
12  206c2f9fd249659c7a897d323459cb6f  ...  [1dbc51475cb04dafa4a8833a8378635e, c2d48b75af6...
13  ce95e4fc6ee410973c040fc628dce155  ...  [3e1b063bbfa9423d84e50311296d2f3c, 1dbc51475cb...
14  260fb94666cbdfb08286ce8d8162130d  ...  [80020a1da63042459e00266b2a605452, 9a8ce816ee9...
15  bf29edcb41403e5af43aa101072f4fdf  ...  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
16  d453d198afec5b284ff36024780b088c  ...  [dbe9063124d047dc8d6fcaeadcda038f, 89b2003e978...
17  c79e67fc6f74a9afbe79c158000cc71b  ...  [31a7e680c4d54101afe4c8d52d246913, c1a146d7fb1...
18  77ae3762a0b062ca5350ea54a05450ae  ...  [5ac60a941a5b4934bdc43d2f87de601c, d405c3154d0...
19  b029f1164f623c14a0cfaa73c246f50d  ...  [c8e706fbdc90420d952deed03c4f04b4, 83c76fbd2a0...
20  29793cee69d4eefd5fea8a5f2f27b521  ...  [31a7e680c4d54101afe4c8d52d246913, 4465efb7f6e...
21  b4dec8fbe9f2a2c6a79d09c9484d15ae  ...  [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...
22  5d70b47bf7167b7586f47fcc4355a746  ...  [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...
23  1bdf253855a115bcf51faa63d7b07e82  ...  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
24  999c9887098d1a25dc3b42a8da7ddc8c  ...  [1dbc51475cb04dafa4a8833a8378635e, f9005e5c01b...
25  bc5fde5d1e00a3ecc1e548c8d24f1c1f  ...  [c79d686eba044c5586c706cdc096817d, 0f70db1e598...
26  4cf4deeb7f61acb7b7db4ce0e57fb1e6  ...  [1dbc51475cb04dafa4a8833a8378635e, c12b9ebd8b4...
27  61a042016835080f3d334560b13b0e35  ...  [f422035f8b78417f98e4d116971cf9f3, c79d686eba0...
28  98f3970b31dfa1d7391cdaa453d6ade7  ...  [5bd156c87ec44e19ae6f8f62e6e50b9d, c8e706fbdc9...
29  ebc403dd3df39bacc3443ef4afb7edfd  ...  [5a6c1d15424149f69052cd8d91fbff75, f9005e5c01b...
30  1cb66ea16e5e4f2816f0e188d3acc792  ...  [40450f2c91944a81944621b94f190b49, 5b9fa6a9592...
31  bc606176c752984da6d202275ee8c7a6  ...  [b84d71ed9c3b45819eb3205fd28e13a0, b0b464bc92a...
32  cd8a47ace09b9cee1e8b27b0689f2822  ...  [40450f2c91944a81944621b94f190b49, 5d97ff82691...
33  f40e4b274b5e1a25afbff9ecb733e1f4  ...  [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...
34  19f8fd68a8dbc1bba7058e13ce3a2e3d  ...  [c79d686eba044c5586c706cdc096817d, da1684437ab...
35  0f9b4e5a7cfc0c3c89a8898a45383588  ...  [da1684437ab04f23adac28ff70bd8429, 6768339b540...
36  6c362d3f8d01c76d84443dcabf3f322a  ...  [31a7e680c4d54101afe4c8d52d246913, adffed660d1...
37  04e5c071e4ee5496d5380662e1339f45  ...  [31a7e680c4d54101afe4c8d52d246913, b7e9c9ef572...
38  2b5ecb7fba1301d1f3d307e194a6c435  ...  [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...
39  aa8d2310a206001404282ddb3fd645aa  ...  [24652fab20d84381b112b8491de2887e, 36be44627ec...
40  0ddc17ea5e566006c000b4013f2181a5  ...  [70634e10a5e845aa8c6a32fe7e8eb2b2, 04085f7cf46...
41  cd4234ed6caba8f15d09a2e3ee604b2a  ...  [70634e10a5e845aa8c6a32fe7e8eb2b2, d203efdbfb2...

[42 rows x 6 columns]
C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\datashaper\engine\verbs\convert.py:72: FutureWarning: errors='ignore' is
deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  datetime_column = pd.to_datetime(column, errors="ignore")
🚀 create_base_documents
                                 id  ...     title
0  c305886e4aa2f6efcf64b57762777055  ...  book.txt

[1 rows x 4 columns]
🚀 create_final_documents
                                 id  ...     title
0  c305886e4aa2f6efcf64b57762777055  ...  book.txt

[1 rows x 4 columns]
⠙ GraphRAG Indexer
├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
├── create_final_entities
├── create_final_nodes
├── create_final_communities
├── join_text_units_to_entity_ids
├── create_final_relationships
├── join_text_units_to_relationship_ids
├── create_final_community_reports
├── create_final_text_units
├── create_base_documents
└── create_final_documents
🚀 All workflows completed successfully.

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

python -m graphrag.index --root ragtest

🚀 Reading settings from ragtest\settings.yaml

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is

deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.

return bound(*args, **kwds)

🚀 create_base_text_units

id ... n_tokens

0 d6583840046247f428a9f02738842a7c ... 1200

1 10730234d6ccc7cee08f3cfc58d8a9a1 ... 1200

2 980594a50d68db06e6ca257bdb9ae95e ... 1200

3 080d8e696ff38c653ca90fa086415e74 ... 1200

4 0e2b719e4c97d0d8bfeb2a53f7638eb6 ... 1200

5 7064df4af064aeb556e5bab52e896414 ... 1200

6 759315fa84c14e81f84fc71c73746184 ... 1200

7 e8d4072836ac08145edc2fa8c15ea2c2 ... 1200

8 e3bef9514042131cf477476725497416 ... 1200

9 4ffd9df98742c035b3e15bb24c3edb12 ... 1200

10 8435b078474636a989a8c22f5493e1b6 ... 1200

11 3763b08136628f77304cb4eb1136ea48 ... 1200

12 206c2f9fd249659c7a897d323459cb6f ... 1200

13 ce95e4fc6ee410973c040fc628dce155 ... 1200

14 260fb94666cbdfb08286ce8d8162130d ... 1200

15 bf29edcb41403e5af43aa101072f4fdf ... 1200

16 d453d198afec5b284ff36024780b088c ... 1200

17 c79e67fc6f74a9afbe79c158000cc71b ... 1200

18 77ae3762a0b062ca5350ea54a05450ae ... 1200

19 b029f1164f623c14a0cfaa73c246f50d ... 1200

20 29793cee69d4eefd5fea8a5f2f27b521 ... 1200

21 b4dec8fbe9f2a2c6a79d09c9484d15ae ... 1200

22 5d70b47bf7167b7586f47fcc4355a746 ... 1200

23 1bdf253855a115bcf51faa63d7b07e82 ... 1200

24 999c9887098d1a25dc3b42a8da7ddc8c ... 1200

25 bc5fde5d1e00a3ecc1e548c8d24f1c1f ... 1200

26 4cf4deeb7f61acb7b7db4ce0e57fb1e6 ... 1200

27 61a042016835080f3d334560b13b0e35 ... 1200

28 98f3970b31dfa1d7391cdaa453d6ade7 ... 1200

29 ebc403dd3df39bacc3443ef4afb7edfd ... 1200

30 1cb66ea16e5e4f2816f0e188d3acc792 ... 1200

31 bc606176c752984da6d202275ee8c7a6 ... 1200

32 cd8a47ace09b9cee1e8b27b0689f2822 ... 1200

33 f40e4b274b5e1a25afbff9ecb733e1f4 ... 1200

34 19f8fd68a8dbc1bba7058e13ce3a2e3d ... 1200

35 0f9b4e5a7cfc0c3c89a8898a45383588 ... 1200

36 6c362d3f8d01c76d84443dcabf3f322a ... 1200

37 04e5c071e4ee5496d5380662e1339f45 ... 1200

38 2b5ecb7fba1301d1f3d307e194a6c435 ... 1200

39 aa8d2310a206001404282ddb3fd645aa ... 1200

40 0ddc17ea5e566006c000b4013f2181a5 ... 1200

41 cd4234ed6caba8f15d09a2e3ee604b2a ... 1055

[42 rows x 5 columns]

🚀 create_base_extracted_entities

entity_graph

0 <graphml xmlns="http://graphml.graphdrawing.or...

🚀 create_summarized_entities

entity_graph

0 <graphml xmlns="http://graphml.graphdrawing.or...

🚀 create_base_entity_graph

level clustered_graph

0 0 <graphml xmlns="http://graphml.graphdrawing.or...

1 1 <graphml xmlns="http://graphml.graphdrawing.or...

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is

deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.

return bound(*args, **kwds)

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is

deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.

return bound(*args, **kwds)

🚀 create_final_entities

id ... description_embedding

0 b45241d70f0e43fca764df95b2b81f77 ... [-0.03227982670068741, 0.026621580123901367, 0...

1 4119fd06010c494caa07f439b333f4c5 ... [0.005379790905863047, 0.0033020235132426023, ...

2 d3835bf3dda84ead99deadbeac5d0d7d ... [0.06518136709928513, 0.009380006231367588, -0...

3 077d2820ae1845bcbb1803379a3d1eae ... [0.004046803805977106, 0.010871930047869682, -...

4 3671ea0dd4e84c1a9b02c5ab2c8f4bac ... [0.018699107691645622, -0.0062330360524356365,...

.. ... ... ...

173 5a28b94bc63b44edb30c54748fd14f15 ... [-0.036404531449079514, -0.01810646429657936, ...

174 f97011b2a99d44648e18d517e1eae15c ... [-0.019464654847979546, -0.005517757963389158,...

175 35489ca6a63b47d6a8913cf333818bc1 ... [-0.03391822427511215, -0.006714249961078167, ...

176 5d3344f45e654d2c808481672f2f08dd ... [0.0029292278923094273, 0.03771210461854935, 0...

177 6fb57f83baec45c9b30490ee991f433f ... [0.0347650907933712, -0.00041183660505339503, ...

[178 rows x 8 columns]

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is

deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.

return bound(*args, **kwds)

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\datashaper\engine\verbs\convert.py:72: FutureWarning: errors='ignore' is

deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead

datetime_column = pd.to_datetime(column, errors="ignore")

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\datashaper\engine\verbs\convert.py:72: UserWarning: Could not infer format,

so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please

specify a format.

datetime_column = pd.to_datetime(column, errors="ignore")

🚀 create_final_nodes

level title type ... top_level_node_id x y

0 0 "PROJECT GUTENBERG" "ORGANIZATION" ... b45241d70f0e43fca764df95b2b81f77 0 0

1 0 "A CHRISTMAS CAROL" "EVENT" ... 4119fd06010c494caa07f439b333f4c5 0 0

2 0 "CHARLES DICKENS" "PERSON" ... d3835bf3dda84ead99deadbeac5d0d7d 0 0

3 0 "ARTHUR RACKHAM" "PERSON" ... 077d2820ae1845bcbb1803379a3d1eae 0 0

4 0 "EBENEZER SCROOGE" "PERSON" ... 3671ea0dd4e84c1a9b02c5ab2c8f4bac 0 0

.. ... ... ... ... ... .. ..

351 1 "LITERARY ARCHIVE FOUNDATION" "ORGANIZATION" ... 5a28b94bc63b44edb30c54748fd14f15 0 0

352 1 "MICHAEL S. HART" "PERSON" ... f97011b2a99d44648e18d517e1eae15c 0 0

353 1 "SALT LAKE CITY" "GEO" ... 35489ca6a63b47d6a8913cf333818bc1 0 0

354 1 "DONATIONS" "EVENT" ... 5d3344f45e654d2c808481672f2f08dd 0 0

355 1 "PUBLIC DOMAIN WORKS" "EVENT" ... 6fb57f83baec45c9b30490ee991f433f 0 0

[356 rows x 14 columns]

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is

deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.

return bound(*args, **kwds)

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is

deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.

return bound(*args, **kwds)

🚀 create_final_communities

id title ... relationship_ids text_unit_ids

0 4 Community 4 ... [ad1595a78935472999444c9330e7730e, 735d19aea07... [260fb94666cbdfb08286ce8d8162130d,d65838400462...

1 2 Community 2 ... [31a7e680c4d54101afe4c8d52d246913, 351abba16e5... [04e5c071e4ee5496d5380662e1339f45,1bdf253855a1...

2 0 Community 0 ... [5ac60a941a5b4934bdc43d2f87de601c, d405c3154d0... [04e5c071e4ee5496d5380662e1339f45,080d8e696ff3...

3 1 Community 1 ... [b35c3d1a7daa4924b6bdb58bc69c354d, a97e2ecd870... [04e5c071e4ee5496d5380662e1339f45,080d8e696ff3...

4 10 Community 10 ... [31499ee6277a4d71b19cb5b6be554c69, d99eabad5df... [0e2b719e4c97d0d8bfeb2a53f7638eb6]

5 6 Community 6 ... [d53f15cb7f7845de91cc44ad44ff9f6e, 0080f96708c... [0e2b719e4c97d0d8bfeb2a53f7638eb6]

6 9 Community 9 ... [0ec262c2cfef4dd581f3655e5e496e31, 40e4ef7dbc9... [8435b078474636a989a8c22f5493e1b6]

7 3 Community 3 ... [100c2fccd7f74d9281707082f062ba72, 4d183e70076... [19f8fd68a8dbc1bba7058e13ce3a2e3d,8435b0784746...

8 8 Community 8 ... [4e9ca18ccc1d4527a3bc035d07f5e162, 5564257e89f...

9 5 Community 5 ... [2325dafe50d1435cbee8ebcaa69688df, 9ed7e3d187b... [4cf4deeb7f61acb7b7db4ce0e57fb1e6,bc5fde5d1e00...

10 7 Community 7 ... [469aeef98cd1421fa123277b93d7b83a, 2fb66f9a0de... [1cb66ea16e5e4f2816f0e188d3acc792, 1cb66ea16e5...

11 20 Community 20 ... [ad1595a78935472999444c9330e7730e, 735d19aea07... [260fb94666cbdfb08286ce8d8162130d,d65838400462...

12 16 Community 16 ... [31a7e680c4d54101afe4c8d52d246913, 351abba16e5... [04e5c071e4ee5496d5380662e1339f45,1bdf253855a1...

13 12 Community 12 ... [5ac60a941a5b4934bdc43d2f87de601c, d405c3154d0... [04e5c071e4ee5496d5380662e1339f45,080d8e696ff3...

14 19 Community 19 ... [c1a146d7fb16429ea6d0aa2a55ee597f, ede93506320... [080d8e696ff38c653ca90fa086415e74,0e2b719e4c97...

15 14 Community 14 ... [f422035f8b78417f98e4d116971cf9f3, c79d686eba0... [04e5c071e4ee5496d5380662e1339f45,1cb66ea16e5e...

16 11 Community 11 ... [bcfdc48e5f044e1d84c5d217c1992d4b, b232fb0f2ac... [04e5c071e4ee5496d5380662e1339f45,10730234d6cc...

17 13 Community 13 ... [b3aeb7ae009a4f52ae3ae4586e32fe11, 089b9b98417... [0e2b719e4c97d0d8bfeb2a53f7638eb6,0f9b4e5a7cfc...

18 15 Community 15 ... [23becf8c6fca4f47a53ec4883d4bf63f, d0ffa3bcd12... [0f9b4e5a7cfc0c3c89a8898a45383588,1bdf253855a1...

19 22 Community 22 ... [83c76fbd2a004d90a5b0a6736ffed61d, d9779c41e3c... [98f3970b31dfa1d7391cdaa453d6ade7,b029f1164f62...

20 21 Community 21 ... [bd43f3d439a54781bd4b721a9a269b92, adc0f95733e...

21 18 Community 18 ... [225105a7be14447cb03186bd40756059, efce8a9d612... [19f8fd68a8dbc1bba7058e13ce3a2e3d,1bdf253855a1...

22 17 Community 17 ... [f2c06f3a0c704296bf3353b91ee8af47, 9d08f285a7b... [b4dec8fbe9f2a2c6a79d09c9484d15ae,f40e4b274b5e...

[23 rows x 6 columns]

🚀 join_text_units_to_entity_ids

text_unit_ids entity_ids id

0 0ddc17ea5e566006c000b4013f2181a5 [b45241d70f0e43fca764df95b2b81f77, f2ff8044718... 0ddc17ea5e566006c000b4013f2181a5

1 cd4234ed6caba8f15d09a2e3ee604b2a [b45241d70f0e43fca764df95b2b81f77, f7e11b0e297... cd4234ed6caba8f15d09a2e3ee604b2a

2 d6583840046247f428a9f02738842a7c [b45241d70f0e43fca764df95b2b81f77, 4119fd06010... d6583840046247f428a9f02738842a7c

3 260fb94666cbdfb08286ce8d8162130d [3671ea0dd4e84c1a9b02c5ab2c8f4bac, e7ffaee9d31... 260fb94666cbdfb08286ce8d8162130d

4 04e5c071e4ee5496d5380662e1339f45 [19a7f254a5d64566ab5cc15472df02de, f7e11b0e297... 04e5c071e4ee5496d5380662e1339f45

5 1bdf253855a115bcf51faa63d7b07e82 [19a7f254a5d64566ab5cc15472df02de, de988724cfd... 1bdf253855a115bcf51faa63d7b07e82

6 29793cee69d4eefd5fea8a5f2f27b521 [19a7f254a5d64566ab5cc15472df02de, de988724cfd... 29793cee69d4eefd5fea8a5f2f27b521

7 2b5ecb7fba1301d1f3d307e194a6c435 [19a7f254a5d64566ab5cc15472df02de, de988724cfd... 2b5ecb7fba1301d1f3d307e194a6c435

8 4ffd9df98742c035b3e15bb24c3edb12 [19a7f254a5d64566ab5cc15472df02de, 254770028d7... 4ffd9df98742c035b3e15bb24c3edb12

9 5d70b47bf7167b7586f47fcc4355a746 [19a7f254a5d64566ab5cc15472df02de, e7ffaee9d31... 5d70b47bf7167b7586f47fcc4355a746

10 6c362d3f8d01c76d84443dcabf3f322a [19a7f254a5d64566ab5cc15472df02de, de988724cfd... 6c362d3f8d01c76d84443dcabf3f322a

11 7064df4af064aeb556e5bab52e896414 [19a7f254a5d64566ab5cc15472df02de, e7ffaee9d31... 7064df4af064aeb556e5bab52e896414

12 759315fa84c14e81f84fc71c73746184 [19a7f254a5d64566ab5cc15472df02de, e7ffaee9d31... 759315fa84c14e81f84fc71c73746184

13 8435b078474636a989a8c22f5493e1b6 [19a7f254a5d64566ab5cc15472df02de, 254770028d7... 8435b078474636a989a8c22f5493e1b6

14 b4dec8fbe9f2a2c6a79d09c9484d15ae [19a7f254a5d64566ab5cc15472df02de, e7ffaee9d31... b4dec8fbe9f2a2c6a79d09c9484d15ae

15 bf29edcb41403e5af43aa101072f4fdf [19a7f254a5d64566ab5cc15472df02de, de988724cfd... bf29edcb41403e5af43aa101072f4fdf

16 c79e67fc6f74a9afbe79c158000cc71b [19a7f254a5d64566ab5cc15472df02de, de988724cfd... c79e67fc6f74a9afbe79c158000cc71b

17 e8d4072836ac08145edc2fa8c15ea2c2 [19a7f254a5d64566ab5cc15472df02de, e7ffaee9d31... e8d4072836ac08145edc2fa8c15ea2c2

18 f40e4b274b5e1a25afbff9ecb733e1f4 [19a7f254a5d64566ab5cc15472df02de, f7e11b0e297... f40e4b274b5e1a25afbff9ecb733e1f4

19 080d8e696ff38c653ca90fa086415e74 [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297... 080d8e696ff38c653ca90fa086415e74

20 0f9b4e5a7cfc0c3c89a8898a45383588 [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297... 0f9b4e5a7cfc0c3c89a8898a45383588

21 10730234d6ccc7cee08f3cfc58d8a9a1 [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297... 10730234d6ccc7cee08f3cfc58d8a9a1

22 3763b08136628f77304cb4eb1136ea48 [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297... 3763b08136628f77304cb4eb1136ea48

23 4cf4deeb7f61acb7b7db4ce0e57fb1e6 [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297... 4cf4deeb7f61acb7b7db4ce0e57fb1e6

24 77ae3762a0b062ca5350ea54a05450ae [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297... 77ae3762a0b062ca5350ea54a05450ae

25 980594a50d68db06e6ca257bdb9ae95e [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297... 980594a50d68db06e6ca257bdb9ae95e

26 bc5fde5d1e00a3ecc1e548c8d24f1c1f [e7ffaee9d31d4d3c96e04f911d0a8f9e, f7e11b0e297... bc5fde5d1e00a3ecc1e548c8d24f1c1f

27 1cb66ea16e5e4f2816f0e188d3acc792 [f7e11b0e297a44a896dc67928368f600, 1fd3fa8bb5a... 1cb66ea16e5e4f2816f0e188d3acc792

28 98f3970b31dfa1d7391cdaa453d6ade7 [1fd3fa8bb5a2408790042ab9573779ee, de988724cfd... 98f3970b31dfa1d7391cdaa453d6ade7

29 0e2b719e4c97d0d8bfeb2a53f7638eb6 [de988724cfdf45cebfba3b13c43ceede, 9646481f66c... 0e2b719e4c97d0d8bfeb2a53f7638eb6

30 206c2f9fd249659c7a897d323459cb6f [de988724cfdf45cebfba3b13c43ceede, 254770028d7... 206c2f9fd249659c7a897d323459cb6f

31 61a042016835080f3d334560b13b0e35 [de988724cfdf45cebfba3b13c43ceede, c9632a35146... 61a042016835080f3d334560b13b0e35

32 999c9887098d1a25dc3b42a8da7ddc8c [de988724cfdf45cebfba3b13c43ceede, 254770028d7... 999c9887098d1a25dc3b42a8da7ddc8c

33 b029f1164f623c14a0cfaa73c246f50d [de988724cfdf45cebfba3b13c43ceede, 254770028d7... b029f1164f623c14a0cfaa73c246f50d

34 ce95e4fc6ee410973c040fc628dce155 [de988724cfdf45cebfba3b13c43ceede, 04dbbb2283b... ce95e4fc6ee410973c040fc628dce155

35 d453d198afec5b284ff36024780b088c [de988724cfdf45cebfba3b13c43ceede, 254770028d7... d453d198afec5b284ff36024780b088c

36 e3bef9514042131cf477476725497416 [de988724cfdf45cebfba3b13c43ceede, 254770028d7... e3bef9514042131cf477476725497416

37 ebc403dd3df39bacc3443ef4afb7edfd [de988724cfdf45cebfba3b13c43ceede, 9646481f66c... ebc403dd3df39bacc3443ef4afb7edfd

38 19f8fd68a8dbc1bba7058e13ce3a2e3d [254770028d7a4fa9877da4ba0ad5ad21, 273daeec8ca... 19f8fd68a8dbc1bba7058e13ce3a2e3d

39 bc606176c752984da6d202275ee8c7a6 [273daeec8cad41e6b3e450447db58ee7, 3e95dacfe57... bc606176c752984da6d202275ee8c7a6

40 cd8a47ace09b9cee1e8b27b0689f2822 [273daeec8cad41e6b3e450447db58ee7, 4f3c97517f7... cd8a47ace09b9cee1e8b27b0689f2822

41 aa8d2310a206001404282ddb3fd645aa [f2ff8044718648e18acef16dd9a65436, 00d785e7d76... aa8d2310a206001404282ddb3fd645aa

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is

deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.

return bound(*args, **kwds)

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\numpy\core\fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is

deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.

return bound(*args, **kwds)

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\datashaper\engine\verbs\convert.py:65: FutureWarning: errors='ignore' is

deprecated and will raise in a future version. Use to_numeric without passing `errors` and catch exceptions explicitly instead

column_numeric = cast(pd.Series, pd.to_numeric(column, errors="ignore"))

🚀 create_final_relationships

source target ... target_degree rank

0 "PROJECT GUTENBERG" "A CHRISTMAS CAROL" ... 3 10

1 "PROJECT GUTENBERG" "PROJECT GUTENBERG LITERARY ARCHIVE FOUNDATION" ... 3 10

2 "PROJECT GUTENBERG" "DEFECTS" ... 1 8

3 "PROJECT GUTENBERG" "MICHAEL S. HART" ... 1 8

4 "PROJECT GUTENBERG" "SALT LAKE CITY" ... 1 8

.. ... ... ... ... ...

176 "BOB" "PETER" ... 1 4

177 "PROJECT GUTENBERG LITERARY ARCHIVE FOUNDATION" "PROJECT GUTENBERG™" ... 3 6

178 "PROJECT GUTENBERG LITERARY ARCHIVE FOUNDATION" "ROYALTY PAYMENTS" ... 1 4

179 "PROJECT GUTENBERG™" "UNITED STATES" ... 1 4

180 "PROJECT GUTENBERG™" "COPYRIGHT LAW" ... 1 4

[181 rows x 10 columns]

🚀 join_text_units_to_relationship_ids

id relationship_ids

0 d6583840046247f428a9f02738842a7c [68762e6f0d1c41cd857c6b964a8e76c3, 101572f552b...

1 0ddc17ea5e566006c000b4013f2181a5 [70634e10a5e845aa8c6a32fe7e8eb2b2, 04085f7cf46...

2 cd4234ed6caba8f15d09a2e3ee604b2a [70634e10a5e845aa8c6a32fe7e8eb2b2, d203efdbfb2...

3 260fb94666cbdfb08286ce8d8162130d [80020a1da63042459e00266b2a605452, 9a8ce816ee9...

4 04e5c071e4ee5496d5380662e1339f45 [31a7e680c4d54101afe4c8d52d246913, b7e9c9ef572...

5 1bdf253855a115bcf51faa63d7b07e82 [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

6 29793cee69d4eefd5fea8a5f2f27b521 [31a7e680c4d54101afe4c8d52d246913, 4465efb7f6e...

7 2b5ecb7fba1301d1f3d307e194a6c435 [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

8 4ffd9df98742c035b3e15bb24c3edb12 [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

9 6c362d3f8d01c76d84443dcabf3f322a [31a7e680c4d54101afe4c8d52d246913, adffed660d1...

10 7064df4af064aeb556e5bab52e896414 [31a7e680c4d54101afe4c8d52d246913, 351abba16e5...

11 759315fa84c14e81f84fc71c73746184 [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

12 8435b078474636a989a8c22f5493e1b6 [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

13 bf29edcb41403e5af43aa101072f4fdf [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

14 c79e67fc6f74a9afbe79c158000cc71b [31a7e680c4d54101afe4c8d52d246913, c1a146d7fb1...

15 e8d4072836ac08145edc2fa8c15ea2c2 [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

16 5d70b47bf7167b7586f47fcc4355a746 [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...

17 b4dec8fbe9f2a2c6a79d09c9484d15ae [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...

18 f40e4b274b5e1a25afbff9ecb733e1f4 [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...

19 3763b08136628f77304cb4eb1136ea48 [072cdee531b74513984f49d99a8d64a0, 5ae335d9210...

20 77ae3762a0b062ca5350ea54a05450ae [5ac60a941a5b4934bdc43d2f87de601c, d405c3154d0...

21 98f3970b31dfa1d7391cdaa453d6ade7 [5bd156c87ec44e19ae6f8f62e6e50b9d, c8e706fbdc9...

22 61a042016835080f3d334560b13b0e35 [f422035f8b78417f98e4d116971cf9f3, c79d686eba0...

23 19f8fd68a8dbc1bba7058e13ce3a2e3d [c79d686eba044c5586c706cdc096817d, da1684437ab...

24 bc5fde5d1e00a3ecc1e548c8d24f1c1f [c79d686eba044c5586c706cdc096817d, 0f70db1e598...

25 10730234d6ccc7cee08f3cfc58d8a9a1 [b35c3d1a7daa4924b6bdb58bc69c354d, a97e2ecd870...

26 ce95e4fc6ee410973c040fc628dce155 [3e1b063bbfa9423d84e50311296d2f3c, 1dbc51475cb...

27 080d8e696ff38c653ca90fa086415e74 [1dbc51475cb04dafa4a8833a8378635e, c12b9ebd8b4...

28 0e2b719e4c97d0d8bfeb2a53f7638eb6 [1dbc51475cb04dafa4a8833a8378635e, fdc954b4547...

29 206c2f9fd249659c7a897d323459cb6f [1dbc51475cb04dafa4a8833a8378635e, c2d48b75af6...

30 4cf4deeb7f61acb7b7db4ce0e57fb1e6 [1dbc51475cb04dafa4a8833a8378635e, c12b9ebd8b4...

31 999c9887098d1a25dc3b42a8da7ddc8c [1dbc51475cb04dafa4a8833a8378635e, f9005e5c01b...

32 980594a50d68db06e6ca257bdb9ae95e [c12b9ebd8b4e42b7896822a32e3fa6eb, 27505f6ade4...

33 ebc403dd3df39bacc3443ef4afb7edfd [5a6c1d15424149f69052cd8d91fbff75, f9005e5c01b...

34 0f9b4e5a7cfc0c3c89a8898a45383588 [da1684437ab04f23adac28ff70bd8429, 6768339b540...

35 e3bef9514042131cf477476725497416 [da1684437ab04f23adac28ff70bd8429, 4517768fc4e...

36 d453d198afec5b284ff36024780b088c [dbe9063124d047dc8d6fcaeadcda038f, 89b2003e978...

37 b029f1164f623c14a0cfaa73c246f50d [c8e706fbdc90420d952deed03c4f04b4, 83c76fbd2a0...

38 1cb66ea16e5e4f2816f0e188d3acc792 [40450f2c91944a81944621b94f190b49, 5b9fa6a9592...

39 cd8a47ace09b9cee1e8b27b0689f2822 [40450f2c91944a81944621b94f190b49, 5d97ff82691...

40 bc606176c752984da6d202275ee8c7a6 [b84d71ed9c3b45819eb3205fd28e13a0, b0b464bc92a...

41 aa8d2310a206001404282ddb3fd645aa [24652fab20d84381b112b8491de2887e, 36be44627ec...

🚀 create_final_community_reports

community ... id

0 11 ... 00f4f26e-f665-4513-a11b-c2af7a02a36e

1 12 ... 9cdd9926-6613-4dc6-8b76-1701b42a67e5

2 13 ... 6df08abe-4339-406b-8f8e-88508ea788d4

3 14 ... e128aff8-1831-4546-bc12-dfe6fe5b7456

4 15 ... a304c0d0-8d80-459b-a6d0-338c40913f0e

5 16 ... 570f44a7-aa91-4799-8ec0-f83b0249093c

6 17 ... a70866bd-ef07-4ffd-9481-32e843273f93

7 18 ... bea51767-f949-4533-8dec-7dac859ca953

8 19 ... 9569bb60-3df5-44f1-be6c-eb01b2499cba

9 20 ... eeccba53-7dfa-4b75-93ba-af46ebfe57e7

10 21 ... 57c87ad1-6417-45c2-a340-be1a9dd581d3

11 22 ... ace75235-f237-4da4-b239-40418d462680

12 0 ... 186203c5-a162-4cad-a8f5-7c92af213b31

13 1 ... e8e63308-7a83-4abc-a123-87952fb88042

14 10 ... 726b6e9a-70ab-45c5-890c-c7845d62564c

15 2 ... 1db816f5-301a-433c-95bf-a208fb6a06e9

16 3 ... f06bd7e9-029f-425a-aa19-0987930dcfbb

17 4 ... 32285e91-d28f-487a-b438-8f978c9e5a19

18 5 ... c0679eb7-369f-4792-8849-b326bebe1381

19 6 ... 4588263c-01b2-4689-b103-d6effcfc6870

20 7 ... 7097902f-0f84-4e41-93bc-b319a59abef4

21 8 ... 5584e7b9-227d-4248-b766-aab34a3e5805

22 9 ... 44b1d2d9-bc0d-493e-b124-f2530d55bdf8

[23 rows x 10 columns]

🚀 create_final_text_units

id ... relationship_ids

0 d6583840046247f428a9f02738842a7c ... [68762e6f0d1c41cd857c6b964a8e76c3, 101572f552b...

1 10730234d6ccc7cee08f3cfc58d8a9a1 ... [b35c3d1a7daa4924b6bdb58bc69c354d, a97e2ecd870...

2 980594a50d68db06e6ca257bdb9ae95e ... [c12b9ebd8b4e42b7896822a32e3fa6eb, 27505f6ade4...

3 080d8e696ff38c653ca90fa086415e74 ... [1dbc51475cb04dafa4a8833a8378635e, c12b9ebd8b4...

4 0e2b719e4c97d0d8bfeb2a53f7638eb6 ... [1dbc51475cb04dafa4a8833a8378635e, fdc954b4547...

5 7064df4af064aeb556e5bab52e896414 ... [31a7e680c4d54101afe4c8d52d246913, 351abba16e5...

6 759315fa84c14e81f84fc71c73746184 ... [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

7 e8d4072836ac08145edc2fa8c15ea2c2 ... [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

8 e3bef9514042131cf477476725497416 ... [da1684437ab04f23adac28ff70bd8429, 4517768fc4e...

9 4ffd9df98742c035b3e15bb24c3edb12 ... [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

10 8435b078474636a989a8c22f5493e1b6 ... [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

11 3763b08136628f77304cb4eb1136ea48 ... [072cdee531b74513984f49d99a8d64a0, 5ae335d9210...

12 206c2f9fd249659c7a897d323459cb6f ... [1dbc51475cb04dafa4a8833a8378635e, c2d48b75af6...

13 ce95e4fc6ee410973c040fc628dce155 ... [3e1b063bbfa9423d84e50311296d2f3c, 1dbc51475cb...

14 260fb94666cbdfb08286ce8d8162130d ... [80020a1da63042459e00266b2a605452, 9a8ce816ee9...

15 bf29edcb41403e5af43aa101072f4fdf ... [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

16 d453d198afec5b284ff36024780b088c ... [dbe9063124d047dc8d6fcaeadcda038f, 89b2003e978...

17 c79e67fc6f74a9afbe79c158000cc71b ... [31a7e680c4d54101afe4c8d52d246913, c1a146d7fb1...

18 77ae3762a0b062ca5350ea54a05450ae ... [5ac60a941a5b4934bdc43d2f87de601c, d405c3154d0...

19 b029f1164f623c14a0cfaa73c246f50d ... [c8e706fbdc90420d952deed03c4f04b4, 83c76fbd2a0...

20 29793cee69d4eefd5fea8a5f2f27b521 ... [31a7e680c4d54101afe4c8d52d246913, 4465efb7f6e...

21 b4dec8fbe9f2a2c6a79d09c9484d15ae ... [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...

22 5d70b47bf7167b7586f47fcc4355a746 ... [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...

23 1bdf253855a115bcf51faa63d7b07e82 ... [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

24 999c9887098d1a25dc3b42a8da7ddc8c ... [1dbc51475cb04dafa4a8833a8378635e, f9005e5c01b...

25 bc5fde5d1e00a3ecc1e548c8d24f1c1f ... [c79d686eba044c5586c706cdc096817d, 0f70db1e598...

26 4cf4deeb7f61acb7b7db4ce0e57fb1e6 ... [1dbc51475cb04dafa4a8833a8378635e, c12b9ebd8b4...

27 61a042016835080f3d334560b13b0e35 ... [f422035f8b78417f98e4d116971cf9f3, c79d686eba0...

28 98f3970b31dfa1d7391cdaa453d6ade7 ... [5bd156c87ec44e19ae6f8f62e6e50b9d, c8e706fbdc9...

29 ebc403dd3df39bacc3443ef4afb7edfd ... [5a6c1d15424149f69052cd8d91fbff75, f9005e5c01b...

30 1cb66ea16e5e4f2816f0e188d3acc792 ... [40450f2c91944a81944621b94f190b49, 5b9fa6a9592...

31 bc606176c752984da6d202275ee8c7a6 ... [b84d71ed9c3b45819eb3205fd28e13a0, b0b464bc92a...

32 cd8a47ace09b9cee1e8b27b0689f2822 ... [40450f2c91944a81944621b94f190b49, 5d97ff82691...

33 f40e4b274b5e1a25afbff9ecb733e1f4 ... [004f40a5aeca48a1879db728eb12bcba, 4465efb7f6e...

34 19f8fd68a8dbc1bba7058e13ce3a2e3d ... [c79d686eba044c5586c706cdc096817d, da1684437ab...

35 0f9b4e5a7cfc0c3c89a8898a45383588 ... [da1684437ab04f23adac28ff70bd8429, 6768339b540...

36 6c362d3f8d01c76d84443dcabf3f322a ... [31a7e680c4d54101afe4c8d52d246913, adffed660d1...

37 04e5c071e4ee5496d5380662e1339f45 ... [31a7e680c4d54101afe4c8d52d246913, b7e9c9ef572...

38 2b5ecb7fba1301d1f3d307e194a6c435 ... [31a7e680c4d54101afe4c8d52d246913, 004f40a5aec...

39 aa8d2310a206001404282ddb3fd645aa ... [24652fab20d84381b112b8491de2887e, 36be44627ec...

40 0ddc17ea5e566006c000b4013f2181a5 ... [70634e10a5e845aa8c6a32fe7e8eb2b2, 04085f7cf46...

41 cd4234ed6caba8f15d09a2e3ee604b2a ... [70634e10a5e845aa8c6a32fe7e8eb2b2, d203efdbfb2...

[42 rows x 6 columns]

C:\Users\admin\anaconda3\envs\graphrag\lib\site-packages\datashaper\engine\verbs\convert.py:72: FutureWarning: errors='ignore' is

deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead

datetime_column = pd.to_datetime(column, errors="ignore")

🚀 create_base_documents

id ... title

0 c305886e4aa2f6efcf64b57762777055 ... book.txt

[1 rows x 4 columns]

🚀 create_final_documents

id ... title

0 c305886e4aa2f6efcf64b57762777055 ... book.txt

[1 rows x 4 columns]

⠙ GraphRAG Indexer

├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00

├── create_base_text_units

├── create_base_extracted_entities

├── create_summarized_entities

├── create_base_entity_graph

├── create_final_entities

├── create_final_nodes

├── create_final_communities

├── join_text_units_to_entity_ids

├── create_final_relationships

├── join_text_units_to_relationship_ids

├── create_final_community_reports

├── create_final_text_units

├── create_base_documents

└── create_final_documents

🚀 All workflows completed successfully.

4. 使用查询引擎

现在，让我们使用这个数据集提出一些问题。

以下示例使用全局搜索提出高级问题：

python -m graphrag.query --root ragtest --method global "What are the top themes in this story?"

1	python -m graphrag.query --root ragtest --method global "What are the top themes in this story?"

输出为：

python -m graphrag.query --root ragtest --method global "What are the top themes in this story?"


INFO: Reading settings from ragtest\settings.yaml
creating llm client with {'api_key': 'REDACTED,len=56', 'type': "openai_chat", 'model': 'llama3.1:70b', 'max_tokens': 4000, 'temperature': 0.0, 'top_p': 1.0, 'n': 1, 'request_timeout': 180.0, 'api_base': 'http://localhost:11434/v1', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}

SUCCESS: Global Search Response: The story presents several prominent themes that intertwine to convey its moral and emotional messages. Below are the key themes identified:

### Transformation
The theme of transformation is central to the narrative, particularly through the character of Ebenezer Scrooge. His profound change from a miserly individual to a compassionate person is catalyzed by encounters with the spirits of Christmas. These encounters emphasize the importance of self-reflection and the potential for redemption, especially during the Christmas season [Data: Reports (14, 12, 13, 20, 22, +more); (11)].

### Generosity and Compassion
Generosity and compassion emerge as significant themes, illustrated by Scrooge's eventual decision to assist the Cratchit family and his changing attitude towards Christmas. The narrative underscores the impact of individual actions on the community and highlights the importance of caring for others, particularly during festive times [Data: Reports (16, 19, 12, 14, 15, +more)].

### Redemption
Redemption is a key theme, as Scrooge's journey illustrates the possibility of change and the importance of making amends for past mistakes. The story advocates for a collective responsibility to care for one another, reinforcing the idea that it is never too late to change one's ways [Data: Reports (14, 12, 13, 20, 22, +more)].

### Family and Human Connection
The importance of family and human connection is a recurring theme, evident in the relationships within the Cratchit family and Scrooge's interactions with his nephew Fred. The narrative emphasizes the value of love, support, and togetherness, particularly during Christmas [Data: Reports (16, 19, 12, 14, 15, +more)].

### Social Responsibility
Social responsibility and the struggles of the working class are highlighted through the character of Bob Cratchit and his family. This theme showcases the dire consequences of poverty and the need for empathy towards the less fortunate, which is particularly relevant in the context of Victorian society [Data: Reports (14, 16, 19, 12, 10, +more)].

### Conclusion
These themes collectively illustrate the moral fabric of the story, emphasizing the transformative power of compassion, the importance of family, and the necessity of social responsibility. The narrative serves as a reminder of the potential for change and the impact of individual actions on the broader community.

python -m graphrag.query --root ragtest --method global "What are the top themes in this story?"

INFO: Reading settings from ragtest\settings.yaml

creating llm client with {'api_key': 'REDACTED,len=56', 'type': "openai_chat", 'model': 'llama3.1:70b', 'max_tokens': 4000, 'temperature': 0.0, 'top_p': 1.0, 'n': 1, 'request_timeout': 180.0, 'api_base': 'http://localhost:11434/v1', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}

SUCCESS: Global Search Response: The story presents several prominent themes that intertwine to convey its moral and emotional messages. Below are the key themes identified:

### Transformation

The theme of transformation is central to the narrative, particularly through the character of Ebenezer Scrooge. His profound change from a miserly individual to a compassionate person is catalyzed by encounters with the spirits of Christmas. These encounters emphasize the importance of self-reflection and the potential for redemption, especially during the Christmas season [Data: Reports (14, 12, 13, 20, 22, +more); (11)].

### Generosity and Compassion

Generosity and compassion emerge as significant themes, illustrated by Scrooge's eventual decision to assist the Cratchit family and his changing attitude towards Christmas. The narrative underscores the impact of individual actions on the community and highlights the importance of caring for others, particularly during festive times [Data: Reports (16, 19, 12, 14, 15, +more)].

### Redemption

Redemption is a key theme, as Scrooge's journey illustrates the possibility of change and the importance of making amends for past mistakes. The story advocates for a collective responsibility to care for one another, reinforcing the idea that it is never too late to change one's ways [Data: Reports (14, 12, 13, 20, 22, +more)].

### Family and Human Connection

The importance of family and human connection is a recurring theme, evident in the relationships within the Cratchit family and Scrooge's interactions with his nephew Fred. The narrative emphasizes the value of love, support, and togetherness, particularly during Christmas [Data: Reports (16, 19, 12, 14, 15, +more)].

### Social Responsibility

Social responsibility and the struggles of the working class are highlighted through the character of Bob Cratchit and his family. This theme showcases the dire consequences of poverty and the need for empathy towards the less fortunate, which is particularly relevant in the context of Victorian society [Data: Reports (14, 16, 19, 12, 10, +more)].

### Conclusion

These themes collectively illustrate the moral fabric of the story, emphasizing the transformative power of compassion, the importance of family, and the necessity of social responsibility. The narrative serves as a reminder of the potential for change and the impact of individual actions on the broader community.

以下示例使用本地搜索来询问有关特定字符的更具体的问题：

python -m graphrag.query --root ragtest --method local "Who is Scrooge, and what are his main relationships?"

1	python -m graphrag.query --root ragtest --method local "Who is Scrooge, and what are his main relationships?"

输出如下：

python -m graphrag.query --root ragtest --method local "Who is Scrooge, and what are his main relationships?"


INFO: Reading settings from ragtest\settings.yaml
[2024-07-29T12:23:43Z WARN  lance::dataset] No existing dataset at D:\GraphRAG\microsoft\ollama\lancedb\description_embedding.lance, it will be created
creating llm client with {'api_key': 'REDACTED,len=56', 'type': "openai_chat", 'model': 'llama3.1:70b', 'max_tokens': 4000, 'temperature': 0.0, 'top_p': 1.0, 'n': 1, 'request_timeout': 180.0, 'api_base': 'http://localhost:11434/v1', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}
creating embedding llm client with {'api_key': 'REDACTED,len=56', 'type': "openai_embedding", 'model': 'nomic-embed-text', 'max_tokens': 4000, 'temperature': 0, 'top_p': 1, 'n': 1, 'request_timeout': 180.0, 'api_base': 'http://localhost:11434/v1', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': None, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}

SUCCESS: Local Search Response: ## Who is Scrooge?

Ebenezer Scrooge is the central character in Charles Dickens' classic novella "A Christmas Carol." Initially depicted as a miserly and cold-hearted businessman, Scrooge embodies greed and a lack of compassion, particularly towards the poor and his employees. He is known for his disdain for Christmas and the joy it brings to others, often isolating himself from family and community connections. Scrooge's character undergoes a profound transformation throughout the story, prompted by supernatural encounters with the Ghosts of Christmas Past, Present, and Yet to Come. These spirits guide him to reflect on his life choices, ultimately leading him to embrace generosity and compassion, particularly towards those he once neglected [Data: Entities (21, 4, 141); Relationships (46, 78, 90, 107, 58)].

## Main Relationships

### Bob Cratchit
One of the most significant relationships in Scrooge's life is with Bob Cratchit, his underpaid and overworked clerk. Initially, Scrooge treats Bob with indifference, reflecting his lack of empathy towards the struggles of the working class. However, as the story progresses, Scrooge's transformation leads him to recognize Bob's hardships, particularly concerning the health of Bob's son, Tiny Tim. This relationship highlights themes of social responsibility and the impact of individual actions on the lives of others. Ultimately, Scrooge raises Bob's salary and supports his family, marking a significant shift in his character [Data: Relationships (15, 89, 33)].

### Tiny Tim
Tiny Tim, Bob Cratchit's youngest son, serves as a poignant symbol of hope and compassion in the narrative. His frail health and optimistic spirit deeply affect Scrooge, prompting him to reconsider his attitudes towards kindness and generosity. Scrooge's concern for Tiny Tim's well-being becomes a crucial motivator for his transformation, emphasizing the interconnectedness of their fates. The relationship between Scrooge and Tiny Tim illustrates the potential for change and the importance of empathy in fostering human connections [Data: Relationships (89, 15)].

### The Ghosts of Christmas
Scrooge's encounters with the three spirits—Christmas Past, Present, and Yet to Come—are pivotal in his journey of self-discovery. Each ghost presents him with vivid experiences that compel him to confront his past mistakes, recognize the joy and struggles of those around him, and ultimately face the grim future that awaits him if he does not change. These supernatural relationships serve as catalysts for Scrooge's redemption, highlighting the importance of self-reflection and the potential for personal growth [Data: Relationships (27, 29, 31)].

### Jacob Marley
Jacob Marley, Scrooge's deceased business partner, plays a crucial role in initiating Scrooge's transformation. Marley appears as a ghost to warn Scrooge about the dire consequences of his miserly ways and the need for change. His spectral visit serves as a wake-up call for Scrooge, emphasizing the importance of compassion and the repercussions of a life focused solely on material wealth [Data: Relationships (22, 25)].

### Scrooge's Nephew
Scrooge's relationship with his nephew, Fred, further illustrates the contrast between Scrooge's initial misanthropy and the spirit of Christmas. Fred embodies joy and familial love, consistently inviting Scrooge to join in the Christmas festivities. Despite Scrooge's rejection of these invitations, Fred's unwavering kindness highlights the potential for connection and the importance of family during the holiday season [Data: Entities (28, 140)].

In summary, Scrooge's relationships with Bob Cratchit, Tiny Tim, the Ghosts of Christmas, Jacob Marley, and his nephew Fred are central to his character development and the overarching themes of "A Christmas Carol." These connections illustrate the transformative power of compassion, generosity, and human connection, ultimately leading to Scrooge's redemption.

python -m graphrag.query --root ragtest --method local "Who is Scrooge, and what are his main relationships?"

INFO: Reading settings from ragtest\settings.yaml

[2024-07-29T12:23:43Z WARN lance::dataset] No existing dataset at D:\GraphRAG\microsoft\ollama\lancedb\description_embedding.lance, it will be created

creating embedding llm client with {'api_key': 'REDACTED,len=56', 'type': "openai_embedding", 'model': 'nomic-embed-text', 'max_tokens': 4000, 'temperature': 0, 'top_p': 1, 'n': 1, 'request_timeout': 180.0, 'api_base': 'http://localhost:11434/v1', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': None, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}

SUCCESS: Local Search Response: ## Who is Scrooge?

Ebenezer Scrooge is the central character in Charles Dickens' classic novella "A Christmas Carol." Initially depicted as a miserly and cold-hearted businessman, Scrooge embodies greed and a lack of compassion, particularly towards the poor and his employees. He is known for his disdain for Christmas and the joy it brings to others, often isolating himself from family and community connections. Scrooge's character undergoes a profound transformation throughout the story, prompted by supernatural encounters with the Ghosts of Christmas Past, Present, and Yet to Come. These spirits guide him to reflect on his life choices, ultimately leading him to embrace generosity and compassion, particularly towards those he once neglected [Data: Entities (21, 4, 141); Relationships (46, 78, 90, 107, 58)].

## Main Relationships

### Bob Cratchit

One of the most significant relationships in Scrooge's life is with Bob Cratchit, his underpaid and overworked clerk. Initially, Scrooge treats Bob with indifference, reflecting his lack of empathy towards the struggles of the working class. However, as the story progresses, Scrooge's transformation leads him to recognize Bob's hardships, particularly concerning the health of Bob's son, Tiny Tim. This relationship highlights themes of social responsibility and the impact of individual actions on the lives of others. Ultimately, Scrooge raises Bob's salary and supports his family, marking a significant shift in his character [Data: Relationships (15, 89, 33)].

### Tiny Tim

Tiny Tim, Bob Cratchit's youngest son, serves as a poignant symbol of hope and compassion in the narrative. His frail health and optimistic spirit deeply affect Scrooge, prompting him to reconsider his attitudes towards kindness and generosity. Scrooge's concern for Tiny Tim's well-being becomes a crucial motivator for his transformation, emphasizing the interconnectedness of their fates. The relationship between Scrooge and Tiny Tim illustrates the potential for change and the importance of empathy in fostering human connections [Data: Relationships (89, 15)].

### The Ghosts of Christmas

Scrooge's encounters with the three spirits—Christmas Past, Present, and Yet to Come—are pivotal in his journey of self-discovery. Each ghost presents him with vivid experiences that compel him to confront his past mistakes, recognize the joy and struggles of those around him, and ultimately face the grim future that awaits him if he does not change. These supernatural relationships serve as catalysts for Scrooge's redemption, highlighting the importance of self-reflection and the potential for personal growth [Data: Relationships (27, 29, 31)].

### Jacob Marley

Jacob Marley, Scrooge's deceased business partner, plays a crucial role in initiating Scrooge's transformation. Marley appears as a ghost to warn Scrooge about the dire consequences of his miserly ways and the need for change. His spectral visit serves as a wake-up call for Scrooge, emphasizing the importance of compassion and the repercussions of a life focused solely on material wealth [Data: Relationships (22, 25)].

### Scrooge's Nephew

Scrooge's relationship with his nephew, Fred, further illustrates the contrast between Scrooge's initial misanthropy and the spirit of Christmas. Fred embodies joy and familial love, consistently inviting Scrooge to join in the Christmas festivities. Despite Scrooge's rejection of these invitations, Fred's unwavering kindness highlights the potential for connection and the importance of family during the holiday season [Data: Entities (28, 140)].

In summary, Scrooge's relationships with Bob Cratchit, Tiny Tim, the Ghosts of Christmas, Jacob Marley, and his nephew Fred are central to his character development and the overarching themes of "A Christmas Carol." These connections illustrate the transformative power of compassion, generosity, and human connection, ultimately leading to Scrooge's redemption.

1. 创建 GraphRAG 环境

2. 安装 GraphRAG 和 ollama

2.1 安装 GraphRAG

2.1 安装 ollama

2.2 下载下面需要使用的模型

3. 运行索引器

3.1 创建目录

3.2 下载文件

3.3 设置工作区变量

3.4 运行索引 pipeline

4. 使用查询引擎

相关文章

发表评论 取消回复

发表评论取消回复