Llama Index Chunk¶
Chunk parsed results to use Llama Index Node_Parsers & Text Splitters.
Available Chunk Method¶
1. Token¶
2. Sentence¶
3. Window¶
4. Semantic¶
5. Simple¶
Example YAML¶
modules:
- module_type: llama_index_chunk
chunk_method: [ Token, Sentence ]
chunk_size: [ 1024, 512 ]
chunk_overlap: 24
add_file_name: english
Using Llama Index Chunk Method that is not in the Available Chunk Method¶
You can find more information about the llama index chunk method at here.
How to Use¶
If you want to use HTMLNodeParser
that is not in the available chunk method, you can use the following code.
from autorag.data import chunk_modules
from llama_index.core.node_parser import HTMLNodeParser
chunk_modules["html"] = HTMLNodeParser
Attention
The key value in chunk_modules must always be written in lowercase.