autorag.nodes.passagecompressor package¶
Submodules¶
autorag.nodes.passagecompressor.base module¶
autorag.nodes.passagecompressor.longllmlingua module¶
- autorag.nodes.passagecompressor.longllmlingua.llmlingua_pure(query: str, contents: List[str], llm_lingua: PromptCompressor, instructions: str, target_token: int = 300, **kwargs) str [source]¶
Return the compressed text.
- Parameters:
query – The query for retrieved passages.
contents – The contents of retrieved passages.
llm_lingua – The llm instance that will be used to compress.
instructions – The instructions for compression.
target_token – The target token for compression. Default is 300.
kwargs – Additional keyword arguments.
- Returns:
The compressed text.
- autorag.nodes.passagecompressor.longllmlingua.longllmlingua(queries: List[str], contents: List[List[str]], scores, ids, model_name: str = 'NousResearch/Llama-2-7b-hf', instructions: str | None = None, target_token: int = 300, **kwargs) List[str] [source]¶
Compresses the retrieved texts using LongLLMLingua. For more information, visit https://github.com/microsoft/LLMLingua.
- Parameters:
queries – The queries for retrieved passages.
contents – The contents of retrieved passages.
scores – The scores of retrieved passages. Do not use in this function, so you can pass an empty list.
ids – The ids of retrieved passages. Do not use in this function, so you can pass an empty list.
model_name – The model name to use for compression. Default is “NousResearch/Llama-2-7b-hf”.
instructions – The instructions for compression. Default is None. When it is None, it will use default instructions.
target_token – The target token for compression. Default is 300.
kwargs – Additional keyword arguments.
- Returns:
The list of compressed texts.
autorag.nodes.passagecompressor.pass_compressor module¶
autorag.nodes.passagecompressor.refine module¶
- autorag.nodes.passagecompressor.refine.refine(queries: List[str], contents: List[List[str]], scores, ids, llm: LLM, prompt: str | None = None, chat_prompt: str | None = None, batch: int = 16) List[str] [source]¶
Refine a response to a query across text chunks. This function is a wrapper for llama_index.response_synthesizers.Refine. For more information, visit https://docs.llamaindex.ai/en/stable/examples/response_synthesizers/refine/.
- Parameters:
queries – The queries for retrieved passages.
contents – The contents of retrieved passages.
scores – The scores of retrieved passages. Do not use in this function, so you can pass an empty list.
ids – The ids of retrieved passages. Do not use in this function, so you can pass an empty list.
llm – The llm instance that will be used to summarize.
prompt – The prompt template for refine. If you want to use chat prompt, you should pass chat_prompt instead. At prompt, you must specify where to put ‘context_msg’ and ‘query_str’. Default is None. When it is None, it will use llama index default prompt.
chat_prompt – The chat prompt template for refine. If you want to use normal prompt, you should pass prompt instead. At prompt, you must specify where to put ‘context_msg’ and ‘query_str’. Default is None. When it is None, it will use llama index default chat prompt.
batch – The batch size for llm. Set low if you face some errors. Default is 16.
- Returns:
The list of compressed texts.
autorag.nodes.passagecompressor.run module¶
- autorag.nodes.passagecompressor.run.evaluate_passage_compressor_node(result_df: DataFrame, metric_inputs: List[MetricInput], metrics: List[str])[source]¶
- autorag.nodes.passagecompressor.run.run_passage_compressor_node(modules: List[Callable], module_params: List[Dict], previous_result: DataFrame, node_line_dir: str, strategies: Dict) DataFrame [source]¶
Run evaluation and select the best module among passage compressor modules.
- Parameters:
modules – Passage compressor modules to run.
module_params – Passage compressor module parameters.
previous_result – Previous result dataframe. Could be retrieval, reranker modules result. It means it must contain ‘query’, ‘retrieved_contents’, ‘retrieved_ids’, ‘retrieve_scores’ columns.
node_line_dir – This node line’s directory.
strategies – Strategies for passage compressor node. In this node, we use You can skip evaluation when you use only one module and a module parameter.
- Returns:
The best result dataframe with previous result columns. This node will replace ‘retrieved_contents’ to compressed passages, so its length will be one.
autorag.nodes.passagecompressor.tree_summarize module¶
- autorag.nodes.passagecompressor.tree_summarize.tree_summarize(queries: List[str], contents: List[List[str]], scores, ids, llm: LLM, prompt: str | None = None, chat_prompt: str | None = None, batch: int = 16) List[str] [source]¶
Recursively merge retrieved texts and summarizes them in a bottom-up fashion. This function is a wrapper for llama_index.response_synthesizers.TreeSummarize. For more information, visit https://docs.llamaindex.ai/en/latest/examples/response_synthesizers/tree_summarize.html.
- Parameters:
queries – The queries for retrieved passages.
contents – The contents of retrieved passages.
scores – The scores of retrieved passages. Do not use in this function, so you can pass an empty list.
ids – The ids of retrieved passages. Do not use in this function, so you can pass an empty list.
llm – The llm instance that will be used to summarize.
prompt – The prompt template for summarization. If you want to use chat prompt, you should pass chat_prompt instead. At prompt, you must specify where to put ‘context_str’ and ‘query_str’. Default is None. When it is None, it will use llama index default prompt.
chat_prompt – The chat prompt template for summarization. If you want to use normal prompt, you should pass prompt instead. At prompt, you must specify where to put ‘context_str’ and ‘query_str’. Default is None. When it is None, it will use llama index default chat prompt.
batch – The batch size for llm. Set low if you face some errors. Default is 16.
- Returns:
The list of compressed texts.