autorag.nodes.passagecompressor package

Submodules

autorag.nodes.passagecompressor.base module

autorag.nodes.passagecompressor.base.make_llm(llm_name: str, kwargs: Dict) LLM[source]
autorag.nodes.passagecompressor.base.passage_compressor_node(func)[source]

autorag.nodes.passagecompressor.longllmlingua module

autorag.nodes.passagecompressor.longllmlingua.llmlingua_pure(query: str, contents: List[str], llm_lingua: PromptCompressor, instructions: str, target_token: int = 300, **kwargs) str[source]

Return the compressed text.

Parameters:
  • query – The query for retrieved passages.

  • contents – The contents of retrieved passages.

  • llm_lingua – The llm instance that will be used to compress.

  • instructions – The instructions for compression.

  • target_token – The target token for compression. Default is 300.

  • kwargs – Additional keyword arguments.

Returns:

The compressed text.

autorag.nodes.passagecompressor.longllmlingua.longllmlingua(queries: List[str], contents: List[List[str]], scores, ids, model_name: str = 'NousResearch/Llama-2-7b-hf', instructions: str | None = None, target_token: int = 300, **kwargs) List[str][source]

Compresses the retrieved texts using LongLLMLingua. For more information, visit https://github.com/microsoft/LLMLingua.

Parameters:
  • queries – The queries for retrieved passages.

  • contents – The contents of retrieved passages.

  • scores – The scores of retrieved passages. Do not use in this function, so you can pass an empty list.

  • ids – The ids of retrieved passages. Do not use in this function, so you can pass an empty list.

  • model_name – The model name to use for compression. Default is “NousResearch/Llama-2-7b-hf”.

  • instructions – The instructions for compression. Default is None. When it is None, it will use default instructions.

  • target_token – The target token for compression. Default is 300.

  • kwargs – Additional keyword arguments.

Returns:

The list of compressed texts.

autorag.nodes.passagecompressor.pass_compressor module

autorag.nodes.passagecompressor.pass_compressor.pass_compressor(contents: List[List[str]])[source]

Do not perform any passage compression

autorag.nodes.passagecompressor.refine module

autorag.nodes.passagecompressor.refine.refine(queries: List[str], contents: List[List[str]], scores, ids, llm: LLM, prompt: str | None = None, chat_prompt: str | None = None, batch: int = 16) List[str][source]

Refine a response to a query across text chunks. This function is a wrapper for llama_index.response_synthesizers.Refine. For more information, visit https://docs.llamaindex.ai/en/stable/examples/response_synthesizers/refine/.

Parameters:
  • queries – The queries for retrieved passages.

  • contents – The contents of retrieved passages.

  • scores – The scores of retrieved passages. Do not use in this function, so you can pass an empty list.

  • ids – The ids of retrieved passages. Do not use in this function, so you can pass an empty list.

  • llm – The llm instance that will be used to summarize.

  • prompt – The prompt template for refine. If you want to use chat prompt, you should pass chat_prompt instead. At prompt, you must specify where to put ‘context_msg’ and ‘query_str’. Default is None. When it is None, it will use llama index default prompt.

  • chat_prompt – The chat prompt template for refine. If you want to use normal prompt, you should pass prompt instead. At prompt, you must specify where to put ‘context_msg’ and ‘query_str’. Default is None. When it is None, it will use llama index default chat prompt.

  • batch – The batch size for llm. Set low if you face some errors. Default is 16.

Returns:

The list of compressed texts.

autorag.nodes.passagecompressor.run module

autorag.nodes.passagecompressor.run.evaluate_passage_compressor_node(result_df: DataFrame, metric_inputs: List[MetricInput], metrics: List[str])[source]
autorag.nodes.passagecompressor.run.run_passage_compressor_node(modules: List[Callable], module_params: List[Dict], previous_result: DataFrame, node_line_dir: str, strategies: Dict) DataFrame[source]

Run evaluation and select the best module among passage compressor modules.

Parameters:
  • modules – Passage compressor modules to run.

  • module_params – Passage compressor module parameters.

  • previous_result – Previous result dataframe. Could be retrieval, reranker modules result. It means it must contain ‘query’, ‘retrieved_contents’, ‘retrieved_ids’, ‘retrieve_scores’ columns.

  • node_line_dir – This node line’s directory.

  • strategies – Strategies for passage compressor node. In this node, we use You can skip evaluation when you use only one module and a module parameter.

Returns:

The best result dataframe with previous result columns. This node will replace ‘retrieved_contents’ to compressed passages, so its length will be one.

autorag.nodes.passagecompressor.tree_summarize module

autorag.nodes.passagecompressor.tree_summarize.tree_summarize(queries: List[str], contents: List[List[str]], scores, ids, llm: LLM, prompt: str | None = None, chat_prompt: str | None = None, batch: int = 16) List[str][source]

Recursively merge retrieved texts and summarizes them in a bottom-up fashion. This function is a wrapper for llama_index.response_synthesizers.TreeSummarize. For more information, visit https://docs.llamaindex.ai/en/latest/examples/response_synthesizers/tree_summarize.html.

Parameters:
  • queries – The queries for retrieved passages.

  • contents – The contents of retrieved passages.

  • scores – The scores of retrieved passages. Do not use in this function, so you can pass an empty list.

  • ids – The ids of retrieved passages. Do not use in this function, so you can pass an empty list.

  • llm – The llm instance that will be used to summarize.

  • prompt – The prompt template for summarization. If you want to use chat prompt, you should pass chat_prompt instead. At prompt, you must specify where to put ‘context_str’ and ‘query_str’. Default is None. When it is None, it will use llama index default prompt.

  • chat_prompt – The chat prompt template for summarization. If you want to use normal prompt, you should pass prompt instead. At prompt, you must specify where to put ‘context_str’ and ‘query_str’. Default is None. When it is None, it will use llama index default chat prompt.

  • batch – The batch size for llm. Set low if you face some errors. Default is 16.

Returns:

The list of compressed texts.

Module contents