autorag.nodes.queryexpansion package¶
Submodules¶
autorag.nodes.queryexpansion.base module¶
autorag.nodes.queryexpansion.hyde module¶
- autorag.nodes.queryexpansion.hyde.hyde(queries: List[str], generator_func: Callable, generator_params: Dict, prompt: str = 'Please write a passage to answer the question') List[List[str]] [source]¶
HyDE, which inspired by “Precise Zero-shot Dense Retrieval without Relevance Labels” (https://arxiv.org/pdf/2212.10496.pdf) LLM model creates a hypothetical passage. And then, retrieve passages using hypothetical passage as a query. :param queries: List[str], queries to retrieve. :param generator_func: Callable, generator functions. :param generator_params: Dict, generator parameters. :param prompt: prompt to use when generating hypothetical passage :return: List[List[str]], List of hyde results.
autorag.nodes.queryexpansion.multi_query_expansion module¶
- autorag.nodes.queryexpansion.multi_query_expansion.get_multi_query_expansion(query: str, answer: str) List[str] [source]¶
- autorag.nodes.queryexpansion.multi_query_expansion.multi_query_expansion(queries: List[str], generator_func: Callable, generator_params: Dict, prompt: str = 'You are an AI language model assistant.\n Your task is to generate 3 different versions of the given user\n question to retrieve relevant documents from a vector database.\n By generating multiple perspectives on the user question,\n your goal is to help the user overcome some of the limitations\n of distance-based similarity search. Provide these alternative\n questions separated by newlines. Original question: {question}') List[List[str]] [source]¶
Expand a list of queries using a multi-query expansion approach. LLM model generate 3 different versions queries for each input query.
- Parameters:
queries – List[str], queries to decompose.
generator_func – Callable, generator functions.
generator_params – Dict, generator parameters.
prompt – str, prompt to use for multi-query expansion. default prompt comes from langchain MultiQueryRetriever default query prompt.
- Returns:
List[List[str]], list of expansion query.
autorag.nodes.queryexpansion.pass_query_expansion module¶
autorag.nodes.queryexpansion.query_decompose module¶
- autorag.nodes.queryexpansion.query_decompose.get_query_decompose(query: str, answer: str) List[str] [source]¶
decompose query to little piece of questions. :param query: str, query to decompose. :param answer: str, answer from query_decompose function. :return: List[str], list of a decomposed query. Return input query if query is not decomposable.
- autorag.nodes.queryexpansion.query_decompose.query_decompose(queries: List[str], generator_func: Callable, generator_params: Dict, prompt: str = 'Decompose a question in self-contained sub-questions. Use "The question needs no decomposition" when no decomposition is needed.\n\n Example 1:\n\n Question: Is Hamlet more common on IMDB than Comedy of Errors?\n Decompositions:\n 1: How many listings of Hamlet are there on IMDB?\n 2: How many listing of Comedy of Errors is there on IMDB?\n\n Example 2:\n\n Question: Are birds important to badminton?\n\n Decompositions:\n The question needs no decomposition\n\n Example 3:\n\n Question: Is it legal for a licensed child driving Mercedes-Benz to be employed in US?\n\n Decompositions:\n 1: What is the minimum driving age in the US?\n 2: What is the minimum age for someone to be employed in the US?\n\n Example 4:\n\n Question: Are all cucumbers the same texture?\n\n Decompositions:\n The question needs no decomposition\n\n Example 5:\n\n Question: Hydrogen\'s atomic number squared exceeds number of Spice Girls?\n\n Decompositions:\n 1: What is the atomic number of hydrogen?\n 2: How many Spice Girls are there?\n\n Example 6:\n\n Question: {question}\n\n Decompositions:\n ') List[List[str]] [source]¶
decompose query to little piece of questions. :param queries: List[str], queries to decompose. :param generator_func: Callable, generator functions. :param generator_params: Dict, generator parameters. :param prompt: str, prompt to use for query decomposition.
default prompt comes from Visconde’s StrategyQA few-shot prompt.
- Returns:
List[List[str]], list of decomposed query. Return input query if query is not decomposable.
autorag.nodes.queryexpansion.run module¶
- autorag.nodes.queryexpansion.run.evaluate_one_query_expansion_node(retrieval_funcs: List[Callable], retrieval_params: List[Dict], metric_inputs: List[MetricInput], metrics: List[str], project_dir, previous_result: DataFrame, strategy_name: str) DataFrame [source]¶
- autorag.nodes.queryexpansion.run.make_retrieval_callable_params(strategy_dict: Dict)[source]¶
strategy_dict looks like this:
{ "metrics": ["retrieval_f1", "retrieval_recall"], "top_k": 50, "retrieval_modules": [ {"module_type": "bm25"}, {"module_type": "vectordb", "embedding_model": ["openai", "huggingface"]} ] }
- autorag.nodes.queryexpansion.run.run_query_expansion_node(modules: List[Callable], module_params: List[Dict], previous_result: DataFrame, node_line_dir: str, strategies: Dict) DataFrame [source]¶
Run evaluation and select the best module among query expansion node results. Initially, retrieval is run using expanded_queries, the result of the query_expansion module. The retrieval module is run as a combination of the retrieval_modules in strategies. If there are multiple retrieval_modules, run them all and choose the best result. If there are no retrieval_modules, run them with the default of bm25. In this way, the best result is selected for each module, and then the best result is selected.
- Parameters:
modules – Query expansion modules to run.
module_params – Query expansion module parameters.
previous_result – Previous result dataframe. In this case, it would be qa data.
node_line_dir – This node line’s directory.
strategies – Strategies for query expansion node.
- Returns:
The best result dataframe.