autorag.nodes.queryexpansion package

Submodules

autorag.nodes.queryexpansion.base module

class autorag.nodes.queryexpansion.base.BaseQueryExpansion(project_dir: str | Path, *args, **kwargs)[source]

Bases: BaseModule

cast_to_run(previous_result: DataFrame, *args, **kwargs)[source]

This function is for cast function (a.k.a decorator) only for pure function in the whole node.

autorag.nodes.queryexpansion.base.check_expanded_query(query: str, expanded_query_list: List[str])[source]

autorag.nodes.queryexpansion.hyde module

class autorag.nodes.queryexpansion.hyde.HyDE(project_dir: str | Path, *args, **kwargs)[source]

Bases: BaseQueryExpansion

pure(previous_result: DataFrame, *args, **kwargs)[source]

autorag.nodes.queryexpansion.multi_query_expansion module

class autorag.nodes.queryexpansion.multi_query_expansion.MultiQueryExpansion(project_dir: str | Path, *args, **kwargs)[source]

Bases: BaseQueryExpansion

pure(previous_result: DataFrame, *args, **kwargs)[source]
autorag.nodes.queryexpansion.multi_query_expansion.get_multi_query_expansion(query: str, answer: str) List[str][source]

autorag.nodes.queryexpansion.pass_query_expansion module

class autorag.nodes.queryexpansion.pass_query_expansion.PassQueryExpansion(project_dir: str | Path, *args, **kwargs)[source]

Bases: BaseQueryExpansion

pure(previous_result: DataFrame, *args, **kwargs)[source]

Do not perform query expansion. Return with the same queries. The dimension will be 2-d list, and the column name will be ‘queries’.

autorag.nodes.queryexpansion.query_decompose module

class autorag.nodes.queryexpansion.query_decompose.QueryDecompose(project_dir: str | Path, *args, **kwargs)[source]

Bases: BaseQueryExpansion

pure(previous_result: DataFrame, *args, **kwargs)[source]
autorag.nodes.queryexpansion.query_decompose.get_query_decompose(query: str, answer: str) List[str][source]

decompose query to little piece of questions. :param query: str, query to decompose. :param answer: str, answer from query_decompose function. :return: List[str], list of a decomposed query. Return input query if query is not decomposable.

autorag.nodes.queryexpansion.run module

autorag.nodes.queryexpansion.run.evaluate_one_query_expansion_node(retrieval_funcs: List, retrieval_params: List[Dict], metric_inputs: List[MetricInput], metrics: List[str], project_dir, previous_result: DataFrame, strategy_name: str) DataFrame[source]
autorag.nodes.queryexpansion.run.make_retrieval_callable_params(strategy_dict: Dict)[source]

strategy_dict looks like this:

{
    "metrics": ["retrieval_f1", "retrieval_recall"],
    "top_k": 50,
    "retrieval_modules": [
      {"module_type": "bm25"},
      {"module_type": "vectordb", "embedding_model": ["openai", "huggingface"]}
    ]
  }
autorag.nodes.queryexpansion.run.run_query_expansion_node(modules: List, module_params: List[Dict], previous_result: DataFrame, node_line_dir: str, strategies: Dict) DataFrame[source]

Run evaluation and select the best module among query expansion node results. Initially, retrieval is run using expanded_queries, the result of the query_expansion module. The retrieval module is run as a combination of the retrieval_modules in strategies. If there are multiple retrieval_modules, run them all and choose the best result. If there are no retrieval_modules, run them with the default of bm25. In this way, the best result is selected for each module, and then the best result is selected.

Parameters:
  • modules – Query expansion modules to run.

  • module_params – Query expansion module parameters.

  • previous_result – Previous result dataframe. In this case, it would be qa data.

  • node_line_dir – This node line’s directory.

  • strategies – Strategies for query expansion node.

Returns:

The best result dataframe.

Module contents