autorag.nodes.passagereranker package¶
Subpackages¶
Submodules¶
autorag.nodes.passagereranker.base module¶
- class autorag.nodes.passagereranker.base.BasePassageReranker(project_dir: str | Path, *args, **kwargs)[source]¶
Bases:
BaseModule
autorag.nodes.passagereranker.cohere module¶
- class autorag.nodes.passagereranker.cohere.CohereReranker(project_dir: str, *args, **kwargs)[source]¶
Bases:
BasePassageReranker
- async autorag.nodes.passagereranker.cohere.cohere_rerank_pure(cohere_client: AsyncClient, model: str, query: str, documents: List[str], ids: List[str], top_k: int) Tuple[List[str], List[str], List[float]] [source]¶
Rerank a list of contents with Cohere rerank models.
- Parameters:
cohere_client – The Cohere AsyncClient to use for reranking
model – The model name for Cohere rerank
query – The query to use for reranking
documents – The list of contents to rerank
ids – The list of ids corresponding to the documents
top_k – The number of passages to be retrieved
- Returns:
Tuple of lists containing the reranked contents, ids, and scores
autorag.nodes.passagereranker.colbert module¶
- class autorag.nodes.passagereranker.colbert.ColbertReranker(project_dir: str, model_name: str = 'colbert-ir/colbertv2.0', *args, **kwargs)[source]¶
Bases:
BasePassageReranker
- autorag.nodes.passagereranker.colbert.get_colbert_embedding_batch(input_strings: List[str], model, tokenizer, batch_size: int) List[array] [source]¶
autorag.nodes.passagereranker.flag_embedding module¶
- class autorag.nodes.passagereranker.flag_embedding.FlagEmbeddingReranker(project_dir, model_name: str = 'BAAI/bge-reranker-large', *args, **kwargs)[source]¶
Bases:
BasePassageReranker
autorag.nodes.passagereranker.flag_embedding_llm module¶
- class autorag.nodes.passagereranker.flag_embedding_llm.FlagEmbeddingLLMReranker(project_dir, model_name: str = 'BAAI/bge-reranker-v2-gemma', *args, **kwargs)[source]¶
Bases:
BasePassageReranker
autorag.nodes.passagereranker.flashrank module¶
- class autorag.nodes.passagereranker.flashrank.FlashRankReranker(project_dir: str, model: str = 'ms-marco-TinyBERT-L-2-v2', *args, **kwargs)[source]¶
Bases:
BasePassageReranker
autorag.nodes.passagereranker.jina module¶
- class autorag.nodes.passagereranker.jina.JinaReranker(project_dir: str, api_key: str | None = None, *args, **kwargs)[source]¶
Bases:
BasePassageReranker
autorag.nodes.passagereranker.koreranker module¶
- class autorag.nodes.passagereranker.koreranker.KoReranker(project_dir: str, *args, **kwargs)[source]¶
Bases:
BasePassageReranker
autorag.nodes.passagereranker.mixedbreadai module¶
- class autorag.nodes.passagereranker.mixedbreadai.MixedbreadAIReranker(project_dir: str, *args, **kwargs)[source]¶
Bases:
BasePassageReranker
- async autorag.nodes.passagereranker.mixedbreadai.mixedbreadai_rerank_pure(client: AsyncMixedbreadAI, query: str, documents: List[str], ids: List[str], top_k: int, model: str = 'mixedbread-ai/mxbai-rerank-large-v1') Tuple[List[str], List[str], List[float]] [source]¶
Rerank a list of contents with mixedbread-ai rerank models.
- Parameters:
client – The mixedbread-ai client to use for reranking
query – The query to use for reranking
documents – The list of contents to rerank
ids – The list of ids corresponding to the documents
top_k – The number of passages to be retrieved
model – The model name for mixedbread-ai rerank. You can choose between “mixedbread-ai/mxbai-rerank-large-v1” and “mixedbread-ai/mxbai-rerank-base-v1”. Default is “mixedbread-ai/mxbai-rerank-large-v1”.
- Returns:
Tuple of lists containing the reranked contents, ids, and scores
autorag.nodes.passagereranker.monot5 module¶
- class autorag.nodes.passagereranker.monot5.MonoT5(project_dir: str, model_name: str = 'castorini/monot5-3b-msmarco-10k', *args, **kwargs)[source]¶
Bases:
BasePassageReranker
autorag.nodes.passagereranker.openvino module¶
- class autorag.nodes.passagereranker.openvino.OpenVINOReranker(project_dir: str, model: str = 'BAAI/bge-reranker-large', *args, **kwargs)[source]¶
Bases:
BasePassageReranker
autorag.nodes.passagereranker.pass_reranker module¶
- class autorag.nodes.passagereranker.pass_reranker.PassReranker(project_dir: str | Path, *args, **kwargs)[source]¶
Bases:
BasePassageReranker
autorag.nodes.passagereranker.rankgpt module¶
- class autorag.nodes.passagereranker.rankgpt.AsyncRankGPTRerank(top_n: int = 5, llm: LLM | None = None, verbose: bool = False, rankgpt_rerank_prompt: BasePromptTemplate | None = None)[source]¶
Bases:
RankGPTRerank
- async async_postprocess_nodes(nodes: List[NodeWithScore], query_bundle: QueryBundle, ids: List[str] | None = None) Tuple[List[NodeWithScore], List[str]] [source]¶
- llm: LLM¶
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {'callback_manager': FieldInfo(annotation=CallbackManager, required=False, default_factory=CallbackManager, exclude=True), 'llm': FieldInfo(annotation=LLM, required=False, default_factory=get_default_llm, description='LLM to use for rankGPT'), 'rankgpt_rerank_prompt': FieldInfo(annotation=BasePromptTemplate, required=True, description='rankGPT rerank prompt.', metadata=[SerializeAsAny()]), 'top_n': FieldInfo(annotation=int, required=False, default=5, description='Top N nodes to return from reranking.'), 'verbose': FieldInfo(annotation=bool, required=False, default=False, description='Whether to print intermediate steps.')}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- rankgpt_rerank_prompt: Annotated[BasePromptTemplate, SerializeAsAny()]¶
- top_n: int¶
- verbose: bool¶
- class autorag.nodes.passagereranker.rankgpt.RankGPT(project_dir: str, llm: str | LLM | None = None, **kwargs)[source]¶
Bases:
BasePassageReranker
autorag.nodes.passagereranker.run module¶
- autorag.nodes.passagereranker.run.run_passage_reranker_node(modules: List, module_params: List[Dict], previous_result: DataFrame, node_line_dir: str, strategies: Dict) DataFrame [source]¶
Run evaluation and select the best module among passage reranker node results.
- Parameters:
modules – Passage reranker modules to run.
module_params – Passage reranker module parameters.
previous_result – Previous result dataframe. Could be retrieval, reranker modules result. It means it must contain ‘query’, ‘retrieved_contents’, ‘retrieved_ids’, ‘retrieve_scores’ columns.
node_line_dir – This node line’s directory.
strategies – Strategies for passage reranker node. In this node, we use ‘retrieval_f1’, ‘retrieval_recall’ and ‘retrieval_precision’. You can skip evaluation when you use only one module and a module parameter.
- Returns:
The best result dataframe with previous result columns.
autorag.nodes.passagereranker.sentence_transformer module¶
- class autorag.nodes.passagereranker.sentence_transformer.SentenceTransformerReranker(project_dir: str, model_name: str = 'cross-encoder/ms-marco-MiniLM-L-2-v2', *args, **kwargs)[source]¶
Bases:
BasePassageReranker
- pure(previous_result: DataFrame, *args, **kwargs)[source]¶
Rerank a list of contents based on their relevance to a query using a Sentence Transformer model.
- Parameters:
previous_result – The previous result
top_k – The number of passages to be retrieved
batch – The number of queries to be processed in a batch
- Returns:
pd DataFrame containing the reranked contents, ids, and scores
autorag.nodes.passagereranker.time_reranker module¶
- class autorag.nodes.passagereranker.time_reranker.TimeReranker(project_dir: str, *args, **kwargs)[source]¶
Bases:
BasePassageReranker
autorag.nodes.passagereranker.upr module¶
- class autorag.nodes.passagereranker.upr.UPRScorer(suffix_prompt: str, prefix_prompt: str, use_bf16: bool = False)[source]¶
Bases:
object
- class autorag.nodes.passagereranker.upr.Upr(project_dir: str, use_bf16: bool = False, prefix_prompt: str = 'Passage: ', suffix_prompt: str = 'Please write a question based on this passage.', *args, **kwargs)[source]¶
Bases:
BasePassageReranker
autorag.nodes.passagereranker.voyageai module¶
- class autorag.nodes.passagereranker.voyageai.VoyageAIReranker(project_dir: str, *args, **kwargs)[source]¶
Bases:
BasePassageReranker
- async autorag.nodes.passagereranker.voyageai.voyageai_rerank_pure(voyage_client: AsyncClient, model: str, query: str, documents: List[str], ids: List[str], top_k: int, truncation: bool = True) Tuple[List[str], List[str], List[float]] [source]¶
Rerank a list of contents with VoyageAI rerank models.
- Parameters:
voyage_client – The Voyage Client to use for reranking
model – The model name for VoyageAI rerank
query – The query to use for reranking
documents – The list of contents to rerank
ids – The list of ids corresponding to the documents
top_k – The number of passages to be retrieved
truncation – Whether to truncate the input to satisfy the ‘context length limit’ on the query and the documents.
- Returns:
Tuple of lists containing the reranked contents, ids, and scores