FlashRank Reranker¶

FlashRank is the Ultra-lite & Super-fast Python library to add re-ranking to your existing search & retrieval pipelines.

It is based on SoTA cross-encoders, with gratitude to all the model owners.

Module Parameters¶

batch : The size of a batch. If you have limited CUDA memory, decrease the size of the batch. (default: 64)
model : The type of model id or path you want to use for reranking. Default is id “”ms-marco-TinyBERT-L-2-v2””.
- You can get the list of available models from FlashRank
Note

“rank_zephyr_7b_v1_full” is an llm based reranker that uses llama-cpp. Due to issues with parallel inference, “rank_zephyr_7b_v1_full” is not currently supported by AutoRAG.

- module_type: flashrank_reranker
  batch: 32
  model: "ms-marco-MiniLM-L-12-v2"