autorag.data.beta.filter package

Submodules

autorag.data.beta.filter.dontknow module

class autorag.data.beta.filter.dontknow.Response(*, is_dont_know: bool)[source]

Bases: BaseModel

is_dont_know: bool
model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'is_dont_know': FieldInfo(annotation=bool, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

async autorag.data.beta.filter.dontknow.dontknow_filter_llama_index(row: Dict, llm: BaseLLM, lang: str = 'en') bool[source]

This will drop rows that have a “don’t know” answer. It will drop unanswerable questions from the QA dataset. You can use this filter with the ` batch_filter ` function at QA class.

Parameters:
  • row – The row dict from QA dataset.

  • llm – The Llama index llm instance. It will be good if you set max tokens to low for saving tokens.

  • lang – The supported language is en or ko.

Returns:

False if the row generation_gt is a “don’t know” meaning.

async autorag.data.beta.filter.dontknow.dontknow_filter_openai(row: Dict, client: AsyncOpenAI, model_name: str = 'gpt-4o-mini-2024-07-18', lang: str = 'en') bool[source]

This will drop rows that have a “don’t know” answer. It will drop unanswerable questions from the QA dataset. You can use this filter with the ` batch_filter ` function at QA class.

Parameters:
  • row – The row dict from QA dataset.

  • client – The OpenAI client.

  • model_name – The model name. You have to use gpt-4o-2024-08-06 or gpt-4o-mini-2024-07-18.

  • lang – The supported language is en or ko.

Returns:

False if the row generation_gt is a “don’t know” meaning.

autorag.data.beta.filter.dontknow.dontknow_filter_rule_based(row: Dict, lang: str = 'en') bool[source]

autorag.data.beta.filter.prompt module

Module contents