Classifiers¶
- class aisploit.classifiers.MarkdownInjectionClassifier¶
Bases:
BaseTextClassifier
[List
[Any
]]A text classifier to detect Markdown injection in input text.
- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[List[Any]] ¶
Score the input and return a Score object.
- Args:
input (Input): The input to be scored. references (List[Input], optional): List of reference inputs. Defaults to None. metadata (Dict[str, Any], optional): Additional metadata for scoring. Defaults to {}.
- Returns:
Score[T]: A Score object representing the score of the input.
- class aisploit.classifiers.PythonPackageHallucinationClassifier(python_version: str = '3.12')¶
Bases:
BaseTextClassifier
[List
[str
]]A text classifier that identifies hallucinated Python package names in code.
- python_version: str¶
- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[List[str]] ¶
Scores the input based on the presence of hallucinated Python package names.
- Args:
input (str): The input text to analyze.
- Returns:
Score[List[str]]: A score object containing information about the analysis results.
- tags: List[str]¶
- class aisploit.classifiers.RegexClassifier(*, pattern: Pattern, flag_matches=True)¶
Bases:
BaseTextClassifier
[bool
]A text classifier based on regular expressions.
- class aisploit.classifiers.RepeatedTokenClassifier¶
Bases:
BaseTextClassifier
[str
]- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[str] ¶
Score the input and return a Score object.
- Args:
input (Input): The input to be scored. references (List[Input], optional): List of reference inputs. Defaults to None. metadata (Dict[str, Any], optional): Additional metadata for scoring. Defaults to {}.
- Returns:
Score[T]: A Score object representing the score of the input.
- class aisploit.classifiers.SelfSimilarityClassifier(*, embeddings: ~langchain_core.embeddings.embeddings.Embeddings = <factory>, threshold: float = 0.7, aggregation: ~typing.Literal['mean', 'min'] = 'mean')¶
Bases:
BaseTextClassifier
[Dict
[str
,Any
]]A text classifier based on self-similarity using cosine similarity scores.
- aggregation: Literal['mean', 'min']¶
- embeddings: Embeddings¶
- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[Dict[str, Any]] ¶
Score the input text based on its self-similarity to reference texts.
- Args:
input (str): The input text to be scored. references (List[str], optional): List of reference texts. Defaults to None.
- Raises:
ValueError: If references is None or if the number of references is not at least 1.
- Returns:
Score[Dict[Any]]: A Score object representing the self-similarity score of the input.
- tags: List[str]¶
- threshold: float¶
- class aisploit.classifiers.SubstringClassifier(*, substring: str, ignore_case=True, flag_matches=True)¶
Bases:
RegexClassifier
A text classifier based on substring matching.
- class aisploit.classifiers.TextTokenClassifier(token: str)¶
Bases:
BaseTextClassifier
[bool
]- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[bool] ¶
Score the input and return a Score object.
- Args:
input (Input): The input to be scored. references (List[Input], optional): List of reference inputs. Defaults to None. metadata (Dict[str, Any], optional): Additional metadata for scoring. Defaults to {}.
- Returns:
Score[T]: A Score object representing the score of the input.
- token: str¶
- class aisploit.classifiers.amazon.ComprehendPIIClassifier(session: ~boto3.session.Session = <factory>, region_name: str = 'us-east-1', *, language: str = 'en', threshold: float = 0.7, filter_func: ~typing.Callable[[str, dict], bool] | None = None)¶
Bases:
BaseComprehendClassifier
[List
[Any
]]A classifier that uses Amazon Comprehend to detect personally identifiable information (PII).
- filter_func: Callable[[str, dict], bool] | None¶
- language: str¶
- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[List[Any]] ¶
Score the input for PII using Amazon Comprehend.
- Args:
input (str): The input text to be scored. _references: List of reference inputs (ignored).
- Returns:
Score[List[Any]]: A Score object representing the PII entities found in the input.
- tags: List[str]¶
- threshold: float¶
- class aisploit.classifiers.amazon.ComprehendToxicityClassifier(session: ~boto3.session.Session = <factory>, region_name: str = 'us-east-1', language: str = 'en', threshold: float = 0.7)¶
Bases:
BaseComprehendClassifier
[Dict
[str
,Any
]]A classifier that uses Amazon Comprehend to detect toxicity in text.
- language: str¶
- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[Dict[str, Any]] ¶
Score the input for toxicity using Amazon Comprehend.
- Args:
input (str): The input text to be scored. _references: List of reference inputs (ignored).
- Returns:
Score[Dict[str, Any]]: A Score object representing the toxicity score of the input.
- tags: List[str]¶
- threshold: float¶
- class aisploit.classifiers.huggingface.BertScoreClassifier(threshold: float = 0.8, model_type: str = 'distilbert-base-uncased')¶
Bases:
BaseTextClassifier
[Dict
[str
,Any
]]A classifier that computes BERTScore for text inputs.
- bertscore: EvaluationModule¶
- model_type: str¶
- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[Dict[str, Any]] ¶
Score the input using BERTScore computed by the evaluate module.
- Args:
input (str): The input text to be scored. references (List[str], optional): List of reference texts. Defaults to None.
- Raises:
ValueError: If references is None or if the number of references is not equal to 1.
- Returns:
Score[Dict[str, Any]]: A Score object representing the BERTScore of the input.
- threshold: float¶
- class aisploit.classifiers.huggingface.BleuClassifier(threshold: float = 0.2)¶
Bases:
BaseTextClassifier
[Dict
[str
,Any
]]A classifier that computes BLEU score for text inputs.
- bleu: EvaluationModule¶
- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[Dict[str, Any]] ¶
Score the input using BLEU score computed by the evaluate module.
- Args:
input (str): The input text to be scored. references (List[str], optional): List of reference texts. Defaults to None.
- Raises:
ValueError: If the number of references is not equal to 1.
- Returns:
Score[Dict[str, Any]]: A Score object representing the BLEU score of the input.
- threshold: float¶
- class aisploit.classifiers.huggingface.PipelinePromptInjectionClassifier(*, model_name: str = 'laiyer/deberta-v3-base-prompt-injection', injection_label: str = 'INJECTION', threshold: float = 0.5)¶
Bases:
BaseTextClassifier
[float
]- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[float] ¶
Score the input and return a Score object.
- Args:
input (Input): The input to be scored. references (List[Input], optional): List of reference inputs. Defaults to None. metadata (Dict[str, Any], optional): Additional metadata for scoring. Defaults to {}.
- Returns:
Score[T]: A Score object representing the score of the input.
- class aisploit.classifiers.openai.ModerationClassifier(*, api_key: str | None = None)¶
Bases:
BaseTextClassifier
[Moderation
]A classifier that uses the OpenAI Moderations API for scoring.
- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[Moderation] ¶
Score the input using the OpenAI Moderations API.
- Args:
input (str): The input text to be scored. _: List of references (ignored).
- Returns:
Score[Moderation]: A Score object representing the moderation score of the input.
- class aisploit.classifiers.presidio.PresidioAnalyserClassifier(*, language: str = 'en', entities: ~typing.List[str] | None = None, threshold: float = 0.7, additional_recognizers: ~typing.List[~presidio_analyzer.entity_recognizer.EntityRecognizer] = <factory>, filter_func: ~typing.Callable[[str, ~presidio_analyzer.recognizer_result.RecognizerResult], bool] | None = None)¶
Bases:
BaseTextClassifier
[List
[RecognizerResult
]]A text classifier using the Presidio Analyzer for detecting Personally Identifiable Information (PII).
- additional_recognizers: List[EntityRecognizer]¶
- entities: List[str] | None¶
- filter_func: Callable[[str, RecognizerResult], bool] | None¶
- language: str¶
- score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) Score[List[RecognizerResult]] ¶
Score the input text for Personally Identifiable Information (PII) entities.
- Args:
input (str): The input text to be scored. _references: List[str], optional): Ignored parameter. Defaults to None.
- Returns:
Score[List[RecognizerResult]]: A Score object representing the results of PII detection.
- tags: List[str]¶
- threshold: float¶