Classifiers¶

class aisploit.classifiers.MarkdownInjectionClassifier¶

Bases: BaseTextClassifier[List[Any]]

A text classifier to detect Markdown injection in input text.

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[List[Any]]¶

Score the input and return a Score object.

Args:: input (Input): The input to be scored. references (List[Input], optional): List of reference inputs. Defaults to None. metadata (Dict[str, Any], optional): Additional metadata for scoring. Defaults to {}.
Returns:: Score[T]: A Score object representing the score of the input.

class aisploit.classifiers.PythonPackageHallucinationClassifier(python_version: str = '3.12')¶

Bases: BaseTextClassifier[List[str]]

A text classifier that identifies hallucinated Python package names in code.

python_version: str¶

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[List[str]]¶

Scores the input based on the presence of hallucinated Python package names.

Args:: input (str): The input text to analyze.
Returns:: Score[List[str]]: A score object containing information about the analysis results.

tags: List[str]¶

class aisploit.classifiers.RegexClassifier(*, pattern: Pattern, flag_matches=True)¶

Bases: BaseTextClassifier[bool]

A text classifier based on regular expressions.

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[bool]¶

Score the input based on the regular expression pattern.

Args:: input (str): The input text to be scored.
Returns:: Score[bool]: A Score object representing the result of scoring.

class aisploit.classifiers.RepeatedTokenClassifier¶

Bases: BaseTextClassifier[str]

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[str]¶

Score the input and return a Score object.

Args:: input (Input): The input to be scored. references (List[Input], optional): List of reference inputs. Defaults to None. metadata (Dict[str, Any], optional): Additional metadata for scoring. Defaults to {}.
Returns:: Score[T]: A Score object representing the score of the input.

class aisploit.classifiers.SelfSimilarityClassifier(*, embeddings: ~langchain_core.embeddings.embeddings.Embeddings = <factory>, threshold: float = 0.7, aggregation: ~typing.Literal['mean', 'min'] = 'mean')¶

Bases: BaseTextClassifier[Dict[str, Any]]

A text classifier based on self-similarity using cosine similarity scores.

aggregation: Literal['mean', 'min']¶

embeddings: Embeddings¶

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[Dict[str, Any]]¶

Score the input text based on its self-similarity to reference texts.

Args:: input (str): The input text to be scored. references (List[str], optional): List of reference texts. Defaults to None.
Raises:: ValueError: If references is None or if the number of references is not at least 1.
Returns:: Score[Dict[Any]]: A Score object representing the self-similarity score of the input.

tags: List[str]¶

threshold: float¶

class aisploit.classifiers.SubstringClassifier(*, substring: str, ignore_case=True, flag_matches=True)¶

Bases: RegexClassifier

A text classifier based on substring matching.

class aisploit.classifiers.TextTokenClassifier(token: str)¶

Bases: BaseTextClassifier[bool]

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[bool]¶

Score the input and return a Score object.

Args:: input (Input): The input to be scored. references (List[Input], optional): List of reference inputs. Defaults to None. metadata (Dict[str, Any], optional): Additional metadata for scoring. Defaults to {}.
Returns:: Score[T]: A Score object representing the score of the input.

token: str¶

class aisploit.classifiers.amazon.ComprehendPIIClassifier(session: ~boto3.session.Session = <factory>, region_name: str = 'us-east-1', *, language: str = 'en', threshold: float = 0.7, filter_func: ~typing.Callable[[str, dict], bool] | None = None)¶

Bases: BaseComprehendClassifier[List[Any]]

A classifier that uses Amazon Comprehend to detect personally identifiable information (PII).

filter_func: Callable[[str, dict], bool] | None¶

language: str¶

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[List[Any]]¶

Score the input for PII using Amazon Comprehend.

Args:: input (str): The input text to be scored. _references: List of reference inputs (ignored).
Returns:: Score[List[Any]]: A Score object representing the PII entities found in the input.

tags: List[str]¶

threshold: float¶

class aisploit.classifiers.amazon.ComprehendToxicityClassifier(session: ~boto3.session.Session = <factory>, region_name: str = 'us-east-1', language: str = 'en', threshold: float = 0.7)¶

Bases: BaseComprehendClassifier[Dict[str, Any]]

A classifier that uses Amazon Comprehend to detect toxicity in text.

language: str¶

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[Dict[str, Any]]¶

Score the input for toxicity using Amazon Comprehend.

Args:: input (str): The input text to be scored. _references: List of reference inputs (ignored).
Returns:: Score[Dict[str, Any]]: A Score object representing the toxicity score of the input.

tags: List[str]¶

threshold: float¶

class aisploit.classifiers.huggingface.BertScoreClassifier(threshold: float = 0.8, model_type: str = 'distilbert-base-uncased')¶

Bases: BaseTextClassifier[Dict[str, Any]]

A classifier that computes BERTScore for text inputs.

bertscore: EvaluationModule¶

model_type: str¶

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[Dict[str, Any]]¶

Score the input using BERTScore computed by the evaluate module.

Args:: input (str): The input text to be scored. references (List[str], optional): List of reference texts. Defaults to None.
Raises:: ValueError: If references is None or if the number of references is not equal to 1.
Returns:: Score[Dict[str, Any]]: A Score object representing the BERTScore of the input.

threshold: float¶

class aisploit.classifiers.huggingface.BleuClassifier(threshold: float = 0.2)¶

Bases: BaseTextClassifier[Dict[str, Any]]

A classifier that computes BLEU score for text inputs.

bleu: EvaluationModule¶

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[Dict[str, Any]]¶

Score the input using BLEU score computed by the evaluate module.

Args:: input (str): The input text to be scored. references (List[str], optional): List of reference texts. Defaults to None.
Raises:: ValueError: If the number of references is not equal to 1.
Returns:: Score[Dict[str, Any]]: A Score object representing the BLEU score of the input.

threshold: float¶

class aisploit.classifiers.huggingface.PipelinePromptInjectionClassifier(*, model_name: str = 'laiyer/deberta-v3-base-prompt-injection', injection_label: str = 'INJECTION', threshold: float = 0.5)¶

Bases: BaseTextClassifier[float]

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[float]¶

Score the input and return a Score object.

Args:: input (Input): The input to be scored. references (List[Input], optional): List of reference inputs. Defaults to None. metadata (Dict[str, Any], optional): Additional metadata for scoring. Defaults to {}.
Returns:: Score[T]: A Score object representing the score of the input.

class aisploit.classifiers.openai.ModerationClassifier(*, api_key: str | None = None)¶

Bases: BaseTextClassifier[Moderation]

A classifier that uses the OpenAI Moderations API for scoring.

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[Moderation]¶

Score the input using the OpenAI Moderations API.

Args:: input (str): The input text to be scored. _: List of references (ignored).
Returns:: Score[Moderation]: A Score object representing the moderation score of the input.

class aisploit.classifiers.presidio.PresidioAnalyserClassifier(*, language: str = 'en', entities: ~typing.List[str] | None = None, threshold: float = 0.7, additional_recognizers: ~typing.List[~presidio_analyzer.entity_recognizer.EntityRecognizer] = <factory>, filter_func: ~typing.Callable[[str, ~presidio_analyzer.recognizer_result.RecognizerResult], bool] | None = None)¶

Bases: BaseTextClassifier[List[RecognizerResult]]

A text classifier using the Presidio Analyzer for detecting Personally Identifiable Information (PII).

additional_recognizers: List[EntityRecognizer]¶

entities: List[str] | None¶

filter_func: Callable[[str, RecognizerResult], bool] | None¶

language: str¶

score(input: str, references: List[str] | None = None, metadata: Dict[str, Any] | None = None) → Score[List[RecognizerResult]]¶

Score the input text for Personally Identifiable Information (PII) entities.

Args:: input (str): The input text to be scored. _references: List[str], optional): Ignored parameter. Defaults to None.
Returns:: Score[List[RecognizerResult]]: A Score object representing the results of PII detection.

tags: List[str]¶

threshold: float¶

AISploit

Navigation

Related Topics

Disclaimer

Classifiers¶