Quirx

Quirx - Simplified Architecture

This document provides a high-level overview of how Quirx works.

Architecture Overview

graph LR
    A[Input Prompt] --> B[Mutation Engine]
    B --> C[Multiple Variations]
    C --> D[LLM Provider]
    D --> E[Response Analysis]
    E --> F[Robustness Report]
    
    style A fill:#e1f5fe
    style B fill:#f3e5f5
    style D fill:#fff3e0
    style E fill:#e8f5e8
    style F fill:#fce4ec

Process Flow

Input: You provide a prompt file and optional input text
Mutation: Quirx generates variations of your prompt using three mutation types:
- Lexical: Case changes, punctuation modifications
- Semantic: Synonym replacements
- Structural: Sentence reordering
Testing: Each variation is sent to your chosen LLM provider (OpenAI, Anthropic, or Mock)
Analysis: Responses are compared using multiple similarity metrics
Classification: Each response is classified as:
- Equivalent: Same meaning and behavior
- Minor Variation: Slight differences, same intent
- Behavioral Deviation: Significant changes that could cause issues
Reporting: Get detailed reports in Markdown, JSON, or HTML format

Key Benefits

Automated Testing: No manual prompt variations needed
Multiple Providers: Works with OpenAI, Anthropic, or mock providers
Comprehensive Analysis: Token, semantic, and structural similarity metrics
CI/CD Ready: Exit codes for automated testing pipelines
Multiple Formats: Choose from Markdown, JSON, or HTML reports

Quick Example

# Test a classification prompt with 10 mutations
quirx --prompt-file sentiment.txt --input "I love this!" --mutations 10

# Result: Robustness score of 0.85/1.0 with detailed breakdown

For the complete technical architecture, see the main README.

This site is open source. Improve this page.