AI & ML interests

Open-source Gen AI testing platform. Collaborative test generation and execution at scale.

Recent Activity

Rhesis: Open-Source Gen AI Testing 🦫

Rhesis Logo

License PyPI Version Python Versions Discord LinkedIn Documentation

Your team defines expectations, Rhesis generates and executes thousands of test scenarios. So that you know what you ship.

Open-source Gen AI testing platform. Collaborative test management that turns domain expertise into comprehensive automated testing. These datasets help teams assess Gen AI application robustness, reliability, and compliance across real-world scenarios.

Using our datasets

Our datasets are designed to test various aspects of LLM application behavior, from reliability to safety and bias detection. To get started:

  1. Browse the available test sets here on Hugging Face.
  2. Select the dataset that aligns with your evaluation needs.
  3. Load and apply the test cases to assess your application’s behavior.

For more advanced testing and seamless integration, the Rhesis SDK provides tools to automate dataset handling, generate structured test cases, and streamline evaluation workflows.

Key features

  • Curated Test Sets – Pre-built datasets covering diverse evaluation criteria.
  • Dynamic Test Generation – Generate custom test sets tailored to specific use cases.
  • Scalability – Use datasets for one-off evaluations or integrate them into automated testing pipelines.

For questions or custom datasets, reach out at hello@rhesis.ai.

Example use cases:

  • AI Financial Advisor:
    Evaluate the reliability and accuracy of financial guidance provided by LLM applications, ensuring sound advice for users.

  • AI Claim Processing:
    Test for and eliminate biases in LLM-supported claim decisions, ensuring fair and compliant processing of insurance claims.

  • AI Sales Advisor:
    Validate the accuracy of product recommendations, enhancing customer satisfaction and driving more successful sales.

  • AI Support Chatbot:
    Ensure that your chatbot consistently delivers helpful, accurate, and empathetic responses across various scenarios.

Disclaimer

Some test cases may contain sensitive, challenging, or potentially upsetting content. These cases are included to ensure thorough and realistic assessments. Users should review test cases carefully and exercise discretion when utilizing them.

Connect with us

For more details about our testing platform, datasets, and solutions, including the Rhesis AI SDK, visit Rhesis AI.
Join our Discord community to connect with other AI engineers, discuss best practices, and stay updated on new test sets.

models 0

None public yet