Responsible AI Safety and Effectiveness (RAISE) Benchmarks

The Responsible AI Safety and Effectiveness (RAISE) Benchmarks are a comprehensive set of criteria and metrics developed to guide and evaluate the responsible development, deployment, and governance of AI systems. These benchmarks aim to provide clear, measurable standards that operationalize responsible AI practices, focusing on AI systems' safety, effectiveness, and ethical alignment. By establishing benchmarks, organizations can ensure their AI systems meet ethical and safety standards while maintaining public trust and accountability. There are three types of RAISE Benchmarks.

1. RAISE Corporate AI Policy Benchmarks

These benchmarks focus on evaluating an organization’s internal AI policies. They assess the strength of AI governance structures, ethical guidelines, and operational policies that support responsible AI development and use. Key areas include transparency, accountability, privacy protection, fairness, and adherence to ethical principles.

2. RAISE LLM (Large Language Models) Hallucinations Benchmarks

Specifically targeting large language models (LLMs), these benchmarks assess the frequency and severity of hallucinations, where AI generates inaccurate, misleading, or nonsensical information. The benchmarks provide metrics for measuring the reliability of LLMs, the effectiveness of strategies to reduce hallucinations, and the model's ability to provide accurate, reliable information across different contexts.

3. RAISE Vendor Alignment Benchmarks

These benchmarks are designed to ensure that external vendors or partners align with an organization’s responsible AI principles. They evaluate the extent to which vendors' AI systems and practices meet established standards for ethical AI, with a focus on data handling, model transparency, and ethical deployment.

Alignment with NIST AI Risk Management Framework and ISO/IEC 42001 Standard

The RAISE Benchmarks are aligned with internationally recognized AI governance and risk management frameworks, including the NIST AI Risk Management Framework and the ISO/IEC 42001 Standard. This alignment ensures that organizations adopting the RAISE Benchmarks adhere to global standards for managing AI risks and uphold ethical principles in AI development, deployment, and monitoring. Following these frameworks helps organizations demonstrate their commitment to responsible AI practices and mitigate potential risks associated with AI technologies.

Conclusion

The RAISE Benchmarks represent a significant advancement in operationalizing responsible AI. By offering clear, actionable criteria across corporate policies, specific technologies like LLMs, and vendor relationships, these benchmarks help organizations navigate the complex ethical and safety challenges posed by AI. Through alignment with established risk management frameworks, the RAISE Benchmarks promote trust, accountability, and ethical practices in AI development and use.

References

National Institute of Standards and Technology. AI Risk Management Framework 1.0. U.S. Department of Commerce, 2023. https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf.
International Organization for Standardization and International Electrotechnical Commission. ISO/IEC 42001:2023 Information Technology – Artificial Intelligence – Management System Standard. Geneva: ISO, 2023.

Responsible AI Safety and Effectiveness (RAISE) Benchmarks

About

Project Insight

Moral Imagination

Human Rights Index