Evaluate model risks on safety, fairness, and factual accuracy using our guidance and tooling.
Get started
LLM Comparator
Conduct side-by-side evaluations with LLM Comparator to qualitatively assess differences in responses between models, different prompts for the same model, or even different tunings of a model
Detect a list of safety attributes, including various potentially harmful categories and topics that may be considered sensitive with this Google Cloud Natural Language API available for free below a certain usage limit.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],[],[],[],null,["Responsible Generative AI Toolkit \nTools and guidance to design, build and evaluate open AI models responsibly. \n[Responsible application design\nDefine rules for model behaviour, create a safe and accountable application, and maintain transparent communication with users.](#design) [Safety alignment\nDiscover prompt-debugging techniques and guidance for fine-tuning and RLHF to align AI models with safety policies.](#align) [Model evaluation\nFind guidance and data to conduct a robust model evaluation for safety, fairness, and factuality with the LLM Comparator.](#evaluate) [Safeguards\nDeploy safety classifiers, using off-the-shelf solutions or build your own with step-by-step tutorials.](#protect) \n\nDesign a responsible approach \nProactively identify potential risks of your application and define a system-level approach to build safe and responsible applications for users. \nGet started \nDefine system-level policies\n\nDetermine what type of content your application should and should not generate. \n- [Define policies](https://ai.google.dev/responsible/docs/design#define-policies)\n- [See examples](https://ai.google.dev/responsible/docs/design#hypothetical-policies) \nDesign for safety\n\nDefine your overall approach to implement risk mitigation techniques, considering technical and business tradeoffs. \n- [Learn more](https://ai.google.dev/responsible/docs/design#design-safety) \nBe transparent\n\nCommunicate your approach with artifacts like model cards. \n- [See Templates](https://ai.google.dev/responsible/docs/design#transparency-artifacts) \nSecure AI systems\n\nConsider AI-specific security risks and remediation methods highlighted in the Secure AI Framework (SAIF). \n- [Google's Secure AI Framework](https://safety.google/cybersecurity-advancements/saif/)\n- [Documentation](https://ai.google.dev/responsible/docs/design#secure-ai) \n\nAlign your model \nAlign your model with your specific safety policies using prompting and tuning techniques. \nGet started \nCraft safer, more robust prompts\n\nUse the power of LLMs to help craft safer prompt templates with the Model Alignment library. \n- [Try now](https://colab.research.google.com/github/pair-code/model-alignment/blob/main/notebooks/Gemma_for_Model_Alignment.ipynb)\n- [Model Alignment](/responsible/docs/alignment/model-alignment) \nTune models for safety\n\nControl model behavior by tuning your model to align with your safety and content policies. \n- [Learn about Tuning](/responsible/docs/alignment#tuning)\n- [Learn about Tuning SFT](/responsible/docs/alignment#tuning-sft)\n- [Learn about Tuning RLHF](/responsible/docs/alignment#tuning-rlhf) \nInvestigate model prompts\n\nBuild safe and helpful prompts through iterative improvement with the Learning Interpretability Tool (LIT). \nModel prompts video \n- [Try now](https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemma/docs/lit_gemma.ipynb)\n- [Learning Interpretability Tool](/responsible/docs/alignment/lit) \n\nEvaluate your model \nEvaluate model risks on safety, fairness, and factual accuracy using our guidance and tooling. \nGet started \nLLM Comparator\n\nConduct side-by-side evaluations with LLM Comparator to qualitatively assess differences in responses between models, different prompts for the same model, or even different tunings of a model \nLLM Comparator video \n- [Try demo](https://pair-code.github.io/llm-comparator/)\n- [Learn about LLM Comparator](https://ai.google.dev/responsible/docs/evaluation#llm-comparator) \nModel evaluation guidelines\n\nLearn about red teaming best practices and evaluate your model against academic benchmarks to assess harms around safety, fairness, and factuality. \n- [Learn more](https://ai.google.dev/responsible/docs/evaluation)\n- [See benchmarks](https://ai.google.dev/responsible/docs/evaluation#benchmarks)\n- [See red teaming best practices](https://ai.google.dev/responsible/docs/evaluation#red-teaming) \n\nProtect with safeguards \nFilter your application's input and outputs, and protect users from undesirable outcomes. \nGet started \nSynthID Text\n\nA tool for watermarking and detecting text generated by your model. \n- [SynthID text watermarking](/responsible/docs/safeguards/synthid) \nShieldGemma\n\nA series of content safety classifiers, built on Gemma 2, available in three sizes: 2B, 9B, 27B. \n- [ShieldGemma content safety classifiers](/responsible/docs/safeguards/shieldgemma) \nAgile classifiers\n\nCreate safety classifiers for your specific policies using parameter efficient tuning (PET) with relatively little training data \n- [Create safety classifiers](/responsible/docs/safeguards/agile-classifiers) \nChecks AI Safety\n\nEnsure AI safety compliance against your content policies with APIs and monitoring dashboards. \n- [Checks AI Safety](https://checks.google.com/ai-safety/?utm_source=GenAITK&utm_medium=Link&utm_campaign=AI_Toolkit) \nText moderation service\n\nDetect a list of safety attributes, including various potentially harmful categories and topics that may be considered sensitive with this Google Cloud Natural Language API available for free below a certain usage limit. \n- [Cloud Natural Language API](https://cloud.google.com/natural-language/docs/moderating-text#:~:text=Text%20moderation%20analyzes%20a%20document,document%2C%20call%20the%20moderateText%20method.&text=Content%20that%20is%20rude%2C%20disrespectful%2C%20or%20unreasonable)\n- [Cloud Natural Language pricing](https://cloud.google.com/natural-language/pricing) \nPerspective API\n\nIdentify \"toxic\" comments with this free Google Jigsaw API to mitigate online toxicity and ensure healthy dialogue. \n- [Perspective API](https://perspectiveapi.com/)"]]