Agile classifiers is an efficient and flexible method for creating custom content policy classifiers by tuning models, such as Gemma, to fit your needs. They also allow you complete control over where and how they are deployed.
The codelab and tutorial use LoRA to fine-tune a Gemma model to act as a content policy classifier using the KerasNLP library. Using only 200 examples from the ETHOS dataset, this classifier achieves an F1 score of 0.80 and ROC-AUC score of 0.78, which compares favorably to state of the art leaderboard results. When trained on the 800 examples, like the other classifiers on the leaderboard, the Gemma-based agile classifier achieves an F1 score of 83.74 and a ROC-AUC score of 88.17. You can adapt the tutorial instructions to further refine this classifier, or to create your own custom safety classifier safeguards.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-10-23 UTC."],[],[],null,["\u003cbr /\u003e\n\n[Agile classifiers](https://arxiv.org/pdf/2302.06541.pdf) is an efficient and flexible method\nfor creating custom content policy classifiers by tuning models, such as Gemma,\nto fit your needs. They also allow you complete control over where and how they\nare deployed.\n\n**Gemma Agile Classifier Tutorials**\n\n|---|---------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------|\n| | [Start Codelab](https://codelabs.developers.google.com/codelabs/responsible-ai/agile-classifiers) | [Start Google Colab](https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemma/docs/agile_classifiers.ipynb) |\n\n\u003cbr /\u003e\n\nThe [codelab](https://codelabs.developers.google.com/codelabs/responsible-ai/agile-classifiers) and\n[tutorial](/gemma/docs/agile_classifiers) use [LoRA](https://arxiv.org/abs/2106.09685) to fine-tune a Gemma\nmodel to act as a content policy classifier using the [KerasNLP](https://keras.io/keras_nlp/)\nlibrary. Using only 200 examples from the [ETHOS dataset](https://paperswithcode.com/dataset/ethos), this\nclassifier achieves an [F1 score](https://en.wikipedia.org/wiki/F-score) of 0.80 and [ROC-AUC score](https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc#AUC)\nof 0.78, which compares favorably to state of the art\n[leaderboard results](https://paperswithcode.com/sota/hate-speech-detection-on-ethos-binary). When trained on the 800 examples,\nlike the other classifiers on the leaderboard, the Gemma-based agile classifier\nachieves an F1 score of 83.74 and a ROC-AUC score of 88.17. You can adapt the\ntutorial instructions to further refine this classifier, or to create your own\ncustom safety classifier safeguards."]]