使用 Gemini(又称“nano banana”)生成和编辑图片


您可以让 Gemini 模型使用纯文本提示以及文本和图片提示来生成和修改图片。使用 Firebase AI Logic 时,您可以直接从应用中发出此请求。

借助此功能,您可以执行以下操作:

  • 通过自然语言对话迭代生成图片,在调整图片的同时保持一致性和上下文。

  • 生成具有高质量文本渲染效果的图片,包括长文本字符串。

  • 生成图文交织的输出。例如,在单个对话轮次中包含文本和图片的博文。以前,这需要将多个模型串联在一起。

  • 利用 Gemini 的世界知识和推理能力生成图片。

您可以在本页面的后续部分找到受支持的模态和功能的完整列表(以及提示示例)。

跳转到文字转图片的代码 跳转到交织文本和图片的代码

跳转到图片编辑代码 跳转到迭代式图片编辑代码


如需了解处理图片的其他选项,请参阅其他指南
分析图片 在设备上分析图片 生成结构化输出

GeminiImagen 型号之间进行选择

Firebase AI Logic SDK 支持使用 Gemini 模型或 Imagen 模型生成和修改图片。

对于大多数使用情形,请先选择 Gemini,然后仅在图像质量至关重要的专业任务中选择 Imagen

如果您想执行以下操作,请选择 Gemini

  • 利用世界知识和推理能力生成与上下文相关的图片。
  • 无缝融合文字和图片,或交织文字和图片输出。
  • 在长文本序列中嵌入准确的视觉元素。
  • 以对话方式修改图片,同时保持上下文。

如果您想执行以下操作,请选择 Imagen

  • 优先考虑画质、写实度、艺术细节或特定风格(例如印象派或动漫)。
  • 融入品牌元素、风格或生成徽标和产品设计。
  • 用于明确指定所生成图片的宽高比或格式。

准备工作

点击您的 Gemini API 提供商,以查看此页面上特定于提供商的内容和代码。

如果您尚未完成入门指南,请先完成该指南。该指南介绍了如何设置 Firebase 项目、将应用连接到 Firebase、添加 SDK、为所选的 Gemini API 提供程序初始化后端服务,以及创建 GenerativeModel 实例。

如需测试和迭代提示,我们建议使用 Google AI Studio

支持此功能的模型

  • gemini-2.5-flash-image(也称为“nano banana”)。

请注意,SDK 还支持使用 Imagen 模型生成图片

生成和修改图片

您可以使用 Gemini 模型生成和编辑图片。

生成图片(仅限文本输入)

在试用此示例之前,请完成本指南的准备工作部分,以设置您的项目和应用。
在该部分中,您还需要点击所选Gemini API提供商对应的按钮,以便在此页面上看到特定于提供商的内容

您可以向 Gemini 模型提供文本提示,让其生成图片。

请务必创建 GenerativeModel 实例,在模型配置中包含 responseModalities: ["TEXT", "IMAGE"],并调用 generateContent

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Provide a text prompt instructing the model to generate an image let prompt = "Generate an image of the Eiffel tower with fireworks in the background."  // To generate an image, call `generateContent` with the text input let response = try await model.generateContent(prompt)  // Handle the generated image guard let inlineDataPart = response.inlineDataParts.first else {   fatalError("No image data in response.") } guard let uiImage = UIImage(data: inlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") } 

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide a text prompt instructing the model to generate an image val prompt = "Generate an image of the Eiffel tower with fireworks in the background."  // To generate image output, call `generateContent` with the text input val generatedImageAsBitmap = model.generateContent(prompt)     // Handle the generated image     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image  

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide a text prompt instructing the model to generate an image Content prompt = new Content.Builder()         .addText("Generate an image of the Eiffel Tower with fireworks in the background.")         .build();  // To generate an image, call `generateContent` with the text input ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt); Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {          // iterate over all the parts in the first candidate in the result object         for (Part part : result.getCandidates().get(0).getContent().getParts()) {             if (part instanceof ImagePart) {                 ImagePart imagePart = (ImagePart) part;                 // The returned image as a bitmap                 Bitmap generatedImageAsBitmap = imagePart.getImage();                 break;             }         }     }      @Override     public void onFailure(Throwable t) {         t.printStackTrace();     } }, executor); 

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Provide a text prompt instructing the model to generate an image const prompt = 'Generate an image of the Eiffel Tower with fireworks in the background.';  // To generate an image, call `generateContent` with the text input const result = model.generateContent(prompt);  // Handle the generated image try {   const inlineDataParts = result.response.inlineDataParts();   if (inlineDataParts?.[0]) {     const image = inlineDataParts[0].inlineData;     console.log(image.mimeType, image.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); } 

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Provide a text prompt instructing the model to generate an image final prompt = [Content.text('Generate an image of the Eiffel Tower with fireworks in the background.')];  // To generate an image, call `generateContent` with the text input final response = await model.generateContent(prompt); if (response.inlineDataParts.isNotEmpty) {   final imageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); } 

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Provide a text prompt instructing the model to generate an image var prompt = "Generate an image of the Eiffel Tower with fireworks in the background.";  // To generate an image, call `GenerateContentAsync` with the text input var response = await model.GenerateContentAsync(prompt);  var text = response.Text; if (!string.IsNullOrWhiteSpace(text)) {   // Do something with the text }  // Handle the generated image var imageParts = response.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); foreach (var imagePart in imageParts) {   // Load the Image into a Unity Texture2D object   UnityEngine.Texture2D texture2D = new(2, 2);   if (texture2D.LoadImage(imagePart.Data.ToArray())) {     // Do something with the image   } } 

生成图文交织的内容

在试用此示例之前,请完成本指南的准备工作部分,以设置您的项目和应用。
在该部分中,您还需要点击所选Gemini API提供商对应的按钮,以便在此页面上看到特定于提供商的内容

您可以让 Gemini 模型生成图文交织的回答。例如,您可以生成所生成食谱中每个步骤的图片,以便与该步骤的说明搭配使用,而无需向模型或不同模型发出单独的请求。

请务必创建 GenerativeModel 实例,在模型配置中包含 responseModalities: ["TEXT", "IMAGE"],并调用 generateContent

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Provide a text prompt instructing the model to generate interleaved text and images let prompt = """ Generate an illustrated recipe for a paella. Create images to go alongside the text as you generate the recipe """  // To generate interleaved text and images, call `generateContent` with the text input let response = try await model.generateContent(prompt)  // Handle the generated text and image guard let candidate = response.candidates.first else {   fatalError("No candidates in response.") } for part in candidate.content.parts {   switch part {   case let textPart as TextPart:     // Do something with the generated text     let text = textPart.text   case let inlineDataPart as InlineDataPart:     // Do something with the generated image     guard let uiImage = UIImage(data: inlineDataPart.data) else {       fatalError("Failed to convert data to UIImage.")     }   default:     fatalError("Unsupported part type: \(part)")   } } 

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide a text prompt instructing the model to generate interleaved text and images val prompt = """     Generate an illustrated recipe for a paella.     Create images to go alongside the text as you generate the recipe     """.trimIndent()  // To generate interleaved text and images, call `generateContent` with the text input val responseContent = model.generateContent(prompt).candidates.first().content  // The response will contain image and text parts interleaved for (part in responseContent.parts) {     when (part) {         is ImagePart -> {             // ImagePart as a bitmap             val generatedImageAsBitmap: Bitmap? = part.asImageOrNull()         }         is TextPart -> {             // Text content from the TextPart             val text = part.text         }     } } 

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide a text prompt instructing the model to generate interleaved text and images Content prompt = new Content.Builder()         .addText("Generate an illustrated recipe for a paella.\n" +                  "Create images to go alongside the text as you generate the recipe")         .build();  // To generate interleaved text and images, call `generateContent` with the text input ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt); Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {         Content responseContent = result.getCandidates().get(0).getContent();         // The response will contain image and text parts interleaved         for (Part part : responseContent.getParts()) {             if (part instanceof ImagePart) {                 // ImagePart as a bitmap                 Bitmap generatedImageAsBitmap = ((ImagePart) part).getImage();             } else if (part instanceof TextPart){                 // Text content from the TextPart                 String text = ((TextPart) part).getText();             }         }     }      @Override     public void onFailure(Throwable t) {         System.err.println(t);     } }, executor); 

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Provide a text prompt instructing the model to generate interleaved text and images const prompt = 'Generate an illustrated recipe for a paella.\n.' +   'Create images to go alongside the text as you generate the recipe';  // To generate interleaved text and images, call `generateContent` with the text input const result = await model.generateContent(prompt);  // Handle the generated text and image try {   const response = result.response;   if (response.candidates?.[0].content?.parts) {     for (const part of response.candidates?.[0].content?.parts) {       if (part.text) {         // Do something with the text         console.log(part.text)       }       if (part.inlineData) {         // Do something with the image         const image = part.inlineData;         console.log(image.mimeType, image.data);       }     }   }  } catch (err) {   console.error('Prompt or candidate was blocked:', err); } 

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Provide a text prompt instructing the model to generate interleaved text and images final prompt = [Content.text(   'Generate an illustrated recipe for a paella\n ' +   'Create images to go alongside the text as you generate the recipe' )];  // To generate interleaved text and images, call `generateContent` with the text input final response = await model.generateContent(prompt);  // Handle the generated text and image final parts = response.candidates.firstOrNull?.content.parts if (parts.isNotEmpty) {   for (final part in parts) {     if (part is TextPart) {       // Do something with text part       final text = part.text     }     if (part is InlineDataPart) {       // Process image       final imageBytes = part.bytes     }   } } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); } 

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Provide a text prompt instructing the model to generate interleaved text and images var prompt = "Generate an illustrated recipe for a paella \n" +   "Create images to go alongside the text as you generate the recipe";  // To generate interleaved text and images, call `GenerateContentAsync` with the text input var response = await model.GenerateContentAsync(prompt);  // Handle the generated text and image foreach (var part in response.Candidates.First().Content.Parts) {   if (part is ModelContent.TextPart textPart) {     if (!string.IsNullOrWhiteSpace(textPart.Text)) {       // Do something with the text     }   } else if (part is ModelContent.InlineDataPart dataPart) {     if (dataPart.MimeType == "image/png") {       // Load the Image into a Unity Texture2D object       UnityEngine.Texture2D texture2D = new(2, 2);       if (texture2D.LoadImage(dataPart.Data.ToArray())) {         // Do something with the image       }     }   } } 

修改图片(文本和图片输入)

在试用此示例之前,请完成本指南的准备工作部分,以设置您的项目和应用。
在该部分中,您还需要点击所选Gemini API提供商对应的按钮,以便在此页面上看到特定于提供商的内容

您可以向 Gemini 模型发出提示,要求其使用文本和一张或多张图片来修改图片。

请务必创建 GenerativeModel 实例,在模型配置中包含 responseModalities: ["TEXT", "IMAGE"],并调用 generateContent

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Provide an image for the model to edit guard let image = UIImage(named: "scones") else { fatalError("Image file not found.") }  // Provide a text prompt instructing the model to edit the image let prompt = "Edit this image to make it look like a cartoon"  // To edit the image, call `generateContent` with the image and text input let response = try await model.generateContent(image, prompt)  // Handle the generated image guard let inlineDataPart = response.inlineDataParts.first else {   fatalError("No image data in response.") } guard let uiImage = UIImage(data: inlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") } 

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide an image for the model to edit val bitmap = BitmapFactory.decodeResource(context.resources, R.drawable.scones)  // Provide a text prompt instructing the model to edit the image val prompt = content {     image(bitmap)     text("Edit this image to make it look like a cartoon") }  // To edit the image, call `generateContent` with the prompt (image and text input) val generatedImageAsBitmap = model.generateContent(prompt)     // Handle the generated text and image     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image  

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide an image for the model to edit Bitmap bitmap = BitmapFactory.decodeResource(resources, R.drawable.scones);  // Provide a text prompt instructing the model to edit the image Content promptcontent = new Content.Builder()         .addImage(bitmap)         .addText("Edit this image to make it look like a cartoon")         .build();  // To edit the image, call `generateContent` with the prompt (image and text input) ListenableFuture<GenerateContentResponse> response = model.generateContent(promptcontent); Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {         // iterate over all the parts in the first candidate in the result object         for (Part part : result.getCandidates().get(0).getContent().getParts()) {             if (part instanceof ImagePart) {                 ImagePart imagePart = (ImagePart) part;                 Bitmap generatedImageAsBitmap = imagePart.getImage();                 break;             }         }     }      @Override     public void onFailure(Throwable t) {         t.printStackTrace();     } }, executor); 

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Prepare an image for the model to edit async function fileToGenerativePart(file) {   const base64EncodedDataPromise = new Promise((resolve) => {     const reader = new FileReader();     reader.onloadend = () => resolve(reader.result.split(',')[1]);     reader.readAsDataURL(file);   });   return {     inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },   }; }  // Provide a text prompt instructing the model to edit the image const prompt = "Edit this image to make it look like a cartoon";  const fileInputEl = document.querySelector("input[type=file]"); const imagePart = await fileToGenerativePart(fileInputEl.files[0]);  // To edit the image, call `generateContent` with the image and text input const result = await model.generateContent([prompt, imagePart]);  // Handle the generated image try {   const inlineDataParts = result.response.inlineDataParts();   if (inlineDataParts?.[0]) {     const image = inlineDataParts[0].inlineData;     console.log(image.mimeType, image.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); } 

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Prepare an image for the model to edit final image = await File('scones.jpg').readAsBytes(); final imagePart = InlineDataPart('image/jpeg', image);  // Provide a text prompt instructing the model to edit the image final prompt = TextPart("Edit this image to make it look like a cartoon");  // To edit the image, call `generateContent` with the image and text input final response = await model.generateContent([   Content.multi([prompt,imagePart]) ]);  // Handle the generated image if (response.inlineDataParts.isNotEmpty) {   final imageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); } 

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Prepare an image for the model to edit var imageFile = System.IO.File.ReadAllBytes(System.IO.Path.Combine(   UnityEngine.Application.streamingAssetsPath, "scones.jpg")); var image = ModelContent.InlineData("image/jpeg", imageFile);  // Provide a text prompt instructing the model to edit the image var prompt = ModelContent.Text("Edit this image to make it look like a cartoon.");  // To edit the image, call `GenerateContent` with the image and text input var response = await model.GenerateContentAsync(new [] { prompt, image });  var text = response.Text; if (!string.IsNullOrWhiteSpace(text)) {   // Do something with the text }  // Handle the generated image var imageParts = response.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); foreach (var imagePart in imageParts) {   // Load the Image into a Unity Texture2D object   Texture2D texture2D = new Texture2D(2, 2);   if (texture2D.LoadImage(imagePart.Data.ToArray())) {     // Do something with the image   } } 

使用多轮对话迭代和修改图片

在试用此示例之前,请完成本指南的准备工作部分,以设置您的项目和应用。
在该部分中,您还需要点击所选Gemini API提供商对应的按钮,以便在此页面上看到特定于提供商的内容

通过多轮对话,您可以与 Gemini 模型就其生成的图片或您提供的图片进行迭代。

请务必创建 GenerativeModel 实例,在模型配置中添加 responseModalities: ["TEXT", "IMAGE"],并调用 startChat()sendMessage() 来发送新用户消息。

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Initialize the chat let chat = model.startChat()  guard let image = UIImage(named: "scones") else { fatalError("Image file not found.") }  // Provide an initial text prompt instructing the model to edit the image let prompt = "Edit this image to make it look like a cartoon"  // To generate an initial response, send a user message with the image and text prompt let response = try await chat.sendMessage(image, prompt)  // Inspect the generated image guard let inlineDataPart = response.inlineDataParts.first else {   fatalError("No image data in response.") } guard let uiImage = UIImage(data: inlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") }  // Follow up requests do not need to specify the image again let followUpResponse = try await chat.sendMessage("But make it old-school line drawing style")  // Inspect the edited image after the follow up request guard let followUpInlineDataPart = followUpResponse.inlineDataParts.first else {   fatalError("No image data in response.") } guard let followUpUIImage = UIImage(data: followUpInlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") } 

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide an image for the model to edit val bitmap = BitmapFactory.decodeResource(context.resources, R.drawable.scones)  // Create the initial prompt instructing the model to edit the image val prompt = content {     image(bitmap)     text("Edit this image to make it look like a cartoon") }  // Initialize the chat val chat = model.startChat()  // To generate an initial response, send a user message with the image and text prompt var response = chat.sendMessage(prompt) // Inspect the returned image var generatedImageAsBitmap = response     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image  // Follow up requests do not need to specify the image again response = chat.sendMessage("But make it old-school line drawing style") generatedImageAsBitmap = response     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image  

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide an image for the model to edit Bitmap bitmap = BitmapFactory.decodeResource(resources, R.drawable.scones);  // Initialize the chat ChatFutures chat = model.startChat();  // Create the initial prompt instructing the model to edit the image Content prompt = new Content.Builder()         .setRole("user")         .addImage(bitmap)         .addText("Edit this image to make it look like a cartoon")         .build();  // To generate an initial response, send a user message with the image and text prompt ListenableFuture<GenerateContentResponse> response = chat.sendMessage(prompt); // Extract the image from the initial response ListenableFuture<@Nullable Bitmap> initialRequest = Futures.transform(response, result -> {     for (Part part : result.getCandidates().get(0).getContent().getParts()) {         if (part instanceof ImagePart) {             ImagePart imagePart = (ImagePart) part;             return imagePart.getImage();         }     }     return null; }, executor);  // Follow up requests do not need to specify the image again ListenableFuture<GenerateContentResponse> modelResponseFuture = Futures.transformAsync(         initialRequest,         generatedImage -> {             Content followUpPrompt = new Content.Builder()                     .addText("But make it old-school line drawing style")                     .build();             return chat.sendMessage(followUpPrompt);         },         executor);  // Add a final callback to check the reworked image Futures.addCallback(modelResponseFuture, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {         for (Part part : result.getCandidates().get(0).getContent().getParts()) {             if (part instanceof ImagePart) {                 ImagePart imagePart = (ImagePart) part;                 Bitmap generatedImageAsBitmap = imagePart.getImage();                 break;             }         }     }      @Override     public void onFailure(Throwable t) {         t.printStackTrace();     } }, executor); 

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Prepare an image for the model to edit async function fileToGenerativePart(file) {   const base64EncodedDataPromise = new Promise((resolve) => {     const reader = new FileReader();     reader.onloadend = () => resolve(reader.result.split(',')[1]);     reader.readAsDataURL(file);   });   return {     inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },   }; }  const fileInputEl = document.querySelector("input[type=file]"); const imagePart = await fileToGenerativePart(fileInputEl.files[0]);  // Provide an initial text prompt instructing the model to edit the image const prompt = "Edit this image to make it look like a cartoon";  // Initialize the chat const chat = model.startChat();  // To generate an initial response, send a user message with the image and text prompt const result = await chat.sendMessage([prompt, imagePart]);  // Request and inspect the generated image try {   const inlineDataParts = result.response.inlineDataParts();   if (inlineDataParts?.[0]) {     // Inspect the generated image     const image = inlineDataParts[0].inlineData;     console.log(image.mimeType, image.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); }  // Follow up requests do not need to specify the image again const followUpResult = await chat.sendMessage("But make it old-school line drawing style");  // Request and inspect the returned image try {   const followUpInlineDataParts = followUpResult.response.inlineDataParts();   if (followUpInlineDataParts?.[0]) {     // Inspect the generated image     const followUpImage = followUpInlineDataParts[0].inlineData;     console.log(followUpImage.mimeType, followUpImage.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); } 

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Prepare an image for the model to edit final image = await File('scones.jpg').readAsBytes(); final imagePart = InlineDataPart('image/jpeg', image);  // Provide an initial text prompt instructing the model to edit the image final prompt = TextPart("Edit this image to make it look like a cartoon");  // Initialize the chat final chat = model.startChat();  // To generate an initial response, send a user message with the image and text prompt final response = await chat.sendMessage([   Content.multi([prompt,imagePart]) ]);  // Inspect the returned image if (response.inlineDataParts.isNotEmpty) {   final imageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); }  // Follow up requests do not need to specify the image again final followUpResponse = await chat.sendMessage([   Content.text("But make it old-school line drawing style") ]);  // Inspect the returned image if (followUpResponse.inlineDataParts.isNotEmpty) {   final followUpImageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); } 

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Prepare an image for the model to edit var imageFile = System.IO.File.ReadAllBytes(System.IO.Path.Combine(   UnityEngine.Application.streamingAssetsPath, "scones.jpg")); var image = ModelContent.InlineData("image/jpeg", imageFile);  // Provide an initial text prompt instructing the model to edit the image var prompt = ModelContent.Text("Edit this image to make it look like a cartoon.");  // Initialize the chat var chat = model.StartChat();  // To generate an initial response, send a user message with the image and text prompt var response = await chat.SendMessageAsync(new [] { prompt, image });  // Inspect the returned image var imageParts = response.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); // Load the image into a Unity Texture2D object UnityEngine.Texture2D texture2D = new(2, 2); if (texture2D.LoadImage(imageParts.First().Data.ToArray())) {   // Do something with the image }  // Follow up requests do not need to specify the image again var followUpResponse = await chat.SendMessageAsync("But make it old-school line drawing style");  // Inspect the returned image var followUpImageParts = followUpResponse.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); // Load the image into a Unity Texture2D object UnityEngine.Texture2D followUpTexture2D = new(2, 2); if (followUpTexture2D.LoadImage(followUpImageParts.First().Data.ToArray())) {   // Do something with the image } 



支持的功能、限制和最佳实践

支持的模态和功能

以下是 Gemini 模型支持的图片输出模态和功能。每项功能都显示了一个提示示例,并在上方提供了一个代码示例。

  • 文本 图片(纯文本转图片)

    • 生成一张以烟花为背景的埃菲尔铁塔图片。
  • 文本 图片(图片中的文本渲染)

    • 生成一张电影效果照片,照片中有一栋大型建筑,建筑正面投影着巨大的文字。
  • 文本 图片和文本(交织)

    • 生成一份图文并茂的海鲜饭食谱。在生成食谱时,与文本一起创建图片。

    • 生成一个关于狗狗的故事,采用 3D 卡通动画风格。 为每个场景生成一张图片。

  • 图片和文本 图片和文本(交织)

    • [带家具的房间的图片] + 我的空间还适合放置哪些颜色的沙发?您可以更新图片吗?
  • 图片编辑(文字和图片转图片)

    • [司康饼图片] + 修改此图片,使其看起来像卡通图片

    • [猫的图片] + [枕头的图片] + 请在这只枕头上制作一张我的猫的十字绣图案。

  • 多轮图片修改(聊天)

    • [蓝色汽车的图片] + 将此汽车改为敞篷车。,然后现在将颜色改为黄色。

限制和最佳做法

以下是 Gemini 模型生成图片输出的限制和最佳实践。

  • 图片生成 Gemini 模型支持以下功能:

    • 生成最大尺寸为 1024 像素的 PNG 图片。
    • 生成和修改人物图片。
    • 使用安全过滤器,提供灵活且限制较少的用户体验。
  • 生成图片的 Gemini 模型不支持以下功能:

    • 包括音频或视频输入。
    • 仅生成图片。
      模型将始终返回文本和图片,并且您必须在模型配置中添加 responseModalities: ["TEXT", "IMAGE"]
  • 为获得最佳性能,请使用以下语言:enes-mxja-jpzh-cnhi-in

  • 图片生成可能不会始终触发。以下是一些已知问题:

    • 模型可能只能输出文本。
      尝试明确要求生成图片输出(例如,“生成图片”“在您操作过程中提供图片”“更新图片”)。

    • 模型可能会中途停止生成。
      请重试或尝试使用其他提示。

    • 模型可能会以图片形式生成文本。
      尝试明确要求文本输出。例如,“生成叙事文本及插图”。

  • 为图片生成文字时,如果先生成文字,然后再要求生成包含该文字的图片,Gemini 的效果最好。