All Gemini 1.0 and Gemini 1.5 models are now retired.
To avoid service disruption, update to a newer model (for example, gemini-2.5-flash-lite). Learn more.

此页面由 Cloud Translation API 翻译。

使用 Gemini（又称“nano banana”）生成和编辑图片

您可以让 Gemini 模型使用纯文本提示以及文本和图片提示来生成和修改图片。使用 Firebase AI Logic 时，您可以直接从应用中发出此请求。

借助此功能，您可以执行以下操作：

通过自然语言对话迭代生成图片，在调整图片的同时保持一致性和上下文。
生成具有高质量文本渲染效果的图片，包括长文本字符串。
生成图文交织的输出。例如，在单个对话轮次中包含文本和图片的博文。以前，这需要将多个模型串联在一起。
利用 Gemini 的世界知识和推理能力生成图片。

您可以在本页面的后续部分找到受支持的模态和功能的完整列表（以及提示示例）。

跳转到文字转图片的代码跳转到交织文本和图片的代码

跳转到图片编辑代码跳转到迭代式图片编辑代码

如需了解处理图片的其他选项，请参阅其他指南
分析图片在设备上分析图片生成结构化输出

在 Gemini 和 Imagen 型号之间进行选择

Firebase AI Logic SDK 支持使用 Gemini 模型或 Imagen 模型生成和修改图片。

对于大多数使用情形，请先选择 Gemini，然后仅在图像质量至关重要的专业任务中选择 Imagen。

如果您想执行以下操作，请选择 Gemini：

利用世界知识和推理能力生成与上下文相关的图片。
无缝融合文字和图片，或交织文字和图片输出。
在长文本序列中嵌入准确的视觉元素。
以对话方式修改图片，同时保持上下文。

如果您想执行以下操作，请选择 Imagen：

优先考虑画质、写实度、艺术细节或特定风格（例如印象派或动漫）。
融入品牌元素、风格或生成徽标和产品设计。
用于明确指定所生成图片的宽高比或格式。

准备工作

点击您的 Gemini API 提供商，以查看此页面上特定于提供商的内容和代码。

如果您尚未完成入门指南，请先完成该指南。该指南介绍了如何设置 Firebase 项目、将应用连接到 Firebase、添加 SDK、为所选的 Gemini API 提供程序初始化后端服务，以及创建 GenerativeModel 实例。

如需测试和迭代提示，我们建议使用 Google AI Studio。

支持此功能的模型

gemini-2.5-flash-image（也称为“nano banana”）。

请注意，SDK 还支持使用 Imagen 模型生成图片。

生成和修改图片

您可以使用 Gemini 模型生成和编辑图片。

生成图片（仅限文本输入）

在试用此示例之前，请完成本指南的准备工作部分，以设置您的项目和应用。
在该部分中，您还需要点击所选Gemini API提供商对应的按钮，以便在此页面上看到特定于提供商的内容。

您可以向 Gemini 模型提供文本提示，让其生成图片。

请务必创建 GenerativeModel 实例，在模型配置中包含 responseModalities: ["TEXT", "IMAGE"]，并调用 generateContent。

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Provide a text prompt instructing the model to generate an image let prompt = "Generate an image of the Eiffel tower with fireworks in the background."  // To generate an image, call `generateContent` with the text input let response = try await model.generateContent(prompt)  // Handle the generated image guard let inlineDataPart = response.inlineDataParts.first else {   fatalError("No image data in response.") } guard let uiImage = UIImage(data: inlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide a text prompt instructing the model to generate an image val prompt = "Generate an image of the Eiffel tower with fireworks in the background."  // To generate image output, call `generateContent` with the text input val generatedImageAsBitmap = model.generateContent(prompt)     // Handle the generated image     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide a text prompt instructing the model to generate an image Content prompt = new Content.Builder()         .addText("Generate an image of the Eiffel Tower with fireworks in the background.")         .build();  // To generate an image, call `generateContent` with the text input ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt); Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {          // iterate over all the parts in the first candidate in the result object         for (Part part : result.getCandidates().get(0).getContent().getParts()) {             if (part instanceof ImagePart) {                 ImagePart imagePart = (ImagePart) part;                 // The returned image as a bitmap                 Bitmap generatedImageAsBitmap = imagePart.getImage();                 break;             }         }     }      @Override     public void onFailure(Throwable t) {         t.printStackTrace();     } }, executor);

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Provide a text prompt instructing the model to generate an image const prompt = 'Generate an image of the Eiffel Tower with fireworks in the background.';  // To generate an image, call `generateContent` with the text input const result = model.generateContent(prompt);  // Handle the generated image try {   const inlineDataParts = result.response.inlineDataParts();   if (inlineDataParts?.[0]) {     const image = inlineDataParts[0].inlineData;     console.log(image.mimeType, image.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); }

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Provide a text prompt instructing the model to generate an image final prompt = [Content.text('Generate an image of the Eiffel Tower with fireworks in the background.')];  // To generate an image, call `generateContent` with the text input final response = await model.generateContent(prompt); if (response.inlineDataParts.isNotEmpty) {   final imageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); }

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Provide a text prompt instructing the model to generate an image var prompt = "Generate an image of the Eiffel Tower with fireworks in the background.";  // To generate an image, call `GenerateContentAsync` with the text input var response = await model.GenerateContentAsync(prompt);  var text = response.Text; if (!string.IsNullOrWhiteSpace(text)) {   // Do something with the text }  // Handle the generated image var imageParts = response.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); foreach (var imagePart in imageParts) {   // Load the Image into a Unity Texture2D object   UnityEngine.Texture2D texture2D = new(2, 2);   if (texture2D.LoadImage(imagePart.Data.ToArray())) {     // Do something with the image   } }

生成图文交织的内容

您可以让 Gemini 模型生成图文交织的回答。例如，您可以生成所生成食谱中每个步骤的图片，以便与该步骤的说明搭配使用，而无需向模型或不同模型发出单独的请求。

请务必创建 GenerativeModel 实例，在模型配置中包含 responseModalities: ["TEXT", "IMAGE"]，并调用 generateContent。

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Provide a text prompt instructing the model to generate interleaved text and images let prompt = """ Generate an illustrated recipe for a paella. Create images to go alongside the text as you generate the recipe """  // To generate interleaved text and images, call `generateContent` with the text input let response = try await model.generateContent(prompt)  // Handle the generated text and image guard let candidate = response.candidates.first else {   fatalError("No candidates in response.") } for part in candidate.content.parts {   switch part {   case let textPart as TextPart:     // Do something with the generated text     let text = textPart.text   case let inlineDataPart as InlineDataPart:     // Do something with the generated image     guard let uiImage = UIImage(data: inlineDataPart.data) else {       fatalError("Failed to convert data to UIImage.")     }   default:     fatalError("Unsupported part type: \(part)")   } }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide a text prompt instructing the model to generate interleaved text and images val prompt = """     Generate an illustrated recipe for a paella.     Create images to go alongside the text as you generate the recipe     """.trimIndent()  // To generate interleaved text and images, call `generateContent` with the text input val responseContent = model.generateContent(prompt).candidates.first().content  // The response will contain image and text parts interleaved for (part in responseContent.parts) {     when (part) {         is ImagePart -> {             // ImagePart as a bitmap             val generatedImageAsBitmap: Bitmap? = part.asImageOrNull()         }         is TextPart -> {             // Text content from the TextPart             val text = part.text         }     } }

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide a text prompt instructing the model to generate interleaved text and images Content prompt = new Content.Builder()         .addText("Generate an illustrated recipe for a paella.\n" +                  "Create images to go alongside the text as you generate the recipe")         .build();  // To generate interleaved text and images, call `generateContent` with the text input ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt); Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {         Content responseContent = result.getCandidates().get(0).getContent();         // The response will contain image and text parts interleaved         for (Part part : responseContent.getParts()) {             if (part instanceof ImagePart) {                 // ImagePart as a bitmap                 Bitmap generatedImageAsBitmap = ((ImagePart) part).getImage();             } else if (part instanceof TextPart){                 // Text content from the TextPart                 String text = ((TextPart) part).getText();             }         }     }      @Override     public void onFailure(Throwable t) {         System.err.println(t);     } }, executor);

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Provide a text prompt instructing the model to generate interleaved text and images const prompt = 'Generate an illustrated recipe for a paella.\n.' +   'Create images to go alongside the text as you generate the recipe';  // To generate interleaved text and images, call `generateContent` with the text input const result = await model.generateContent(prompt);  // Handle the generated text and image try {   const response = result.response;   if (response.candidates?.[0].content?.parts) {     for (const part of response.candidates?.[0].content?.parts) {       if (part.text) {         // Do something with the text         console.log(part.text)       }       if (part.inlineData) {         // Do something with the image         const image = part.inlineData;         console.log(image.mimeType, image.data);       }     }   }  } catch (err) {   console.error('Prompt or candidate was blocked:', err); }

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Provide a text prompt instructing the model to generate interleaved text and images final prompt = [Content.text(   'Generate an illustrated recipe for a paella\n ' +   'Create images to go alongside the text as you generate the recipe' )];  // To generate interleaved text and images, call `generateContent` with the text input final response = await model.generateContent(prompt);  // Handle the generated text and image final parts = response.candidates.firstOrNull?.content.parts if (parts.isNotEmpty) {   for (final part in parts) {     if (part is TextPart) {       // Do something with text part       final text = part.text     }     if (part is InlineDataPart) {       // Process image       final imageBytes = part.bytes     }   } } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); }

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Provide a text prompt instructing the model to generate interleaved text and images var prompt = "Generate an illustrated recipe for a paella \n" +   "Create images to go alongside the text as you generate the recipe";  // To generate interleaved text and images, call `GenerateContentAsync` with the text input var response = await model.GenerateContentAsync(prompt);  // Handle the generated text and image foreach (var part in response.Candidates.First().Content.Parts) {   if (part is ModelContent.TextPart textPart) {     if (!string.IsNullOrWhiteSpace(textPart.Text)) {       // Do something with the text     }   } else if (part is ModelContent.InlineDataPart dataPart) {     if (dataPart.MimeType == "image/png") {       // Load the Image into a Unity Texture2D object       UnityEngine.Texture2D texture2D = new(2, 2);       if (texture2D.LoadImage(dataPart.Data.ToArray())) {         // Do something with the image       }     }   } }

修改图片（文本和图片输入）

您可以向 Gemini 模型发出提示，要求其使用文本和一张或多张图片来修改图片。

请务必创建 GenerativeModel 实例，在模型配置中包含 responseModalities: ["TEXT", "IMAGE"]，并调用 generateContent。

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Provide an image for the model to edit guard let image = UIImage(named: "scones") else { fatalError("Image file not found.") }  // Provide a text prompt instructing the model to edit the image let prompt = "Edit this image to make it look like a cartoon"  // To edit the image, call `generateContent` with the image and text input let response = try await model.generateContent(image, prompt)  // Handle the generated image guard let inlineDataPart = response.inlineDataParts.first else {   fatalError("No image data in response.") } guard let uiImage = UIImage(data: inlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide an image for the model to edit val bitmap = BitmapFactory.decodeResource(context.resources, R.drawable.scones)  // Provide a text prompt instructing the model to edit the image val prompt = content {     image(bitmap)     text("Edit this image to make it look like a cartoon") }  // To edit the image, call `generateContent` with the prompt (image and text input) val generatedImageAsBitmap = model.generateContent(prompt)     // Handle the generated text and image     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide an image for the model to edit Bitmap bitmap = BitmapFactory.decodeResource(resources, R.drawable.scones);  // Provide a text prompt instructing the model to edit the image Content promptcontent = new Content.Builder()         .addImage(bitmap)         .addText("Edit this image to make it look like a cartoon")         .build();  // To edit the image, call `generateContent` with the prompt (image and text input) ListenableFuture<GenerateContentResponse> response = model.generateContent(promptcontent); Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {         // iterate over all the parts in the first candidate in the result object         for (Part part : result.getCandidates().get(0).getContent().getParts()) {             if (part instanceof ImagePart) {                 ImagePart imagePart = (ImagePart) part;                 Bitmap generatedImageAsBitmap = imagePart.getImage();                 break;             }         }     }      @Override     public void onFailure(Throwable t) {         t.printStackTrace();     } }, executor);

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Prepare an image for the model to edit async function fileToGenerativePart(file) {   const base64EncodedDataPromise = new Promise((resolve) => {     const reader = new FileReader();     reader.onloadend = () => resolve(reader.result.split(',')[1]);     reader.readAsDataURL(file);   });   return {     inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },   }; }  // Provide a text prompt instructing the model to edit the image const prompt = "Edit this image to make it look like a cartoon";  const fileInputEl = document.querySelector("input[type=file]"); const imagePart = await fileToGenerativePart(fileInputEl.files[0]);  // To edit the image, call `generateContent` with the image and text input const result = await model.generateContent([prompt, imagePart]);  // Handle the generated image try {   const inlineDataParts = result.response.inlineDataParts();   if (inlineDataParts?.[0]) {     const image = inlineDataParts[0].inlineData;     console.log(image.mimeType, image.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); }

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Prepare an image for the model to edit final image = await File('scones.jpg').readAsBytes(); final imagePart = InlineDataPart('image/jpeg', image);  // Provide a text prompt instructing the model to edit the image final prompt = TextPart("Edit this image to make it look like a cartoon");  // To edit the image, call `generateContent` with the image and text input final response = await model.generateContent([   Content.multi([prompt,imagePart]) ]);  // Handle the generated image if (response.inlineDataParts.isNotEmpty) {   final imageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); }

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Prepare an image for the model to edit var imageFile = System.IO.File.ReadAllBytes(System.IO.Path.Combine(   UnityEngine.Application.streamingAssetsPath, "scones.jpg")); var image = ModelContent.InlineData("image/jpeg", imageFile);  // Provide a text prompt instructing the model to edit the image var prompt = ModelContent.Text("Edit this image to make it look like a cartoon.");  // To edit the image, call `GenerateContent` with the image and text input var response = await model.GenerateContentAsync(new [] { prompt, image });  var text = response.Text; if (!string.IsNullOrWhiteSpace(text)) {   // Do something with the text }  // Handle the generated image var imageParts = response.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); foreach (var imagePart in imageParts) {   // Load the Image into a Unity Texture2D object   Texture2D texture2D = new Texture2D(2, 2);   if (texture2D.LoadImage(imagePart.Data.ToArray())) {     // Do something with the image   } }

使用多轮对话迭代和修改图片

通过多轮对话，您可以与 Gemini 模型就其生成的图片或您提供的图片进行迭代。

请务必创建 GenerativeModel 实例，在模型配置中添加 responseModalities: ["TEXT", "IMAGE"]，并调用 startChat() 和 sendMessage() 来发送新用户消息。

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Initialize the chat let chat = model.startChat()  guard let image = UIImage(named: "scones") else { fatalError("Image file not found.") }  // Provide an initial text prompt instructing the model to edit the image let prompt = "Edit this image to make it look like a cartoon"  // To generate an initial response, send a user message with the image and text prompt let response = try await chat.sendMessage(image, prompt)  // Inspect the generated image guard let inlineDataPart = response.inlineDataParts.first else {   fatalError("No image data in response.") } guard let uiImage = UIImage(data: inlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") }  // Follow up requests do not need to specify the image again let followUpResponse = try await chat.sendMessage("But make it old-school line drawing style")  // Inspect the edited image after the follow up request guard let followUpInlineDataPart = followUpResponse.inlineDataParts.first else {   fatalError("No image data in response.") } guard let followUpUIImage = UIImage(data: followUpInlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide an image for the model to edit val bitmap = BitmapFactory.decodeResource(context.resources, R.drawable.scones)  // Create the initial prompt instructing the model to edit the image val prompt = content {     image(bitmap)     text("Edit this image to make it look like a cartoon") }  // Initialize the chat val chat = model.startChat()  // To generate an initial response, send a user message with the image and text prompt var response = chat.sendMessage(prompt) // Inspect the returned image var generatedImageAsBitmap = response     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image  // Follow up requests do not need to specify the image again response = chat.sendMessage("But make it old-school line drawing style") generatedImageAsBitmap = response     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide an image for the model to edit Bitmap bitmap = BitmapFactory.decodeResource(resources, R.drawable.scones);  // Initialize the chat ChatFutures chat = model.startChat();  // Create the initial prompt instructing the model to edit the image Content prompt = new Content.Builder()         .setRole("user")         .addImage(bitmap)         .addText("Edit this image to make it look like a cartoon")         .build();  // To generate an initial response, send a user message with the image and text prompt ListenableFuture<GenerateContentResponse> response = chat.sendMessage(prompt); // Extract the image from the initial response ListenableFuture<@Nullable Bitmap> initialRequest = Futures.transform(response, result -> {     for (Part part : result.getCandidates().get(0).getContent().getParts()) {         if (part instanceof ImagePart) {             ImagePart imagePart = (ImagePart) part;             return imagePart.getImage();         }     }     return null; }, executor);  // Follow up requests do not need to specify the image again ListenableFuture<GenerateContentResponse> modelResponseFuture = Futures.transformAsync(         initialRequest,         generatedImage -> {             Content followUpPrompt = new Content.Builder()                     .addText("But make it old-school line drawing style")                     .build();             return chat.sendMessage(followUpPrompt);         },         executor);  // Add a final callback to check the reworked image Futures.addCallback(modelResponseFuture, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {         for (Part part : result.getCandidates().get(0).getContent().getParts()) {             if (part instanceof ImagePart) {                 ImagePart imagePart = (ImagePart) part;                 Bitmap generatedImageAsBitmap = imagePart.getImage();                 break;             }         }     }      @Override     public void onFailure(Throwable t) {         t.printStackTrace();     } }, executor);

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Prepare an image for the model to edit async function fileToGenerativePart(file) {   const base64EncodedDataPromise = new Promise((resolve) => {     const reader = new FileReader();     reader.onloadend = () => resolve(reader.result.split(',')[1]);     reader.readAsDataURL(file);   });   return {     inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },   }; }  const fileInputEl = document.querySelector("input[type=file]"); const imagePart = await fileToGenerativePart(fileInputEl.files[0]);  // Provide an initial text prompt instructing the model to edit the image const prompt = "Edit this image to make it look like a cartoon";  // Initialize the chat const chat = model.startChat();  // To generate an initial response, send a user message with the image and text prompt const result = await chat.sendMessage([prompt, imagePart]);  // Request and inspect the generated image try {   const inlineDataParts = result.response.inlineDataParts();   if (inlineDataParts?.[0]) {     // Inspect the generated image     const image = inlineDataParts[0].inlineData;     console.log(image.mimeType, image.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); }  // Follow up requests do not need to specify the image again const followUpResult = await chat.sendMessage("But make it old-school line drawing style");  // Request and inspect the returned image try {   const followUpInlineDataParts = followUpResult.response.inlineDataParts();   if (followUpInlineDataParts?.[0]) {     // Inspect the generated image     const followUpImage = followUpInlineDataParts[0].inlineData;     console.log(followUpImage.mimeType, followUpImage.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); }

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Prepare an image for the model to edit final image = await File('scones.jpg').readAsBytes(); final imagePart = InlineDataPart('image/jpeg', image);  // Provide an initial text prompt instructing the model to edit the image final prompt = TextPart("Edit this image to make it look like a cartoon");  // Initialize the chat final chat = model.startChat();  // To generate an initial response, send a user message with the image and text prompt final response = await chat.sendMessage([   Content.multi([prompt,imagePart]) ]);  // Inspect the returned image if (response.inlineDataParts.isNotEmpty) {   final imageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); }  // Follow up requests do not need to specify the image again final followUpResponse = await chat.sendMessage([   Content.text("But make it old-school line drawing style") ]);  // Inspect the returned image if (followUpResponse.inlineDataParts.isNotEmpty) {   final followUpImageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); }

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Prepare an image for the model to edit var imageFile = System.IO.File.ReadAllBytes(System.IO.Path.Combine(   UnityEngine.Application.streamingAssetsPath, "scones.jpg")); var image = ModelContent.InlineData("image/jpeg", imageFile);  // Provide an initial text prompt instructing the model to edit the image var prompt = ModelContent.Text("Edit this image to make it look like a cartoon.");  // Initialize the chat var chat = model.StartChat();  // To generate an initial response, send a user message with the image and text prompt var response = await chat.SendMessageAsync(new [] { prompt, image });  // Inspect the returned image var imageParts = response.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); // Load the image into a Unity Texture2D object UnityEngine.Texture2D texture2D = new(2, 2); if (texture2D.LoadImage(imageParts.First().Data.ToArray())) {   // Do something with the image }  // Follow up requests do not need to specify the image again var followUpResponse = await chat.SendMessageAsync("But make it old-school line drawing style");  // Inspect the returned image var followUpImageParts = followUpResponse.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); // Load the image into a Unity Texture2D object UnityEngine.Texture2D followUpTexture2D = new(2, 2); if (followUpTexture2D.LoadImage(followUpImageParts.First().Data.ToArray())) {   // Do something with the image }

支持的功能、限制和最佳实践

支持的模态和功能

以下是 Gemini 模型支持的图片输出模态和功能。每项功能都显示了一个提示示例，并在上方提供了一个代码示例。

文本图片（纯文本转图片）
- 生成一张以烟花为背景的埃菲尔铁塔图片。
文本图片（图片中的文本渲染）
- 生成一张电影效果照片，照片中有一栋大型建筑，建筑正面投影着巨大的文字。
文本图片和文本（交织）
- 生成一份图文并茂的海鲜饭食谱。在生成食谱时，与文本一起创建图片。
- 生成一个关于狗狗的故事，采用 3D 卡通动画风格。为每个场景生成一张图片。
图片和文本图片和文本（交织）
- [带家具的房间的图片] + 我的空间还适合放置哪些颜色的沙发？您可以更新图片吗？
图片编辑（文字和图片转图片）
- [司康饼图片] + 修改此图片，使其看起来像卡通图片
- [猫的图片] + [枕头的图片] + 请在这只枕头上制作一张我的猫的十字绣图案。
多轮图片修改（聊天）
- [蓝色汽车的图片] + 将此汽车改为敞篷车。，然后现在将颜色改为黄色。

限制和最佳做法

以下是 Gemini 模型生成图片输出的限制和最佳实践。

图片生成 Gemini 模型支持以下功能：
- 生成最大尺寸为 1024 像素的 PNG 图片。
- 生成和修改人物图片。
- 使用安全过滤器，提供灵活且限制较少的用户体验。
生成图片的 Gemini 模型不支持以下功能：
- 包括音频或视频输入。
- 仅生成图片。
  模型将始终返回文本和图片，并且您必须在模型配置中添加 responseModalities: ["TEXT", "IMAGE"]。
为获得最佳性能，请使用以下语言：en、es-mx、ja-jp、zh-cn、hi-in。
图片生成可能不会始终触发。以下是一些已知问题：
- 模型可能只能输出文本。
  尝试明确要求生成图片输出（例如，“生成图片”“在您操作过程中提供图片”“更新图片”）。
- 模型可能会中途停止生成。
  请重试或尝试使用其他提示。
- 模型可能会以图片形式生成文本。
  尝试明确要求文本输出。例如，“生成叙事文本及插图”。
为图片生成文字时，如果先生成文字，然后再要求生成包含该文字的图片，Gemini 的效果最好。

使用 Gemini（又称“nano banana”）生成和编辑图片 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

在 Gemini 和 Imagen 型号之间进行选择

准备工作

支持此功能的模型

生成和修改图片

生成图片（仅限文本输入）

Swift

Kotlin

Java

Web

Dart

Unity

生成图文交织的内容

Swift

Kotlin

Java

Web

Dart

Unity

修改图片（文本和图片输入）

Swift

Kotlin

Java

Web

Dart

Unity

使用多轮对话迭代和修改图片

Swift

Kotlin

Java

Web

Dart

Unity

支持的功能、限制和最佳实践

支持的模态和功能

限制和最佳做法

使用 Gemini（又称“nano banana”）生成和编辑图片