All Gemini 1.0 and Gemini 1.5 models are now retired.
To avoid service disruption, update to a newer model (for example, gemini-2.5-flash-lite). Learn more.

หน้านี้ได้รับการแปลโดย Cloud Translation API

สร้างและแก้ไขรูปภาพโดยใช้ Gemini (หรือที่เรียกว่า "nano banana")

คุณขอให้Geminiโมเดลสร้างและแก้ไขรูปภาพได้โดยใช้ ทั้งพรอมต์ข้อความเท่านั้นและพรอมต์ข้อความและรูปภาพ เมื่อใช้ Firebase AI Logic คุณจะส่งคำขอนี้จากแอปได้โดยตรง

ความสามารถนี้ช่วยให้คุณทำสิ่งต่างๆ ได้ เช่น

สร้างรูปภาพซ้ำๆ ผ่านการสนทนาด้วยภาษาธรรมชาติ ปรับรูปภาพโดยคงความสอดคล้องและบริบทไว้
สร้างรูปภาพที่มีการแสดงข้อความคุณภาพสูง รวมถึงสตริงข้อความยาวๆ
สร้างเอาต์พุตข้อความและรูปภาพที่มีการแทรกสลับ เช่น บล็อกโพสต์ที่มีข้อความและรูปภาพในเทิร์นเดียว ก่อนหน้านี้ คุณต้องเชื่อมโยงโมเดลหลายรายการเข้าด้วยกัน
สร้างรูปภาพโดยใช้ความรู้เกี่ยวกับโลกและความสามารถในการให้เหตุผลของ Gemini

คุณดูรายการรูปแบบและความสามารถที่รองรับทั้งหมด (พร้อมกับตัวอย่างพรอมต์) ได้ในส่วนท้ายของหน้านี้

ไปที่โค้ดสำหรับข้อความเป็นรูปภาพ ไปที่โค้ดสำหรับข้อความและรูปภาพที่สลับกัน

ไปที่โค้ดสำหรับการแก้ไขรูปภาพ ไปที่โค้ดสำหรับการแก้ไขรูปภาพแบบวนซ้ำ

ดูคำแนะนำอื่นๆ สำหรับตัวเลือกเพิ่มเติมในการทำงานกับรูปภาพ
วิเคราะห์รูปภาพ วิเคราะห์รูปภาพในอุปกรณ์ สร้างเอาต์พุตที่มีโครงสร้าง

การเลือกระหว่างรุ่น Gemini กับ Imagen

Firebase AI Logic SDK รองรับการสร้างและแก้ไขรูปภาพโดยใช้GeminiโมเดลหรือImagenโมเดล

สำหรับกรณีการใช้งานส่วนใหญ่ ให้เริ่มต้นด้วย Gemini แล้วเลือก Imagen เฉพาะสำหรับงานเฉพาะทางที่คุณภาพของรูปภาพมีความสำคัญ

เลือก Gemini เมื่อต้องการทำสิ่งต่อไปนี้

เพื่อใช้ความรู้และเหตุผลเกี่ยวกับโลกในการสร้างรูปภาพที่เกี่ยวข้องตามบริบท
เพื่อผสานข้อความและรูปภาพอย่างแนบเนียน หรือเพื่อสลับเอาต์พุตข้อความและรูปภาพ
เพื่อฝังภาพที่ถูกต้องภายในลำดับข้อความยาว
เพื่อแก้ไขรูปภาพแบบสนทนาในขณะที่ยังคงบริบทไว้

เลือก Imagen เมื่อต้องการทำสิ่งต่อไปนี้

เพื่อจัดลำดับความสำคัญของคุณภาพของรูปภาพ ความสมจริงเหมือนภาพถ่าย รายละเอียดทางศิลปะ หรือสไตล์ที่เฉพาะเจาะจง (เช่น อิมเพรสชันนิสม์หรืออนิเมะ)
เพื่อใส่การสร้างแบรนด์ สไตล์ หรือการสร้างโลโก้และการออกแบบผลิตภัณฑ์
หากต้องการระบุสัดส่วนภาพหรือรูปแบบของรูปภาพที่สร้างขึ้นอย่างชัดเจน

ก่อนเริ่มต้น

คลิกผู้ให้บริการ Gemini API เพื่อดูเนื้อหาและโค้ดเฉพาะของผู้ให้บริการ ในหน้านี้

หากยังไม่ได้ดำเนินการ ให้ทำตามคู่มือเริ่มต้นใช้งาน ซึ่งอธิบายวิธี ตั้งค่าโปรเจ็กต์ Firebase, เชื่อมต่อแอปกับ Firebase, เพิ่ม SDK, เริ่มต้นบริการแบ็กเอนด์สำหรับผู้ให้บริการ Gemini API ที่เลือก และ สร้างอินสแตนซ์ GenerativeModel

สําหรับการทดสอบและทําซ้ำพรอมต์ เราขอแนะนําให้ใช้ Google AI Studio

รุ่นที่รองรับความสามารถนี้

gemini-2.5-flash-image (หรือที่เรียกว่า "กล้วยไข่")

สำคัญ: โปรดทราบข้อมูลต่อไปนี้

Gemini 2.5 Flash Image โมเดล (หรือที่เรียกว่า "nano banana") ต้องใช้แพ็กเกจราคา Blaze แบบจ่ายเมื่อใช้ ไม่ว่าคุณจะใช้ ผู้ให้บริการ Gemini API รายใดก็ตาม
Gemini ไม่รองรับเอาต์พุตรูปภาพจากโมเดล Flash มาตรฐาน เช่น gemini-2.5-flash หรือ gemini-2.0-flash
เราจะเลิกใช้โมเดล gemini-2.0-flash-preview-image-generation และ gemini-2.5-flash-image-preview ในวันที่ 31 ตุลาคม 2025 ย้ายข้อมูลเวิร์กโฟลว์ไปยัง gemini-2.5-flash-image ก่อนวันที่ดังกล่าวเพื่อหลีกเลี่ยง การหยุดชะงักของบริการ

โปรดทราบว่า SDK ยังรองรับการสร้างรูปภาพโดยใช้โมเดล Imagen ด้วย

สร้างและแก้ไขรูปภาพ

คุณสร้างและแก้ไขรูปภาพได้โดยใช้โมเดล Gemini

สร้างรูปภาพ (ป้อนข้อความเท่านั้น)

ก่อนที่จะลองใช้ตัวอย่างนี้ ให้ทำตามส่วน ก่อนที่จะเริ่มของคู่มือนี้ เพื่อตั้งค่าโปรเจ็กต์และแอป
ในส่วนนั้น คุณจะคลิกปุ่มสำหรับ ผู้ให้บริการ Gemini API ที่เลือกเพื่อให้เห็นเนื้อหาเฉพาะของผู้ให้บริการ ในหน้านี้ด้วย

คุณขอให้Geminiโมเดลสร้างรูปภาพได้โดยการป้อนพรอมต์ด้วยข้อความ

อย่าลืมสร้างGenerativeModelอินสแตนซ์ ใส่ responseModalities: ["TEXT", "IMAGE"] ในการกำหนดค่าโมเดล และเรียกใช้ generateContent

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Provide a text prompt instructing the model to generate an image let prompt = "Generate an image of the Eiffel tower with fireworks in the background."  // To generate an image, call `generateContent` with the text input let response = try await model.generateContent(prompt)  // Handle the generated image guard let inlineDataPart = response.inlineDataParts.first else {   fatalError("No image data in response.") } guard let uiImage = UIImage(data: inlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide a text prompt instructing the model to generate an image val prompt = "Generate an image of the Eiffel tower with fireworks in the background."  // To generate image output, call `generateContent` with the text input val generatedImageAsBitmap = model.generateContent(prompt)     // Handle the generated image     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide a text prompt instructing the model to generate an image Content prompt = new Content.Builder()         .addText("Generate an image of the Eiffel Tower with fireworks in the background.")         .build();  // To generate an image, call `generateContent` with the text input ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt); Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {          // iterate over all the parts in the first candidate in the result object         for (Part part : result.getCandidates().get(0).getContent().getParts()) {             if (part instanceof ImagePart) {                 ImagePart imagePart = (ImagePart) part;                 // The returned image as a bitmap                 Bitmap generatedImageAsBitmap = imagePart.getImage();                 break;             }         }     }      @Override     public void onFailure(Throwable t) {         t.printStackTrace();     } }, executor);

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Provide a text prompt instructing the model to generate an image const prompt = 'Generate an image of the Eiffel Tower with fireworks in the background.';  // To generate an image, call `generateContent` with the text input const result = model.generateContent(prompt);  // Handle the generated image try {   const inlineDataParts = result.response.inlineDataParts();   if (inlineDataParts?.[0]) {     const image = inlineDataParts[0].inlineData;     console.log(image.mimeType, image.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); }

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Provide a text prompt instructing the model to generate an image final prompt = [Content.text('Generate an image of the Eiffel Tower with fireworks in the background.')];  // To generate an image, call `generateContent` with the text input final response = await model.generateContent(prompt); if (response.inlineDataParts.isNotEmpty) {   final imageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); }

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Provide a text prompt instructing the model to generate an image var prompt = "Generate an image of the Eiffel Tower with fireworks in the background.";  // To generate an image, call `GenerateContentAsync` with the text input var response = await model.GenerateContentAsync(prompt);  var text = response.Text; if (!string.IsNullOrWhiteSpace(text)) {   // Do something with the text }  // Handle the generated image var imageParts = response.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); foreach (var imagePart in imageParts) {   // Load the Image into a Unity Texture2D object   UnityEngine.Texture2D texture2D = new(2, 2);   if (texture2D.LoadImage(imagePart.Data.ToArray())) {     // Do something with the image   } }

สร้างข้อความและรูปภาพที่มีการแทรกสลับ

คุณขอให้Geminiสร้างรูปภาพที่สลับกับคำตอบที่เป็นข้อความได้ เช่น คุณสามารถสร้างรูปภาพของแต่ละขั้นตอนในสูตรอาหารที่สร้างขึ้นพร้อมกับวิธีการของขั้นตอนนั้นๆ โดยไม่ต้องส่งคำขอแยกต่างหากไปยังโมเดลหรือโมเดลต่างๆ

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Provide a text prompt instructing the model to generate interleaved text and images let prompt = """ Generate an illustrated recipe for a paella. Create images to go alongside the text as you generate the recipe """  // To generate interleaved text and images, call `generateContent` with the text input let response = try await model.generateContent(prompt)  // Handle the generated text and image guard let candidate = response.candidates.first else {   fatalError("No candidates in response.") } for part in candidate.content.parts {   switch part {   case let textPart as TextPart:     // Do something with the generated text     let text = textPart.text   case let inlineDataPart as InlineDataPart:     // Do something with the generated image     guard let uiImage = UIImage(data: inlineDataPart.data) else {       fatalError("Failed to convert data to UIImage.")     }   default:     fatalError("Unsupported part type: \(part)")   } }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide a text prompt instructing the model to generate interleaved text and images val prompt = """     Generate an illustrated recipe for a paella.     Create images to go alongside the text as you generate the recipe     """.trimIndent()  // To generate interleaved text and images, call `generateContent` with the text input val responseContent = model.generateContent(prompt).candidates.first().content  // The response will contain image and text parts interleaved for (part in responseContent.parts) {     when (part) {         is ImagePart -> {             // ImagePart as a bitmap             val generatedImageAsBitmap: Bitmap? = part.asImageOrNull()         }         is TextPart -> {             // Text content from the TextPart             val text = part.text         }     } }

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide a text prompt instructing the model to generate interleaved text and images Content prompt = new Content.Builder()         .addText("Generate an illustrated recipe for a paella.\n" +                  "Create images to go alongside the text as you generate the recipe")         .build();  // To generate interleaved text and images, call `generateContent` with the text input ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt); Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {         Content responseContent = result.getCandidates().get(0).getContent();         // The response will contain image and text parts interleaved         for (Part part : responseContent.getParts()) {             if (part instanceof ImagePart) {                 // ImagePart as a bitmap                 Bitmap generatedImageAsBitmap = ((ImagePart) part).getImage();             } else if (part instanceof TextPart){                 // Text content from the TextPart                 String text = ((TextPart) part).getText();             }         }     }      @Override     public void onFailure(Throwable t) {         System.err.println(t);     } }, executor);

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Provide a text prompt instructing the model to generate interleaved text and images const prompt = 'Generate an illustrated recipe for a paella.\n.' +   'Create images to go alongside the text as you generate the recipe';  // To generate interleaved text and images, call `generateContent` with the text input const result = await model.generateContent(prompt);  // Handle the generated text and image try {   const response = result.response;   if (response.candidates?.[0].content?.parts) {     for (const part of response.candidates?.[0].content?.parts) {       if (part.text) {         // Do something with the text         console.log(part.text)       }       if (part.inlineData) {         // Do something with the image         const image = part.inlineData;         console.log(image.mimeType, image.data);       }     }   }  } catch (err) {   console.error('Prompt or candidate was blocked:', err); }

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Provide a text prompt instructing the model to generate interleaved text and images final prompt = [Content.text(   'Generate an illustrated recipe for a paella\n ' +   'Create images to go alongside the text as you generate the recipe' )];  // To generate interleaved text and images, call `generateContent` with the text input final response = await model.generateContent(prompt);  // Handle the generated text and image final parts = response.candidates.firstOrNull?.content.parts if (parts.isNotEmpty) {   for (final part in parts) {     if (part is TextPart) {       // Do something with text part       final text = part.text     }     if (part is InlineDataPart) {       // Process image       final imageBytes = part.bytes     }   } } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); }

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Provide a text prompt instructing the model to generate interleaved text and images var prompt = "Generate an illustrated recipe for a paella \n" +   "Create images to go alongside the text as you generate the recipe";  // To generate interleaved text and images, call `GenerateContentAsync` with the text input var response = await model.GenerateContentAsync(prompt);  // Handle the generated text and image foreach (var part in response.Candidates.First().Content.Parts) {   if (part is ModelContent.TextPart textPart) {     if (!string.IsNullOrWhiteSpace(textPart.Text)) {       // Do something with the text     }   } else if (part is ModelContent.InlineDataPart dataPart) {     if (dataPart.MimeType == "image/png") {       // Load the Image into a Unity Texture2D object       UnityEngine.Texture2D texture2D = new(2, 2);       if (texture2D.LoadImage(dataPart.Data.ToArray())) {         // Do something with the image       }     }   } }

แก้ไขรูปภาพ (อินพุตข้อความและรูปภาพ)

คุณขอให้โมเดล Gemini แก้ไขรูปภาพได้โดยการป้อนพรอมต์ด้วยข้อความและรูปภาพอย่างน้อย 1 รูป

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Provide an image for the model to edit guard let image = UIImage(named: "scones") else { fatalError("Image file not found.") }  // Provide a text prompt instructing the model to edit the image let prompt = "Edit this image to make it look like a cartoon"  // To edit the image, call `generateContent` with the image and text input let response = try await model.generateContent(image, prompt)  // Handle the generated image guard let inlineDataPart = response.inlineDataParts.first else {   fatalError("No image data in response.") } guard let uiImage = UIImage(data: inlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide an image for the model to edit val bitmap = BitmapFactory.decodeResource(context.resources, R.drawable.scones)  // Provide a text prompt instructing the model to edit the image val prompt = content {     image(bitmap)     text("Edit this image to make it look like a cartoon") }  // To edit the image, call `generateContent` with the prompt (image and text input) val generatedImageAsBitmap = model.generateContent(prompt)     // Handle the generated text and image     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide an image for the model to edit Bitmap bitmap = BitmapFactory.decodeResource(resources, R.drawable.scones);  // Provide a text prompt instructing the model to edit the image Content promptcontent = new Content.Builder()         .addImage(bitmap)         .addText("Edit this image to make it look like a cartoon")         .build();  // To edit the image, call `generateContent` with the prompt (image and text input) ListenableFuture<GenerateContentResponse> response = model.generateContent(promptcontent); Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {         // iterate over all the parts in the first candidate in the result object         for (Part part : result.getCandidates().get(0).getContent().getParts()) {             if (part instanceof ImagePart) {                 ImagePart imagePart = (ImagePart) part;                 Bitmap generatedImageAsBitmap = imagePart.getImage();                 break;             }         }     }      @Override     public void onFailure(Throwable t) {         t.printStackTrace();     } }, executor);

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Prepare an image for the model to edit async function fileToGenerativePart(file) {   const base64EncodedDataPromise = new Promise((resolve) => {     const reader = new FileReader();     reader.onloadend = () => resolve(reader.result.split(',')[1]);     reader.readAsDataURL(file);   });   return {     inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },   }; }  // Provide a text prompt instructing the model to edit the image const prompt = "Edit this image to make it look like a cartoon";  const fileInputEl = document.querySelector("input[type=file]"); const imagePart = await fileToGenerativePart(fileInputEl.files[0]);  // To edit the image, call `generateContent` with the image and text input const result = await model.generateContent([prompt, imagePart]);  // Handle the generated image try {   const inlineDataParts = result.response.inlineDataParts();   if (inlineDataParts?.[0]) {     const image = inlineDataParts[0].inlineData;     console.log(image.mimeType, image.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); }

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Prepare an image for the model to edit final image = await File('scones.jpg').readAsBytes(); final imagePart = InlineDataPart('image/jpeg', image);  // Provide a text prompt instructing the model to edit the image final prompt = TextPart("Edit this image to make it look like a cartoon");  // To edit the image, call `generateContent` with the image and text input final response = await model.generateContent([   Content.multi([prompt,imagePart]) ]);  // Handle the generated image if (response.inlineDataParts.isNotEmpty) {   final imageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); }

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Prepare an image for the model to edit var imageFile = System.IO.File.ReadAllBytes(System.IO.Path.Combine(   UnityEngine.Application.streamingAssetsPath, "scones.jpg")); var image = ModelContent.InlineData("image/jpeg", imageFile);  // Provide a text prompt instructing the model to edit the image var prompt = ModelContent.Text("Edit this image to make it look like a cartoon.");  // To edit the image, call `GenerateContent` with the image and text input var response = await model.GenerateContentAsync(new [] { prompt, image });  var text = response.Text; if (!string.IsNullOrWhiteSpace(text)) {   // Do something with the text }  // Handle the generated image var imageParts = response.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); foreach (var imagePart in imageParts) {   // Load the Image into a Unity Texture2D object   Texture2D texture2D = new Texture2D(2, 2);   if (texture2D.LoadImage(imagePart.Data.ToArray())) {     // Do something with the image   } }

ทำซ้ำและแก้ไขรูปภาพโดยใช้การแชทแบบหลายรอบ

การใช้แชทแบบหลายรอบช่วยให้คุณทำซ้ำกับโมเดล Gemini ในรูปภาพที่โมเดลสร้างขึ้นหรือที่คุณให้มาได้

โปรดสร้างอินสแตนซ์ GenerativeModel รวมถึง responseModalities: ["TEXT", "IMAGE"] ในการกำหนดค่าโมเดล และเรียกใช้ startChat() และ sendMessage() เพื่อส่งข้อความถึงผู้ใช้ใหม่

Swift

 import FirebaseAILogic  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output let generativeModel = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [.text, .image]) )  // Initialize the chat let chat = model.startChat()  guard let image = UIImage(named: "scones") else { fatalError("Image file not found.") }  // Provide an initial text prompt instructing the model to edit the image let prompt = "Edit this image to make it look like a cartoon"  // To generate an initial response, send a user message with the image and text prompt let response = try await chat.sendMessage(image, prompt)  // Inspect the generated image guard let inlineDataPart = response.inlineDataParts.first else {   fatalError("No image data in response.") } guard let uiImage = UIImage(data: inlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") }  // Follow up requests do not need to specify the image again let followUpResponse = try await chat.sendMessage("But make it old-school line drawing style")  // Inspect the edited image after the follow up request guard let followUpInlineDataPart = followUpResponse.inlineDataParts.first else {   fatalError("No image data in response.") } guard let followUpUIImage = UIImage(data: followUpInlineDataPart.data) else {   fatalError("Failed to convert data to UIImage.") }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(     modelName = "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     generationConfig = generationConfig { responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) } )  // Provide an image for the model to edit val bitmap = BitmapFactory.decodeResource(context.resources, R.drawable.scones)  // Create the initial prompt instructing the model to edit the image val prompt = content {     image(bitmap)     text("Edit this image to make it look like a cartoon") }  // Initialize the chat val chat = model.startChat()  // To generate an initial response, send a user message with the image and text prompt var response = chat.sendMessage(prompt) // Inspect the returned image var generatedImageAsBitmap = response     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image  // Follow up requests do not need to specify the image again response = chat.sendMessage("But make it old-school line drawing style") generatedImageAsBitmap = response     .candidates.first().content.parts.filterIsInstance<ImagePart>().firstOrNull()?.image

Java

 // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(     "gemini-2.5-flash-image",     // Configure the model to respond with text and images (required)     new GenerationConfig.Builder()         .setResponseModalities(Arrays.asList(ResponseModality.TEXT, ResponseModality.IMAGE))         .build() );  GenerativeModelFutures model = GenerativeModelFutures.from(ai);  // Provide an image for the model to edit Bitmap bitmap = BitmapFactory.decodeResource(resources, R.drawable.scones);  // Initialize the chat ChatFutures chat = model.startChat();  // Create the initial prompt instructing the model to edit the image Content prompt = new Content.Builder()         .setRole("user")         .addImage(bitmap)         .addText("Edit this image to make it look like a cartoon")         .build();  // To generate an initial response, send a user message with the image and text prompt ListenableFuture<GenerateContentResponse> response = chat.sendMessage(prompt); // Extract the image from the initial response ListenableFuture<@Nullable Bitmap> initialRequest = Futures.transform(response, result -> {     for (Part part : result.getCandidates().get(0).getContent().getParts()) {         if (part instanceof ImagePart) {             ImagePart imagePart = (ImagePart) part;             return imagePart.getImage();         }     }     return null; }, executor);  // Follow up requests do not need to specify the image again ListenableFuture<GenerateContentResponse> modelResponseFuture = Futures.transformAsync(         initialRequest,         generatedImage -> {             Content followUpPrompt = new Content.Builder()                     .addText("But make it old-school line drawing style")                     .build();             return chat.sendMessage(followUpPrompt);         },         executor);  // Add a final callback to check the reworked image Futures.addCallback(modelResponseFuture, new FutureCallback<GenerateContentResponse>() {     @Override     public void onSuccess(GenerateContentResponse result) {         for (Part part : result.getCandidates().get(0).getContent().getParts()) {             if (part instanceof ImagePart) {                 ImagePart imagePart = (ImagePart) part;                 Bitmap generatedImageAsBitmap = imagePart.getImage();                 break;             }         }     }      @Override     public void onFailure(Throwable t) {         t.printStackTrace();     } }, executor);

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, ResponseModality } from "firebase/ai";  // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {   // ... };  // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig);  // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });  // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {   model: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: {     responseModalities: [ResponseModality.TEXT, ResponseModality.IMAGE],   }, });  // Prepare an image for the model to edit async function fileToGenerativePart(file) {   const base64EncodedDataPromise = new Promise((resolve) => {     const reader = new FileReader();     reader.onloadend = () => resolve(reader.result.split(',')[1]);     reader.readAsDataURL(file);   });   return {     inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },   }; }  const fileInputEl = document.querySelector("input[type=file]"); const imagePart = await fileToGenerativePart(fileInputEl.files[0]);  // Provide an initial text prompt instructing the model to edit the image const prompt = "Edit this image to make it look like a cartoon";  // Initialize the chat const chat = model.startChat();  // To generate an initial response, send a user message with the image and text prompt const result = await chat.sendMessage([prompt, imagePart]);  // Request and inspect the generated image try {   const inlineDataParts = result.response.inlineDataParts();   if (inlineDataParts?.[0]) {     // Inspect the generated image     const image = inlineDataParts[0].inlineData;     console.log(image.mimeType, image.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); }  // Follow up requests do not need to specify the image again const followUpResult = await chat.sendMessage("But make it old-school line drawing style");  // Request and inspect the returned image try {   const followUpInlineDataParts = followUpResult.response.inlineDataParts();   if (followUpInlineDataParts?.[0]) {     // Inspect the generated image     const followUpImage = followUpInlineDataParts[0].inlineData;     console.log(followUpImage.mimeType, followUpImage.data);   } } catch (err) {   console.error('Prompt or candidate was blocked:', err); }

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart';  await Firebase.initializeApp(   options: DefaultFirebaseOptions.currentPlatform, );  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output final model = FirebaseAI.googleAI().generativeModel(   model: 'gemini-2.5-flash-image',   // Configure the model to respond with text and images (required)   generationConfig: GenerationConfig(responseModalities: [ResponseModalities.text, ResponseModalities.image]), );  // Prepare an image for the model to edit final image = await File('scones.jpg').readAsBytes(); final imagePart = InlineDataPart('image/jpeg', image);  // Provide an initial text prompt instructing the model to edit the image final prompt = TextPart("Edit this image to make it look like a cartoon");  // Initialize the chat final chat = model.startChat();  // To generate an initial response, send a user message with the image and text prompt final response = await chat.sendMessage([   Content.multi([prompt,imagePart]) ]);  // Inspect the returned image if (response.inlineDataParts.isNotEmpty) {   final imageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); }  // Follow up requests do not need to specify the image again final followUpResponse = await chat.sendMessage([   Content.text("But make it old-school line drawing style") ]);  // Inspect the returned image if (followUpResponse.inlineDataParts.isNotEmpty) {   final followUpImageBytes = response.inlineDataParts[0].bytes;   // Process the image } else {   // Handle the case where no images were generated   print('Error: No images were generated.'); }

Unity

 using Firebase; using Firebase.AI;  // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a Gemini model that supports image output var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(   modelName: "gemini-2.5-flash-image",   // Configure the model to respond with text and images (required)   generationConfig: new GenerationConfig(     responseModalities: new[] { ResponseModality.Text, ResponseModality.Image }) );  // Prepare an image for the model to edit var imageFile = System.IO.File.ReadAllBytes(System.IO.Path.Combine(   UnityEngine.Application.streamingAssetsPath, "scones.jpg")); var image = ModelContent.InlineData("image/jpeg", imageFile);  // Provide an initial text prompt instructing the model to edit the image var prompt = ModelContent.Text("Edit this image to make it look like a cartoon.");  // Initialize the chat var chat = model.StartChat();  // To generate an initial response, send a user message with the image and text prompt var response = await chat.SendMessageAsync(new [] { prompt, image });  // Inspect the returned image var imageParts = response.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); // Load the image into a Unity Texture2D object UnityEngine.Texture2D texture2D = new(2, 2); if (texture2D.LoadImage(imageParts.First().Data.ToArray())) {   // Do something with the image }  // Follow up requests do not need to specify the image again var followUpResponse = await chat.SendMessageAsync("But make it old-school line drawing style");  // Inspect the returned image var followUpImageParts = followUpResponse.Candidates.First().Content.Parts                          .OfType<ModelContent.InlineDataPart>()                          .Where(part => part.MimeType == "image/png"); // Load the image into a Unity Texture2D object UnityEngine.Texture2D followUpTexture2D = new(2, 2); if (followUpTexture2D.LoadImage(followUpImageParts.First().Data.ToArray())) {   // Do something with the image }

ฟีเจอร์ที่รองรับ ข้อจำกัด และแนวทางปฏิบัติแนะนำ

รูปแบบและความสามารถที่รองรับ

ต่อไปนี้คือรูปแบบและความสามารถที่รองรับสำหรับเอาต์พุตรูปภาพจากGeminiโมเดล ความสามารถแต่ละอย่างจะแสดงพรอมต์ตัวอย่างและมี ตัวอย่างโค้ดด้านบน

ข้อความ รูปภาพ (ข้อความเท่านั้นเป็นรูปภาพ)
- สร้างรูปภาพหอไอเฟลที่มีดอกไม้ไฟเป็นฉากหลัง
ข้อความ รูปภาพ (การแสดงข้อความภายในรูปภาพ)
- สร้างรูปภาพสไตล์ภาพยนตร์ของอาคารขนาดใหญ่ที่มีข้อความขนาดใหญ่นี้ ฉายที่ด้านหน้าอาคาร
ข้อความ รูปภาพและข้อความ (สลับ)
- สร้างสูตรอาหารภาพประกอบสำหรับปาเอญ่า สร้างรูปภาพควบคู่ไปกับ ข้อความขณะสร้างสูตรอาหาร
- สร้างเรื่องราวเกี่ยวกับสุนัขในสไตล์ภาพเคลื่อนไหวการ์ตูน 3 มิติ สร้างรูปภาพสำหรับแต่ละฉาก
รูปภาพและข้อความ รูปภาพและข้อความ (สลับ)
- [รูปภาพห้องที่มีเฟอร์นิเจอร์] + โซฟาสีอื่นๆ ที่เหมาะกับพื้นที่ของฉันมีอะไรบ้าง คุณอัปเดตรูปภาพได้ไหม
การแก้ไขรูปภาพ (ข้อความและรูปภาพเป็นรูปภาพ)
- [รูปภาพสโคน] + แก้ไขรูปภาพนี้ให้ดูเหมือนการ์ตูน
- [รูปภาพแมว] + [รูปภาพหมอน] + สร้างภาพครอสติชของแมวบนหมอนนี้
การแก้ไขรูปภาพแบบหลายรอบ (แชท)
- [รูปภาพรถสีน้ำเงิน] + เปลี่ยนรถคันนี้ให้เป็นรถเปิดประทุน แล้วเปลี่ยนสีเป็นสีเหลือง

ข้อจำกัดและแนวทางปฏิบัติแนะนำ

ข้อจำกัดและแนวทางปฏิบัติแนะนำสำหรับการแสดงผลรูปภาพจากโมเดล Gemini มีดังนี้

โมเดลสร้างรูปภาพ Gemini รองรับสิ่งต่อไปนี้
- สร้างรูปภาพ PNG ที่มีขนาดสูงสุด 1024 พิกเซล
- การสร้างและแก้ไขรูปภาพของผู้คน
- การใช้ตัวกรองความปลอดภัยที่มอบประสบการณ์การใช้งานที่ยืดหยุ่นและจำกัดน้อยลง
Geminiโมเดลไม่รองรับรายการต่อไปนี้
- รวมถึงอินพุตเสียงหรือวิดีโอ
- สร้างเฉพาะรูปภาพ
  โมเดลจะแสดงทั้งข้อความและรูปภาพเสมอ และคุณต้องใส่ responseModalities: ["TEXT", "IMAGE"] ในการกำหนดค่าโมเดล
เพื่อประสิทธิภาพสูงสุด ให้ใช้ภาษาต่อไปนี้ en, es-mx, ja-jp, zh-cn, hi-in
การสร้างรูปภาพอาจไม่ทำงานเสมอไป ปัญหาที่ทราบมีดังนี้
- โมเดลอาจแสดงผลเป็นข้อความเท่านั้น
  ลองขอเอาต์พุตเป็นรูปภาพอย่างชัดเจน (เช่น "สร้างรูปภาพ" "แสดงรูปภาพไปพร้อมๆ กัน" "อัปเดตรูปภาพ")
- โมเดลอาจหยุดสร้างกลางคัน
  โปรดลองอีกครั้งหรือลองใช้พรอมต์อื่น
- โมเดลอาจสร้างข้อความเป็นรูปภาพ
  ลองขอเอาต์พุตข้อความอย่างชัดเจน เช่น "สร้างข้อความบรรยาย พร้อมภาพประกอบ"
เมื่อสร้างข้อความสำหรับรูปภาพ Gemini จะทำงานได้ดีที่สุดหากคุณสร้างข้อความก่อน แล้วจึงขอรูปภาพที่มีข้อความ