OpenAI 호환성

Gemini 모델은 REST API와 함께 OpenAI 라이브러리(Python 및 TypeScript/JavaScript)를 사용하여 액세스할 수 있습니다. Vertex AI에서 OpenAI 라이브러리를 사용하려면 Google Cloud Auth만 지원됩니다. 아직 OpenAI 라이브러리를 사용하고 있지 않다면 Gemini API를 직접 호출하는 것이 좋습니다.

Python

import openai from google.auth import default import google.auth.transport.requests  # TODO(developer): Update and un-comment below lines # project_id = "PROJECT_ID" # location = "global"  # Programmatically get an access token credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"]) credentials.refresh(google.auth.transport.requests.Request())  # OpenAI Client client = openai.OpenAI(   base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",   api_key=credentials.token )  response = client.chat.completions.create(   model="google/gemini-2.0-flash-001",   messages=[       {"role": "system", "content": "You are a helpful assistant."},       {"role": "user", "content": "Explain to me how AI works"}   ] )  print(response.choices[0].message) 

변경사항

  • api_key=credentials.token: Google Cloud 인증을 사용하려면 샘플 코드를 사용하여Google Cloud 인증 토큰을 가져옵니다.

  • base_url: OpenAI 라이브러리에 기본 URL 대신 Google Cloud로 요청을 전송하도록 지시합니다.

  • model="google/gemini-2.0-flash-001": Vertex에서 호스팅하는 모델 중에서 호환되는 Gemini 모델을 선택합니다.

사고

Gemini 2.5 모델은 복잡한 문제를 해결하도록 학습되어 추론이 크게 개선되었습니다. Gemini API에는 모델이 얼마나 사고할지 세부적으로 제어할 수 있는 '사고 예산' 파라미터가 제공됩니다.

Gemini API와 달리 OpenAI API는 '낮음', '중간', '높음'의 세 가지 사고 제어 수준을 제공하며, 이들은 백그라운드에서 1,000, 8,000, 24,000 사고 토큰 예산에 매핑됩니다.

추론 노력을 전혀 지정하지 않는 것은 사고 예산을 지정하지 않는 것과 같습니다.

OpenAI 호환 API에서 사고 예산 및 기타 사고 관련 구성을 보다 직접적으로 제어하려면 extra_body.google.thinking_config를 활용하세요.

Python

import openai from google.auth import default import google.auth.transport.requests  # TODO(developer): Update and un-comment below lines # project_id = "PROJECT_ID" # location = "global"  # # Programmatically get an access token credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"]) credentials.refresh(google.auth.transport.requests.Request())  # OpenAI Client client = openai.OpenAI(   base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",   api_key=credentials.token )  response = client.chat.completions.create(   model="google/gemini-2.5-flash",   reasoning_effort="low",   messages=[       {"role": "system", "content": "You are a helpful assistant."},       {           "role": "user",           "content": "Explain to me how AI works"       }   ] ) print(response.choices[0].message) 

스트리밍

Gemini API는 스트리밍 응답을 지원합니다.

Python

import openai from google.auth import default import google.auth.transport.requests  # TODO(developer): Update and un-comment below lines # project_id = "PROJECT_ID" # location = "global"  credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"]) credentials.refresh(google.auth.transport.requests.Request())  client = openai.OpenAI(   base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",   api_key=credentials.token ) response = client.chat.completions.create(   model="google/gemini-2.0-flash",   messages=[     {"role": "system", "content": "You are a helpful assistant."},     {"role": "user", "content": "Hello!"}   ],   stream=True )  for chunk in response:   print(chunk.choices[0].delta) 

함수 호출

함수 호출을 사용하면 생성형 모델에서 구조화된 데이터 출력을 더 쉽게 가져올 수 있는데, 이는 Gemini API에서 지원됩니다.

Python

import openai from google.auth import default import google.auth.transport.requests  # TODO(developer): Update and un-comment below lines # project_id = "PROJECT_ID" # location = "global"  credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"]) credentials.refresh(google.auth.transport.requests.Request())  client = openai.OpenAI(   base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",   api_key=credentials.token )  tools = [   {     "type": "function",     "function": {       "name": "get_weather",       "description": "Get the weather in a given location",       "parameters": {         "type": "object",         "properties": {           "location": {             "type": "string",             "description": "The city and state, e.g. Chicago, IL",           },           "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},         },         "required": ["location"],       },     }   } ]  messages = [{"role": "user", "content": "What's the weather like in Chicago today?"}] response = client.chat.completions.create(   model="google/gemini-2.0-flash",   messages=messages,   tools=tools,   tool_choice="auto" )  print(response) 

이미지 이해

Gemini 모델은 네이티브 멀티모달이며 많은 일반적인 시각 작업에서 동급 최고의 성능을 제공합니다.

Python

from google.auth import default import google.auth.transport.requests  import base64 from openai import OpenAI  # TODO(developer): Update and un-comment below lines # project_id = "PROJECT_ID" # location = "global"  # Programmatically get an access token credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"]) credentials.refresh(google.auth.transport.requests.Request())  # OpenAI Client client = openai.OpenAI(   base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",   api_key=credentials.token, )  # Function to encode the image def encode_image(image_path):   with open(image_path, "rb") as image_file:     return base64.b64encode(image_file.read()).decode('utf-8')  # Getting the base64 string # base64_image = encode_image("Path/to/image.jpeg")  response = client.chat.completions.create(   model="google/gemini-2.0-flash",   messages=[     {       "role": "user",       "content": [         {           "type": "text",           "text": "What is in this image?",         },         {           "type": "image_url",           "image_url": {             "url":  f"data:image/jpeg;base64,{base64_image}"           },         },       ],     }   ], )  print(response.choices[0]) 

이미지 생성

Python

from google.auth import default import google.auth.transport.requests  import base64 from openai import OpenAI  # TODO(developer): Update and un-comment below lines # project_id = "PROJECT_ID" # location = "global"  # Programmatically get an access token credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"]) credentials.refresh(google.auth.transport.requests.Request())  # OpenAI Client client = openai.OpenAI(   base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",   api_key=credentials.token, )  # Function to encode the image def encode_image(image_path):   with open(image_path, "rb") as image_file:     return base64.b64encode(image_file.read()).decode('utf-8')  # Getting the base64 string base64_image = encode_image("/content/wayfairsofa.jpg")  response = client.chat.completions.create(   model="google/gemini-2.0-flash",   messages=[     {       "role": "user",       "content": [         {           "type": "text",           "text": "What is in this image?",         },         {           "type": "image_url",           "image_url": {             "url":  f"data:image/jpeg;base64,{base64_image}"           },         },       ],     }   ], )  print(response.choices[0]) 

오디오 이해

오디오 입력 분석:

Python

from google.auth import default import google.auth.transport.requests  import base64 from openai import OpenAI  # TODO(developer): Update and un-comment below lines # project_id = "PROJECT_ID" # location = "global"  # Programmatically get an access token credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"]) credentials.refresh(google.auth.transport.requests.Request())  # OpenAI Client client = openai.OpenAI(   base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",   api_key=credentials.token, )  with open("/path/to/your/audio/file.wav", "rb") as audio_file: base64_audio = base64.b64encode(audio_file.read()).decode('utf-8')  response = client.chat.completions.create(   model="gemini-2.0-flash",   messages=[     {       "role": "user",       "content": [         {           "type": "text",           "text": "Transcribe this audio",         },         {               "type": "input_audio",               "input_audio": {                 "data": base64_audio,                 "format": "wav"           }         }       ],     }   ], )  print(response.choices[0].message.content) 

구조화된 출력

Gemini 모델은 정의한 구조로 JSON 객체를 출력할 수 있습니다.

Python

from google.auth import default import google.auth.transport.requests  from pydantic import BaseModel from openai import OpenAI  # TODO(developer): Update and un-comment below lines # project_id = "PROJECT_ID" # location = "global"  # Programmatically get an access token credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"]) credentials.refresh(google.auth.transport.requests.Request())  # OpenAI Client client = openai.OpenAI(   base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",   api_key=credentials.token, )  class CalendarEvent(BaseModel):   name: str   date: str   participants: list[str]  completion = client.beta.chat.completions.parse(   model="google/gemini-2.0-flash",   messages=[       {"role": "system", "content": "Extract the event information."},       {"role": "user", "content": "John and Susan are going to an AI conference on Friday."},   ],   response_format=CalendarEvent, )  print(completion.choices[0].message.parsed) 

현재 제한사항

  • 액세스 토큰은 기본적으로 1시간 동안 유효합니다. 만료 후에는 다시 인증해야 합니다. 자세한 내용은 이 코드 예시를 참고하세요.

  • 기능 지원을 확대하고는 있지만, OpenAI 라이브러리 지원은 아직 프리뷰 버전입니다. 질문이나 문제가 있으면 Google Cloud 커뮤니티에 게시하세요.

다음 단계