Crea audio a partir de texto con bibliotecas cliente

En esta guía de inicio rápido, se te guiará en el proceso de usar bibliotecas cliente para realizar una solicitud a Text-to-Speech y crear audio a partir de texto.

Para obtener más información sobre los conceptos básicos de Text-to-Speech, consulta los Conceptos básicos de la API de Text-to-Speech. Para ver qué voces sintéticas están disponibles en tu idioma, consulta la página de idiomas y voces compatibles.

Antes de comenzar

Antes de enviar una solicitud a la API de Text-to-Speech, debes haber realizado las siguientes acciones. Consulta la página Antes de comenzar para obtener más detalles.

  • Habilita Text-to-Speech en un proyecto de Google Cloud .
  • Asegúrate de que la facturación esté habilitada para Text-to-Speech.
  • Instala Google Cloud CLI. Después de la instalación, inicializa Google Cloud CLI con el siguiente comando:

    gcloud init

    Si usas un proveedor de identidad (IdP) externo, primero debes Acceder a gcloud CLI con tu identidad federada.

  • If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

    If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

Instala la biblioteca cliente

Go

go get cloud.google.com/go/texttospeech/apiv1

Java

If you are using Maven, add the following to your pom.xml file. For more information about BOMs, see The Google Cloud Platform Libraries BOM.

<dependencyManagement>   <dependencies>     <dependency>       <groupId>com.google.cloud</groupId>       <artifactId>libraries-bom</artifactId>       <version>26.70.0</version>       <type>pom</type>       <scope>import</scope>     </dependency>   </dependencies> </dependencyManagement>  <dependencies>   <dependency>     <groupId>com.google.cloud</groupId>     <artifactId>google-cloud-texttospeech</artifactId>   </dependency> </dependencies>

If you are using Gradle, add the following to your dependencies:

implementation 'com.google.cloud:google-cloud-texttospeech:2.78.0'

If you are using sbt, add the following to your dependencies:

libraryDependencies += "com.google.cloud" % "google-cloud-texttospeech" % "2.78.0"

If you're using Visual Studio Code, IntelliJ, or Eclipse, you can add client libraries to your project using the following IDE plugins:

The plugins provide additional functionality, such as key management for service accounts. Refer to each plugin's documentation for details.

Node.js

Antes de instalar la biblioteca, asegúrate de haber preparado tu entorno para el desarrollo en Node.js.

npm install @google-cloud/text-to-speech

Python

Antes de instalar la biblioteca, asegúrate de haber preparado tu entorno para el desarrollo en Python.

pip install --upgrade google-cloud-texttospeech

Lenguajes adicionales

C#: Sigue las instrucciones de configuración de C# en la página de bibliotecas cliente y, luego, visita la documentación de referencia de Text-to-Speech para .NET.

PHP: Sigue las instrucciones de configuración de PHP en la página de bibliotecas cliente y, luego, visita la documentación de referencia de Text-to-Speech para PHP.

Ruby: Sigue las instrucciones de configuración de Ruby en la página de bibliotecas cliente y, luego, visita la documentación de referencia de Text-to-Speech para Ruby.

Crea datos de audio

Ahora puedes usar Text-to-Speech para crear un archivo de audio de voz humana sintética. Usa el siguiente código para enviar una solicitud synthesize a la API de Text-to-Speech.

Go

 // Command quickstart generates an audio file with the content "Hello, World!". package main  import ( 	"context" 	"fmt" 	"log" 	"os"  	texttospeech "cloud.google.com/go/texttospeech/apiv1" 	"cloud.google.com/go/texttospeech/apiv1/texttospeechpb" )  func main() { 	// Instantiates a client. 	ctx := context.Background()  	client, err := texttospeech.NewClient(ctx) 	if err != nil { 		log.Fatal(err) 	} 	defer client.Close()  	// Perform the text-to-speech request on the text input with the selected 	// voice parameters and audio file type. 	req := texttospeechpb.SynthesizeSpeechRequest{ 		// Set the text input to be synthesized. 		Input: &texttospeechpb.SynthesisInput{ 			InputSource: &texttospeechpb.SynthesisInput_Text{Text: "Hello, World!"}, 		}, 		// Build the voice request, select the language code ("en-US") and the SSML 		// voice gender ("neutral"). 		Voice: &texttospeechpb.VoiceSelectionParams{ 			LanguageCode: "en-US", 			SsmlGender:   texttospeechpb.SsmlVoiceGender_NEUTRAL, 		}, 		// Select the type of audio file you want returned. 		AudioConfig: &texttospeechpb.AudioConfig{ 			AudioEncoding: texttospeechpb.AudioEncoding_MP3, 		}, 	}  	resp, err := client.SynthesizeSpeech(ctx, &req) 	if err != nil { 		log.Fatal(err) 	}  	// The resp's AudioContent is binary. 	filename := "output.mp3" 	err = os.WriteFile(filename, resp.AudioContent, 0644) 	if err != nil { 		log.Fatal(err) 	} 	fmt.Printf("Audio content written to file: %v\n", filename) } 

Java

// Imports the Google Cloud client library import com.google.cloud.texttospeech.v1.AudioConfig; import com.google.cloud.texttospeech.v1.AudioEncoding; import com.google.cloud.texttospeech.v1.SsmlVoiceGender; import com.google.cloud.texttospeech.v1.SynthesisInput; import com.google.cloud.texttospeech.v1.SynthesizeSpeechResponse; import com.google.cloud.texttospeech.v1.TextToSpeechClient; import com.google.cloud.texttospeech.v1.VoiceSelectionParams; import com.google.protobuf.ByteString; import java.io.FileOutputStream; import java.io.OutputStream;  /**  * Google Cloud TextToSpeech API sample application. Example usage: mvn package exec:java  * -Dexec.mainClass='com.example.texttospeech.QuickstartSample'  */ public class QuickstartSample {    /** Demonstrates using the Text-to-Speech API. */   public static void main(String... args) throws Exception {     // Instantiates a client     try (TextToSpeechClient textToSpeechClient = TextToSpeechClient.create()) {       // Set the text input to be synthesized       SynthesisInput input = SynthesisInput.newBuilder().setText("Hello, World!").build();        // Build the voice request, select the language code ("en-US") and the ssml voice gender       // ("neutral")       VoiceSelectionParams voice =           VoiceSelectionParams.newBuilder()               .setLanguageCode("en-US")               .setSsmlGender(SsmlVoiceGender.NEUTRAL)               .build();        // Select the type of audio file you want returned       AudioConfig audioConfig =           AudioConfig.newBuilder().setAudioEncoding(AudioEncoding.MP3).build();        // Perform the text-to-speech request on the text input with the selected voice parameters and       // audio file type       SynthesizeSpeechResponse response =           textToSpeechClient.synthesizeSpeech(input, voice, audioConfig);        // Get the audio contents from the response       ByteString audioContents = response.getAudioContent();        // Write the response to the output file.       try (OutputStream out = new FileOutputStream("output.mp3")) {         out.write(audioContents.toByteArray());         System.out.println("Audio content written to file \"output.mp3\"");       }     }   } }

Node.js

Antes de ejecutar el ejemplo, asegúrate de haber preparado tu entorno para el desarrollo en Node.js.

// Imports the Google Cloud client library const textToSpeech = require('@google-cloud/text-to-speech');  // Import other required libraries const {writeFile} = require('node:fs/promises');  // Creates a client const client = new textToSpeech.TextToSpeechClient();  async function quickStart() {   // The text to synthesize   const text = 'hello, world!';    // Construct the request   const request = {     input: {text: text},     // Select the language and SSML voice gender (optional)     voice: {languageCode: 'en-US', ssmlGender: 'NEUTRAL'},     // select the type of audio encoding     audioConfig: {audioEncoding: 'MP3'},   };    // Performs the text-to-speech request   const [response] = await client.synthesizeSpeech(request);    // Save the generated binary audio content to a local file   await writeFile('output.mp3', response.audioContent, 'binary');   console.log('Audio content written to file: output.mp3'); }  await quickStart();

Python

Antes de ejecutar el ejemplo, asegúrate de haber preparado tu entorno para el desarrollo en Python.

"""Synthesizes speech from the input string of text or ssml. Make sure to be working in a virtual environment.  Note: ssml must be well-formed according to:     https://www.w3.org/TR/speech-synthesis/ """ from google.cloud import texttospeech  # Instantiates a client client = texttospeech.TextToSpeechClient()  # Set the text input to be synthesized synthesis_input = texttospeech.SynthesisInput(text="Hello, World!")  # Build the voice request, select the language code ("en-US") and the ssml # voice gender ("neutral") voice = texttospeech.VoiceSelectionParams(     language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL )  # Select the type of audio file you want returned audio_config = texttospeech.AudioConfig(     audio_encoding=texttospeech.AudioEncoding.MP3 )  # Perform the text-to-speech request on the text input with the selected # voice parameters and audio file type response = client.synthesize_speech(     input=synthesis_input, voice=voice, audio_config=audio_config )  # The response's audio_content is binary. with open("output.mp3", "wb") as out:     # Write the response to the output file.     out.write(response.audio_content)     print('Audio content written to file "output.mp3"')

¡Felicitaciones! Enviaste tu primera solicitud a Text-to-Speech.

¿Cómo fue?

Limpia

Sigue estos pasos para evitar que se apliquen cargos a tu cuenta de Google Cloud por los recursos que usaste en esta página.

¿Qué sigue?