Audio transcription and generation
Transcribe spoken language from audio files and create spoken audio using AI-generated speech
Use cases
- Automatically trigger transcription when new audio files are uploaded.
- Save and summarize transcriptions using a Squid AI Agent.
- Create an audible persona for your AI agent.
Run transcription
To transcribe a file, use the Squid Client SDK.
The Squid AI Audio client requires admin access to your Squid resources. It should only be used in a secure environment where you can safely provide your Squid API key, such as the Squid backend.
To transcribe a file, use Squid AI Audio's transcribe() method, passing the audio file data:
- TypeScript
- Python
const fileName = 'myAudioFile.mp3';
const audioBlobAndFilename = {
audioBlob, // provide your audio as a Blob
fileName,
};
const transcription = await squid.ai().audio().transcribe(audioBlobAndFilename, {
modelName: 'whisper-1',
});
transcription = await squid.ai().audio().transcribe(
audio_data, # provide your audio as bytes
'myAudioFile.mp3',
'audio/mpeg',
options={'modelName': 'whisper-1'},
)
The transcribe() method also accepts an optional options parameter for customizing your transcription. More information can be found in the reference documentation.
Create audio files
To create AI-generated audio files, use Squid AI Audio's createSpeech() method. This method takes an input string and an options parameter, which is used to customize the audio file. The method returns a promise that resolves to the generated audio file.
- TypeScript
- Python
const audioFile = await squid.ai().audio().createSpeech(
'Say hello to all users like a pirate would say hello.',
{ modelName: 'tts-1' },
);
audio_file = await squid.ai().audio().create_speech(
'Say hello to all users like a pirate would say hello.',
{'modelName': 'tts-1'},
)