Text-To-Speech using Amazon Polly in PHP

Do you ever want a program that can convert text to speech in PHP? It’d be something like uploading your speech in text format and in return, you will get an audio file of the converted speech.

Amazon Polly is a cloud service that allows us to convert text into lifelike speech. Additionally, Amazon Polly delivers ground-breaking improvements in speech quality through a new machine-learning approach. They offer customers the most natural and human-like text-to-speech voices possible.

Benefits of using Amazon Polly include:

  • High quality
  • Low latency
  • Support for a large portfolio of languages and voices
  • Cost-effective
  • Cloud-based solution

There are several applications where Amazon Polly can be useful. Some of them are – applications such as newsreaders, games, eLearning platforms, applications for visually impaired people, etc. You may read more about this service in their documentation.

In this article, I show you how to convert text to speech with Amazon Polly and PHP.

Amazon Polly Console

If you don’t want to build a PHP application, then you can use the Polly console directly. You can access it by login into your AWS account. After login, head over to the Amazon Polly console. On this page, you will get the options for adding speech in plain text or SSML format. You can choose the region, voice id and listen to a speech or even download it in MP3 format.

Polly Console

Using the console is one option if you are the administrator and don’t want to share account credentials with anyone. But what if you want to build an application that does the exact task of converting text to speech and allows you to download the MP3 of converted speech.

Text-To-Speech using Amazon Polly and PHP

To build the PHP application for Amazon Polly you first need to get your AWS security credentials. You can obtain it by login into the AWS account and then clicking on ‘My Security Credentials’.

AWS Credentials

Once you get your credentials, install the AWS SDK for PHP library using the Composer. Run the below command for installation of the library.

composer require aws/aws-sdk-php

Upon installation, build a form that allows the user to upload a text file and send it to the server for processing it.

<form method="post" enctype="multipart/form-data">
    <p><input type="file" name="file" /></p>
    <button type="submit" name="submit">Submit</button>
</form>

As mentioned previously, the user can pass speech either in plain text or Speech Synthesis Markup Language(SSML). I prefer to use SSML which allows us to control generated speech out of the text provided. Using SSML, we can include a pause within the text, change speech rate, emphasize specific words or phrases, etc. Read more about this on AWS documentation. Basically, you need to use tags provided by SSML in your text.

As an example, I am building a plain text file with SSML tags. My text file is as follows.

<speak>
    <prosody rate='medium'>Hi, I am Sajid. I do blogging at Artisans Web.</prosody>
</speak>

Now on submission of a form, uploaded text files would be sent to AWS cloud service, and in return, MP3 files will download automatically. Write the code for it as follows.

<?php
require_once "vendor/autoload.php";
 
use Aws\Polly\PollyClient;
 
if ( isset($_POST['submit']) ) {
 
    try {
        $config = [
            'version' => 'latest',
            'region' => 'YOUR_AWS_REGION',
            'credentials' => [
                'key' => 'ACCESS_KEY_ID',
                'secret' => 'SECRET_ACCESS_KEY',
                ]
            ];
             
        $client = new PollyClient($config);
 
        $args = [
            'OutputFormat' => 'mp3',
            'Text' => file_get_contents($_FILES['file']['tmp_name']),
            'TextType' => 'ssml',
            'VoiceId' => 'Matthew', //pass preferred voice id here
        ];
 
        $result = $client->synthesizeSpeech($args);
 
        $resultData = $result->get('AudioStream')->getContents();
 
        header('Content-length: ' . strlen($resultData));
        header('Content-Disposition: attachment; filename="text-to-speech.mp3"');
        header('X-Pad: avoid browser bug');
        header('Cache-Control: no-cache');
        echo $resultData;
    } catch(Exception $e) {
        echo $e->getMessage();
    }
}
?>

Replace the placeholders with the actual values. In the above code, I have passed the value for ‘VoiceId’ as ‘Matthew’. ‘Matthew’ is a voice id that adds a speech accent in the English (US) (en-US) language. Of course, the user can choose any preferred voice id. You will get the list of available voices on their Voices in Amazon Polly page.

It’s all about converting text to speech using Amazon Polly in PHP. Go ahead and test it. On uploading your file you will get the MP3 audio file of your converted speech. I would like to hear your thoughts or suggestions in the comment section below.

Related Articles

If you liked this article, then please subscribe to our YouTube Channel for video tutorials.

2 thoughts on “Text-To-Speech using Amazon Polly in PHP

Leave a Reply

Your email address will not be published. Required fields are marked *