Text-To-Speech using Amazon Polly in PHP

Amazon Polly is a nicely done cloud service that allows us to convert text into lifelike speech. Additionally, Amazon Polly delivers ground-breaking improvements in speech quality through a new machine learning approach. They offer customers the most natural and human-like text-to-speech voices possible.

Benefits of using Amazon Polly include:

  • High quality
  • Low latency
  • Support for a large portfolio of languages and voices
  • Cost-effective
  • Cloud-based solution

There are several applications where Amazon Polly can be useful. Some of them are – applications such as newsreaders. games, eLearning platforms, applications for visually impaired people, etc. You may read more about this service on their documentation.

In this article, I show you how to convert text to speech with Amazon Polly and PHP.

Amazon Polly Console

If you don’t want to build a PHP application, then you can use Polly console directly. You can access it by login to your AWS account. After login, head over to Amazon Polly console. On this page, you will get the options for adding speech in plain text or SSML format. You can choose the region, voice id and listen to a speech or even download it in MP3 format.

Polly Console

Using the console is one option if you are the administrator and don’t want to share account credentials with anyone. But what if you want to build an application that does the exact task of converting text to speech and allow you to download MP3 of converted speech.

Text-To-Speech using Amazon Polly and PHP

To build the PHP application for Amazon Polly you first need to get your AWS security credentials. You can obtain it by login into the AWS account and then click on ‘My Security Credentials’.

AWS Credentials

Once you get your credentials, install the AWS SDK for PHP using the Composer. Run the below command for installation of library.

composer require aws/aws-sdk-php

Upon installation, build a form that allows the user to upload a file and send it to the server for processing it.

<form method="post" enctype="multipart/form-data">
    <input type="file" name="file" />
    <button type="submit" name="submit">Submit</button>
</form>

As mentioned previously, the user can pass speech either in plain text or SSML format. I prefer to use SSML which allows us to control over generated speech out of the text provided. Using SSML, we can include a pause within the text, change speech rate, emphasize specific words or phrases, etc. Read more about this on Using SSML. Basically, you need to use tags provided by SSML in your text.

For this tutorial, I am building a plain text file with SSML tags. My text file is as follows.

dummy.txt

<speak>
    <prosody rate='medium'>Hi, I am Sajid. I do blogging at Artisans Web.</prosody>
</speak>

Next, on submission of form uploaded text file would be sent to AWS cloud service and in return MP3 files will download automatically. Write the code for it as follows.

<?php
require_once "vendor/autoload.php";

use Aws\Polly\PollyClient;

if ( isset($_POST['submit']) ) {

    try {
        $config = [
            'version' => 'latest',
            'region' => 'YOUR_AWS_REGION',
            'credentials' => [
                'key' => 'ACCESS_KEY_ID',
                'secret' => 'SECRET_ACCESS_KEY',
                ]
            ];
            
        $client = new PollyClient($config);

        $args = [
            'OutputFormat' => 'mp3',
            'Text' => file_get_contents($_FILES['file']['tmp_name']),
            'TextType' => 'ssml',
            'VoiceId' => 'Matthew', //pass preferred voice id here
        ];

        $result = $client->synthesizeSpeech($args);

        $resultData = $result->get('AudioStream')->getContents();

        header('Content-length: ' . strlen($resultData));
        header('Content-Disposition: attachment; filename="text-to-speech.mp3"');
        header('X-Pad: avoid browser bug');
        header('Cache-Control: no-cache');
        echo $resultData;
    } catch(Exception $e) {
        echo $e->getMessage();
    }
}
?>

Replace the placeholders with the actual values. In the above code, I have passed the value for ‘VoiceId’ is ‘Matthew’. ‘Matthew’ is a voice id which adds a speech accent in English (US) (en-US) language. Of course, the user can choose any preferred voice id. You will get the list of available voices on their Voices in Amazon Polly page.

Go ahead and test it. On uploading your text file you will get the MP3 audio file of your text.

It’s all about converting text to speech using Amazon Polly in PHP. I would like to hear your thoughts or suggestions in the comment section below.

Related Articles

If you liked this article, then please subscribe to our YouTube Channel for video tutorials.

1 thought on “Text-To-Speech using Amazon Polly in PHP

Leave a Reply

Your email address will not be published. Required fields are marked *