Speech-To-Text using Amazon Transcribe in PHP

Recently I was working on a project where I was introduced with the Amazon Transcribe service. We wanted to add the feature of converting speech to text in our application. And we found Amazon Transcribe is the best fit for it. Amazon Transcribe uses a deep learning process called automatic speech recognition (ASR) to convert speech to text quickly and accurately.

In this article, I show you how to convert speech to text using Amazon Transcribe in PHP. We will use the official AWS SDK package built for PHP programming.

Getting Started

In order to get started, you should have AWS account. Login to your AWS account and grab your security credentials. We will require these credentials in the later part of the tutorial.

AWS Credentials

After this, install the AWS SDK PHP library using the Composer command:

composer require aws/aws-sdk-php

To convert the speech to text, you need your media files ready. Allowed media formats are mp3 | mp4 | wav | flac. In addition to this, there are several languages available to convert your speech into text. You can see all supported languages and basic things about parameters on their documentation. The user can convert the speech of supported languages into the text.

While integrating Amazon Transcribe in the application, we have to build the flow as follows:

  • Upload the media file on S3 Bucket.
  • Instantiate an Amazon Transcribe Client.
  • Start a Transcription job of Amazon Transcribe by passing media URL of S3 and unique job id.
  • Amazon Transcribe service may take a few minutes to finish the translation process. So wait for it.
  • Download the text file after AWS completes the transcription job.

Let’s see how to handle this flow with the actual PHP code.

Speech-To-Text using Amazon Transcribe in PHP

At first, create an HTML form where users can browse the media file and hit the submit button. Upon submission, we take the media file for further processing and finally send a translation text back to the browser in the ‘.txt’ file format.

<form method="post" enctype="multipart/form-data">
    <p><input type="file" name="audio" accept="audio/*,video/*" /></p>
    <input type="submit" name="submit" value="Submit" />
</form>

On the PHP end, you have to send the media file to the AWS service for processing so include the AWS environment as follows.

<?php
require 'vendor/autoload.php';
 
use Aws\S3\S3Client;
use Aws\TranscribeService\TranscribeServiceClient;

// submission code

After this, upload the media file on the S3 bucket. And grab the S3 URL of the uploaded media.

if ( isset($_POST['submit']) ) {

    $arr_mime_types = ['audio/wav', 'audio/mpeg', 'video/mp4', 'audio/x-flac'];
    if ( !in_array($_FILES['audio']['type'], $arr_mime_types) ) {
        die('File type is not allowed');
    }

    $region = 'PASS_REGION';
    $access_key = 'ACCESS_KEY';
    $secret_access_key = 'SECRET_ACCESS_KEY';

    // Instantiate an Amazon S3 client.
    $s3 = new S3Client([
        'version' => 'latest',
        'region'  => $region,
        'credentials' => [
            'key'    => $access_key,
            'secret' => $secret_access_key
        ]
    ]);

    $bucketName = 'PASS_BUCKET_NAME';
    $key = basename($_FILES['audio']['name']);

    // upload file on S3 Bucket
    try {
        $result = $s3->putObject([
            'Bucket' => $bucketName,
            'Key'    => $key,
            'Body'   => fopen($_FILES['audio']['tmp_name'], 'r'),
            'ACL'    => 'public-read',
        ]);
        $audio_url = $result->get('ObjectURL');

        // Amazon Transcribe service start here
    }  catch (Exception $e) {
        echo $e->getMessage();
    }
}

Make sure to replace the placeholders with the actual values. Next, we need to pass the uploaded media URL to the Amazon Transcribe service. It also requires a unique job id which I will create using the uniqid() method.

// Create Amazon Transcribe Client
$awsTranscribeClient = new TranscribeServiceClient([
    'region' => $region,
    'version' => 'latest',
    'credentials' => [
        'key'    => $access_key,
        'secret' => $secret_access_key
    ]
]);

// Start a Transcription Job
$job_id = uniqid();
$transcriptionResult = $awsTranscribeClient->startTranscriptionJob([
        'LanguageCode' => 'en-US',
        'Media' => [
            'MediaFileUri' => $audio_url,
        ],
        'TranscriptionJobName' => $job_id,
]);

$status = array();
while(true) {
    $status = $awsTranscribeClient->getTranscriptionJob([
        'TranscriptionJobName' => $job_id
    ]);

    if ($status->get('TranscriptionJob')['TranscriptionJobStatus'] == 'COMPLETED') {
        break;
    }

    sleep(5);
}

// download the txt file

In the above code, we instantiate Amazon Transcribe Client and start the Transcription job. It may take a few mins to complete the translation process. I have handled it using the while loop and sleep() method. I am checking whether the process is completed after every 5 seconds and breaking the loop upon finishing the process.

You can see this Transcription process on the AWS dashboard under the Amazon Transcribe->Transcription jobs as shown in the screenshot below.

Transcription Job

Finally, download the file using the code below.

$url = $status->get('TranscriptionJob')['Transcript']['TranscriptFileUri'];
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HEADER, false);
$data = curl_exec($curl);
if (curl_errno($curl)) {
    $error_msg = curl_error($curl);
    echo $error_msg;
}
curl_close($curl);
$arr_data = json_decode($data);

// download a file
$file = $job_id.".txt";
$txt = fopen($file, "w") or die("Unable to open file!");
fwrite($txt, $arr_data->results->transcripts[0]->transcript);
fclose($txt);

header('Content-Description: File Transfer');
header('Content-Disposition: attachment; filename='.basename($file));
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
header('Content-Length: ' . filesize($file));
header("Content-Type: text/plain");
readfile($file);
exit();

This code sends the generated text file to the browser so the user can download it.

Final Sample Code

The code written above is in chunks. If you want a whole code together then it is as follows.

<?php
set_time_limit(0);

require 'vendor/autoload.php';
 
use Aws\S3\S3Client;
use Aws\TranscribeService\TranscribeServiceClient;

if ( isset($_POST['submit']) ) {

    $arr_mime_types = ['audio/wav', 'audio/mpeg', 'video/mp4', 'audio/x-flac'];
    if ( !in_array($_FILES['audio']['type'], $arr_mime_types) ) {
        die('File type is not allowed');
    }

    $region = 'PASS_REGION';
    $access_key = 'ACCESS_KEY';
    $secret_access_key = 'SECRET_ACCESS_KEY';

    // Instantiate an Amazon S3 client.
    $s3 = new S3Client([
        'version' => 'latest',
        'region'  => $region,
        'credentials' => [
            'key'    => $access_key,
            'secret' => $secret_access_key
        ]
    ]);

    $bucketName = 'PASS_BUCKET_NAME';
    $key = basename($_FILES['audio']['name']);

    // upload file on S3 Bucket
    try {
        $result = $s3->putObject([
            'Bucket' => $bucketName,
            'Key'    => $key,
            'Body'   => fopen($_FILES['audio']['tmp_name'], 'r'),
            'ACL'    => 'public-read',
        ]);
        $audio_url = $result->get('ObjectURL');

        // Create Amazon Transcribe Client
        $awsTranscribeClient = new TranscribeServiceClient([
            'region' => $region,
            'version' => 'latest',
            'credentials' => [
                'key'    => $access_key,
                'secret' => $secret_access_key
            ]
        ]);

        // Start a Transcription Job
        $job_id = uniqid();
        $transcriptionResult = $awsTranscribeClient->startTranscriptionJob([
                'LanguageCode' => 'en-US',
                'Media' => [
                    'MediaFileUri' => $audio_url,
                ],
                'TranscriptionJobName' => $job_id,
        ]);

        $status = array();
        while(true) {
            $status = $awsTranscribeClient->getTranscriptionJob([
                'TranscriptionJobName' => $job_id
            ]);

            if ($status->get('TranscriptionJob')['TranscriptionJobStatus'] == 'COMPLETED') {
                break;
            }

            sleep(5);
        }

        $url = $status->get('TranscriptionJob')['Transcript']['TranscriptFileUri'];
        $curl = curl_init();
        curl_setopt($curl, CURLOPT_URL, $url);
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($curl, CURLOPT_HEADER, false);
        $data = curl_exec($curl);
        if (curl_errno($curl)) {
            $error_msg = curl_error($curl);
            echo $error_msg;
        }
        curl_close($curl);
        $arr_data = json_decode($data);

        // download a file
        $file = $job_id.".txt";
        $txt = fopen($file, "w") or die("Unable to open file!");
        fwrite($txt, $arr_data->results->transcripts[0]->transcript);
        fclose($txt);

        header('Content-Description: File Transfer');
        header('Content-Disposition: attachment; filename='.basename($file));
        header('Expires: 0');
        header('Cache-Control: must-revalidate');
        header('Pragma: public');
        header('Content-Length: ' . filesize($file));
        header("Content-Type: text/plain");
        readfile($file);
        exit();
    }  catch (Exception $e) {
        echo $e->getMessage();
    }
}
?>
<form method="post" enctype="multipart/form-data">
    <p><input type="file" name="audio" accept="audio/*,video/*" /></p>
    <input type="submit" name="submit" value="Submit" />
</form>

I hope you understand how to convert speech to text using Amazon Transcribe in PHP. Please share your thoughts and suggestions in the comment section below.

Related Articles

If you liked this article, then please subscribe to our YouTube Channel for video tutorials.

2 thoughts on “Speech-To-Text using Amazon Transcribe in PHP

  1. Would be great to see an example with a Queue like SQS. Let the user wait in front of a PHP script running sleep(5) in an endless loop is not very friendly.

Leave a Reply

Your email address will not be published. Required fields are marked *