Beginnings in Golang and AWS – Part IV- Events, Lambda and Transcribe

Introduction

The previous posts have taken us through the process of creating a Go executable for uploading a file to S3. We’ll now focus on the next stage of our project. Namely, creating a Transcribe job automatically when an mp4 file is dropped into an S3 bucket.

During these posts, we’ll be covering our code, S3 Events, Lambda, Cloudwatch and Transcribe. These areas will include, amongst others, the CreateObject event, subscriptions, handlers, marshalling, creating a reusable package, logging, reference date format, string slices, and a bit of a deeper look into structs.

Goal

Let’s recap our target by the end of this group of blogs. We want to setup a configuration that responds to an mp4 file being placed into an S3 bucket and runs code that will take the information, including the key, and from this create a job in Amazon Transcribe. Because our code will be running remotely, we also want to have some way to log information during execution, such as an action being undertaken or an error if one has occurred.

Our Code

As before, let’s start with our code, and then break it down.

Imports

We’re using several other packages in this code, some of which we’ve already used.

  • context
    • We will be using the context package, and particularly the Context struct as part of our Lambda function. This allows our Lambda function to obtain metadata from AWS Lambda. Although not per se required, it’s interesting to cover the type of information available.
  • json
    • implements encoding and decoding of JSON.
  • fmt
    • input and output functions, such as Printf
  • log
    • we use log to provide formatted output which will be used by Cloudwatch
  • strconv
    • is used in this project to allow us to perform some formatting on time and date information
  • time
    • for displaying and measuring date and time information
  • github.com/aws/aws-lambda-go/events
    • this package is split into separate Go files, representing the various AWS services which support events.
  • github.com/aws/aws-lambda-go/lambda
    • functions, primarily for dealing with lambda handlers
  • github.com/aws/aws-sdk-go/aws
    • the generic aws package
  • github.com/aws/aws-sdk-go/aws/session
    • used for creating session clients and storing configuration information about the session
  • github.com/aws/aws-sdk-go/service/transcribeservice
    • this package is used for our operations involving the Transcribe service.

GUID function

The purpose of this function is to generate a unique identifier that can be used for our Transcribe’s job number. I chose an arbitrary format for this.

The function introduces us for the first to the time package and two of its functions, Parse and Since.
From an operational point of view, Parse is used to decode a string and cast it into a time object. Since provides information on the period of time that has elapsed since a given date/time. These on their own are fairly straightforward to understand. Then we go onto reference date/time format…

Reference Date/Time Format

One area where Go differs from any other language I’ve worked with to date is on how it deals with parsing and formatting dates and times. Instead of using classic identifiers (such as hh, mmm, ss), it uses an actual reference based format to indicate how it should be interpreted. Confused? I was!
In we look at the code for the time.format package, we can see a set of constants that are used to define these reference points. The comments on the right hand side are the actual values associated with it.

Let’s say we have a string 01-01-1970, aka 1 January 1970. We want Go to take this string and covert it to a Time object. The interpreter needs to know what represents what though.
Looking the list above

01 (our day) uses as its indicator 02
01 (our month) uses as its indicator 01
1970 (our year) uses as its indicator 2006

So our parsing string (including the dashes) for 01-01-1970 is 02-01-2006

Back to the remainder of our GUID function code :-

The time.Parse function takes as input the layout format and the string to be parsed. Now when we look at this code again, it starts to make sense:

Then, we use the ad variable as a parameter in the function time.Since, assigning strsince to value of the number of nanoseconds since that moment.

When converting the result to a string, we specify that the number should be represented as base 10 (aka decimal)

String Slices

Now we’re going to format the results of strsince into a “Windowsesque” GUID format. To do this we’re going to be using substrings with additional formatting characters.

Here’s what’s happening:

  • The value of strsince will be a 19 digit number. In my code I wanted to make it four blocks of aka characters (i.e. 20 characters)
  • For the above, a zero is added onto the beginning of the string.
  • We now get into how Go deals with creating a string slice (aka substring). Go is different from archetypal formats you might have seen for creating a substring.
  • There is no direct substring function, we refer to the string within square braces, like the array format.
    • BUT instead of a [startindex:lastindex] format (with 0 being the first item), Go uses [startindex:lastcharacternumber]
  • For example:

Does not give us a substring of 5678

This produces the string 567

Index 5 is the number 5
Character 8 is 7

When we use the concatenation above, it will result in our forthcoming Transcribe jobs having a name of the following type:-

Conclusion
In this post, we’ve covered the various packages that we’ll be using, the reference date/time format, string slices, and string formatting.

In the next blog, we’re going to kick into S3 events and Lambda.

thanks for reading! Feedback always welcome. 🙂

cheersy,

Tim

Share

Leave a Reply

Your email address will not be published. Required fields are marked *