Beginnings in Golang and AWS – Part VI – Events, Lambda and Transcribe (cont’d)

Introduction

In today’s post, we’re going to be looking at the code within the handler’s function. As part of this, we’ll be covering using structs and JSON together, logging to CloudWatch, marshaling, processing of S3 event data, and how to start a Transcribe job. It’s a bit of a longer post today as we’ll go through the entire code in the function.

The Story So Far…

At this point, our handler has been triggered by a file being placed in our S3 bucket to which there is an event subscription for CreateObject (more about this in the next blog). We’ve received the event information, which is placed in our variable S3Event, a struct. We have the information we need for further processing of S3Event and can proceed immediately with it. However, it’s worthwhile spending a couple of minutes looking at how Go processes the information received to place it into the variable.

A Bit About JSON & Go

Go does not have a native parsing mechanism for JSON data that allows dynamic generation of a struct based on the content (think PowerShell’s ConvertFrom-JSON cmdlet for example). Instead, (unless you want to go into the murky world of reflection and creating maps) you are expected to have some degree of awareness of the schema of data being received. Go still handles the conversion process, but it looks to you for information on how to map content. This is done in struct definitions simply by indicating the location of the data to map to, of the format json:"location".

In our situation, the definition of an S3Event for example (an array of type S3EventRecord), maps to the “Records” section of the event data (see below). When information is nested (i.e. a subsection of another), we just make sure our struct matches this. An example of this is the EventVersion string, contained in the S3EventRecord struct, which is mapped to “eventVersion” in the JSON file.

An extract of the struct configuration and JSON data is below. You can examine the complete definition of an S3Event from the aws-lambda-go sdk, here:

Here’s the first two levels of our struct…

…and an extract from the S3Data JSON

Notice how they match up. The parser uses this information to populate the struct properties.

One aspect that is quite nice about the mapping process is that providing the top level outline structure matches the data being received, the entire substructure does not need to be present. You have complete control over which properties should be mandatory or not.

Naturally, it makes sense to have your struct defined to represent the “full” content schema of the JSON data, but content being received into this struct does not need to be as complete. This can help when JSON content may contain less fields, yet still come from the same event source.

If you’d like to spend some more time looking how a Go struct can be designed from a JSON source, I’d recommend taking a look at:

https://mholt.github.io/json-to-go/

Code Description

Let’s move onto the code itself now. Here’s our entire function as a reminder, before we burrow down into what it’s doing.

As mentioned previously, we’re going to use Cloudwatch for logging, using the aptly named log package. The first logging we’ll do is that of the S3Event data.

Initially, we need to marshal the data. Marshaling takes an interface (our struct in this case) and returns a JSON encoding of it.

With this done, we use the string function, which in turn converts our byte data to a string.

We then log this information to CloudWatch. A nice touch of CloudWatch is that it picks up that the string data is JSON and formats it nicely for us. You’ll see this firsthand in the final part of this series.

We then need to iterate through each record entry in the event data. We use for to do this, assigning the record variable on each interaction from s3Event.Records. We do not need to set an initial value in the declaration, hence the “_”.

Inside the loop, we set s3 to the value of the s3 branch, and from this we log the key referred to in the event. We will use this later as a parameter for our Transcribe job.

Any time we want to perform operations with another AWS service, a session needs to be created. We define the parameters that the session will use (a region of eu-west-1, and using my own development profile), then create it using transcribeservice.New, a function from the github.com/aws/aws-sdk-go/service/transcribeservice package.

Next, a check is made to ensure that a successful session was established. This is easily verified by ensuring that a non nil value was returned to our variable, transcriber. If a nil value was returned, we exit the function. We log the result irrespective to CloudWatch.

Now we want to get our parameters set before starting a transcription job.

  • A random job name needs be created, so the GUID function we created earlier is used to populate the variable jobname.
  • We set mediafileuri, using string expansion with the the bucket name and key name that we got from the S3EventData
  • Mediaformat is set to mp4
  • Lastly, we set a language code of en-US for the languagecode variable.

 

We define StrucMedia, which is of type transcribeservice.Media. One thing to be mentioned here is that we pass in a pointer to mediafileuri, not a string. This is because the MediaFileUri definition in the transcribeservice.Media struct specifies via *string that it expects to receive a pointer.

As such, our definition is as below.

 

Then, we invoke the StartTranscriptionJob function. This takes as its parameter a pointer to a StartTranscriptionJobInput struct, whose properties we set within it.  Lastly, a completion message is logged to CloudWatch.

Conclusion

In this post, we’ve covered the code within our lambda function and in doing so have covered how structs and json interoperate, logging to CloudWatch, marshaling, processing of S3 event data and finally how to create a Transcribe job.

We’re nearly there. In the next blog, we’ll run through the entire process of getting our code in S3, creating the lambda function, creating the event subscription, and triggering our function.

thanks for reading! Feedback always welcome. 🙂

cheersy,

Tim

Share

Beginnings in Golang and AWS – Part V – Events, Lambda and Transcribe (cont’d)

Introduction

In todays post we’ll cover an event handler that our Lambda function is going to use when it receives notification of an MP4 file being dropped in our S3 bucket from a subscribed S3 event. This will in turn cover the Context and Event objects. Lastly, we’ll look at the one specific to our function, S3Event.

Our Code

Because we’re only covering the handler and background info on the same and events, the code within the function is removed for this post.

 

Lambda Function Handlers for Go

For building a Lambda function handler in Go, you have a degree of scope with regards to the input and output parameters you use, provided they conform to, per the latest documentation, the following rules.

  • The handler may take between 0 and 2 arguments. If there are two arguments, the first argument must implement context.Context.
  • The handler may return between 0 and 2 arguments. If there is a single return value, it must implement error. If there are two return values, the second value must implement error.

Although not strictly required for our function, Handler, we are using two parameters. The first, per the requirements of above, will be the implementation of context.Context. The second is the actual event data.

Context Object

The service which calls your Lambda function carries metadata, which the developer may find useful to view, or use. This is where the Context object comes into play. When your function signature contains a parameter for this, information is passed into this. There’s a plethora of information that can be available, some of which are service specific and others standard. An example of one of the latter is the AwsRequestID, a unique identifier that can be used as a reference later should AWS support be required. The complete documentation for the Context object is available here:

Event Data

This is the core information passed from the service to the function. It’s format is completely based on said service. In order to manage this, the Go SDK features interfaces for most event sources. In our case, this is the events.s3Event one.

If you wish to look at it’s construction in more detail, you can find it in the s3.go file, located within the events directory of the aws-lambda-go package.

We’ll be setting up an event subscription so that once an MP4 file is dropped into our S3 bucket, it invokes the Lambda function. What does the typical S3 event data our function would be passed look like? Look below.

Here’s the type of information we could expect to see once we have our Lambda function fully in place and an event subscription created to our S3 bucket. More on the latter later.

There is a lot of information there, but the key part of information passed that we’ll be using is contained within the object section.

Conclusion
In this post, we’ve covered the basics of a Lambda event handler for Go, the valid signatures that can be used with it and their purpose. We’ve also looked at the typical information that we can expect to be passed into our S3 event.

In the next blog, we’ll dig deeper into the function and the code within.

thanks for reading! Feedback always welcome. 🙂

cheersy,

Tim

Share