Beginnings in Golang and AWS – Part II – Uploading to S3

Introduction

With the preambling and prerequisities of Part I out of the way, we can now begin working on writing some code to allow us to upload the MP4 file to an S3 bucket.

In this post, we’ll cover the format of a Go package, how to add packages to an installation of Go, the import statement, and lastly how we go about parsing command line options.

It might not seem a great deal of code, but there’s quite a lot of concepts covered here that are essential to understanding how Go works. We’ll then be primed for the final part of the series on S3, which will cover the rest of the code, compiling, running, and using this program.

Uploading a File to S3

I’ll show the complete code first, and then break it down into parts.

Code Breakdown

Let’s go through the code and get a feel for what’s happen here.

VERY important! Go is case sensitive. Capitalization is treated differently, both from a name interpretation, and also operational point of view.

The Package Declaration

Every Go program consists of one or more packages. For a program to run (as opposed to being a resource for another program), it requires a main package.

Define Packages to be Used

Multiple packages exist for Go, both as part of a default installation, and also from the community.

The import statement tells Go what packages (and consequently resources such as functions and types) are available to the program. We can either do separate import statements, or group them together like above.

Go has a default package directory setting for packages not included in the default installation, from which it attempts to find the package (typically ~/go/src).

For example, github.com/aws/aws-sdk-go/aws, referred to in the import statement above, is located at the following location under my home directory:

When you want to use a resource in a package, such as a function or type, you need to refer to it including the package name. So if we wanted to use the Printf function within the fmt package with write a message, an example of this would be:

fmt.Printf(“No Hello World today”)

Define the Main Function

The entry point for a file to be executable (as opposed to solely a resource) is the main() function. The code executed within the function, represented by the dots, is enclosed within { and } braces.

Configure Command Line Parameter Parsing

When we execute this program from the command line, we want to include parameters which will define both the s3 bucket we want to upload to, and also the source file. The values need to be parsed and assigned to variables. To make it easier, we also want to provide some help text for people running the program.

Several things happen with the above code, so let’s go through them.

  • Both the bucket and filename variables are defined. Go normally requires a variable and its type to be pre-declared before it can be used. However, it is possible to create a variable with its type, and assign a value to it by using  :=  Quite simply what this does is leave it to the right of the operator to provide and type and value information. In this case, it is using the result of the String function in the flag package.
  • We use the flag package. The flag package has functions that allow us to parse the command line. We use flag.String to define a string flag with specified name, default value, and usage string. The return value is a reference to the memory address (aka pointer) which stores the value of the flag.
  • The Parse function is called. This carries out the processing of the command line parameters. This function sets the values at the memory location referred to by bucket and filename
  • It’s worthwhile mentioning that the output that will be generated if help is requested on our program, once compiled, is defined in this code as well. We’ll see in the last part on the S3 Uploader just exactly how this works.
  • You also might be wondering why the function name is capitalized. This is because in order for a resource in a package to be used by another, the initial letter must be a capital one. This marks it as “exportable”, allowing its use elsewhere.

Conclusion

In this post, we’ve covered a lot of topics, such as how we can use existing packages with our go program, how packages are stored locally, the effect that using lower and uppercase letters can have, the requirements for a program in go, and the import statement. We also began to delve in assignment by inference, pointers, flags  and how we can parse them.

Wit these out of the way, we’re primed and pretty much tickety boo for the final part of the series on S3, which will cover the rest of the code, introducing further concepts and syntax, including compiling, running, and using this program.

Thanks for reading!

Share

When Marvin Gaye Met Amazon Transcribe & PowerShell – Automating Subtitle Creation – Part II

“I Heard it Through the Grape Van”

With the background set in the previous post for what we’ll be aiming to achieve, it’s time to move forward with getting things into gear.
Todays post covers how to upload the media file to s3, create the Transcribe job to process it, and finally download the results locally.

Quick Recap

This projects demonstrates the use of the AWS Transcribe service and PowerShell to create an SRT (subtitle) file from a media file.

Our project makes use of:

  • PowerShell Core
  • AWS PowerShell Core Cmdlets
  • AWS S3
  • AWS Transcribe Service
  • An MP4 video file

Prerequisites

Before going into the nitty gritty, you need to ensure all of the following are in place:

  • An existing AWS account. You can sign up for a Free Tier account here
  • Credentials and configuration are setup. See here for information on how to do this.
  • PowerShell Core is installed (available cross platform here)
  • The AWS PowerShell Net Core module is installed.
  • You’ve a ready-to-hand MP4 media file
  • You have cloned the repo for the project from either its source or your own fork

Sequence of Events to Transcribe the File

The order of events that need to happen is relatively straightforward:

  • Upload the file to S3
  • Create a Transcribe job
  • Wait for job to finish
  • Download the JSON file

Upload the file to S3

We’ll start out defining some variables and defaults to make things a bit easier, then the Write-S3Object cmdlet takes care of itself:

Create a Transcribe job

All Transcribe jobs have an associated job name associated with this. For this script, I’ve used the GUID class to create a unique one. We define this and the name of the results file that will be used when it’s downloaded from a completed job. Then the Start-TRSTranscriptionJob cmdlet is used to initiate the task. The $s3uri variable is used to tell Transcribe where to get the file it is to process.

Wait for job to finish

A basic loop is put in place which checks the status of the Transcribe job every five seconds. The loops continues until the job status changes from IN_PROGRESS, indicating either a failure or completion of it.

Download the JSON file

When a job has successfully executed, visible by its COMPLETED status, it stores the result in an s3 bucket of its own choice. The location is not in your own personal bucket, and has an expiry life. By querying the TranscriptFileUri property of the job status, we can get the location where it is stored. You’ve then got the choice of using the S3 Manager cmdlet for downloading the file, or alternatively (in this case), simply with Invoke-Webrequest.

Part III will cover the largest part of the process, converting the job results into the SRT file we’ll use with the original video.
Thanks for reading!

Share

When Marvin Gaye Met Amazon Transcribe & PowerShell – Automating Subtitle Creation – Part I

“I Heard it Through the Grape Van”

It’s been a while since my last blog, so I had to try and think of something a bit more eye-catching than the previous ones for the title. 🙂  That said, the title and heading are actually very accurate… #mysterious

These set of posts cover how one of the AWS services, Transcribe, can be used, in this case in combo with PowerShell, to create a subtitles file for any video, which can then be used for viewing. There’s quite a bit of content, so as mentioned, it’s being split across several posts.

Todays post provides a background to the main parts that will be used. These are two AWS services Amazon S3 and Amazon Transcribe, a subtitle file, and AWS Tools for PowerShell Core Edition, and a video of the legend himself, Marvin Gaye.

Amazon Transcribe/S3

Amongst the plethora of services AWS offer is Transcribe, or to be more precise, Amazon Transcribe. Part of AWS’s group of Machine Learning offerings, Transcribe’s role is fairly straightforward. Feed it a supported media file (FLAC, MP3, MP4 or WAV) from a bucket on S3 and it will process the file, endeavoring to provide as best as possible a transcription of it. Upon successful completion of a job, a JSON formatted file becomes available for download.

The file itself contains a summary of the conversion at its beginning:

Which is then followed by a breakdown of the job. This consists either of data about the next word identified (start and end time, the word, a ‘confidence’ rating from the service that it has correctly identified the word, and its classification…

…or if its found an appropriate place that would use punctuation.

Unlike the other formats supported, MPG4 can also (and usually does), consist of one of more additional streams than audio. Typically this will be video content, but it might also include additional streams for other audio (think different languages, or directors/producers comments for example) or subtitles.

Subtitle Files

At their core, subtitle files simply contain textual descriptions of the content of its accompanying video file. This is typically dialogue, but also other notifications, such as the type of music being played, or other intonations. Accompanying these are timespan indicators, which are used to match this information up with the video content.

The most common file format in use is the Subrip format, better recognised by its extension of SRT. These files are arranged in a manner similar to below:

Line by line respectively, these consist of :
  • The numeric counter identifying each sequential subtitle
  • Start and end time for the subtitle to be visible, separated by the marker you see.
  • The text itself, typically between one and two lines, and ideally restricted to a number of characters per line
  • A blank line indicating the end of this sequence.

Looking at the two different forms of text data in Transcribe and SRT format respectively, you’ll probably have already noticed that the former contains enough information that should allow, with a bit of transformation, the output to be in the latters.

AWS Tools for PowerShell Core Edition

PowerShell Core is Microsoft’s cross platform implementation of PowerShell and as such can pretty much run on any platform that has .NET Core installed on it. AWS provide a module for this platform, AWS Tools for PowerShell Core Edition. Consisting of, at present, 4136 cmdlets, it pretty much covers all of the broad spectrum of services available from the provider. Amongst these are the set of ones for the Transcribe service, ironically numbering only three.

Marvin Gaye

Needing no introduction whatsoever, the posts over the next day or so make use of an MP4 file of the legend singing I Heard it Through the Grapevine acapella. If you really feel the need to follow along exactly, then its fairly straightforward to find and download. It’s most definitely worth a listen in any case if you’ve not heard it already.

With all the background set, part II will kick in properly with getting setup for the script and the beginning of its implementation.

 

Share

Using PowerShell to get data for Microsoft Ignite

I was registered for Ignite yesterday (yipeeee!!), and decided to take a look at the session list. 

Navigation and search is a bit of a chore, so I set out to see if I could get the information I needed via PowerShell. If so, I was free to obtain whatever data wanted quickly.

Here’s what i came up with. After the script a couple of examples of post querying the data are given. Note that instead of querying the web services each time for data, I’ve just downloaded all the data, and query it locally. This isn’t really best practice, but (IMO) the low size of the dataset mitigates this to some extent.

Recommendations for improving or additions are more than welcome. 🙂 It will be posted to GitHub shortly.

 

JeffreyQuery

SpeakersQuery

Share

Using .NET Event Handlers in a PowerShell GUI

GUI development tools, such as PowerShellStudio, make it very easy to manage events for controls on our winforms.

Once the control is on the form, and we select it, click on the Events button (the lightning symbol), the Properties panel gives us a list of the events available for us to manage. However, events are not just restricted to controls. There’s a world of other events out there that we can use to interact with our winforms projects.

In this article, we’ll create a forms project that downloads the latest 64 bit antimalware definitions from Microsoft and updates a progress control to show how far the download is to completion, using methods and events from a .NET class.

Updates to the latest antimalware definitions can be obtained through http://go.microsoft.com/fwlink/?LinkID=87341&clcid=0x409 and a look through MSDN shows us that we can use the .NET WebClient class to carry out downloads programmatically.

To start this process, create a new forms project, and drag a progress bar, label, and button onto the form. Then set the properties of the controls as below. Note that properties with text controls will automatically be named for you if you set the text property first.

Label
Text : Progress
Name : labelProgress

Button
Text : Download
Name : buttonDownload

Progress Bar
Name : progressbarDownload

Here’s how my form looks.

Blog - Adding Events - Form Design

Once this is complete, we can begin writing the event code.

In our forms Load event, we create an instance of the System.Net.Webclient class. This is assigned to the script level variable, $webclient. This scope is required in order for the other parts of the solution to be able to process the object and its events.

The next two lines add event handlers for the DownloadProgressChanged and DownloadFileCompleted events. DownloadProgressChanged indicates a change in the state of the transfer with regards to the amount of content downloaded, whilst DownloadFileCompleted is fired on the completion of a download. The scriptblocks for these are $webclient_DownloadProgressChanged and $webclient_DownloadFileCompleted respectively.

The event handler for updating the progress of the download is written next:

To make it easier to read, $progressInfo is used for the rest of the code instead of $_. The variable contains the values given to us by the System.Net.DownloadProgressChangedEventArgs class instance that is passed into the handler.

The DownloadProgressChangedEventArgs class contains ProgressPercentage, BytesReceived, and TotalBytesToReceive properties. We use these for changing the progress meter value property, and also updating the text in the label below to show bytes received and the total size of the download.

The event handler for DownLoadFileCompleted is next:

When DownloadFileCompleted is fired, the label text is changed to indicate the download’s completion.

Lastly, the download button’s Click event is set to begin an asynchronous download of the antimalware definition.

Blog - Adding Events - Code

Our project code

And when we run the project and click on Download! We see this in action, with the progress bar being updated and the progress text below it also, using the code we wrote earlier.

Blog - Adding Events - Downloader Running

The downloader in action

This same methodology can be employed for using .NET events, creating an instance of the object, adding the event handler definition, and then the scriptblock code to be used.

You can find exported project code and the project files at my repository on GitHub, and a short video of the project in action on the powershell.amsterdam YouTube channel.

 

Share

URL Shortening

Love it or loathe it, URL shortening has been with us a while now and can certainly be handy. TinyURL are one such company to offer this service. Nicely for us, we do not need to register in order to use their API, and yet nicer still is that we can use it simply by entering a standard format of URL.

Before we see how we can use PowerShell to automate this process, let’s take a look at the format of URL that we need to use with TinyURL.

http://tinyurl.com/api-create.php?url=targetaddress

Where targetaddress refers to the URL that you wish to shorten.

And that’s it.

Let’s say we wanted share a link containing information about this years PowerShell Summit Europe event in Stockholm. The full length URL for this is :

http://powershell.org/wp/community-events/summit/powershell-summit-europe-2015/

If we wanted to get the TinyURL equivalent of this, we’d use the following URL, pasting it into the address bar of our browser.

http://tinyurl.com/api-create.php?url=http://powershell.org/wp/community-events/summit/powershell-summit-europe-2015/

TinyURLExample

For making this happen via PowerShell, Invoke-WebRequest is our friend. All we need to do is provide the required address via the Uri parameter, and the Content property of the returned HtmlWebResponseObject will contain its shortened equivalent.

So for the case of the above we’d be using a command (note the pipeline symbol) of the type :

And can expect to get :

InvokeWebRequest

I’ve put together a cmdlet called Get-TinyURL for doing this. At its simplest, you can run it with the Uri parameter, and it will return a PSObject containing the original full address and its shortened equivalent.

[/code]

GetTinyURL

It’s also been bulked out a bit to give some extra functionality, such as being able to read from and write to the clipboard if we want. With both options enabled, we can copy a full address into the clipboard, run the cmdlet, and automatically have the shortened URL available for pasting wherever we want it next.

pseufull
Navigate to desired URL and copy it to the clipboard

GetTinyClipboard
Run the required command

pseuemail Paste where required

The code used is listed below, and will also be posted on GitHub in due course.

Share