Using Open-AI’s GPT-2 To Generate New Netflix Movie/TV Descriptions

Published in

Becoming Human: Artificial Intelligence Magazine

7 min readMar 18, 2020

I’ve seen many resources online that talk about how to use Open AI’s GPT-2 but I haven’t seen much on how to use the model to generate short text (tweets, descriptions, etc). The article below is a step by step tutorial to help you do that. The results are showcased on my website, thismoviedoesnotexist.co. I plan to provide instructions on how I built it in a separate article.

Introduction

What is GPT-2?

Now, for those wondering what GPT-2 is, allow me to provide a brief overview. GPT-2 is a state-of-the-art language model developed by Open-AI. Its main application, which I’ve leveraged, is synthetic text generation. On multiple occasions it has pushed the boundary for what is algorithmically possible which is exemplified in the table below.

*Figure 1: GTP-2 Performance (*https://openai.com/blog/better-language-models/)

Without diving into the details of each of these datasets, these tests were built to evaluate the conversational capabilities of a particular model.

To give you an idea of its capabilities, here is an example of how GPT-2 generates unconstrained text based on an arbitrary user-inputted text prompt:

*Figure 2: GPT-2 Example (*https://openai.com/blog/better-language-models/)

Useful Resources

Before I move forward, I wanted to provide a few resources that were tremendously helpful to me while experimenting with this model:

Adam King’s Talk to Transformer project. This was my first exposure to Open AI’s work and was the catalyst for my exploration of this subject. I strongly encourage you to explore his project!
Open-AI’s original blog post describing the model that was built along with its performance metrics.
Max Woolf’s gpt-2-simple model that he has made available to everyone. His work removed several steps to fine-tune the pre-trained GPT-2 model.

Preparing the Data

In my opinion, data preparation is the key piece when getting GPT-2 to do what you want. First, let’s take a glimpse at the training data:

Figure 3: Tabular Netflix data provided on Kaggle

The model requires a large, single text file. Examples would be an entire movie script, Wikipedia article, etc. As seen in Figure 2, many GPT-2 use cases that I have come across are entirely unconstrained, meaning the output isn’t required to end at a specific point. In this example, that isn’t the case. We aren’t inputting a large body of text and we do require a constrained output. To counter this, we will have to manipulate our tabular descriptions into a large text file that will allow the model to output a single, coherent movie/tv description.

Specifically, I’ve added large, distinct tags on either end of each description before they are concatenated. Samples are shown below:

As the model reads through all of this text, it will begin to learn that new descriptions start and end with these tags. So, in theory, if we were to ask GPT–2 to generate new text from a “<|startoftext|>” text prompt, it would begin printing a new description and would output <|endoftext|> at some point. Once that happens, we know the new description has been completely generated and the model can stop outputting text.

Fortunately for us, the gpt-2-simple package allows us to incorporate these rules within its generate function’s hyperparameters. We’ll dive into this later in this article.

Fine Tuning GPT-2

Open AI has released 4 different models for us to use, each more complex than the last. Here are the choices:

175 Million Parameters
355 Million Parameters ← My Choice
762 Million Parameters
1.5 Billion Parameters

The 355 million parameter (aka weight) GPT-2 model was trained “to simply predict the next word in 40GB of Internet text”. Our job is make micro-adjustments to these weights to complete the task at hand (generating Netflix content descriptions). As you’d imagine, loading such a model and fine-tuning it will be computationally expensive. Luckily, Google Colab’s are a great resource as they provide free GPU compute. I built off of Max Woolf’s notebook and it can be found here. I’ll also go through the code below:

Figure 5: Fine-tuning code snippet

Naturally, the model isn’t going to be perfect right from the get-go so I wanted to showcase what the model has learned throughout the training.

TIP: For anyone working with an NLP deep learning model, I strongly suggest outputting samples as you train to gauge how well the model is learning. In this case, you would use the sample_every parameter

Here are a couple of snapshots of what the model learned throughout the training process:

Step 600:

Step 1400:

There’s a couple of things to note:

There is a significant improvement in the cohesiveness and the structure of the output as the model is fine-tuned.
After many iterations, you start to see that the model outputs <|startoftext|> and <|endoftext|> in the beginning and end of each output. This shows us that the model will be able to output distinct descriptions using the rules we discussed in the previous section.

After the model has finished training, the final weights are available in the “checkpoints” directory.

Generating New Descriptions

Now for the fun part; generating brand new movie/tv descriptions! First, here is the code:

Figure 8: Text Generation Code Snippet

Because of Max Woolf’s work, we are able to generate samples using one simple, elegant function. With that in mind, the best way to provide an explanation would be to dive into the function’s hyperparameters. All of their definitions can be found the gpt-2-simple GitHub page but I wanted to include a few key parameters here as well:

run_name: This must be the same as the run_name in the finetune function. A folder with this name will be referenced so it is key to make sure this is correct.
temperature: This determines how “creative” the output will be. The optimal range is from 0.7 to 1.2. The lower, the more tame and the higher, the more creative.
nsamples: the number of outputs to generate.
prefix: The text prompt that is used to generate new text. Since our training consistently starts with “<|startoftext|>”, the model will know to start generating a new content description after this prompt.
include_prefix: When False, the output won’t include the prefix we’ve included above. You could use this or build a regex after.
truncate: This forces the model to stop outputting after the truncate text is shown. When <|endoftext|> is outputted by the model, it will stop outputting results.

Now, let’s take a look at an example. I’ve taken a screenshot from my website, thismoviedoesnotexist.co, to showcase what GPT-2 can generate. In my opinion, I could actually see this being produced by Netflix!

Figure 9: Generated description on thismoviedoesnotexist.co

This site is still up and running so I encourage you to explore and find more promising and/or ridiculous content.

Final Thoughts and What’s Next

Ultimately, I want this article to be another resource that lowers the barrier to entry into the cutting edge in NLP. With just a few lines of code, you can prepare your data, fine-tune a GPT-2 model and generate brand new content. For me, the natural next step would be to leverage some of the larger GPT-2 models to see how the results vary.

Now, this article focused on the Data Science portion of this project but I wanted to highlight that this was the first time I took a model and exposed it on my very own server/website. As a next step, I’ll be sharing similar instructions on how to do just that.

I had a blast building this project and writing this article. I’d like to start writing more so I would appreciate any feedback!

End to end can be found on my google colab notebook. This notebook is an expansion of what Max put together. If you have any questions or suggestions, feel free to reach out.