How to embark on your first GAN journey in 5 easy steps!

Published in

Becoming Human: Artificial Intelligence Magazine

4 min readNov 11, 2019

One of the hot topics in Deep Learning is Generative Adversarial Networks (GANs). Especially in the realm of Computer Vision, they opened the gates to fun and entertaining projects. Even though the real-world use cases for GANs are not as prominent and straightforward yet they have an aura of fascination as they have the ability to generate novel images.

As much fun as it is to toy around with GANs, it can also be very frustrating to work with them. Minor changes in the parameter settings can have a large impact on the quality of the generated images and finding adequate parameters can be tricky, as well as cost and time-intensive.

To succeed with GANs, here are a few things to consider when starting out with your project.

It’s harder to generate good looking images if you have a lot of variation in your dataset e.g different poses, backgrounds, etc..

Trending AI Articles:

1. Paper repro: “Learning to Learn by Gradient Descent by Gradient Descent”
2. Reinforcement Learning for Autonomous Vehicle Route Optimisation
3. AI, Machine Learning, & Deep Learning Explained in 5 Minutes
4. How To Choose Between Angular And React For Your Next Project

Don’t reinvent the wheel. Sure you could go through all the theory, read papers and implement your GAN from scratch — which certainly is fun — but it can be very time-consuming. For a quick start, I’d recommend using one of the many OpenSource implementations on GitHub (search for GAN on https://modelzoo.co/). Get your hands dirty, play around with it, get a feeling for it, and just have fun.
Keep it simple. For the beginning stick with only one class that you want to generate. The reason for this is that the more classes you use the more variation you have in your data, and the harder it is to generate good looking images. In addition, you will spend more time and money on tuning the parameters. Do you really want that? No. So keep it simple. Which brings me to the next point.
Start with a class where it is easy to get a larger amount of high-quality images. In my experience with 3000–5000 images you should be able to train your GAN and the generated images should somewhat resemble the class and its main features. However, artifacts might occur, details look blurred and the images won’t be that crisp.
Spend time on data curation. Try to keep the data as uniform as possible and avoid large variance within the class. What do I mean by that? When you build your dataset for generating faces, try to only use faces in one pose, e.g view from the front, and not mix it up with heads that are tilted. If you want to generate dogs, stick to maybe only one special kind of dog breed. Spend time on making the data as uniform and high quality as possible. Data curation is certainly not the most fun part but will help your GAN tremendously in learning the underlying distribution and thereby generate nice images.
Be patient. Using GANs to create high-quality images with more variation seems to take 30000–70000 images according to the literature. So it might take time to curate such a dataset. This Person does not exist was trained on the Flickr-Faces-HQ dataset with 70,000 images.