Understanding failure

The first thing I wanted to do this week was figure out why I failed at training my GAN model. I had my suspicion that it was because I didn’t have enough training data, as I only had around 4,000 images to train it on.

I came across a blog post on ‘DeepLearning.AI’ explaining that it would take ‘upwards of 100,000 images’. I also found a user on Quora, ‘Anton Karazeev’, former researcher at Laboratory of Neural Networks and Deep Learning, say that GANs trained on generating handwritten digits have required datasets with 60,000 images, with GANs generating colored images also required a similarly large dataset for training. This seemed to confirm my suspicion that I simply didn’t have enough training data for my GAN, however I did come across some instances of GANs successfully being trained on less than 4,000 images, such as a face generation model trained on the Toronto Facial Dataset, which only contains 2,925 images.

Nevertheless, I have a good feeling that the small dataset at least partly contributed to the poor GAN results, so I’m deciding to research and try different AI models for better results.

New ideas

While I was looking on the community r/ArtificialIntelligence on Reddit, I came across a post that mentioned something interesting. One user said that if I didn’t have a large dataset, a viable option is to use a ‘pre-trained’ model. This meant the model had already been trained by it’s creators on hundreds of thousands or even millions of general images, and I only had to finetune its knowledge using the couple images that I had.

From the same Reddit post I also learned that ‘Stable Diffusion’ was one of the most popular free-to-use pre-trained models. Stable Diffusion supports an image-to-image generator, which takes a text prompt and an image example and generates an image based on your image and the text prompt.

*Stable Diffusion, a free to use text-to-image model*

*Some of the images that Stable Diffusion generated*

At this point, I’ve made up my mind to investigate how to finetune Stable Diffusion using my own interior design images, to see how well a pre-trained model would perform when generating images of floorplans.

Next Steps

Next week, I’m going to spend time looking into Stable Diffusion’s documentation and generate my first floorplan images using the pretrained model.

Week 7 (May 26 - June 1): Why?

Understanding failure

New ideas

Next Steps

Isaac Wang

Week 7 (May 26 - June 1): Why?

Understanding failure

New ideas

Next Steps

Week 8 (July 14 - 20): Lets Try This Again

Week 6 (May 14 - 18): Building an AI

Isaac Wang