Week 8 (July 14 - 20): Lets Try This Again

Another Angle

This week I’m going to investigate whether finetuning the pretrained model ‘Stable Diffusion’ will give me the results I’m looking for: an AI that can generate images of floorplans.

I discovered that I’m meant to use ModelsLab’s' API to generate images using Stable Diffusion while providing it with some training data in my own floorplan images. The theory is that this will allow me to steer a model pretrained on millions of images towards generating images that I want it to. Stable Diffusion is also a ‘text-to-image’ AI, meaning I type text that represents the image I want to generate into Stable Diffusion, and it will generate it’s best understanding of what the text would look like using what it has learnt from it’s training.

Models Lab API. They support the training/finetuning of thousands of different models in different forms such as video, voice, and image

ModelsLab’s documentation includes examples of how I can finetune models using my own images.

We provide a text prompt, links to images, as well as other variables that improve finetuning such as a negative prompt, seed, training steps, etc.

ModelsLab recommended that I used 7 images along with my text input to Stable Diffusion when finetuning for best results. Since I didn’t need a large dataset, I could manually create my own images and use them immediately during testing unlike when I trained my own GAN. I created my images with simple shapes using ‘Google Drawings’, an application Google provides for free on their Workspace.

an image I created using Google Drawings that I wanted my AI to be able to produce. I created 7 floorplans in this similar style

The other 6 images I used to train the AI were similar to this image with a bed, and a couple other pieces of furniture. I changed the spacing and rotated the layout for the other images, but I kept them simple in hopes that the AI would recognize the patterns between each image.

Next, I had to figure out how to upload my images and allow public access to them using a link. I learned one method of doing this was by uploading my images to a repository on GitHub. I quickly learned how to do this using GitHub’s documentation:

GitHub’s documentation on uploading files to a publicly accessible repository

according to the documentation, I uploaded my 7 custom floorplan designs onto my own repository

Now that I was able to share my images through links, I could actually use these links as input to Stable Diffusion as ‘image finetuning input’ by following the ModelsLabs’s documentation. I followed the API docs and wrote my code to match theirs. I used my own image links and changed the model type to use the Stable Diffusions XL model.

My own version of their example prompt but I customized my prompt as ‘photo of a floor plan’ and I included the GitHub links under ‘images’. I also changed the ‘base_model_type’ to ‘sdxl’ which stands for Stable Diffusion Extra Large

I ran my code, and after 20 minutes I checked the ModelsLab website to see what the result of the training would be. This is one of the images that my trained AI generated:

An image that the finetuned SDXL model generated

Obviously this looked nothing like the images I used to train the AI. I was pretty surprised, given even the original Stable Diffusion model was trained on 2.3 billion image-text pairs, using the 2B-EN dataset. My guess is that in the 2.3 billion images used to train Stable Diffusion, there were zero or very little floorplan images, or the labels and text for these images were different than ‘floor plan images’.

At this point I’m deciding to brainstorm new ideas utilizing artificial intelligence that I can implement into my app, because at this point I don’t think I have any viable approach to image generation because either I don’t have enough resources to manually train an AI (datasets aren’t large enough) and pretrained models don’t work as I’ve discovered this week.

 

Next Steps

In the coming week I want to return to developing the onboarding and functional screens for my app, as well as do some brainstorming on other ways I can still incorporate AI into my interior design application

Previous
Previous

Week 9 (July 21 - August 4): Fresh Perspective

Next
Next

Week 7 (May 26 - June 1): Why?