Week 8 (July 14 - 20): Lets Try This Again
Another Angle
This week I’m going to investigate whether finetuning the pretrained model ‘Stable Diffusion’ will give me the results I’m looking for: an AI that can generate images of floorplans.
I discovered that I’m meant to use ModelsLab’s' API to generate images using Stable Diffusion while providing it with some training data in my own floorplan images. The theory is that this will allow me to steer a model pretrained on millions of images towards generating images that I want it to. Stable Diffusion is also a ‘text-to-image’ AI, meaning I type text that represents the image I want to generate into Stable Diffusion, and it will generate it’s best understanding of what the text would look like using what it has learnt from it’s training.
ModelsLab recommended that I used 7 images along with my text input to Stable Diffusion when finetuning for best results. Since I didn’t need a large dataset, I could manually create my own images and use them immediately during testing unlike when I trained my own GAN. I created my images with simple shapes using ‘Google Drawings’, an application Google provides for free on their Workspace.
The other 6 images I used to train the AI were similar to this image with a bed, and a couple other pieces of furniture. I changed the spacing and rotated the layout for the other images, but I kept them simple in hopes that the AI would recognize the patterns between each image.
Next, I had to figure out how to upload my images and allow public access to them using a link. I learned one method of doing this was by uploading my images to a repository on GitHub. I quickly learned how to do this using GitHub’s documentation:
Now that I was able to share my images through links, I could actually use these links as input to Stable Diffusion as ‘image finetuning input’ by following the ModelsLabs’s documentation. I followed the API docs and wrote my code to match theirs. I used my own image links and changed the model type to use the Stable Diffusions XL model.
I ran my code, and after 20 minutes I checked the ModelsLab website to see what the result of the training would be. This is one of the images that my trained AI generated:
Obviously this looked nothing like the images I used to train the AI. I was pretty surprised, given even the original Stable Diffusion model was trained on 2.3 billion image-text pairs, using the 2B-EN dataset. My guess is that in the 2.3 billion images used to train Stable Diffusion, there were zero or very little floorplan images, or the labels and text for these images were different than ‘floor plan images’.
At this point I’m deciding to brainstorm new ideas utilizing artificial intelligence that I can implement into my app, because at this point I don’t think I have any viable approach to image generation because either I don’t have enough resources to manually train an AI (datasets aren’t large enough) and pretrained models don’t work as I’ve discovered this week.
Next Steps
In the coming week I want to return to developing the onboarding and functional screens for my app, as well as do some brainstorming on other ways I can still incorporate AI into my interior design application