Here’s a revised version of your post:
I’m interested in developing software where I can input an image of a bathroom and then ask questions like, “What condition is the bathroom in?” and “Which decade does the bathroom appear to be from?”. I’ve tried using the sceneExplain plugin with ChatGPT, but the results have been off, suggesting that 40-year-old, worn-out bathrooms are in great condition.
I have a decent background in programming, so I believe I might need to train a model myself. However, I’m unsure about which categories of models are best suited for this purpose and which ones I can train on my own.
Can anyone provide guidance on the best models for this task, or perhaps link me to a tutorial on how to train them?