Need a new SaaS idea? 20 Things You Can Build with GPT-4 Vision

Less than a month ago, ChatGPT released ChatGPT vision.

What it does: Basically you can upload an image and it can answer questions about that image.

There's not yet an API for this feature. There are rumors, however, that OpenAI will soon make some major API announcments, and this will probably be among them.

What you can build: We've all seen the crazy amount of apps based on OpenAI's GPT 4 API. Some of the winners were folks who were the first to make use of that API. Things will probably be the same with GPT-4 vision.

Let's explore some of the things that are possible to build thanks to this new feature, so that you can be ready to rock 'n' roll when the API becomes available:

Code a user interface

You can upload an actual picture of a user interface and get GPT-4V to output the HTML/CSS/React code for it:

https://twitter.com/dr_cintas/status/1713261914375979415

What you can build: A "reverse UI" decoder tool. You can analyze/decode some of the world's most popular user interfaces. You can charge for people to upload their own designs and convert those designs to HTML.

In other words, this has the potential to replace people who do "Figma to HTML" conversion.

Do homework

Upload an image from a textbook and GPT-4 Vision will solve the problem:

https://twitter.com/josephofiowa/status/1707157331388076509

What you can build: A tool where people upload an image of a textbook, specify the subject and the tool will solve that problem.

Help accountants

You can upload an image of a receipt to GPT-4V and get a bunch of information in return:

https://twitter.com/lhr0909/status/1710583907295371758

This is a Godsent for accountants who deal with a lot of offline data.

What you can build: An (offline) document organizer for accountants. You can customize this to Shopify/Quickbooks and build it as an extension to their platform.

Expense tracker

This is an interesting prompt:

What you can build: Imagine scaling this to a whole restaurant. A camera is taking a picture of every table and then asking GPT-4V to calculate the price. Then it sums up everything and it gives the restaurant owner an overview of the total expenses for the day.

Reverse recipe creator

GPT-4V is surprisingly good at taking a picture of a food and telling you how to make that food:

https://twitter.com/brianroemmele/status/1707410668067107120?s=46

What you can build: A "reverse recipe" creator where people can upload their favorite food. If the app gets popular, you can add a functionality so that people can rate the "accuracy" of the recipe, suggest changes, etc.

Workout planner

Take a look at this:

https://twitter.com/rowancheung/status/1711955834425512172

Basically you can upload an image of the gym equipment you have and (potentially) an image of yourself so it tells you what exercises to focus on.

What you can build: A "scrappy" workout planner where you can make workout plans from seemingly simple things people have in their home.

Puzzle solver (or any picture with a question)

It seems that you can upload ANY picture with a question into GPT-4V and it'll give you the answer in return:

https://twitter.com/teknium1/status/1706842988591374373?s=46

What you can build: A tool where people can upload an image with any question on it (a puzzle, a homework, etc.)

Movie scene detector

GPT-4V can actually detect scenes from movies:

https://twitter.com/skirano/status/1706904814364361007?s=46

What you can build: I've seen many YouTube/TikTok videos where people ask something like: "Which movie is this?" You could make a tool that would parse that video, extract a few pictures from it and then ask GPT-4V a question like the one in the tweet.

Improving interior design

You can take a picture of a room and ask GPT-4V to suggest any additions to it:

https://twitter.com/skirano/status/1707466657176637709

What you can build: A tool that will let people take a picture of their room and suggest improvements. You could then match those improvements with DALL-E 3 and create a visual representation of the room. You could also provide users with similar-looking rooms.

Speaking of DALL-E 3...

Image-to-DALL-E/MidJourney-prompt

You can upload an image to GPT-4V and ask it to create a DALL-E/Midjourney prompt from it:

What you can build: A "reverse prompt" tool where people upload images and they get prompts in return. You can then process those prompt and compare them to the final. You can even create a 'human vs ai' tool where you'd have a bunch of images side-by-side: the original on the left, and on the right a DALL-E/Midjourney image created from a GPT-4V prompt.

Travel helper

GPT-4V can accurate answer questions like these:

What you can build: A travel app where people take a picture of what they're seeing. As an output they'll get a detailed description of the object, alongside with similar objects, their distance from the target object, etc.

Live road assistance

GPT-4V can do a good job of recognizing the conditions of the road:

What you can build: An app that connects to someone's dash cam and periodically analyzes the image for any conditions they need to be aware of.

Meme generator

GPT-4V can do a pretty good job of analyzing memes:

What you can build: The reverse. Upload an image and ask it to create a few meme ideas for you.

Detective games

GPT-4V is pretty good at analyzing cues in an image:

What you can build: A game where a person can try and write all the things they notice about a picture. Then, ask GPT-4V about the same picture and compare their answers with a GPT-4 prompt.

Convert a graph to code

Take a look at this prompt:

It turns out that GPT-4V is pretty good at translating images to code.

What you can build: A code builder that takes a graph/sketch and translates it into code.

Create content from graphs

You can take pictures that feature stats and convert them to useful answers/insights with GPT-4V:

What you can build: An "insights generator". Get people to choose a niche. Your tool will then search for stats for that niche, feed them into GPT-4V and ask it to generate some unique insights.

Convert an image to a text-like format

Take a look at this GPT-4V prompt:

What you can build: Many images have corresponding text formats. GPT-4V is pretty good at converting to thos eformats. You can build an image-to-text tool that will do this.

News summarizer

This is a pretty interesting GPT-4V prompt:

Basically, you can take a whole page and ask GPT-4V to analyze the news items.

What you can build: A "news summarizer" tool that will take your favorite site and tell you about the "themes", etc.

Emotion predictor

GPT-4V can be good at predicting emotions from an image:

What you can build: You can use this to analyze frames from an ad and give an "emotion" summary. You can sell this as a competitor research tool.

You can also use GPT-4V to predict what people will find to be more beautiful:

Insurance evaluator

GPT-4 can be pretty good on evaluating the damage something has endured:

What you can build: Anything useful to insurance evaluators.

Say something nice to zerotousers…

1

These ideas are fantastic. I am excited to see how entrepreneurs utilize GPT-4 Vision for diverse applications. Thanks Darko for sharing this insightful piece.

FahimFida

·
2 years ago
·
1

How risky is building a wrapper around this!

orliesaurus

·
2 years ago
·
1

Awesome ideas! Thank you for sharing! Funny enough, I’ve actually launched an insight generator back in September :)

ArtKrivtsov

·
2 years ago
·
1

Wow, this is brilliant!!!
There's literally endless possibilities!

Jeyner

·
2 years ago
·
1

Certainly, here are 20 concise SaaS ideas using GPT-4 Vision:

Content generation
Code generation
Video storyboards
Graphic design
E-commerce personalization
Virtual fashion stylist
Language translation
Visual content summarizer
Medical image analysis
Architectural blueprints
Virtual tours
Artwork authentication
Automated video editing
Social media scheduling
Food recipe generator
Home organization
Mood-based music playlists
Environmental impact analysis
Sports performance analysis
Document signature verification
These ideas leverage advanced image analysis and text generation capabilities for diverse applications.

billi9009

·
2 years ago
·
1

Let's try

OnlineConverter

·
2 years ago
·
1
Great ideas! Here are some I came up with:
- An app scans eatable products at your refrigerator and suggests dishes that you can make with them. It should generate detailed cooking guides.
- App that analyzes plant species and suggest tips how to grow/care them
NickNaskida

·
2 years ago
·
1

Here is the list of another 550+ Products you can built using GPT-4 - https://www.indiehackers.com/post/50-side-projects-making-1m-f51ce727df

iideaman

·
2 years ago
·