Opportunity space in prompt-to-image

I saw this topic was requested: What Ideas Are Generating Revenue in AI?

TL;DR While not strictly on topic - I thought I’d share some recent observations on potential opportunity spaces and use cases - specifically in AI text-to-image generation.

For context: I co-founded an open innovation company (iNRV8.net). We typically work with Fortune 500 companies in the CPG space – think Heinz, Unilever, P&G etc. and help them ideate new concepts, brands, innovations and in product development, marketing, comms, activation, packaging etc.

While we have a highly consultative element - we utilise our platform and network of global creators to solve challenges these brands face. Underpinned by strategic and objective orientated goals, we frequently propose visual concepts and designs in response to detailed briefings.

As you will probably be able to tell ; ) I am coming from a non-technical side.

We recently ran a client project for an international brand and immediately saw the impact of midjourney etc. Our creatives are all skilled designers, but I’d estimate 70% of the concepts were rendered using some form of text-to-image software and this is totally new.

What is clear is that the quality of the problem solver is still paramount i.e. a solution can only be as good as the person prompting and their rationale behind the prompt. The ability of midjourey to churn or beautiful images is staggering – but the intent behind the designs is still key and of most value.

Here are some observations and possible opportunities:

Unlike text generators i.e. Chat GPT visual tools lack control. For example – I can ask the AI to write me a long passage of copy and I can add, amend or eliminate parts of it and down to the level of a word.

Problem: lack of easy control and inconsistency in images. Text-to-image is great at exploratory work, and while it is possible to control elements with image inputs, seed variation, image references and training models’ minor iterations are not so easy. For example – you are designing a concept bottle for a drinks brand. Often the design and concept process it iterative – you need to keep certain elements specifically while changing others. The ability to isolate very specific parts you want to retain, adapt or remove would be highly advantageous and allow these programs to be used predictably in workflow.

Problem: You are working from a brand base which has an existing style, tone and colour palette or is under a style guide. A way to ‘plug’ this guide in and still ‘ideate’ using the tool would be highly beneficial to brand agencies, designers, social media managers etc. In affect, a controlled way to inhibit the generations possible.

Problem: lack of quality text or logos. The ability to upload a logo or set text and which could be accurately applied to a visual would be advantageous – a little like how a decal is used in 3D CAD software. Even if the general text production improves having an accurate brand or logo is highly desirable.

Problem: while paid upgrades allow you to keep your work private and off public servers/view – a belt and braces approach would be highly beneficial when managing third party IP and strict NDAs (non disclosures).

The potential users are product, brand, ad and general creative agencies and freelancers. The emerging users are those who are problem solvers but who were limited by lack of design ability.

Photoshop's new fill tool is heading in the right direction but still lacks the required control.

I can’t verify how many people pay for premium AI generation services – but as the option is now common there appears to be a business case and willingness to upgrade.

For someone interested in the space - perhaps this is useful inspiration.