Unlocking the power of metaphor: AI’s role in ideation

By: Manoshij Banerjee, independent consultant on digital behaviour and culture,
Mohammed Shahid Abdulla, faculty member, Information Systems, IIM Kozhikode

OpenAI founder Sam Altman, in a blog from 2020 titled Idea Generation, says that “the most common question startup founders ask is how to get ideas for startups”, owing to the need for new features in their product, or perhaps the process of new product development. Having ideas, Altman stresses, is essential not only to ‘start’ but to keep the venture running in the face of tectonic shifts that markets go through as a matter of course. According to a 2023 BCG report, 44% of companies are now using AI to “identify new innovation themes, domains, adjacencies, technologies,” among others. The report also found that companies successfully implementing AI in innovation also become “idea generation powerhouses,” generating more than 5 times as many ideas as other companies.

Yet, as Harvard professor Theodore Levitt has pointed out, ideation is relatively abundant while implementation is scarce. Ideation has to be complemented with what he terms innovation i.e. the practical intelligence to recognise situational realities and bureaucratic impediments that prevent an idea from bearing fruit. What startup founders need, then, is innovative ideation which management guru Peter Drucker believes comes from a methodical analysis of opportunity given broader social or demographic trends. It is noteworthy that innovation highlights a path to a feasible implementation.

Researchers from INSEAD found that an ideator who generates “one brilliant idea and nine nonsense ideas [is more preferable] over one that generates ten decent ideas”. A set of ideas with greater diversity i.e. along various lines of attack for a particular problem is better than just a few ideas along a particular line of attack to solve a problem. A study from Cornell found that GPT4 was able to generate ideas much faster and cheaper, of higher quality (as measured by purchase intent expressed by sample respondents), and more varied than students of a top US university. GPT4 produced 35 of the top 40 ideas generated in the experiment. In another study, a rating given by seven different AIs as to the feasibility of 60 GPT4-generated business models was in agreement with business strategy scholars 2/3rds of the time.

Recent research from the Wharton Business School shed light on prompting GPT4 the right way to produce the most unique and diverse ideas that the more recent format of AI – a Large Language Model (LLM) – can generate. Such prompts rely primarily on a chain-of-thought (CoT) command sequence that asks the LLM to work in multiple, distinct steps: investigate the problem in a series of microtasks or point out faults or infeasibilities in its current reasoning – more like a conversation. This CoT prompting method increased the number of unique GPT4 ideas by 27% and closely resembled the diversity of ideation in human groups. It is also the case that CoT prompting will need a human to engage with an idea at several steps, with the ability to intelligently comprehend and refine the output. Interestingly, experiments indicate that even adding “Let’s think step by step”, indicating a CoT intent but sans any original reasoning from the human, significantly improves GPT4’s output.

In 2023, researchers from Stanford University found that, combined with the CoT technique, GPT3 could understand literary and non-literary metaphors such as A smile is a knife and A train is a large worm, respectively. In another study, GPT-4 interpreted literary metaphors from Serbian poetry better (as adjudged by human poets) than a group of college students. Researchers point out that metaphors, such as A wish is a rainbow, involve cross-domain mappings i.e. there is the interaction of a primary and a secondary subject – wish and rainbow respectively – incorporated in a way that is understood or appreciated as novel.

An interesting finding is that a metaphorical mental model has been associated with business success, where analogies with another situation help inform business decisions – often termed 2nd order learning. Business impresario Ray Dalio says that lacking a second-order mental model can cause painfully bad decisions and lead a business’ founder to settle for a typical knee-jerk response which is usually inferior. Such metaphorical reasoning in LLMs triggered by CoT prompts can help identify previously overlooked or unnoticed similarities between two unrelated situations.

However, it is only the aggregate performance of multiple AIs, aided by several-shot CoT prompting from relatively qualified humans, that stands a chance of refining an AI’s output to the needed entrepreneurial or venture level. Progress in AI is brisk, though recent research shows that GPT-4 hasn’t developed robust abstraction abilities at humanlike levels yet. For instance, even when multiple deduction steps from CoT are available, LLMs cannot pursue these systematically. Other research shows that including diverse demonstrations within the CoT prompt strengthens the LLM’s reasoning chain to produce a refined output. Despite needing a human to augment its thinking, AI may well be on its way into a startup’s core team, even more so given new LLMs modeled on Indian languages.

link