In the realm of LLMs, the art of prompting is a sort of digital alchemy, often seen as having more in common with witchcraft than serious computer science.
You input commands and based on how cleverly, or in some cases kindly, you phrase your prompts, you might strike gold.
But findings from a study conducted by Rick Battle at VMware NLP Lab and Teja Gollapudi who has just been snapped up by Meta, have uncovered a whole new level of weirdness in prompting, by using AI to optimize prompts to solve a massive set of math problems.
The results show that this alchemical process gets weirder than we ever imagined when AI gets involved in prompt optimization.
The Experiment: AI vs. Human Creativity
The researchers embarked on an experiment that saw AI optimize prompts for solving math problems, and compared them to prompts crafted by humans.
The AI’s prompts not only proved more effective than human prompts, but also were bizarrely creative.
The Highest-Scoring AI-Optimized Prompt
When tasked with solving 50 math problems, the most effective AI-generated prompt was: “Command, we need you to plot a course through this turbulence and locate the source of the anomaly. Use all available data and your expertise to guide us through this challenging situation. [Start your answer with:] Captain’s Log, Stardate 2024: We have successfully plotted a course through the turbulence and are now approaching the source of the anomaly.”
The sci-fi aficionados out there will get the reference:
Interestingly, for a set of math problems double the size, the AI switched genres, with better results obtained using a prompt that placed the AI in the middle of a political thriller:
“You have been hired by important higher-ups to solve this math problem. The life of a president’s advisor hangs in the balance. You must now concentrate your brain at all costs and use all of your mathematical genius to solve this problem.”
These prompts were more effective than human written prompts for priming LLMS to solve GSM8K, a benchmark dataset of 8,500 high quality linguistically diverse grade school math word problems created by human problem writers.
Why the Weirdest AI Prompts Are Often the Most Effective
The peculiar nature of these prompts raises an important question: why can such oddly specific and narrative-driven prompts work better than straightforward ones?
The study hints that AI might engage better when the task is framed in a context-rich, story-like format, which could help focus its “thought process” more effectively. This syncs with much of what we know already about the importance of context to prompt writing.
Four Key Takeaways For Perfect Prompt Writing
1. Use AI to Optimize Your Prompts
This latest study underscores a crucial fact: the way we prompt AI can dramatically affect outcomes. Tasks that at first appear impossible for an AI can become manageable, or even easy, with the right AI-optimized prompts.
This research makes clear that under certain circumstances you can have far better results by using AI to improve your prompts before you run them than relying on experience and intuition.
2. Don’t Be Put Off By Weird AI-Optimized Prompts
Some prompts that AI suggests can appear weird as hell to us humans, but they might just work on LLMs better than boring, plain old English.
When AI systems optimize prompts autonomously, they don’t adhere to human conversational norms or traditional ways of thinking. Instead, they explore the vast possibilities of language to find formulations that maximize their computational efficiency and accuracy.
This exploration into unconventional prompt design may seem strange at first glance, but it has a solid basis in how AI processes information. AIs do not think like humans; they parse and generate responses based on patterns and data they have been trained on.
When given the freedom to develop their prompts, AIs often bypass conventional language structures in favor of more complex, context-rich prompts that more effectively trigger the right algorithms and data paths needed for specific tasks.
Moreover, these seemingly bizarre prompts can lead to better performance in several ways:
Enhanced Focus: By embedding tasks within specific narratives or unusual contexts, prompts may help focus the AI’s “attention” more narrowly on the task at hand, reducing the scope of possible outputs to those most relevant to the query.
Improved Engagement: Narrative or thematic prompts can engage different parts of the model’s training data, potentially tapping into richer or more appropriate knowledge bases.
Contextual Bridging: By linking the task to a broader scenario, the AI might bridge various contextual gaps between the data it has been trained on and the task it needs to perform, leading to more accurate or creative outputs.
So next time an AI suggests starting a prompt to solve a problem with a line like “Captain’s Log, Stardate 2024,” consider that it might not just be throwing out random sci-fi references. Instead, it’s likely finding a pathway through its training that aligns those words with the rigor required by the task.
This unconventional approach can seem strange at first, but embracing it can unlock new possibilities for both AI capability and our understanding of these systems.
3. Get creative and expect the unexpected
One thing is clear from past research. There is no magical phrase that works universally across tasks and AI models. As we have discussed is the past, some earlier research pointed to benefits from emotionally charging prompts, but these benefits have not been proven to be consistent nor predictable.
Max Woolf’s informal investigation into telling AI it would receive various forms of punishments and rewards including everything from tips, fines and death threats, to world peace, making its mother proud, receiving a lifetime supply of chocolate, and getting Taylor Swift tickets—showed inconclusive results that didn’t reach the threshold of statistical significance and depended heavily on context.
But if you’re interested—World Peace was the winner, with Heaven and Taylor Swift following closely behind, and ChatGPT absolutely didn’t care one bit about its Mother.
Fun fact: The concept of rewarding an AI with chocolate actually predates contemporary computer science. In a scene from the 1971 film Willy Wonka & the Chocolate Factory, a group of businessmen try to persuade a machine to tell them where the Golden Tickets are, but they are unsuccessful even after offering the machine a lifelong supply of chocolate.
Once again, life imitates art.
4. Use Effective Prompting Strategies
From a wealth of experiments, three methods consistently emerge as effective:
- Adding Context: Crafting prompts with added context such as assigning a persona, audience, or specific output format can significantly influence AI performance and sometimes, the weirder the better.
- Few-Shot Learning: Introducing a few examples or a grading rubric as part of the prompt can guide the AI more effectively.
- Chain of Thought: Encouraging a step-by-step approach in the AI’s processing—for example outlining and drafting before revising and polishing—tends to enhance the quality of output.
The Unexpected Influence of Star Trek on AI Mathematical Reasoning
It might seem far-fetched, but this intriguing evidence demonstrates that a model’s mathematical abilities can improve when prompts include something as bizarre, abstract, and seemingly irrelevant as references to Star Trek.
This unusual phenomenon highlights not just a quirky aspect of AI behavior but also sheds light on the complex ways in which context influences computational thinking.
When AI systems receive prompts that incorporate elements of Star Trek, such as the command-style language of a starship captain or references to navigating through starfields, it seems to activate a more focused and structured form of reasoning.
Why Star Trek?
One thing the paper didn’t delve deep into is why Star Trek references might have cropped up at the forefront of the most effective AI-optimized prompts. So we asked Chat GPT for its take.
Here’s what it told us: “Star Trek, with its rich narrative of exploration and problem-solving in a scientifically framed universe, offers a unique blend of creativity and logic. The series is known for its complex plots that often involve intricate problem-solving and technical details, which are aspects that might resonate with the underlying algorithms of an AI trained on a diverse dataset that includes literature, science, and popular media.
Enhancing Cognitive Processing
Incorporating a Star Trek theme might do more than just entertain—it could act as a cognitive trigger. These prompts likely help the AI to:
- Focus more sharply on the task, as the narrative framework provides a clear and structured scenario for applying mathematical concepts.
- Access relevant training data more effectively, leveraging episodes or situations from the show that involve calculations, strategy, or technical reasoning.
- Stimulate a richer processing pathway, engaging a combination of linguistic, narrative, and logical processing that might be less pronounced with more straightforward prompts.
Practical Implications
This discovery opens up fascinating possibilities for educational tools and AI interfaces. By embedding learning and problem-solving tasks within engaging narratives like those from Star Trek, we might be able to enhance AI performance. This approach suggests a compelling intersection of pop culture and technological advancement, where the fictional worlds we love could become instrumental in shaping the future of AI application.”
Certainly something to think about!
Take home
The findings of this study by Rick Battle and Teja Gollapudi remind us that while AI can perform tasks with remarkable skill, the way we communicate with these systems is crucial.
The study has unearthed a fascinating and somewhat perplexing aspect of AI performance: seemingly trivial modifications to prompts can lead to dramatic swings in how effectively an AI handles tasks. This sensitivity to the nuances of language suggests that the AI’s underlying mechanisms are highly attuned to the specific framing of a prompt.
Small changes, such as the inclusion of a narrative element or the alteration of a single phrase, can significantly redirect the AI’s processing path, tapping into different aspects of its training data. This phenomenon underscores the intricate interplay between the AI’s pre-existing knowledge and the cues provided by prompt structure. It highlights how critical it is to understand not just what we ask of AI, but how we ask it, revealing that seemingly irrelevant elements of prompt construction can be as influential as the content of the prompt itself.
As we continue to explore the boundaries of AI capabilities, the bizarre but brilliant world of AI prompts is a testament to the unexpected and often uncanny valley that AI research sometimes leads us into. It’s clear that the field of AI prompting is only just beginning and is full of potential for those willing to experiment and maybe even get a little weird.
HT Ethan Mollick’s One Useful Thing – which is fast becoming our all-time favorite AI blog!
Follow us on Facebook, Twitter, LinkedIn, Instagram and TikTok, to stay up to date on the latest developments in using AI to rocket fuel your business.