Back to previous chapter: I - Understand the Life of an AI feature

The first step in building your AI feature is to design its architecture. This involves defining the sequence of actions that will deliver value and identifying the data points needed to populate the {variables} in your prompts.

1. Design one prompt per task

<aside> <img src="https://prod-files-secure.s3.us-west-2.amazonaws.com/d8baae53-7dc0-4bad-a65f-26898d6a633d/98431310-7ca4-494e-8ec2-6d865405c493/Silex_Brand_Symbol.png" alt="https://prod-files-secure.s3.us-west-2.amazonaws.com/d8baae53-7dc0-4bad-a65f-26898d6a633d/98431310-7ca4-494e-8ec2-6d865405c493/Silex_Brand_Symbol.png" width="40px" /> Don’t expect your feature to function effectively with just one massive “Frankenstein” prompt. Remember the “intern analogy”.

</aside>

As a rule of thumb, aim to keep your prompts small and focused, each designed to accomplish one task exceptionally well. Begin by breaking down your feature into a sequence of simple, manageable steps.

For example, if you're building a feature to summarize a 30-minute call transcript, relying on a single prompt is unlikely to produce consistent quality.

Instead, consider the following approach:

Insert_Notion_ Feature@2x.png

In this approach, prompts are chained together, with the output of one prompt becoming part of the input for the next.

This method offers several advantages:

🎮 Better Control Each step has a more specific context and precise examples tailored to that step, leading to more accurate results.
💯 Easier Evaluation The expected output at each step is simpler, making it much easier to evaluate.
👨‍🏭 Easier Troubleshooting In a single "Frankenstein" prompt, identifying and correcting errors is challenging, often leading to unintended issues elsewhere. Smaller prompts make troubleshooting more manageable.
🎨 Improved UX Design Design an experience that keeps the human at the center. Incorporating human-in-the-loop feedback grounds the value of your feature and mitigates the compounding error rate of fully automated steps.
⏳ Cost & Latency Optimization Smaller, more focused tasks can often be handled by simpler models, which are cheaper and faster to run. When it’s time to optimize, switching models at specific steps becomes more straightforward.

Beyond pure prompt engineering, other complex techniques can come into play to improve accuracy and performance of your feature.

2. Retrieval-Augmented Generation is promising yet very complex

<aside> <img src="https://prod-files-secure.s3.us-west-2.amazonaws.com/d8baae53-7dc0-4bad-a65f-26898d6a633d/a48337cf-ceab-4181-9c32-0497315f6bea/Silex_Brand_Symbol.png" alt="https://prod-files-secure.s3.us-west-2.amazonaws.com/d8baae53-7dc0-4bad-a65f-26898d6a633d/a48337cf-ceab-4181-9c32-0497315f6bea/Silex_Brand_Symbol.png" width="40px" /> Be cautious about RAG. It may seem simple but it is far from trivial to make it work.

</aside>

Providing detailed, domain-specific context to a model through prompts is a critical best practice for improving the quality and relevance of the AI’s output.

RAG offers a compelling vision:

Making all of an organization’s unstructured knowledge—most of which is stored in PDFs, articles, Notion pages, Slack conversations, and other unstructured formats—accessible for AI to utilize.

<aside> 🤓

What is Retrieval-Augmented Generation? RAG is a technique that enhances the process of inserting context in the prompts by searching a knowledge base according to the user input to automatically find and include relevant context information in the prompt.

</aside>

Consider a customer support chatbot.