Close Menu
Close Menu
Close Menu
The ultimate guide to prompt engineering your GPT-3.5-Turbo model
Natasha Gouws-Stewart

At Master of Code, we’ve been working with OpenAI’s GPT-3.5-Turbo model since its launch. After many successful implementations, we wanted to share our best practices on prompt engineering the model for your Generative AI solution.

We realize that new models may have launched after we publish this article, so it’s important to note that these prompt engineering approaches apply to OpenAI’s latest model, GPT-3.5-Turbo.

What is GPT prompt engineering?

GPT prompt engineering is the practice of strategically constructing prompts to guide the behavior of GPT language models, such as GPT-3, GPT-3.5-Turbo or GPT-4. It involves composing prompts in a way that will influence the model to generate your desired responses.

By leveraging prompt engineering techniques, you can elicit more accurate and contextually appropriate responses from the model. GPT prompt engineering is an iterative process that requires experimentation, analysis, and thorough testing to achieve your desired outcome.

Understanding the basics of the GPT-3.5-Turbo model

Before we jump right into GPT prompt engineering, it’s important to understand what parameters need to be set for the GPT-3.5-Turbo model.

GPT-3.5-Turbo model configuration example
GPT-3.5-Turbo model configuration example

Pre-prompt (Pre-header): here you’ll write your set of rules and instructions to the model that will determine how it behaves.

Max tokens: this limits the length of the model’s responses. To give you an idea, 100 tokens is roughly 75 words.

Temperature: determines how dynamic your chatbot’s responses are. Set it higher for dynamic responses, and lower for more static ones. A value of 0 means it will always generate the same output. Conversely, a value of 1 makes the model much more creative.

Top-P: similar to temperature, it determines how broad the chatbot’s vocabulary will be. We recommend having a top P of 1 and adjusting temperature for best results.

Presence and Frequency Penalties: these determine how often the same words appear in the chatbot’s response. From our testing, we recommend keeping these parameters at 0.

Setting up your parameters

Now that you understand the parameters you have, it’s time to set their values for your GPT-3.5 model. Here’s what’s worked for us so far:

Temperature: If your use case allows for high variability or personalization (such as product recommendations) from user to user, we recommend a temperature of 0.7 or higher. For more static responses, such as answers to FAQs on return policies or shipping rates, adjust it to 0.4. We’ve also found that with a higher temperature metric, the model tends to add 10 to 15 words on top of your word/token limit, so keep that in mind when setting your parameters.

Top-P: We recommend keeping this at 1, adjusting your temperature instead for the best results.

Max tokens: For short, concise responses (which in our experience is always best), choose a value between 30 and 50, depending on the main use cases of your chatbot.

Writing the pre-prompt for the GPT-3.5-Turbo model

GPT prompt engineering schema
GPT prompt engineering schema

This is where you get to put your prompt engineering skills into practice. At Master of Code, we refer to the pre-prompt as the pre-header as there are already other types of prompting associated with Conversational AI bots. Below are some of our best tips for writing your pre-header.

  • One of the first things to do in prompt engineering is to provide context for your model. Start by giving your bot an identity, including details of your bot persona such as name, characteristics, tone of voice and behavior. Also include the use cases it must handle, along with its scope.
  • Use positive instructions instead of negative ones (i.e. ‘Do’ as opposed to ‘Do not’). For example, when training a model that tended to ask too many questions at once, we changed the pre-header to say “When you ask the user for information, you only ask a maximum of 1 question at a time” from “Do not ask the user more than 1 question at any time”. This yielded better results.
  • Include examples in your pre-header instructions for accurate intent detection.
  • Experiment with synonyms to achieve desired behavior.
  • During the prompt engineering process, avoid giving conflicting or repetitive instructions.
  • Specify the number of words the bot should include in its responses. For example, “When the dialogue starts, you introduce yourself by name and proceed to help the user in no more than 30 words.” Controlling the output length is an important aspect of prompt engineering, as you don’t want the model to generate lengthy text that no one bothers to read.
  • Order your pre-header with: Bot persona, Scope, Intent instructions.
  • When it comes to prompt engineering, having your AI Trainers and Conversation Design teams collaborate is crucial. This ensures that both the business and user needs are accurately translated into the model, tone of voice is consistent with the brand, and instructions are clear, concise and stick to the scope.

Testing your GPT-3.5-Turbo model

Now that you’ve set up your GPT-3.5 model for success, it’s time to put your prompt engineering skills to the test. As mentioned before, prompt engineering is iterative, which means you’ll need to test and revise your pre-header – the model is most likely not going to behave how you intended the first time around. Here are some simple but important tips for testing and updating your pre-header:

  • Change one thing at a time. If you see your model isn’t behaving how you expect, don’t change everything at once. This allows you to test each change and keep track of what’s working and what isn’t.
  • Test every small change. The smallest update could cause your model to behave differently, so it’s crucial to test it. That means even testing that comma you added somewhere in the middle, before you make any other changes.
  • Don’t forget you can also change other parameters, such as temperature. You don’t only have to edit the pre-header to get your desired output.

And finally, if you think you’ve prompt-engineered your pre-header to perfection and the bot is behaving as expected, don’t be surprised if there are outliers in your testing data where the bot does its own thing every now and then. This is because Generative AI models can be unpredictable. To counter this, we use injections. These can overrun pre-header instructions and correct the model’s behavior. They’re sent to the model from the backend as an additional guide, and can be used at any point within the conversation.

By following these GPT prompt engineering best practices, you can enhance your GPT model and achieve more accurate and tailored responses for your specific use cases.

Interested in leveraging the power of Generative AI chatbots? Our skilled team is ready to help.

    By continuing, you're agreeing to the Master of Code
    Terms of Use and
    Privacy Policy and Google’s
    Terms and
    Privacy Policy

    Also Read

    All articles