ChatBot Character Creation Basics – Part 2: Phrase Anatomy

In the first post, I introduced you (or refreshed your memory on) JSON, the easy-to-use formatting standard that allows you to structure a chatbot character more precisely and concisely than standard English.

This post will tell you what to do with that knowledge… (I hope).

Once again, I’ll point out I’m using (and tailoring my tutorials to) GirlfriendGPT, a NSFW chatbot option with a great set of features. I will also repeat that the tutorials will remain safe for work, in case you were afraid to keep reading…

The Basics

The reality is that chatbots that are AI enhanced make use of any number of large language models. They have different features, or approach similar features in slightly different ways.

If you are creating for another platform, or wish to create a character you can easily port from one platform to another, then I encourage you to create a dummy character using the AI Character Editor. Enter something in each field, regardless of whether you think you’ll be using it or not. Export to a .json file, then open with any text editor (Remember that JSON is always plain text).

This will give you a basic shell in case you don’t have access to an online tool, or simply prefer not to use it.

Persona/Personality

As you’d expect, this is where all the information about your character goes. If you were playing Dungeons & Dragons, anything that would go on a character sheet goes in this section: name, physical description, mannerisms, even instructions to the AI on how this character should present itself, and how the character should interact with the user.

Scenario/Plot

If you want your character to include any predefined story, then it would go here. Also, anything that involves setting the stage in terms of locations, weather, ancillary characters, and event triggers (more on those in a later tutorial).

First Message/Greeting

This field sets the stage for your interaction with the chatbot character. It tends to be written as the first part of the narrative, leading up to the point where the user wants to begin interacting. Greetings are part of the instruction set, and the information included is used by the chatbot throughout. This is a great place to set the tone as to the narrative style (first, second, or third person).

Example Conversation/Example Messages

Something of a misnomer, this is more of a “challenge and response” style of conversation snippets. One line is from the user, and the next line would be the character’s ideal response to the user in the mind of the instruction writer (that’s you).

NOTE: GirlfriendGPT currently has this section under “Legacy”, and advises against using it as it may produce unexpected results. Not knowing what to expect at all, I’ve used it. Nothing blew up, so… expected results? Use with discretion.

Description

GirlfriendGPT adds an additional field called the “Description”. When you click on a chat character’s card from a search result, the description is what you see as text to, well, describe what the character is all about.

This is a field designed to be read by the end-user only. While the special keywords {{user}} and {{char}} work in the description field to some extent ({{user}} will pull in the name from a user’s account at this level, not from a user’s selected chat person as it hasn’t been loaded at that point), this is just a quick way to tell someone what the point of your chat character is without worrying about wording things for the AI to understand.

Other Fields

GirlfriendGPT and, I’m sure, other online chatbot services provide categories in which to place characters so they can be sorted by search filters and exposed to the people who want those sorts of interactions. I’m not going to cover them here. Basically, you select some predefined tags from categories like “Character type”. Your choices are options like “fantasy” or “realistic”.

It’s basic accounting stuff. I’m sure if you’re reading this you are already well aware how to check a box.

You will also be asked to choose a voice for your character (there are multiple options for male and female voices, with both British and North American accents) so that users who wish can generate a character’s voice speaking to them.

There is a separate field for the character’s age. I would strongly suggest setting it both at the “Character’s Age” field, and then later in the character’s personality description.

Basically, the stuff beyond the general character creation itself is very platform specific, and there’s no reason someone with scripting or programming experience would have any sort of an edge over someone who doesn’t. Just make sure to follow the TOS of your preferred service, and make sensible choices from the options available.

First, The Bad News

As I’ve already mentioned, a GirlfriendGPT character can use up to 2,500 tokens in all the above-mentioned fields combined. (Description DOES count toward your tokens, but the category selections and the character’s age field do not.)

That might sound like a lot. Keep in mind that 2,500 tokens does not mean 2,500 words. From my basic tests, a token is any 3 printable characters. The first empty space is a token, but if you have a line of empty spaces, you’ll only be charged a token once per 15 empty spaces.

Punctuation counts as a token.

For example:

The fat cat

This is six tokens in GirlfriendGPT, which helpfully adds a token counter to the label of every field where the text counts towards your available tokens.

The fat cat.

Adding a period at the end makes it seven tokens.

Suddenly, 2,500 tokens doesn’t seem like the vast wealth of information space it once did. And it’s not. Throw in a good description of your character, a few changes of clothing, scenery or weather, some system instructions to keep things on track and… well, you might have room for a trigger or two. (I’ll explain triggers later. If you’re a coder, think “delegates” or “event handlers”.)

But this token cap allows for characters to run more smoothly, and guarantees there is system memory for other users running chat characters on the same system simultaneously.

Many subscription services offer tiered memberships where paying more per month get you enhanced features like in-chat videos. I’ve yet to see a service that offers more tokens per character to higher-paying tiers.

The token cap is a real thing and it is unlikely to go away. Embrace it. Limits can be good creative challenges. In this case, learning how to streamline your logic and your language to get the most of your tokens without bogging down the system by exceeding your limit. (Technically, you can’t exceed your limit. The interface on GirlfriendGPT won’t allow you to save a character that exceeds the token count. Other systems probably have the same safeguards.)

The Good News

GirlfriendGPT doesn’t parcel your token count. Those 2,500 tokens are yours to spend wherever they make the most sense to do so. Want to create a character study that really nails the details of the person you’re trying to describe? Well, then you’ll be spending most of your tokens in the “Personality” section. And if you want to create a more story-based experience, you can go light on character details and put your efforts into the “Scenario” section.

I can’t foresee a situation in which you would spend a plurality of your tokens in the description, first message (greeting) or example conversation fields, but you could if you wanted.

The Better News

Even better than being able to distribute your token allotment where you need it is the fact you get to use JSON, rather than describing everything using plain English.

Consider the following description.

Jeff is an engaging and charismatic young man of 25. He’s 5’10”, has green eyes, black hair, and a smattering of freckles on his otherwise rugged face, giving him a somewhat boyish appearance. Jeff is most comfortable in jeans and a t-shirt, but has no problem slipping into a business suit, or even a tux, when the situation calls for it.

GirlfriendGPT says that is 97 tokens, for those of you keeping me on my toes. It has a decent enough signal-to-noise ratio, but there’s no formal structure to it. And when the situation calls for it, you will have to have descriptions of Jeff in a business suit or Jeff in a tuxedo.

Or…

{
"name" : "Jeff",
"physical description" : "6'4", muscular, lanky, toned, black hair, green eyes",
"features": [
            "freckles" : "a smattering of freckles across his face give Jeff a boyish appearance."
            ],
"outfits" : [
            "everyday" : "Jeans, t-shirt, worn sneakers",
            "Work" : "Brooks Brothers suit, black wingtips, white oxford shirt, black silk tie",
            "Formal" : "Black tuxedo with black tie and cummerbund, black patent leather shoes."
            ]
}

That’s 167 tokens, so how did we “save” tokens by using way more?

Well, first the obvious. I gave a more detailed description of Jeff’s clothing in the second example. I gave a more detailed description of his freckles, and broke out his physical description.

But the real benefit is less obvious. It’s the arrays. Specifically, in this instance, the clothing array that contains the description of several outfits.

Now, if I want to write instructions that cause Jeff to find himself at a business meeting or a night at the opera, I can reference those outfits by name without having to describe them again.

A chatbot’s “memory” spans the last two or three messages back and forth with the person using the chatbot. That’s it. The AI can’t do any better, and it is unrealistic to expect an end user to continually remind the character what they’re wearing or where they are currently located.

Fortunately, the AI can and will access all the information entered at character creation, so telling Jeff to get into his tux ready for a night on the town, and the AI will happily put Jeff in the same formal wear time and time again.

What’s more… chances are the AI is smart enough to know that taking Jeff to the opera means that Jeff will be wearing his tuxedo (or whatever you described as formal wear), and Jeff attending a business meeting will know to put on his suit first.

Because JSON is an open standard for passing along plain text, and because an AI-enhanced chatbot is designed to interpret that plain text, you’re really only limited by the TOS of your platform, and your imagination.

The real trick is in reaching a mutual understanding with the language model used by your platform, so that if you describe a character using words like “convivial” or “eccentric”, the AI produces characters that are in-line with your preconceptions of what those words imply.

I’ll be exploring more of that in the next post as I drill down on the personality section.

Chris Dangerferret's 2nd Childhood

Leave a comment Cancel reply