-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify usage of defineDotPrompt vs definePrompt #338
Comments
Thanks for writing this up @MichaelDoyle! Just adding some comments here to perhaps save others some time. There are a few questions raised inline too! The following is a sample dotPrompt saved as weather.prompt You'll see at the top of the file I have used it with gpt-4o and gemini. It would be great if we could pull the list of available models literal name strings (e.g. "googleai/gemini-1.5-pro-latest") via the genKit api, ideally, across all LLM providers i.e. Ollama, OpenAI, Google, Grok etc? If not this is not possible via the api, then having them listed in Github in one place would be a good start. ---
# model: googleai/gemini-1.5-pro-latest
model: openai/gpt-4o
config:
temperature: 0.6
# input:
# schema:
# type: object
# properties:
# cities:
# type: array
# items:
# type: string
# No 'required' keyword here, making it optional
output:
format: text
tools:
- getWeather
---
{{role "system"}}
Always try to be as efficient as possible, and request tool calls in batches.
{{role "user"}}
I really enjoy traveling to places where it's not too hot and not too cold.
{{role "model"}}
Sure, I can help you with that.
{{role "user"}}
Help me decide which is a better place to visit today based on the weather.
I want to be outside as much as possible. Here are the cities I am considering:
New York
London
Amsterdam
A small thing here to remember is to configureGenkit and not to forget the "apiVersion: 'v1beta'", otherwise, you'll get googleai/gemini-1.5-pro-latest not found. import * as functions from 'firebase-functions';
import * as admin from 'firebase-admin';
admin.initializeApp();
import { dotprompt } from '@genkit-ai/dotprompt';
import { configureGenkit } from '@genkit-ai/core';
import { googleAI } from '@genkit-ai/googleai';
import { openAI } from 'genkitx-openai';
configureGenkit({ plugins: [
dotprompt(),
googleAI({ apiKey: '<YOUR_API_KEY>', apiVersion: 'v1beta' }),
openAI({ apiKey: '<YOUR_API_KEY>' })
] }); The calling code const weatherPrompt = await prompt('weather');
promptResult = await weatherPrompt.generate({}); I have tested this simple tool with Gemini and Gpt-4o const getWeather = defineTool(
{
name: 'getWeather',
description: 'Get the weather for the given location.',
inputSchema: z.object({ city: z.string() }),
outputSchema: z.object({
temperatureF: z.number(),
conditions: z.string(),
}),
},
async (input) => {
const conditions = ['Sunny', 'Cloudy', 'Partially Cloudy', 'Raining'];
const c = Math.floor(Math.random() * conditions.length);
const temp = Math.floor(Math.random() * (120 - 32) + 32);
return {
temperatureF: temp,
conditions: conditions[c],
};
}
);
export { getWeather }; The tool is called as expected, however, "promptResult.toolRequests" never seems to return anything, perhaps, I am not calling it correctly? if (promptResult.toolRequests) {
for (const toolRequest of promptResult.toolRequests()) {
const tool = this._tools[toolRequest.name];
if (tool) {
await tool.run(this._agent_doc_ref, toolRequest?.input);
this._last_tool_completion_datetime = new Date(new Date().toUTCString());
await this.updateAgentData({
last_action: toolRequest?.name,
[`last_tool_completion_datetime_${this._agent_chat_id}`]: this._last_tool_completion_datetime,
[`current_prompt_step_${this._agent_chat_id}`]: this._current_prompt_step,
retry: retryCount
});
hasCalledTool = true;
}
}
} I have got around this by looking at the const responseHistory = promptResult.toHistory(); The toHistory returns messages with a content array of type toolRequest and toolResponse, so at least, we have access it to it.
I am wondering the best way to handle more complex tools let's say, whereabouts, we need to pass in parameters from the underlying code. For example, we might want to pass in the relevant IDatabase and associate database record key. As follows, you'll see I created a wrapper function for this called createToolWithContext: const createToolWithContext = (database: IDatabase, agentDocId: string) => {
return <TInput, TOutput>(config: { name: string; description: string; inputSchema: z.ZodSchema<TInput>; outputSchema?: z.ZodSchema<TOutput>; }, handler: (input: TInput, context: { database: IDatabase; agentDocId: string }) => Promise<TOutput>) => {
return defineTool(config, (input: TInput) => handler(input, { database, agentDocId }));
};
};
/**
* Tool to save the user's first and last name.
*/
const saveName = (database: IDatabase, agentDocId: string) => createToolWithContext(database, agentDocId)(
{
name: 'SaveName',
description: 'Saves the user\'s first and last name.',
inputSchema: z.object({
firstName: z.string().optional(),
lastName: z.string().optional(),
}),
},
async (input, { database, agentDocId }) => {
// Split the agent document ID to get the user ID.
const userId = agentDocId.split('_')[0];
// Prepare the update data object.
const updateData: any = {};
if (input.firstName) {
updateData.firstName = input.firstName.trim();
}
if (input.lastName) {
updateData.lastName = input.lastName.trim();
}
// Update the user profile document in the database.
await database.set('userProfile', userId, updateData, true);
return { success: true };
}
); I assume the correct way is then to programmatically set it (when using dotPrompt), is this the correct approach? dotPrompt.tools = [saveName(database, agentDocId)];
promptResult = await dotPrompt.generate({}); Lastly, thanks for the notes on history. I may pass that into dotPrompt as a parameter and render it. It would be great if there was some way to auto summarise the history based on the number of tokens used and pass that in as history. The plumbing code is quite tedious to write and difficult to test. Is this something GenKit api could handle with the call to saveHistory()? |
First off - thank you so much for taking the time to do such a detailed write up. We'll definitely leverage these insights as we continue make improvements to the framework. See answers to your questions below:
Are you looking for code completion / compile time checking for model names? Or a reflective way to interrogate Genkit programatically ? Currently, we do not provide the former, but we do provide the latter. Granted, you'll only be able to interrogate the registry for plugins that are loaded.
Good call out - the Gemini 1.5 family of models are now GA, and will be available in the v1 API starting in next week's Genkit release Separately, we're working on improved messaging if
What would be most intuitive for you here? What will you do with access to the tool calls? In short, you are handling this correctly by looking through the message history. You'll only see toolRequests if you pass
I think you have an interesting solution here; you should be able to reference the tool by name in your Depending on whether or not this code is run once then thrown away (e.g. inside a cloud/firebase function) or is long lived, this may or not behave the way you are expecting. You don't want to register the same tool name twice. And you also don't want to register a new one globally for every user request either. If you need something truly dynamic should be able to pass an
Thanks for the suggestion/request. We'll give this one some thought. There are a few of us working through what it might look like in Genkit to support Agents in a more first-class way. |
Thank you @MichaelDoyle I'll answer inline as follows:
The code completion would be a nice to have, a programmatic, reflective way to query GenKit would be ideal. For all models e.g. all supported OpenAI models, Ollama etc.
Processing the history is a bit of a pain and Im not sure if it would be performant if it is a long conversation. If promptResult.toolRequests returned what tools were called, when the tool was called, what parameters, and what the return was: Essentially all the history filtered by the role="tools" I guess...
I would like to be able to switch environments from Firebase and any other node environment. I have achieved this by abstracting executeFlow and the Agent class as follows: ./index.ts export const startFlow = functions.https.onCall(async (data, context) => {
const userId = context.auth?.uid;
if (!userId) {
return { status: "FAIL2", message: "User not logged in: unauthorized user", data: {} };
}
const { response, agent_chat_id } = data;
return await executeFlow(userId, response, agent_chat_id);
}); ./agentService.ts export async function executeFlow(userId: string, response: string, agent_chat_id: number) {
try {
const database = new FirestoreDatabase();
const intent3nsquestionnaire = await prompt('intent3nsquestionnaire');
const dotPrompts = [intent3nsquestionnaire];
const agent = new AgentMultiPromptSequencer(
userId,
agent_chat_id,
database,
dotPrompts,
generateChatTitlePrompt
);
await agent.init();
const agentRes = await agent.handle_response(response);
return { status: "SUCCESS", message: "", data: { result: agentRes } };
} catch (error) {
console.error("CATCH ERROR executeFlow: ", error);
}
} See questions 5 and 6 below.... Now let's have a look at ./AgentMultiPromptSequencer.ts class and our dynamic tool: const dynamicTools = createDynamicTool(this._database, `${this._owner_id}_${this._agent_chat_id}`);
dotPrompt.tools = [dynamicTools.saveName];
let input = {
isRetryCountZeroOrLess: retryCount <= 0,
userResponse: response, retryCount: retryCount, questionId: this._current_prompt_step, history
};
let promptResult = await dotPrompt.generate({input}); Where createDynamicTool is defined as: export const createDynamicTool = (database: IDatabase, agentDocId: string) => {
return {
saveName: {
name: 'SaveName',
description: 'Call after response to the question like: can you tell me your first and last name',
inputSchema: z.object({
firstName: z.string().optional(),
lastName: z.string().optional(),
}),
handler: async (input: { firstName?: string; lastName?: string }) => {
const userId = agentDocId.split('_')[0];
const updateData: any = {};
if (input.firstName) {
updateData.firstName = input.firstName.trim();
}
if (input.lastName) {
updateData.lastName = input.lastName.trim();
}
await database.set('userProfile', userId, updateData, true);
return { success: true };
}
}
};
}; Ive set out question 4 below on the above This brings us to the .prompt and handlebars. It appears that custom handlers are not supported by GenKit Handlebars. Examine the following .prompt file especially the knownHelpersOnly: false and the logic keywords ---
model: openai/gpt-4
options:
knownHelpersOnly: false
config:
temperature: 0.6
maxRetryCount:
0: 3
1: 5
2: 2
3: 2
4: 2
5: 2
6: 3
7: 2
input:
schema:
type: object
properties:
userResponse:
type: string
default: ""
retryCount:
type: number
questionId:
type: number
history:
type: array
items:
type: object
properties:
content:
type: array
items:
type: string
role:
type: string
domainName:
type: string
default: ""
output:
format: text
---
{{role "system"}}
You are an expert question answering gathering assistant, your goal is to..
{{#each history}}
{{#if (eq this.role "user")}}
{{role "user"}}
{{this.content.[0]}}
{{/role}}
{{else if (eq this.role "model")}}
{{role "model"}}
{{this.content.[0]}}
{{/role}}
{{/if}}
{{/each}}
{{#switch questionId}}
{{#case 0}}
{{#if (lte retryCount 0)}}
{{role "system"}}
Greetings, "To get started, could you please provide your first and last name so I know how to address you?"
{{else}}
{{role "system"}}
We asked the user to provide their first and last name, and they responded: "{{userResponse}}". Please acknowledge their response logically and politely, and try to encourage them to answer the question we raised.
{{/if}}
{{/case}}
{{#case 1}}
{{#if (lte retryCount 0)}}
{{role "system"}}
In a beautifully reworded, concise, and natural way, ask the user the following question and any relevant follow-up questions to gather their intent. Question: "Please tell me your main purpose...."
{{else}}
{{role "system"}}
We asked the user: "Do you want me to sell something, prompt...?" They answered: "{{userResponse}}". Please respond logically to their answer and try to encourage them to clarify their intent further.
{{/if}}
{{/case}}
{{#case 2}}
etc...
{{/switch}}
This will throw parse errors on "eq" "lte" something like "You specified knownHelpersOnly, but used the unknown helper eq - 4:6" So i thought I would define my own const Handlebars = require('handlebars');
Handlebars.registerHelper('eq', function(a, b) {
return a == b;
}); However they are just ignored even if we specify options:
knownHelpersOnly: false in the .prompt file I guess I can hack my way around this by:
Few questions: 1.) Is it not possible to define our own custom handlebar helpers? I would very much appreciate any guidance you might have. Thanks again! |
Re: (5) HTTP Cloud Functions do support streaming but we haven't yet added it to the Callable Functions protocol. It's something we're interested in pursuing. |
Thanks for the help, I am losing quite a bit of time investigating and trying to work around these issues it would be really helpful if you could, if at all possible:
Many thanks |
2, 3, 7 - Context and history Implemented in #421. 4 - I think you mostly have it. I would just change things around slightly so that you're returning an import { action } from '@genkit-ai/core';
export const createDynamicTool = (database: IDatabase, agentDocId: string) => {
return action(
{
name: 'saveName', // give this a unique name however you'd like
description:
'Call after response to the question like: can you tell me your first and last name',
inputSchema: z.object({
firstName: z.string().optional(),
lastName: z.string().optional(),
}),
},
async (input) => {
const userId = agentDocId.split('_')[0];
const updateData: any = {};
if (input.firstName) {
updateData.firstName = input.firstName.trim();
}
if (input.lastName) {
updateData.lastName = input.lastName.trim();
}
await database.set('userProfile', userId, updateData, true);
return { success: true };
}
);
}; 5 - Let me check in and see if there is an expert on Firebase functions that can weigh in on that one. 6 - You can use the registry to do this. import { listActions } from '@genkit-ai/core/registry';
Object.keys(await listActions())
.filter((k) => k.startsWith('/model/'))
.map((k) => k.substring(7, k.length)); Example output (depends on what plugin(s) you have installed): [
"googleai/gemini-1.5-pro-latest",
"googleai/gemini-1.5-flash-latest",
"googleai/gemini-pro",
"googleai/gemini-pro-vision",
"ollama/llama2",
"ollama/llama3",
"ollama/gemma",
"vertexai/imagen2",
"vertexai/gemini-1.0-pro",
"vertexai/gemini-1.0-pro-vision",
"vertexai/gemini-1.5-pro",
"vertexai/gemini-1.5-flash",
"vertexai/gemini-1.5-pro-preview",
"vertexai/gemini-1.5-flash-preview",
"vertexai/claude-3-haiku",
"vertexai/claude-3-sonnet",
"vertexai/claude-3-opus"
] |
Thanks for the help. The latest bug occurs when setting renderedPrompt.returnToolRequests = false; CATCH ERROR executeFlow: Error: running outside step context Calling code
The tools are defined as, note I had to define the output schema otherwise it throws an error:
and the prompt as:
I must admit I am pretty close to giving up on genkit, I do have a working version with my own rolled code but had hoped GenKit would save on the manual implementation of each API. Questions:
1a. (and, if it is now possible, using defineDotPrompt using code and through .prompt with some custom handlebars extensions) ?
thank you. |
|
Great thanks @MichaelDoyle @mbleigh Do you have a timeline on fixing the bug thrown? Only when If it is true it works as expected and the toolrequest call is in history CATCH ERROR executeFlow: Error: running outside step context |
I opened #574, or else I am afraid it will get lost here. |
Heads up, this is still causing some confusion: #723. It looks like Genkit's developed quite a bit even over the last 3 weeks though, so my experience working with I do see some overlap though between some parts of this discussion and some of my own high-level questions about Genkit's architectural/design philosophy: #731. |
Acknowledged! There are some efforts underway to revamp the documentation. I'll check in to make sure this is part of it. |
@i2amsam @MichaelDoyle docs have been updated to remove definePrompt in favor of defineDotprompt. Any additional work needed here? |
There is additional work in progress on prompt documentation but I think you can close this particular one |
@kevinthecheung much better well done - I'd say add a small table on when to use what and what each provides at the end of the page to save people time |
Agreed - I'll close this out, because the fundamental issue is addressed. We can look at including a comparison of different prompt management styles as part of what @kevinthecheung is working on next with the prompt documentation. |
Problem
The existence of both
definePrompt
anddefineDotPrompt
is causing confusion. See also: discussion #337. I believe this at least partially stems from the following documentation which framesdefinePrompt
as a starting point, anddotprompt
as a more advanced capability:https://firebase.google.com/docs/genkit/prompts
I don't think this is what we intend; rather I think we intend for developers to start with
dotprompt
.Background history/context:
When we first implemented the dotprompt library, we had a method
definePrompt
that was used to create and register a prompt action in the registry. Calling this action conveniently hydrated any input variables into the prompt and then called the model to generate a response.Later, when we added dotprompt functionality to the Developer UI, we needed an action that would simply render the prompt template without doing the generate step. This led to the following changes:
definePrompt
was repurposed as a lower level method that registers a prompt action which returns aGenerateRequest
without doing the generate step.defineDotPrompt
method, which followed the old semantics (render the template and then call generate). It usesdefinePrompt
under the covers to register the prompt, which allows the Developer UI to call the underlying prompt action in cases where it only needs the rendered prompt template.Proposal
definePrompt
to something more indicative of what it should be used for.The text was updated successfully, but these errors were encountered: