Two roles of tool schemas

Today I learned

Dec 12, 2024

Whenever there is a tool call message in a conversation history Anthropic’s implementation requires passing tool schemas even if the current message does not require tool usage. LiteLLM addresses this by introducing a "dummy" tool schema when none is explicitly provided, as seen in the transform_request function in LiteLLM code. That seems like workaround for a problem with the Anthropic SDK - but actually tool schemas are crucial for interpreting tool results. Without the schema, the model cannot fully understand the context or structure of the tool’s output. Therefore, schemas for all tools used in the current conversation thread should be included in the request.

However, tool schemas serve a dual purpose:

They enable the LLM to invoke these tools directly.
They allow the LLM to interpret the results of the tools.

This coupling of interpretation and invocation is error prone - because it is easy to forgot that you need to add all tool schemas used in the past to the list. It also complicates scenarios where you might want the model to understand results from past messages without granting it the ability to invoke the same tools in the current message.

For example, consider a web-browsing tool. Because a full page would not fit into the LLM context the LLM would be given possibility to `scroll` the page and load a new fragment of it. However that scrolling only makes sense work when the LLM is actively "on" the webpage. Allowing the LLM to call scrolling when the browser is not on a webpage can lead to unnecessary errors.

In SDKs for models like OpenAI’s and Anthropic’s, there is the tool_choice argument to direct the LLM to use a specific tool. However, this mechanism doesn’t support specifying a range of tools the LLM is allowed to choose from, so it helps only in some cases.

Conclusions

Schema Management: Always include schemas for tools used in the current conversation thread. This ensures accurate interpretation of tool results.
Separation of Concerns: If you need the LLM to interpret results without granting it the ability to invoke tools, consider preprocessing or filtering the tool list before passing it to the model.
SDK Enhancements: Future SDKs could benefit from a more granular approach to tool management, such as allowing developers to specify separate lists for tools the model can interpret versus tools it can invoke.

AI Adventures: A Programmer’s Journey

Discussion about this post