Last week, OctoAI’s Pedro Torruella and Alyss Noland hosted the first in a series of webinars on Harnessing Agentic AI: Function Calling. We had a lively discussion and received more than 50 audience questions 🤯 Below is a round up of all the questions we had time for, plus a few extras:
Question: What are the pros and cons of building agents with open source models?
Question: How are you routing questions or text between the smaller/cheaper model and the larger capable model? Where does that functionality reside?
Answer: That lies in the code/backend. You can decide how to route or you can use a hierarchical agent model (much more complex) to help determine routing. We recommend you check out this thread from LlamaIndex to if you want to go deeper on that topic.
Question: Is it possible to orchestrate several AI agents to work together?
Question: Can you talk about the techniques you use to evaluate that the output is correct? Do you suppress cases where answer may not be correct?
We have previously recorded webinar on how to think about this:
One of our partners, OpenPipe, offers an evaluation tool to compare output and their Mixture of Agents for this as well.
Question: Do you have a prerecorded walkthrough similar to demo 2? A comprehensive overview of the code and process from beginning to end of a codebase like the one developed for a similar demo?
Yes! Here it is:
Question: When you say OctoAI optimizes models, what does that mean exactly? What do you do to optimize them?
Answer: Here’s a good write up on the types of optimizations OctoAI applies to models we serve in SaaS and privately via OctoStack. We are also making investments in other inference acceleration like Medusa and Speculative Decoding.
Question: How do you opt out of having your data/activity used to train AI models?
Question: What kind of training data is most effective for AI agents?
Learn more about OctoAI at octo.ai or subscribe to our YouTube channel for more helpful content on building generative AI applications.