Skip to content
Discussion options

You must be logged in to vote

Based on our tests, at present only Gemini 2.5/3.x Flash/Pro can reliably transcribe stably without audio preprocessing. As far as I know, there is currently no offline model that can simultaneously support speech + tools + structured output; using Gemini is recommended.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@zbf1999
Comment options

Answer selected by darkskygit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants