steamship.agents.tools.audio_transcription package#

Submodules#

steamship.agents.tools.audio_transcription.assembly_speech_to_text_tool module#

Tool for generating images.

class steamship.agents.tools.audio_transcription.assembly_speech_to_text_tool.AssemblySpeechToTextTool(*, name: str = 'AssemblySpeechToTextTool', agent_description: str = 'Used to generate text from spoken audio. Only use if the user has asked directly for a text version of an audio file. When using this tool, the input should be the audio file. The output is the text.', human_description: str = 'Generates text from spoken audio.', is_final: bool = False, cacheable: bool = True, blockifier_plugin_handle: str = 's2t-blockifier-assembly', blockifier_plugin_instance_handle: str | None = None, blockifier_plugin_config: dict = {})[source]#

Bases: AudioBlockifierTool

Tool to generate text from audio.

agent_description: str#: Description for use in an agent in order to enable Action selection. It should include a short summary of what the Tool does, what the inputs to the Tool should be, and what the outputs of the tool are.

blockifier_plugin_handle: str#

human_description: str#: Human-friendly description. Used for logging, tool indices, etc.

name: str#: The short name for the tool. This will be used by Agents to refer to this tool during action selection.

steamship.agents.tools.audio_transcription.fetch_audio_urls_from_rss_tool module#

class steamship.agents.tools.audio_transcription.fetch_audio_urls_from_rss_tool.FetchAudioUrlsFromRssTool(*, name: str = 'FetchAudioUrlsFromRssTool', agent_description: str = 'Used to fetch the podcast episode URLs from a podcast RSS feed. The input is the URL of the RSS feed. The output is the URLs of the episode audio.', human_description: str = 'Fetches the episode URLs from a Podcast RSS feed.', is_final: bool = False, cacheable: bool = True)[source]#

Bases: Tool

Given an RSS feed, this tool will extract episode URLs.

agent_description: str#: Description for use in an agent in order to enable Action selection. It should include a short summary of what the Tool does, what the inputs to the Tool should be, and what the outputs of the tool are.

human_description: str#: Human-friendly description. Used for logging, tool indices, etc.

name: str#: The short name for the tool. This will be used by Agents to refer to this tool during action selection.

run(tool_input: List[Block], context: AgentContext) → List[Block] | Task[Any][source]#

Run the tool given the provided input and context.

At the moment, only synchronous Tools (those that return List[Block]) are supported.

Support for asynchronous Tools (those that return Task[Any]) will be added shortly.

steamship.agents.tools.audio_transcription.whisper_speech_to_text_tool module#

Tool for generating images.

class steamship.agents.tools.audio_transcription.whisper_speech_to_text_tool.WhisperSpeechToTextTool(*, name: str = 'WhisperSpeechToTextTool', agent_description: str = 'Used to generate text from spoken audio at a URL. Only use if the user has asked directly for a an text version of an audio file. The input is a URL. The output is the text from that URL.', human_description: str = 'Generates text from spoken audio.', is_final: bool = False, cacheable: bool = True, blockifier_plugin_handle: str = 'whisper-s2t-blockifier', blockifier_plugin_instance_handle: str | None = None, blockifier_plugin_config: dict = {})[source]#

Bases: AudioBlockifierTool

Tool to generate audio from text.

agent_description: str#: Description for use in an agent in order to enable Action selection. It should include a short summary of what the Tool does, what the inputs to the Tool should be, and what the outputs of the tool are.

blockifier_plugin_handle: str#

human_description: str#: Human-friendly description. Used for logging, tool indices, etc.

name: str#: The short name for the tool. This will be used by Agents to refer to this tool during action selection.

Module contents#

class steamship.agents.tools.audio_transcription.AssemblySpeechToTextTool(*, name: str = 'AssemblySpeechToTextTool', agent_description: str = 'Used to generate text from spoken audio. Only use if the user has asked directly for a text version of an audio file. When using this tool, the input should be the audio file. The output is the text.', human_description: str = 'Generates text from spoken audio.', is_final: bool = False, cacheable: bool = True, blockifier_plugin_handle: str = 's2t-blockifier-assembly', blockifier_plugin_instance_handle: str | None = None, blockifier_plugin_config: dict = {})[source]#

Bases: AudioBlockifierTool

Tool to generate text from audio.

agent_description: str#: Description for use in an agent in order to enable Action selection. It should include a short summary of what the Tool does, what the inputs to the Tool should be, and what the outputs of the tool are.

blockifier_plugin_config: dict#

blockifier_plugin_handle: str#

blockifier_plugin_instance_handle: str | None#

cacheable: bool#: Whether runs of this Tool should be cached based on inputs (if caching is enabled in the AgentContext for a run). Setting this to False will make prevent any Actions that involve this tool from being cached, meaning that every Action using this Tool will result in a call to run. By default, Tools are considered cacheable.

human_description: str#: Human-friendly description. Used for logging, tool indices, etc.

is_final: bool#

Whether actions performed by this tool should have their is_final bit marked.

Setting this to True means that the output of this tool will halt the reasoning loop. Its output will be returned directly to the user.

name: str#: The short name for the tool. This will be used by Agents to refer to this tool during action selection.

class steamship.agents.tools.audio_transcription.FetchAudioUrlsFromRssTool(*, name: str = 'FetchAudioUrlsFromRssTool', agent_description: str = 'Used to fetch the podcast episode URLs from a podcast RSS feed. The input is the URL of the RSS feed. The output is the URLs of the episode audio.', human_description: str = 'Fetches the episode URLs from a Podcast RSS feed.', is_final: bool = False, cacheable: bool = True)[source]#

Bases: Tool

Given an RSS feed, this tool will extract episode URLs.

agent_description: str#: Description for use in an agent in order to enable Action selection. It should include a short summary of what the Tool does, what the inputs to the Tool should be, and what the outputs of the tool are.

cacheable: bool#: Whether runs of this Tool should be cached based on inputs (if caching is enabled in the AgentContext for a run). Setting this to False will make prevent any Actions that involve this tool from being cached, meaning that every Action using this Tool will result in a call to run. By default, Tools are considered cacheable.

human_description: str#: Human-friendly description. Used for logging, tool indices, etc.

is_final: bool#

Whether actions performed by this tool should have their is_final bit marked.

Setting this to True means that the output of this tool will halt the reasoning loop. Its output will be returned directly to the user.

name: str#: The short name for the tool. This will be used by Agents to refer to this tool during action selection.

run(tool_input: List[Block], context: AgentContext) → List[Block] | Task[Any][source]#

Run the tool given the provided input and context.

At the moment, only synchronous Tools (those that return List[Block]) are supported.

Support for asynchronous Tools (those that return Task[Any]) will be added shortly.

class steamship.agents.tools.audio_transcription.WhisperSpeechToTextTool(*, name: str = 'WhisperSpeechToTextTool', agent_description: str = 'Used to generate text from spoken audio at a URL. Only use if the user has asked directly for a an text version of an audio file. The input is a URL. The output is the text from that URL.', human_description: str = 'Generates text from spoken audio.', is_final: bool = False, cacheable: bool = True, blockifier_plugin_handle: str = 'whisper-s2t-blockifier', blockifier_plugin_instance_handle: str | None = None, blockifier_plugin_config: dict = {})[source]#

Bases: AudioBlockifierTool

Tool to generate audio from text.

agent_description: str#: Description for use in an agent in order to enable Action selection. It should include a short summary of what the Tool does, what the inputs to the Tool should be, and what the outputs of the tool are.

blockifier_plugin_config: dict#

blockifier_plugin_handle: str#

blockifier_plugin_instance_handle: str | None#

cacheable: bool#: Whether runs of this Tool should be cached based on inputs (if caching is enabled in the AgentContext for a run). Setting this to False will make prevent any Actions that involve this tool from being cached, meaning that every Action using this Tool will result in a call to run. By default, Tools are considered cacheable.

human_description: str#: Human-friendly description. Used for logging, tool indices, etc.

is_final: bool#

Whether actions performed by this tool should have their is_final bit marked.

Setting this to True means that the output of this tool will halt the reasoning loop. Its output will be returned directly to the user.

name: str#: The short name for the tool. This will be used by Agents to refer to this tool during action selection.