One thing that is incredibly fucking hard with building multimodal agents is that they need context across different modalities (in our case voice and SMS).



Here's an example of a text conversation I had with our agent. I then called in, had a phone conversation, and the agent
A4,03%
AGENT-2,75%
post-image
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 8
  • Repost
  • Share
Comment
0/400
TommyTeachervip
· 06-11 22:16
Cross-modal communication is really exhausting.
View OriginalReply0
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • بالعربية
  • Português (Brasil)
  • 简体中文
  • English
  • Español
  • Français (Afrique)
  • Bahasa Indonesia
  • 日本語
  • Português (Portugal)
  • Русский
  • 繁體中文
  • Українська
  • Tiếng Việt