ポスト
OpenAI is expected to demo a real-time voice assistant tomorrow. What does it take to deliver an immersive, or even magical experience? Almost all voice AI go through 3 stages: 1. Speech recognition or "ASR": audio -> text1, think Whisper; 2. LLM that plans what to say next:…
メニューを開く