Commit Graph

9 Commits

Author SHA1 Message Date
950572046e feat: Auto-save voice recordings for model training
- Add /api/recordings endpoint with full CRUD operations
- Create voice_recordings SQLite table for metadata
- Save audio files to server/recordings/ as .webm
- Store transcription, duration, microphone name, file size
- Auto-save on each Whisper recording completion
2026-02-14 01:56:53 -06:00
5da6179f75 feat: Add microphone selection and audio playback to FloatingVoice
- Add microphone device enumeration and selector dropdown
- Show current microphone name with click-to-change UI
- Microphone selection only available with Whisper GPU mode
- Add audio playback button to replay last recorded audio for debugging
- Improve dropdown animations with staggered item transitions
- Fix FloatingTerminal token request to type character by character
2026-02-14 01:47:08 -06:00
5be0fb91ab fix: Improve Whisper server startup with async polling and reduce logs
- Make server startup async to avoid Bun's 10s timeout
- Add frontend polling to detect when server is ready
- Use PowerShell Get-NetTCPConnection for reliable port detection
- Add starting state to prevent multiple simultaneous starts
- Reduce verbose logging, keep only essential info
- Add dev-dist and nul to gitignore
2026-02-14 01:03:02 -06:00
9f1e10b8d5 feat: Add typing animation to voice transcription
- Text appears letter by letter (15-25ms per character)
- Blinking cursor shows while text is animating
- Animation continues from last position for new chunks
- Smooth visual feedback for transcription progress
2026-02-14 00:28:26 -06:00
ac17a9f292 fix: Improve Whisper transcription with WebM to WAV conversion
- Add ffmpeg conversion from WebM/Opus to WAV (16kHz mono PCM)
- Optimize transcription parameters (VAD, temperature, beam_size)
- Add Honduras Spanish context prompt with local expressions
- Fix chunk accumulation display in voice panel
- Add 1.5s recording buffer after releasing Ctrl+Space
- Skip small audio chunks (<5KB) that cause ffmpeg errors
- Use large-v3 model for better accuracy
2026-02-14 00:16:01 -06:00
638e6ac8e0 feat: Add Whisper GPU speech-to-text with progressive transcription
- Add faster-whisper Python server for GPU-accelerated transcription
- Support dual mode: Web Speech API or Whisper GPU (toggleable)
- Progressive transcription every 3 seconds while recording
- Separate terminal server process (stable during hot-reload)
- Add Ctrl+V paste and Ctrl+C copy support in FloatingTerminal
- Add MCP tools: whisper_start, whisper_stop, whisper_toggle, whisper_status
- Update package.json with separate api/terminal/frontend processes
2026-02-13 23:47:52 -06:00
e867b7873e feat: Add page_refresh global tool and update voice shortcut to Ctrl+Space
- Add page_refresh tool to reload the page via MCP
- Change push-to-talk shortcut from Ctrl+S to Ctrl+Space
- Use capture phase for keyboard events to intercept before terminal
2026-02-13 21:41:56 -06:00
306aade623 feat: Add push-to-talk keyboard shortcut (Ctrl+S) to FloatingVoice
- Hold Ctrl+S for 500ms to start recording
- Release to stop recording and send to terminal
- Shows PTT indicator when using keyboard shortcut
2026-02-13 21:28:44 -06:00
8118356999 feat: Add FloatingVoice component for voice-to-text input
- Add FloatingVoice component with Web Speech API transcription
- Each component has its own independent WebSocket session
- Voice panel connects on open, disconnects on close
- Sends transcribed text to Claude Code with Enter key
2026-02-13 20:24:57 -06:00