-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎤 feat: add custom speech config, browser TTS/STT features, and dynamic speech tab settings #2921
base: main
Are you sure you want to change the base?
Conversation
…ernal audio endpoints This commit updates the useTextToSpeech and useSpeechToText hooks in the Input directory to support external audio endpoints. It introduces the useGetExternalTextToSpeech and useGetExternalSpeechToText hooks, which determine whether the audio endpoints should be set to 'browser' or 'external' based on the value of the endpointTTS and endpointSTT Recoil states. The useTextToSpeech and useSpeechToText hooks now use these new hooks to determine whether to use external audio endpoints
The updateTokenWebsocket function and its import are no longer used in the OpenAIClient module. This commit removes the function and import to clean up the codebase
…chToText hooks This commit updates the useTextToSpeech and useSpeechToText hooks in the Input directory to support external audio endpoints. It introduces the useGetExternalTextToSpeech and useGetExternalSpeechToText hooks, which determine whether the audio endpoints should be set to 'browser' or 'external' based on the value of the endpointTTS and endpointSTT Recoil states. The useTextToSpeech and useSpeechToText hooks now use these new hooks to determine whether to use external audio endpoints
…tests: added AutomaticPlaybackSwitch.spec > > This commit renames the AutomaticPlayback component to AutomaticPlaybackSwitch in the Speech directory. The new name better reflects the purpose of the component and aligns with the naming convention used in the codebase.
This commit updates the useSpeechToText hook in the client/src/components/Chat/Input/AudioRecorder.tsx file to include the interimTranscript state. This allows for real-time display of the speech-to-text transcription while the user is still speaking. The interimTranscript is now used to update the text area value during recording.
…h configuration This commit adds a new API endpoint in the file under the directory. This endpoint is responsible for retrieving the custom speech configuration using the function from the module
…speech configurations This commit modifies the useCustomConfigSpeechQuery function in the client/src/data-provider/queries.ts file to return an array of custom speech configurations instead of a single object. This change allows for better handling and manipulation of the data in the application
…speech configurations
please fix the broken test |
slated for v0.7.4 |
Hi Berry, there are elevated errors when using this without config settings 2024-06-21 09:31:55 info: Server listening on all interfaces at port 3080. Use http://localhost:3080 to access it
2024-06-21 09:31:59 error: Failed to get speechTab settings: Configuration or speechTab schema is missing
2024-06-21 09:31:59 error: Failed to get voices: Configuration or TTS schema is missing
2024-06-21 09:32:45 error: Failed to get speechTab settings: Configuration or speechTab schema is missing
2024-06-21 09:32:45 error: Failed to get voices: Configuration or TTS schema is missing I should not be seeing any errors when I don't have speech enabled/configured and I open the Settings |
Also when i add the new format for speech settings, i should not get an error if i don't have speechTab settings 2024-06-21 09:36:42 error: Failed to get speechTab settings: Configuration or speechTab schema is missing |
lastly, if I'm using the old setup, this error message is confusing and I will think the app has a bug: 2024-06-21 09:40:49 error: Invalid custom config file at /home/danny/LibreChat/librechat.yaml [
{
"code": "unrecognized_keys",
"keys": [
"tts",
"stt"
],
"path": [],
"message": "Unrecognized key(s) in object: 'tts', 'stt'"
}
] this is a parsing error for the custom config file, we should still show it, but also add a note if this error message is detected, that the format has changed. Also please accompany this PR with an update to the changelog: https://www.librechat.ai/changelog |
Summary
This PR introduces several key features and improvements related to the speech functionality in LibreChat:
librechat.yaml
, allowing the ADMIN to set pre-configured "speech tab" settingsendpointSTT
andendpointTTS
toengineSTT
andengineTTS
respectively/api/files/speech
Breaking Changes
SpeechToText
andTextToSpeech
in the store have been renamed tospeechToText
andtextToSpeech
. If you encounter any issues, please delete LibreChat's cachespeech:
Change Type
Testing
External STT/TTS:
Local STT/TTS:
Test Configuration:
To reproduce the test process, follow these steps:
librechat.yaml
file with appropriate speech settingsengineSTT
andengineTTS
) function correctly/api/files/speech
Checklist