VoiceInputFix
Improves voice command accuracy and response speed with a local offline model. 基于本地离线模型,提升语音指令的识别准确率与响应速度。
| Last updated | 21 minutes ago |
| Total downloads | 98 |
| Total rating | 1 |
| Categories | Mods Tools Misc |
| Dependency string | Mhz-VoiceInputFix-1.0.6 |
| Dependants | 0 other packages depend on this package |
This mod requires the following mods to function
BepInEx-BepInExPack
BepInEx pack for Mono Unity games. Preconfigured and ready to use.
Preferred version: 5.4.2304Mhz-SherpaOnnxRuntime
Native dependencies for Sherpa ONNX Speech Recognition Engine. Sherpa ONNX 语音识别引擎的原生依赖库。
Preferred version: 1.0.0README
VoiceInputFix (Fun-ASR / SenseVoice Edition)
Description
VoiceInputFix is a replacement for the default speech recognition system in YAPYAP. It swaps the original Vosk engine with Fun-ASR (SenseVoice) to provide:
- Much Better Chinese Recognition: The original Vosk engine has poor Chinese support. This mod significantly outperforms Vosk for Chinese commands.
- Multi-language Support: Automatically recognizes Chinese (Mandarin/Cantonese)/English/Japanese/Korean.
- Lower Latency: Faster response times for voice commands.
Note: The Fun-ASR model has high recognition accuracy for all languages. However, the original Vosk model was trained with the in-game English voice commands included (like "tempest", "aero"), so it may match them better. For English players, the vanilla Vosk engine is still recommended. Other languages (Japanese, Korean, etc.) have not been extensively tested.
Dependencies
This mod requires the SherpaOnnxRuntime package to function. Please ensure it is installed. It provides the necessary core libraries:
onnxruntime.dllsherpa-onnx-c-api.dllsherpa-onnx.dll
Download Models
You must download the following files for the mod to work:
- model.onnx (1.03GB): Download
- tokens.txt (940KB): Download (If the link opens in your browser, press Ctrl+S to save, or right-click the link and select "Save Link As...")
Alternative Models: This mod uses Sherpa-ONNX's SenseVoice series models. If you want to try other models, make sure they are from the sherpa-onnx-sense-voice series (search for csukuangfj/sherpa-onnx-sense-voice on Hugging Face). The model files must include model.onnx and tokens.txt, which should be placed in the models folder.
Installation
- Critical:
VoiceInputFix.dlland the models folder MUST be in the same directory. - You can place them directly into
BepInEx/plugins/, or within a subfolder likeBepInEx/plugins/VoiceInputFix/. - Place the downloaded
model.onnxandtokens.txtinside the models folder.- Example path:
BepInEx/plugins/VoiceInputFix/models/model.onnx
- Example path:
Performance Characteristics
- Initial Loading: The game will experience a significant screen freeze when the engine initializes for the very first time. This is normal behavior while the 1GB model is being loaded into memory.
- Menu Reloading: When returning to the main menu and re-entering the game, there is a loading delay for the model, but it will not freeze the screen again.
- Gameplay: Once you enter a level, the recognition is smooth and available immediately with no impact on game performance.
Advanced Configuration
Config file path: BepInEx/config/Mhz.voiceinputfix.cfg
- SpeechThreshold (Default: 0.015)
This is the "Noise Gate". It determines the volume required to trigger recognition.
- If the mod captures background noise or text stays on screen too long: Increase this value (e.g., 0.025).
- If you have to shout to be heard: Decrease this value (e.g., 0.010).
- Language (Default: auto)
Specifies the language for recognition. While "auto" works for multiple languages, selecting a specific one improves recognition accuracy and reduces errors.
- auto: Detect all supported languages automatically: Chinese (Mandarin/Cantonese)/English/Japanese/Korean.
- zh, en, ja, ko, yue: Enhance the recognition weight for the selected language. Note: This prioritizes matching the chosen language but does not strictly lock it; if an utterance sounds extremely similar to another language, it may still be recognized as such.
- EnableDebugLog (Default: false)
VoiceInputFix (Fun-ASR / SenseVoice 版)
模组简介
VoiceInputFix 是一款改进《YAPYAP》语音识别体验的模组。它将游戏原有的 Vosk 引擎替换为 Fun-ASR (SenseVoice),主要改进包括:
- 中文识别大幅提升:原版 Vosk 引擎对中文支持较差,本模组在中文指令识别上显著优于原版。
- 多语言支持:默认自动识别中(普通话/粤语)/英/日/韩语。
- 更低延迟:语音指令的响应速度比原版更快。
注意:Fun-ASR 模型本身对各种语言的识别准确率都很高。但原版 Vosk 模型的训练数据中包含了游戏内的英文指令词(如 "tempest", "aero"),所以对这些指令词的识别更准确。对于英文玩家,建议继续使用原版 Vosk 引擎。其他语言(日文、韩文等)未进行深度测试。
必需依赖项
本模组需要安装 SherpaOnnxRuntime 运行库模组才能正常工作。它包含以下核心组件:
onnxruntime.dllsherpa-onnx-c-api.dllsherpa-onnx.dll
模型下载
必须下载以下两个文件,模组才能运行:
更换其他模型:本模组使用 Sherpa-ONNX 的 SenseVoice 系列模型。如果你想尝试其他模型,请确保是 sherpa-onnx-sense-voice 系列(在 Hugging Face 上搜索 csukuangfj/sherpa-onnx-sense-voice)。模型文件必须包含 model.onnx 和 tokens.txt,放入 models 文件夹即可。
安装步骤
- 核心原则:
VoiceInputFix.dll与 models 文件夹必须位于同一目录下。 - 你可以将它们直接放入
BepInEx/plugins目录,也可以放入plugins下的任意子文件夹内(注意:文件夹路径请勿包含中文字符。例如BepInEx/plugins/VoiceInputFix/)。 - 将下载好的
model.onnx和tokens.txt放入 models 文件夹内。- 示例路径:
BepInEx/plugins/VoiceInputFix/models/model.onnx
- 示例路径:
运行特性
- 首次加载:第一次启动语音引擎时,会出现一次较长时间的屏幕卡顿(画面静止)。这是正在将大型模型载入内存,属于正常现象,请耐心等待加载完成。
- 重连加载:返回主菜单并重新进入游戏时,会有短暂的模型重载等待时间,但不会再次导致画面卡死。
- 关卡表现:正式进入关卡后,功能可立即使用,运行流畅且不占用额外的游戏帧率。
进阶配置
配置文件路径:BepInEx/config/Mhz.voiceinputfix.cfg
- SpeechThreshold (默认值: 0.015)
这是语音检测的“分贝门槛”。
- 如果环境吵闹导致文字不消失:请调高此值(例如 0.025 或 0.03)。
- 如果说话必须很大声才能识别:请调低此值(例如 0.010)。
- Language (默认值: auto)
指定识别语言。虽然 "auto" 可以自动处理多语言,但指定特定语言可以显著提高识别率并减少识别错误。
- auto: 自动识别所有支持的语言:中(普通话/粤语)/ 英 / 日 / 韩语。
- zh, en, ja, ko, yue: 增强所选语言的识别权重。注意:这仅代表系统会【优先】匹配该语言,并不代表完全锁死;如果发音特征与其它语言极度相似,系统仍可能识别为其它语言。
- EnableDebugLog (默认值: false)