VoiceInputFix
Improves voice command accuracy and response speed with a local offline model. 基于本地离线模型,提升语音指令的识别准确率与响应速度。
| Last updated | 2 weeks ago |
| Total downloads | 2582 |
| Total rating | 2 |
| Categories | Mods Tools Misc |
| Dependency string | Mhz-VoiceInputFix-1.0.9 |
| Dependants | 0 other packages depend on this package |
This mod requires the following mods to function
BepInEx-BepInExPack
BepInEx pack for Mono Unity games. Preconfigured and ready to use.
Preferred version: 5.4.2304Mhz-SherpaOnnxRuntime
Native dependencies for Sherpa ONNX Speech Recognition Engine. Sherpa ONNX 语音识别引擎的原生依赖库。
Preferred version: 1.0.0README
VoiceInputFix (Fun-ASR / SenseVoice Edition)
Description
VoiceInputFix replaces the default YAPYAP speech recognition engine (Vosk) with Fun-ASR (SenseVoice), providing a modern and more reliable voice input experience:
Key Features
- Much Better Chinese Recognition — The original Vosk engine has poor Chinese support. This mod significantly outperforms Vosk for Chinese voice commands.
- Multi-language Support — Automatically detects Mandarin/Cantonese, English, Japanese, Korean, and allows specifying a recognition language.
- Lower Latency — Voice commands respond faster than the original engine.
Note
- The Fun-ASR model supports multilingual speech recognition.
- However, the original Vosk engine shows significantly more stable recognition for English commands in this game
(other non-Chinese languages may behave similarly, but have not been tested), possibly due to its predefined command setup or potential pre-training. - Players who primarily use English voice commands are still recommended to use the original Vosk engine.
Dependencies
This mod requires the SherpaOnnxRuntime package to function. It provides the following core libraries:
onnxruntime.dllsherpa-onnx-c-api.dllsherpa-onnx.dll
Download Models
You must download the following files for the mod to work:
- model.onnx (1.03GB) — Download
- tokens.txt (940KB) — Download
(If the link opens in your browser, press Ctrl+S to save, or right-click and select "Save Link As...")
Alternative Models
- Must be from the sherpa-onnx-sense-voice series (
csukuangfj/sherpa-onnx-sense-voiceon Hugging Face). - Model package must include
model.onnxandtokens.txtand be placed in themodelsfolder.
Installation
- Critical:
VoiceInputFix.dlland the models folder must be in the same directory. - Place them directly into
BepInEx/plugins/, or a subfolder likeBepInEx/plugins/VoiceInputFix/.
(Avoid Chinese characters in folder paths.) - Place
model.onnxandtokens.txtinside the models folder.
- Example path:
BepInEx/plugins/VoiceInputFix/models/model.onnx
Performance
- Initial Loading — First-time engine initialization may cause a significant screen freeze while loading the 1GB model. This is normal.
- Menu Reloading — Returning to the main menu and re-entering the game introduces a short loading delay, but no freeze occurs.
- Gameplay — Once a level is entered, recognition works immediately, smoothly, and with no performance impact.
Advanced Configuration
Config file: BepInEx/config/Mhz.voiceinputfix.cfg
-
SpeechThreshold (Default: 0.015) — Noise gate for triggering recognition.
- Increase if background noise causes text to linger (e.g., 0.025)
- Decrease if you need to shout to be recognized (e.g., 0.010)
-
Language (Default: auto) — Specify recognition language:
auto: Detects all supported languages (Mandarin/Cantonese, English, Japanese, Korean) automatically.zh, en, ja, ko, yue: Prioritize recognition for the selected language.
(This only prioritizes matching, not strictly locking. Extremely similar pronunciations may still be recognized as another language.)
-
EnableDebugLog (Default: false) — Enable to view detailed recognition logs in the BepInEx console.
中文说明
模组简介
VoiceInputFix 将《YAPYAP》默认 Vosk 引擎替换为 Fun-ASR (SenseVoice),显著提升语音输入体验:
- 中文识别大幅提升 — 原版 Vosk 对中文支持较差,本模组在中文指令识别上显著优于原版
- 多语言支持 — 默认自动识别 普通话/粤语/英语/日语/韩语,也可指定识别语言
- 更低延迟 — 语音指令响应速度明显快于原版
注意
- Fun-ASR 模型本身支持多语言语音识别
- 但原版 Vosk 引擎在游戏内英文指令上的识别表现明显更稳定
(其他非中文语言可能也类似,但尚未进行测试),可能与其对游戏内预设指令词的配置或潜在预训练有关 - 对于主要使用英文指令的玩家,仍建议继续使用原版 Vosk 引擎
必需依赖
本模组依赖 SherpaOnnxRuntime,提供以下核心组件:
onnxruntime.dllsherpa-onnx-c-api.dllsherpa-onnx.dll
模型下载
必须下载以下文件:
更换其他模型
- 必须来自 sherpa-onnx-sense-voice 系列
- 文件需包含
model.onnx和tokens.txt,放入models文件夹
安装步骤
- 核心原则:
VoiceInputFix.dll与 models 文件夹必须在同一目录 - 可放入
BepInEx/plugins/或子目录(路径请勿含中文字符) - 将
model.onnx与tokens.txt放入 models 文件夹
- 示例路径:
BepInEx/plugins/VoiceInputFix/models/model.onnx
运行特性
- 首次加载 — 首次启动时可能出现明显卡顿(加载 1GB 模型到内存,属于正常现象)
- 重连加载 — 返回主菜单再进入游戏,仅短暂等待,无卡顿
- 关卡表现 — 进入关卡后立即可用,运行流畅,无额外帧率占用
进阶配置
配置文件:BepInEx/config/Mhz.voiceinputfix.cfg
-
SpeechThreshold(默认 0.015)— 触发识别的噪声门限
- 环境吵闹或文字不消失:调高(如 0.025)
- 必须大声才能识别:调低(如 0.010)
-
Language(默认 auto)— 指定识别语言
auto:自动识别中(普通话/粤语)、英、日、韩语zh/en/ja/ko/yue:优先匹配指定语言,极端相似发音仍可能识别为其他语言
-
EnableDebugLog(默认 false)— 打开后在 BepInEx 控制台查看详细日志