231 lines
5.6 KiB
Markdown
231 lines
5.6 KiB
Markdown
## CosyVoice 引擎集成指南
|
||
|
||
本文档说明如何在项目中使用 CosyVoice 引擎进行语音合成。
|
||
|
||
### 前置条件
|
||
|
||
1. 已部署本地 CosyVoice API 服务
|
||
2. API 地址:`http://192.168.1.200:8000/tts/zero_shot`
|
||
3. 确保依赖已安装:`httpx`
|
||
|
||
### 快速开始
|
||
|
||
#### 方式 1: 使用工厂模式创建引擎
|
||
|
||
```python
|
||
import asyncio
|
||
from tts.factory import TTSEngineFactory
|
||
|
||
async def main():
|
||
# 创建 CosyVoice 引擎实例
|
||
engine = TTSEngineFactory.create("cosyvoice")
|
||
|
||
# 合成语音
|
||
text = "你好,这是 CosyVoice 合成的语音。"
|
||
audio = await engine.synthesize(
|
||
text=text,
|
||
voice="your_speaker_id" # 替换为实际的 speaker ID
|
||
)
|
||
|
||
# 保存音频
|
||
with open("output.wav", "wb") as f:
|
||
f.write(audio.getvalue())
|
||
|
||
asyncio.run(main())
|
||
```
|
||
|
||
#### 方式 2: 直接使用 CosyVoice 引擎
|
||
|
||
```python
|
||
import asyncio
|
||
from tts.cosyvoice_engine import CosyVoiceEngine
|
||
|
||
async def main():
|
||
# 创建引擎实例,可以自定义 API 地址和超时时间
|
||
engine = CosyVoiceEngine(
|
||
api_url="http://192.168.1.200:8000/tts/zero_shot",
|
||
timeout=30.0
|
||
)
|
||
|
||
try:
|
||
# 合成语音
|
||
text = "你好,这是测试文本。"
|
||
audio = await engine.synthesize(
|
||
text=text,
|
||
voice="female_standard_speaker"
|
||
)
|
||
|
||
# 保存或处理音频
|
||
with open("output.wav", "wb") as f:
|
||
f.write(audio.getvalue())
|
||
|
||
finally:
|
||
# 关闭连接
|
||
await engine.close()
|
||
|
||
asyncio.run(main())
|
||
```
|
||
|
||
### API 参数说明
|
||
|
||
#### 合成接口 (`synthesize`)
|
||
|
||
**必需参数:**
|
||
- `text` (str): 要合成的文本
|
||
- `voice` (str): 发音人 ID (`zero_shot_spk_id`)
|
||
|
||
**可选参数:**
|
||
- `language` (str): 语言代码,默认 "zh-CN"
|
||
- `rate` (float): 语速,默认 1.0(暂不支持)
|
||
- `pitch` (float): 音调,默认 1.0(暂不支持)
|
||
|
||
**返回值:**
|
||
- `BytesIO`: 包含音频数据的字节流对象
|
||
|
||
**异常:**
|
||
- `ValueError`: 如果 `voice` 参数为空,或 API 返回错误
|
||
- `httpx.RequestError`: 网络连接错误
|
||
|
||
### CosyVoice API 请求示例
|
||
|
||
```bash
|
||
curl -X POST "http://192.168.1.200:8000/tts/zero_shot" \
|
||
-H "Content-Type: application/json" \
|
||
-d {
|
||
"text": "你好,世界",
|
||
"zero_shot_spk_id": "female_standard_speaker"
|
||
}
|
||
```
|
||
|
||
### 配置 CosyVoice
|
||
|
||
如果需要修改 API 地址或超时时间,可以:
|
||
|
||
1. **环境变量配置** (推荐)
|
||
```python
|
||
import os
|
||
from tts.cosyvoice_engine import CosyVoiceEngine
|
||
|
||
api_url = os.getenv("COSYVOICE_API_URL", "http://192.168.1.200:8000/tts/zero_shot")
|
||
timeout = float(os.getenv("COSYVOICE_TIMEOUT", "30"))
|
||
|
||
engine = CosyVoiceEngine(api_url=api_url, timeout=timeout)
|
||
```
|
||
|
||
2. **配置文件方式** (参考 `config/app.py`)
|
||
```python
|
||
from tts.cosyvoice_engine import CosyVoiceEngine
|
||
|
||
class CosyVoiceConfig:
|
||
API_URL = "http://192.168.1.200:8000/tts/zero_shot"
|
||
TIMEOUT = 30.0
|
||
|
||
engine = CosyVoiceEngine(**CosyVoiceConfig().__dict__)
|
||
```
|
||
|
||
### FastAPI 集成示例
|
||
|
||
在 API 路由中使用 CosyVoice:
|
||
|
||
```python
|
||
from fastapi import APIRouter, HTTPException
|
||
from tts.factory import TTSEngineFactory
|
||
|
||
router = APIRouter(prefix="/api/v1/tts", tags=["tts"])
|
||
|
||
@router.post("/cosyvoice/synthesize")
|
||
async def synthesize_with_cosyvoice(text: str, speaker_id: str):
|
||
"""
|
||
使用 CosyVoice 合成语音
|
||
|
||
Args:
|
||
text: 要合成的文本
|
||
speaker_id: 发音人 ID
|
||
|
||
Returns:
|
||
音频文件内容
|
||
"""
|
||
try:
|
||
engine = TTSEngineFactory.create("cosyvoice")
|
||
audio = await engine.synthesize(text=text, voice=speaker_id)
|
||
|
||
return {
|
||
"status": "success",
|
||
"audio_size": len(audio.getvalue()),
|
||
"content_type": "audio/wav"
|
||
}
|
||
except ValueError as e:
|
||
raise HTTPException(status_code=400, detail=str(e))
|
||
except Exception as e:
|
||
raise HTTPException(status_code=500, detail="TTS synthesis failed")
|
||
```
|
||
|
||
### 发音人 ID 参考
|
||
|
||
常见的发音人 ID 示例(需根据实际部署调整):
|
||
|
||
- `female_standard_speaker`: 女性标准发音
|
||
- `female_gentle_speaker`: 女性温柔发音
|
||
- `male_standard_speaker`: 男性标准发音
|
||
- `male_gentle_speaker`: 男性温柔发音
|
||
|
||
具体的发音人 ID 应该根据您部署的 CosyVoice 服务配置。
|
||
|
||
### 故障排查
|
||
|
||
#### 问题 1: "Failed to connect to CosyVoice API"
|
||
|
||
**原因:**
|
||
- CosyVoice 服务未运行
|
||
- API 地址配置错误
|
||
- 网络连接问题
|
||
|
||
**解决方案:**
|
||
```bash
|
||
# 检查服务是否运行
|
||
curl http://192.168.1.200:8000/tts/zero_shot -X POST -d "{\"text\":\"test\",\"zero_shot_spk_id\":\"test\"}"
|
||
|
||
# 检查网络连接
|
||
ping 192.168.1.200
|
||
```
|
||
|
||
#### 问题 2: "voice (zero_shot_spk_id) is required for CosyVoice"
|
||
|
||
**原因:** 没有提供 `voice` 参数
|
||
|
||
**解决方案:** 确保调用 `synthesize()` 时提供了 `voice` 参数
|
||
|
||
```python
|
||
audio = await engine.synthesize(
|
||
text="测试",
|
||
voice="valid_speaker_id" # 提供有效的发音人 ID
|
||
)
|
||
```
|
||
|
||
#### 问题 3: HTTP 错误 (400, 500 等)
|
||
|
||
**原因:** API 响应错误
|
||
|
||
**解决方案:**
|
||
- 检查文本格式是否正确
|
||
- 验证 speaker_id 是否有效
|
||
- 查看 CosyVoice 服务日志获取详细错误信息
|
||
|
||
### 性能优化
|
||
|
||
1. **连接重用**:使用工厂模式创建引擎实例可以重用 HTTP 连接
|
||
2. **超时配置**:根据网络情况调整 timeout 参数
|
||
3. **异步处理**:使用异步接口避免阻塞
|
||
|
||
### 相关文件
|
||
|
||
- `tts/cosyvoice_engine.py`: CosyVoice 引擎实现
|
||
- `tts/factory.py`: TTS 引擎工厂类
|
||
- `tts/base.py`: TTSEngine 抽象基类
|
||
- `tts/examples.py`: 使用示例代码
|
||
|
||
### 更多信息
|
||
|
||
- [TTS 架构文档](../docs/TTS_ARCHITECTURE.md)
|
||
- [TTS 实现指南](../docs/TTS_IMPLEMENTATION_SUMMARY.md)
|