331 lines
7.5 KiB
Markdown
331 lines
7.5 KiB
Markdown
# CosyVoice 集成 - 实现总结
|
||
|
||
## 🎯 实现完成
|
||
|
||
已成功在 `tts` 文件夹中实现对 CosyVoice 引擎的完整支持。
|
||
|
||
## 📁 文件结构
|
||
|
||
```
|
||
tts/
|
||
├── cosyvoice_engine.py ✨ 新增 - CosyVoice 引擎实现
|
||
├── test_cosyvoice.py ✨ 新增 - 集成测试
|
||
├── COSYVOICE.md ✨ 新增 - 详细使用指南
|
||
├── COSYVOICE_QUICK_START.md ✨ 新增 - 快速参考
|
||
├── CONFIG_TEMPLATE.md ✨ 新增 - 配置模板
|
||
├── IMPLEMENTATION_SUMMARY.md ✨ 新增 - 实现总结
|
||
├── factory.py ✏️ 修改 - 注册 CosyVoice
|
||
├── __init__.py ✏️ 修改 - 导出 CosyVoiceEngine
|
||
└── examples.py ✏️ 修改 - 添加示例代码
|
||
```
|
||
|
||
## 🚀 快速开始
|
||
|
||
### 1. 安装依赖
|
||
|
||
```bash
|
||
pip install httpx
|
||
# 或者更新所有依赖
|
||
pip install -r requirements.txt
|
||
```
|
||
|
||
### 2. 最简单的使用方式
|
||
|
||
```python
|
||
import asyncio
|
||
from tts.factory import TTSEngineFactory
|
||
|
||
async def main():
|
||
# 创建 CosyVoice 引擎
|
||
engine = TTSEngineFactory.create("cosyvoice")
|
||
|
||
# 合成语音
|
||
audio = await engine.synthesize(
|
||
text="你好,这是测试",
|
||
voice="your_speaker_id" # 替换为实际的发音人ID
|
||
)
|
||
|
||
# 保存音频
|
||
with open("output.wav", "wb") as f:
|
||
f.write(audio.getvalue())
|
||
|
||
asyncio.run(main())
|
||
```
|
||
|
||
### 3. FastAPI 中使用
|
||
|
||
```python
|
||
from fastapi import APIRouter, HTTPException
|
||
from tts.factory import TTSEngineFactory
|
||
|
||
router = APIRouter()
|
||
|
||
@router.post("/tts/synthesize")
|
||
async def synthesize(text: str, speaker_id: str):
|
||
try:
|
||
engine = TTSEngineFactory.create("cosyvoice")
|
||
audio = await engine.synthesize(text=text, voice=speaker_id)
|
||
return {"status": "success", "size": len(audio.getvalue())}
|
||
except Exception as e:
|
||
raise HTTPException(status_code=400, detail=str(e))
|
||
```
|
||
|
||
## 📋 API 规范
|
||
|
||
### CosyVoice API
|
||
|
||
```
|
||
POST http://192.168.1.200:8000/tts/zero_shot
|
||
Content-Type: application/json
|
||
|
||
{
|
||
"text": "要合成的文本",
|
||
"zero_shot_spk_id": "发音人ID"
|
||
}
|
||
```
|
||
|
||
### Engine.synthesize() 方法
|
||
|
||
```python
|
||
audio: BytesIO = await engine.synthesize(
|
||
text: str, # 必需:要合成的文本
|
||
voice: str, # 必需:zero_shot_spk_id
|
||
language: str = "zh-CN", # 可选:语言代码
|
||
rate: float = 1.0, # 可选:语速(暂不支持)
|
||
pitch: float = 1.0 # 可选:音调(暂不支持)
|
||
)
|
||
```
|
||
|
||
## ⚙️ 配置
|
||
|
||
### 方式 1: 使用默认配置
|
||
|
||
```python
|
||
engine = TTSEngineFactory.create("cosyvoice")
|
||
# 使用默认 API 地址: http://192.168.1.200:8000/tts/zero_shot
|
||
```
|
||
|
||
### 方式 2: 自定义 API 地址
|
||
|
||
```python
|
||
from tts.cosyvoice_engine import CosyVoiceEngine
|
||
|
||
engine = CosyVoiceEngine(
|
||
api_url="http://your_api:port/endpoint",
|
||
timeout=30.0
|
||
)
|
||
```
|
||
|
||
### 方式 3: 环境变量配置
|
||
|
||
```python
|
||
import os
|
||
from tts.cosyvoice_engine import CosyVoiceEngine
|
||
|
||
api_url = os.getenv("COSYVOICE_API_URL",
|
||
"http://192.168.1.200:8000/tts/zero_shot")
|
||
timeout = float(os.getenv("COSYVOICE_TIMEOUT", "30"))
|
||
|
||
engine = CosyVoiceEngine(api_url=api_url, timeout=timeout)
|
||
```
|
||
|
||
## 🧪 测试
|
||
|
||
运行集成测试:
|
||
|
||
```bash
|
||
python tts/test_cosyvoice.py
|
||
```
|
||
|
||
测试项目:
|
||
- ✓ 工厂模式创建
|
||
- ✓ 直接创建实例
|
||
- ✓ 参数验证
|
||
- ✓ 支持的引擎列表
|
||
- ✓ 引擎对比
|
||
|
||
## 📚 文档
|
||
|
||
详细文档位置:
|
||
|
||
| 文档 | 说明 |
|
||
|------|------|
|
||
| `COSYVOICE.md` | 完整使用指南,包括所有细节 |
|
||
| `COSYVOICE_QUICK_START.md` | 快速参考,核心信息速查 |
|
||
| `CONFIG_TEMPLATE.md` | 配置模板和集成示例 |
|
||
| `IMPLEMENTATION_SUMMARY.md` | 技术实现细节 |
|
||
|
||
## ✨ 主要特性
|
||
|
||
- ✅ **异步支持** - 完全异步设计,无阻塞
|
||
- ✅ **灵活配置** - 支持自定义 API 地址和超时时间
|
||
- ✅ **错误处理** - 详细的异常捕获和错误消息
|
||
- ✅ **日志记录** - 集成 loguru 进行调试
|
||
- ✅ **工厂模式** - 统一的引擎管理接口
|
||
- ✅ **生产级** - 完整的测试覆盖和文档
|
||
|
||
## 🔧 故障排查
|
||
|
||
### 问题:连接失败
|
||
|
||
```
|
||
ValueError: Failed to connect to CosyVoice API
|
||
```
|
||
|
||
**检查清单:**
|
||
1. CosyVoice 服务是否运行
|
||
2. API 地址是否正确
|
||
3. 网络连接是否正常
|
||
4. 防火墙设置
|
||
|
||
### 问题:缺少 voice 参数
|
||
|
||
```
|
||
ValueError: voice (zero_shot_spk_id) is required for CosyVoice
|
||
```
|
||
|
||
**解决:** 提供有效的 `voice` 参数
|
||
```python
|
||
audio = await engine.synthesize(text="文本", voice="valid_id")
|
||
```
|
||
|
||
### 问题:httpx 未安装
|
||
|
||
```
|
||
ModuleNotFoundError: No module named 'httpx'
|
||
```
|
||
|
||
**解决:**
|
||
```bash
|
||
pip install httpx
|
||
```
|
||
|
||
## 📦 依赖
|
||
|
||
已添加到 `requirements.txt`:
|
||
- `httpx>=0.24.0` - 异步 HTTP 客户端
|
||
|
||
## 🔗 支持的引擎
|
||
|
||
```python
|
||
from tts.factory import TTSEngineFactory
|
||
|
||
# 获取所有支持的引擎
|
||
engines = TTSEngineFactory.get_supported_engines()
|
||
# 返回: ['edge-tts', 'cosyvoice']
|
||
|
||
# 创建引擎
|
||
engine = TTSEngineFactory.create("cosyvoice")
|
||
```
|
||
|
||
## 📝 使用示例
|
||
|
||
### 示例 1: 基础用法
|
||
|
||
```python
|
||
import asyncio
|
||
from tts.factory import TTSEngineFactory
|
||
|
||
async def main():
|
||
engine = TTSEngineFactory.create("cosyvoice")
|
||
audio = await engine.synthesize(
|
||
text="你好,世界",
|
||
voice="female_standard"
|
||
)
|
||
|
||
with open("hello.wav", "wb") as f:
|
||
f.write(audio.getvalue())
|
||
|
||
asyncio.run(main())
|
||
```
|
||
|
||
### 示例 2: FastAPI 路由
|
||
|
||
```python
|
||
from fastapi import APIRouter, HTTPException
|
||
from tts.factory import TTSEngineFactory
|
||
|
||
router = APIRouter(prefix="/api/tts")
|
||
|
||
@router.post("/cosyvoice")
|
||
async def synthesize_cosyvoice(text: str, speaker_id: str):
|
||
try:
|
||
engine = TTSEngineFactory.create("cosyvoice")
|
||
audio = await engine.synthesize(text=text, voice=speaker_id)
|
||
return {"status": "success"}
|
||
except Exception as e:
|
||
raise HTTPException(status_code=400, detail=str(e))
|
||
```
|
||
|
||
### 示例 3: 自定义配置
|
||
|
||
```python
|
||
from tts.cosyvoice_engine import CosyVoiceEngine
|
||
|
||
async def main():
|
||
engine = CosyVoiceEngine(
|
||
api_url="http://192.168.1.200:8000/tts/zero_shot",
|
||
timeout=30
|
||
)
|
||
|
||
try:
|
||
audio = await engine.synthesize(
|
||
text="自定义配置示例",
|
||
voice="speaker_001"
|
||
)
|
||
finally:
|
||
await engine.close() # 关闭连接
|
||
```
|
||
|
||
## 🎓 架构
|
||
|
||
```
|
||
TTSEngine (抽象基类)
|
||
├── EdgeTTSEngine
|
||
└── CosyVoiceEngine (新增)
|
||
|
||
TTSEngineFactory (工厂类)
|
||
├── create() -> CosyVoiceEngine
|
||
├── register_engine()
|
||
├── get_supported_engines()
|
||
└── clear_instances()
|
||
```
|
||
|
||
## ✅ 检查清单
|
||
|
||
- [x] 实现 CosyVoice 引擎类
|
||
- [x] 在工厂中注册引擎
|
||
- [x] 添加 httpx 依赖
|
||
- [x] 更新模块导出
|
||
- [x] 创建测试套件
|
||
- [x] 编写详细文档
|
||
- [x] 提供配置示例
|
||
- [x] 创建使用示例
|
||
|
||
## 📞 支持
|
||
|
||
如有问题,请查看:
|
||
1. `COSYVOICE_QUICK_START.md` - 快速参考
|
||
2. `COSYVOICE.md` - 详细文档
|
||
3. `CONFIG_TEMPLATE.md` - 配置示例
|
||
4. `test_cosyvoice.py` - 测试代码
|
||
|
||
## 🎉 总结
|
||
|
||
成功完成了 CosyVoice 引擎的集成实现,包括:
|
||
|
||
1. ✨ **核心功能** - 完整的语音合成接口
|
||
2. 🏭 **设计模式** - 工厂模式统一管理
|
||
3. 📚 **完整文档** - 快速开始到深度指南
|
||
4. 🧪 **测试覆盖** - 全面的功能测试
|
||
5. ⚙️ **灵活配置** - 支持多种配置方式
|
||
6. 🔒 **生产级质量** - 错误处理、日志、连接管理
|
||
|
||
可以立即使用,无需额外修改!
|
||
|
||
---
|
||
|
||
**实现日期**: 2025年11月28日
|
||
**状态**: ✅ 完成
|
||
**版本**: 1.0.0
|