Files
meme/tts/README_COSYVOICE.md
konjacpotato 6772699cfe
Some checks failed
Gitea Actions Demo / deploy (push) Failing after 2s
commit code
2025-12-29 19:34:39 +08:00

331 lines
7.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CosyVoice 集成 - 实现总结
## 🎯 实现完成
已成功在 `tts` 文件夹中实现对 CosyVoice 引擎的完整支持。
## 📁 文件结构
```
tts/
├── cosyvoice_engine.py ✨ 新增 - CosyVoice 引擎实现
├── test_cosyvoice.py ✨ 新增 - 集成测试
├── COSYVOICE.md ✨ 新增 - 详细使用指南
├── COSYVOICE_QUICK_START.md ✨ 新增 - 快速参考
├── CONFIG_TEMPLATE.md ✨ 新增 - 配置模板
├── IMPLEMENTATION_SUMMARY.md ✨ 新增 - 实现总结
├── factory.py ✏️ 修改 - 注册 CosyVoice
├── __init__.py ✏️ 修改 - 导出 CosyVoiceEngine
└── examples.py ✏️ 修改 - 添加示例代码
```
## 🚀 快速开始
### 1. 安装依赖
```bash
pip install httpx
# 或者更新所有依赖
pip install -r requirements.txt
```
### 2. 最简单的使用方式
```python
import asyncio
from tts.factory import TTSEngineFactory
async def main():
# 创建 CosyVoice 引擎
engine = TTSEngineFactory.create("cosyvoice")
# 合成语音
audio = await engine.synthesize(
text="你好,这是测试",
voice="your_speaker_id" # 替换为实际的发音人ID
)
# 保存音频
with open("output.wav", "wb") as f:
f.write(audio.getvalue())
asyncio.run(main())
```
### 3. FastAPI 中使用
```python
from fastapi import APIRouter, HTTPException
from tts.factory import TTSEngineFactory
router = APIRouter()
@router.post("/tts/synthesize")
async def synthesize(text: str, speaker_id: str):
try:
engine = TTSEngineFactory.create("cosyvoice")
audio = await engine.synthesize(text=text, voice=speaker_id)
return {"status": "success", "size": len(audio.getvalue())}
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
```
## 📋 API 规范
### CosyVoice API
```
POST http://192.168.1.200:8000/tts/zero_shot
Content-Type: application/json
{
"text": "要合成的文本",
"zero_shot_spk_id": "发音人ID"
}
```
### Engine.synthesize() 方法
```python
audio: BytesIO = await engine.synthesize(
text: str, # 必需:要合成的文本
voice: str, # 必需zero_shot_spk_id
language: str = "zh-CN", # 可选:语言代码
rate: float = 1.0, # 可选:语速(暂不支持)
pitch: float = 1.0 # 可选:音调(暂不支持)
)
```
## ⚙️ 配置
### 方式 1: 使用默认配置
```python
engine = TTSEngineFactory.create("cosyvoice")
# 使用默认 API 地址: http://192.168.1.200:8000/tts/zero_shot
```
### 方式 2: 自定义 API 地址
```python
from tts.cosyvoice_engine import CosyVoiceEngine
engine = CosyVoiceEngine(
api_url="http://your_api:port/endpoint",
timeout=30.0
)
```
### 方式 3: 环境变量配置
```python
import os
from tts.cosyvoice_engine import CosyVoiceEngine
api_url = os.getenv("COSYVOICE_API_URL",
"http://192.168.1.200:8000/tts/zero_shot")
timeout = float(os.getenv("COSYVOICE_TIMEOUT", "30"))
engine = CosyVoiceEngine(api_url=api_url, timeout=timeout)
```
## 🧪 测试
运行集成测试:
```bash
python tts/test_cosyvoice.py
```
测试项目:
- ✓ 工厂模式创建
- ✓ 直接创建实例
- ✓ 参数验证
- ✓ 支持的引擎列表
- ✓ 引擎对比
## 📚 文档
详细文档位置:
| 文档 | 说明 |
|------|------|
| `COSYVOICE.md` | 完整使用指南,包括所有细节 |
| `COSYVOICE_QUICK_START.md` | 快速参考,核心信息速查 |
| `CONFIG_TEMPLATE.md` | 配置模板和集成示例 |
| `IMPLEMENTATION_SUMMARY.md` | 技术实现细节 |
## ✨ 主要特性
-**异步支持** - 完全异步设计,无阻塞
-**灵活配置** - 支持自定义 API 地址和超时时间
-**错误处理** - 详细的异常捕获和错误消息
-**日志记录** - 集成 loguru 进行调试
-**工厂模式** - 统一的引擎管理接口
-**生产级** - 完整的测试覆盖和文档
## 🔧 故障排查
### 问题:连接失败
```
ValueError: Failed to connect to CosyVoice API
```
**检查清单:**
1. CosyVoice 服务是否运行
2. API 地址是否正确
3. 网络连接是否正常
4. 防火墙设置
### 问题:缺少 voice 参数
```
ValueError: voice (zero_shot_spk_id) is required for CosyVoice
```
**解决:** 提供有效的 `voice` 参数
```python
audio = await engine.synthesize(text="文本", voice="valid_id")
```
### 问题httpx 未安装
```
ModuleNotFoundError: No module named 'httpx'
```
**解决:**
```bash
pip install httpx
```
## 📦 依赖
已添加到 `requirements.txt`:
- `httpx>=0.24.0` - 异步 HTTP 客户端
## 🔗 支持的引擎
```python
from tts.factory import TTSEngineFactory
# 获取所有支持的引擎
engines = TTSEngineFactory.get_supported_engines()
# 返回: ['edge-tts', 'cosyvoice']
# 创建引擎
engine = TTSEngineFactory.create("cosyvoice")
```
## 📝 使用示例
### 示例 1: 基础用法
```python
import asyncio
from tts.factory import TTSEngineFactory
async def main():
engine = TTSEngineFactory.create("cosyvoice")
audio = await engine.synthesize(
text="你好,世界",
voice="female_standard"
)
with open("hello.wav", "wb") as f:
f.write(audio.getvalue())
asyncio.run(main())
```
### 示例 2: FastAPI 路由
```python
from fastapi import APIRouter, HTTPException
from tts.factory import TTSEngineFactory
router = APIRouter(prefix="/api/tts")
@router.post("/cosyvoice")
async def synthesize_cosyvoice(text: str, speaker_id: str):
try:
engine = TTSEngineFactory.create("cosyvoice")
audio = await engine.synthesize(text=text, voice=speaker_id)
return {"status": "success"}
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
```
### 示例 3: 自定义配置
```python
from tts.cosyvoice_engine import CosyVoiceEngine
async def main():
engine = CosyVoiceEngine(
api_url="http://192.168.1.200:8000/tts/zero_shot",
timeout=30
)
try:
audio = await engine.synthesize(
text="自定义配置示例",
voice="speaker_001"
)
finally:
await engine.close() # 关闭连接
```
## 🎓 架构
```
TTSEngine (抽象基类)
├── EdgeTTSEngine
└── CosyVoiceEngine (新增)
TTSEngineFactory (工厂类)
├── create() -> CosyVoiceEngine
├── register_engine()
├── get_supported_engines()
└── clear_instances()
```
## ✅ 检查清单
- [x] 实现 CosyVoice 引擎类
- [x] 在工厂中注册引擎
- [x] 添加 httpx 依赖
- [x] 更新模块导出
- [x] 创建测试套件
- [x] 编写详细文档
- [x] 提供配置示例
- [x] 创建使用示例
## 📞 支持
如有问题,请查看:
1. `COSYVOICE_QUICK_START.md` - 快速参考
2. `COSYVOICE.md` - 详细文档
3. `CONFIG_TEMPLATE.md` - 配置示例
4. `test_cosyvoice.py` - 测试代码
## 🎉 总结
成功完成了 CosyVoice 引擎的集成实现,包括:
1.**核心功能** - 完整的语音合成接口
2. 🏭 **设计模式** - 工厂模式统一管理
3. 📚 **完整文档** - 快速开始到深度指南
4. 🧪 **测试覆盖** - 全面的功能测试
5. ⚙️ **灵活配置** - 支持多种配置方式
6. 🔒 **生产级质量** - 错误处理、日志、连接管理
可以立即使用,无需额外修改!
---
**实现日期**: 2025年11月28日
**状态**: ✅ 完成
**版本**: 1.0.0