343 lines
7.0 KiB
Markdown
343 lines
7.0 KiB
Markdown
"""
|
||
TTS 模块文档
|
||
|
||
本模块提供文本转语音(Text-to-Speech)的统一接口,支持多引擎扩展架构。
|
||
"""
|
||
|
||
# TTS 模块使用指南
|
||
|
||
## 模块结构
|
||
|
||
```
|
||
tts/
|
||
├── __init__.py # 模块入口
|
||
├── base.py # TTS 引擎基类(抽象接口)
|
||
├── edge_tts_engine.py # Edge-TTS 引擎实现
|
||
├── factory.py # TTS 引擎工厂类
|
||
├── service.py # 高级 TTS 服务接口
|
||
├── examples.py # 使用示例
|
||
└── README.md # 本文档
|
||
```
|
||
|
||
## 快速开始
|
||
|
||
### 1. 安装依赖
|
||
|
||
```bash
|
||
pip install edge-tts
|
||
```
|
||
|
||
### 2. 配置 TTS 引擎
|
||
|
||
在 `.env` 文件中配置:
|
||
|
||
```env
|
||
# TTS 引擎配置
|
||
TTS_ENGINE=edge-tts # 使用的 TTS 引擎
|
||
TTS_LANGUAGE=zh-CN # 默认语言
|
||
TTS_VOICE= # 默认声音(为空使用引擎默认)
|
||
TTS_RATE=1.0 # 语速(1.0 为正常)
|
||
TTS_PITCH=1.0 # 音调(1.0 为正常)
|
||
```
|
||
|
||
### 3. 基本使用
|
||
|
||
#### 方法一:使用高级服务(推荐)
|
||
|
||
```python
|
||
from tts.service import TTSService
|
||
import asyncio
|
||
|
||
async def main():
|
||
# 使用默认配置合成语音
|
||
audio = await TTSService.synthesize("你好,世界!")
|
||
|
||
# 自定义参数
|
||
audio = await TTSService.synthesize(
|
||
"Hello, World!",
|
||
language="en-US",
|
||
rate=1.2 # 快速
|
||
)
|
||
|
||
# 获取支持的声音
|
||
voices = await TTSService.get_supported_voices()
|
||
|
||
# 获取引擎信息
|
||
info = TTSService.get_engine_info()
|
||
|
||
asyncio.run(main())
|
||
```
|
||
|
||
#### 方法二:直接使用引擎工厂
|
||
|
||
```python
|
||
from tts.factory import TTSEngineFactory
|
||
import asyncio
|
||
|
||
async def main():
|
||
# 创建引擎实例
|
||
engine = TTSEngineFactory.create("edge-tts")
|
||
|
||
# 合成语音
|
||
audio = await engine.synthesize(
|
||
"你好,世界!",
|
||
language="zh-CN"
|
||
)
|
||
|
||
# 获取支持的声音
|
||
voices = await engine.get_supported_voices("zh-CN")
|
||
|
||
asyncio.run(main())
|
||
```
|
||
|
||
#### 方法三:直接使用引擎
|
||
|
||
```python
|
||
from tts.edge_tts_engine import EdgeTTSEngine
|
||
import asyncio
|
||
|
||
async def main():
|
||
engine = EdgeTTSEngine()
|
||
|
||
audio = await engine.synthesize(
|
||
"你好,世界!",
|
||
voice="zh-CN-XiaoxiaoNeural",
|
||
language="zh-CN"
|
||
)
|
||
|
||
asyncio.run(main())
|
||
```
|
||
|
||
## API 文档
|
||
|
||
### TTSService(推荐使用)
|
||
|
||
高级服务接口,自动使用配置文件中的设置。
|
||
|
||
```python
|
||
async def synthesize(
|
||
text: str,
|
||
language: Optional[str] = None,
|
||
voice: Optional[str] = None,
|
||
rate: Optional[float] = None,
|
||
pitch: Optional[float] = None,
|
||
) -> BytesIO:
|
||
"""将文本合成为语音"""
|
||
|
||
async def get_supported_voices(language: Optional[str] = None) -> list[dict]:
|
||
"""获取支持的声音列表"""
|
||
|
||
def get_engine_info() -> dict:
|
||
"""获取引擎信息"""
|
||
|
||
def reset_engine() -> None:
|
||
"""重置引擎(仅在切换引擎时需要)"""
|
||
```
|
||
|
||
### TTSEngineFactory
|
||
|
||
引擎工厂类,管理引擎的创建和生命周期。
|
||
|
||
```python
|
||
@classmethod
|
||
def create(engine_type: str | TTSEngineType) -> TTSEngine:
|
||
"""创建引擎实例(单例模式)"""
|
||
|
||
@classmethod
|
||
def register_engine(engine_type: str, engine_class: type[TTSEngine]) -> None:
|
||
"""注册新的引擎类型"""
|
||
|
||
@classmethod
|
||
def get_supported_engines() -> list[str]:
|
||
"""获取所有支持的引擎"""
|
||
```
|
||
|
||
### TTSEngine(基类)
|
||
|
||
所有引擎必须实现的接口。
|
||
|
||
```python
|
||
async def synthesize(
|
||
text: str,
|
||
language: str = "zh-CN",
|
||
voice: Optional[str] = None,
|
||
rate: float = 1.0,
|
||
pitch: float = 1.0,
|
||
) -> BytesIO:
|
||
"""将文本合成为语音"""
|
||
|
||
async def get_supported_voices(language: str = "zh-CN") -> list[dict]:
|
||
"""获取支持的声音"""
|
||
|
||
def get_engine_name() -> str:
|
||
"""获取引擎名称"""
|
||
|
||
def get_engine_version() -> str:
|
||
"""获取引擎版本"""
|
||
```
|
||
|
||
## 支持的语言和声音
|
||
|
||
### Edge-TTS 支持的主要语言
|
||
|
||
- **中文(简体)**: zh-CN - 晓晓 (zh-CN-XiaoxiaoNeural)
|
||
- **中文(繁体)**: zh-TW
|
||
- **英文(美国)**: en-US - Aria (en-US-AriaNeural)
|
||
- **英文(英国)**: en-GB - Sonia (en-GB-SoniaNeural)
|
||
- **日语**: ja-JP
|
||
- **韩语**: ko-KR
|
||
- **法语**: fr-FR
|
||
- **德语**: de-DE
|
||
- **西班牙语**: es-ES
|
||
- **俄语**: ru-RU
|
||
|
||
### 获取完整的声音列表
|
||
|
||
```python
|
||
from tts.service import TTSService
|
||
import asyncio
|
||
|
||
async def main():
|
||
voices = await TTSService.get_supported_voices("zh-CN")
|
||
for voice in voices:
|
||
print(f"{voice['display_name']}: {voice['name']}")
|
||
|
||
asyncio.run(main())
|
||
```
|
||
|
||
## 扩展新的 TTS 引擎
|
||
|
||
### 步骤 1:创建引擎类
|
||
|
||
创建新文件 `tts/new_engine.py`:
|
||
|
||
```python
|
||
from .base import TTSEngine
|
||
from typing import Optional
|
||
from io import BytesIO
|
||
|
||
class NewTTSEngine(TTSEngine):
|
||
"""新的 TTS 引擎实现"""
|
||
|
||
async def synthesize(
|
||
self,
|
||
text: str,
|
||
language: str = "zh-CN",
|
||
voice: Optional[str] = None,
|
||
rate: float = 1.0,
|
||
pitch: float = 1.0,
|
||
) -> BytesIO:
|
||
# 实现合成逻辑
|
||
pass
|
||
|
||
async def get_supported_voices(self, language: str = "zh-CN") -> list[dict]:
|
||
# 实现获取声音列表
|
||
pass
|
||
|
||
def get_engine_name(self) -> str:
|
||
return "new-engine"
|
||
|
||
def get_engine_version(self) -> str:
|
||
return "1.0.0"
|
||
```
|
||
|
||
### 步骤 2:在工厂中注册
|
||
|
||
编辑 `tts/factory.py`:
|
||
|
||
```python
|
||
from .new_engine import NewTTSEngine
|
||
|
||
class TTSEngineType(Enum):
|
||
EDGE_TTS = "edge-tts"
|
||
NEW_ENGINE = "new-engine" # 添加新引擎
|
||
|
||
class TTSEngineFactory:
|
||
_engines = {
|
||
TTSEngineType.EDGE_TTS: EdgeTTSEngine,
|
||
TTSEngineType.NEW_ENGINE: NewTTSEngine, # 注册引擎类
|
||
}
|
||
```
|
||
|
||
### 步骤 3:更新配置
|
||
|
||
在 `.env` 中配置使用新引擎:
|
||
|
||
```env
|
||
TTS_ENGINE=new-engine
|
||
```
|
||
|
||
### 步骤 4:使用新引擎
|
||
|
||
```python
|
||
from tts.service import TTSService
|
||
|
||
# TTSService 会自动使用配置中的引擎
|
||
audio = await TTSService.synthesize("Hello, World!")
|
||
```
|
||
|
||
## REST API 端点
|
||
|
||
### 1. 合成语音
|
||
|
||
```http
|
||
POST /api/v1/tts/synthesize
|
||
Content-Type: application/json
|
||
|
||
{
|
||
"text": "你好,世界!",
|
||
"language": "zh-CN",
|
||
"voice": null,
|
||
"rate": 1.0,
|
||
"pitch": 1.0
|
||
}
|
||
```
|
||
|
||
### 2. 获取声音列表
|
||
|
||
```http
|
||
GET /api/v1/tts/voices?language=zh-CN
|
||
```
|
||
|
||
### 3. 获取支持的引擎
|
||
|
||
```http
|
||
GET /api/v1/tts/engines
|
||
```
|
||
|
||
### 4. 获取引擎信息
|
||
|
||
```http
|
||
GET /api/v1/tts/engine-info
|
||
```
|
||
|
||
## 性能优化
|
||
|
||
1. **引擎缓存**:TTSEngineFactory 使用单例模式缓存引擎实例
|
||
2. **异步处理**:所有 IO 操作都是异步的,支持高并发
|
||
3. **配置缓存**:从配置文件读取的设置只在初始化时加载一次
|
||
|
||
## 错误处理
|
||
|
||
```python
|
||
from tts.service import TTSService
|
||
|
||
try:
|
||
audio = await TTSService.synthesize("文本")
|
||
except Exception as e:
|
||
print(f"TTS 合成失败: {e}")
|
||
```
|
||
|
||
## 许可证
|
||
|
||
参考主项目许可证
|
||
|
||
## 更新日志
|
||
|
||
### v1.0.0 (初始版本)
|
||
- ✅ Edge-TTS 引擎实现
|
||
- ✅ 工厂模式支持引擎扩展
|
||
- ✅ 高级服务接口
|
||
- ✅ REST API 支持
|
||
- ✅ 多语言支持
|