Orpheus-Edge: Voice & Text AI Application

Youtube demo: https://youtu.be/47YhYkwGd60

Orpheus-Edge: Voice & Text AI Application

A FastAPI-based web application that provides a user interface for interacting with a Large Language Model (LLM), generating speech from text (TTS) using an Orpheus-compatible model and SNAC vocoder, and transcribing speech to text (STT) using Whisper.

Features

Real-time streaming of LLM responses.
Streaming Text-to-Speech (TTS) generation.
Speech-to-Text (STT) transcription using local Whisper models.
Configurable parameters for LLM and TTS via the UI.
Push-to-talk functionality using the space bar.
Dark mode interface.

Prerequisites

Python: Version 3.8 or higher is recommended.
pip: Python package installer.
Git: For cloning the repository.
External AI Services: This application acts as a frontend and orchestration layer. It requires separate, already running AI model services for LLM responses and initial TTS audio code generation.
- An LLM inference server (e.g., LM Studio running an LLM of your choice).
- An Orpheus-compatible TTS model server that provides GGUF-style audio codes via a completions API (e.g., LM Studio running an Orpheus TTS model).
- By default, these are expected to be accessible at http://127.0.0.1:1234.
(Optional) NVIDIA GPU & CUDA: For GPU acceleration of local PyTorch models (SNAC vocoder, Whisper STT). The application will fall back to CPU if a GPU is not available.
(Optional 7 Highly recommended) FFmpeg: Whisper may require ffmpeg to be installed on the system and available in the PATH for robust audio file format support during STT.

Setup and Installation

Clone the Repository:

git clone https://github.com/jjmlovesgit/Orpheus_Edge_FastAPI
cd Orpheus_Edge_FastAPI

Create and Activate a Python Virtual Environment (Recommended):
```
python -m venv .venv
```
- On Windows:
```
.\.venv\Scripts\activate
```
- On macOS/Linux:
```
source .venv/bin/activate
```
Install Dependencies: Make sure you have the requirements.txt file and then run:
```
pip install -r requirements.txt
```
This will install FastAPI, Uvicorn, PyTorch, NumPy, Requests, OpenAI-Whisper, SNAC, and other necessary packages.
Model Downloads (First Run):
- The SNAC vocoder model (for TTS) and the specified Whisper model (for STT) will be downloaded automatically by the application on its first run if they are not already cached locally by the respective libraries. An internet connection is required for this initial download.

Configuration

The application uses environment variables for configuration, with sensible defaults provided in the code. You can set these variables in your shell before running the application, or via your deployment environment's configuration tools.

Key environment variables:

SERVER_BASE_URL: The base URL for your external LLM and Orpheus-compatible TTS server.
- Default: http://127.0.0.1:1234
LMSTUDIO_MODEL: (Used by llm_router.py) The model identifier for your LLM on the external server.
- Default: "dolphin3.0-llama3.1-8b-abliterated" (adjust as per your LLM server setup)
LMSTUDIO_SYSTEM_PROMPT: (Used by llm_router.py) The system prompt for the LLM.
- Default: (A specific story-crafting prompt - see llm_router.py)
TTS_MODEL: The model identifier for your Orpheus-compatible TTS model on the external server.
- Default: "orpheus-3b-0.1" (adjust as per your TTS server setup)
WHISPER_MODEL_NAME: The Whisper model to use for STT (e.g., "tiny.en", "base.en", "small.en", "medium.en"). Larger models are more accurate but require more resources.
- Default: "base.en"

Running the External AI Services (Crucial Prerequisite!)

Before starting this FastAPI application, you MUST ensure that your external LLM server and your Orpheus-compatible TTS server are running and accessible at the URL specified by SERVER_BASE_URL (or its default http://127.0.0.1:1234).

LM Studio Example: If using LM Studio, load your desired LLM and start the server (usually on port 1234).
Orpheus TTS Server Example: If using a llama.cpp based server for an Orpheus TTS model, ensure it's running and serving the completions API endpoint.

This application will not function correctly if these backend AI services are not available.

Running the Application

Once prerequisites are met, external AI services are running, and dependencies are installed:

Navigate to the project's root directory (where main.py is located).
Ensure your Python virtual environment is activated.
To run the application, use the following command in your terminal:
```
uvicorn main:app --reload
```
- main: Refers to your Python file main.py.
- app: Refers to the FastAPI instance you created in that file (e.g., app = FastAPI()).
- --reload: This flag enables auto-reloading, which is very useful during development as the server will automatically restart when you make changes to your code. For production deployments, you would typically omit this flag.

Accessing the Application

By default, when you run Uvicorn with the command uvicorn main:app --reload (without specifying a --host or --port), it binds to 127.0.0.1 (localhost) on port 8000.

You can access the application by opening your web browser and navigating to: http://127.0.0.1:8000

Optional - Running on a different host or port:

If you want to make the server accessible from other devices on your network or run it on a different port, you can specify the --host and --port arguments:

uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Using --host 0.0.0.0 will make the server listen on all available network interfaces. You would then access it via your machine's IP address (e.g., http://:8000).

Model Reference -- Important for stability and performance:

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
static		static
temp_stt_audio_files		temp_stt_audio_files
README.md		README.md
llm_router.py		llm_router.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Orpheus-Edge: Voice & Text AI Application

Features

Prerequisites

Setup and Installation

Configuration

Running the External AI Services (Crucial Prerequisite!)

Running the Application

Accessing the Application

Model Reference -- Important for stability and performance:

Orpheus TTS Settings Reference screen shots (See right side of screens)

LLM Settings Reference screen shots (See right side of screens)

About

Releases

Packages

Languages

jjmlovesgit/Orpheus_Edge_FastAPI

Folders and files

Latest commit

History

Repository files navigation

Orpheus-Edge: Voice & Text AI Application

Features

Prerequisites

Setup and Installation

Configuration

Running the External AI Services (Crucial Prerequisite!)

Running the Application

Accessing the Application

Model Reference -- Important for stability and performance:

Orpheus TTS Settings Reference screen shots (See right side of screens)

LLM Settings Reference screen shots (See right side of screens)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages