Go LLM Chat – Multi-LLM Orchestrator Demo

This project demonstrates a multi-LLM orchestration system written in Go. It shows how to:

Orchestrate multiple LLMs (OpenAI GPT-4o-mini models) with different prompts.
Stream answers to the caller using Server-Sent Events (SSE).
Enrich answers with domain data (a MongoDB collection of fictional flight data).
Support multilingual queries (English and Spanish).
Run everything locally with Docker Compose

Architecture

graph TD
    subgraph Client
        A[Browser / curl]
    end

    subgraph Server
        H((/api)) -->|SSE| A
        H[HTTP Handler]
        H --> O[Orchestrator]
        O -->|prompt| L1[(LLM 1)]
        O -->|prompt| L2[(LLM 2)]
        O -->|aggregate| L3[(LLM 3)]
        O -->|query| DB[(MongoDB flights)]
    end

    subgraph Docker
        DB ---|network| Server
    end

LLM 1 – concise, formal replies (or list of flights when the question is about flights).
LLM 2 – verbose, friendly replies (or duration & cost when the question is about flights).
LLM 3 – aggregation layer that combines LLM1 and LLM2 responses.

When the user's question mentions flights (in English or Spanish) the orchestrator:

Extracts origin / destination city names using a simple synonym map.
Queries MongoDB for matching flights (case-insensitive, supports wildcard searches).
Feeds the flight list to both LLMs with different prompts.
LLM3 aggregates both responses into a unified, well-formatted answer.
Streams back the final aggregated response via SSE.

For non-flight questions, LLM1/LLM2 are given the user's question with their respective style prompts, and LLM3 combines the formal and friendly perspectives into one balanced response.

Multilingual Support

The system supports queries in both English and Spanish:

Flight queries work in both languages (e.g., "flights to London" / "vuelos a Londres")
General questions are processed in the language they're asked
City name variations are automatically mapped (e.g., "Londres" → "London", "Madrid" → "Madrid")
Airport codes are supported (e.g., "JFK" → "New York", "MAD" → "Madrid")

The LLMs maintain the original language in their responses, providing a seamless multilingual experience.

Getting Started

Prerequisites

Go 1.22+
Docker & Docker Compose (optional but easiest)
OpenAI API key (required - set OPENAI_API_KEY environment variable)
Internet connection (to access OpenAI API)

Clone & run with Docker Compose (recommended)

⚠️ Important: You need a valid OpenAI API key to run this application.

# 1. Clone the repo
$ git clone https://github.com/Cris245/go-llm-chat.git
$ cd go-llm-chat

# 2. Set your OpenAI API key (choose one method):
# Option A: Export environment variable
$ export OPENAI_API_KEY="sk-…"

# Option B: Create .env file
$ echo "OPENAI_API_KEY=sk-…" > .env

# 3. Start everything
$ docker-compose up --build

Docker Compose spins up:

MongoDB on mongodb://mongo:27017 (aliased as MONGO_URI).
Go server on http://localhost:8080.

On first start the server seeds the flightdb.flights collection with a set of 20 sample flights (Madrid ↔ Paris, London ↔ Berlin, Tokyo → LA, …). Seeding is done via upsert, so re-starts won't duplicate data.

Run natively (Go only)

Start MongoDB locally (brew services start mongodb-community@7 or similar).

Export environment variables:

export MONGO_URI="mongodb://localhost:27017"
export OPENAI_API_KEY="sk-…"

Run the server:
```
go run ./cmd/server
```

API

POST /api with plain-text body. The response is an SSE stream.

Events

Event `Type`	Meaning	Example `Data`
`Status`	Internal status update (invoking LLM)	`Invoking LLM 1`
`Message`	Final aggregated answer	See example below

Curl Examples

List all flights:

curl -N -X POST -d "Que vuelos hay en general?" http://localhost:8080/api

Ask for specific route:

curl -N -X POST -d "hay vuelos a londres?" http://localhost:8080/api

General (non-flight) question:

curl -N -X POST -d "Explain quantum teleportation in simple terms" http://localhost:8080/api

The -N flag keeps the connection open so you see the Status events followed by the Message.

Troubleshooting

Common Issues

"Incorrect API key provided" error:

Ensure your OpenAI API key is valid and has sufficient credits
Check that the key is properly set in your environment or .env file
Verify the key doesn't have extra characters or line breaks

"No flights found" for valid queries:

The system includes price filtering - try removing price constraints
Check that city names are spelled correctly
Supported cities: Madrid, Paris, London, Barcelona, Valencia, Seville, Tokyo, New York, Los Angeles, Berlin, Rome

Connection refused on localhost:8080:

Ensure Docker containers are running: docker-compose ps
Check container logs: docker-compose logs app
Verify port 8080 is not used by another application

Load Testing

The project includes a load testing script to validate concurrent request handling:

# Test 5 simultaneous requests
./scripts/load_test.sh 5

# Test 10 simultaneous requests  
./scripts/load_test.sh 10

The script sends multiple concurrent requests and validates that:

All requests receive responses
No requests are lost or dropped
Server maintains performance under load
SSE streams work correctly with multiple clients

Challenges Faced & Solutions

1. Multi-LLM Orchestration Complexity

Challenge: Coordinating three LLMs with different roles while maintaining response quality. Solution: Implemented parallel processing with sync.WaitGroup and channels, ensuring LLM1 and LLM2 run concurrently, then LLM3 aggregates their results.

2. Enhanced NLP for Flight Queries

Challenge: Extracting flight parameters (cities, dates, prices) from natural language in multiple languages. Solution: Built comprehensive regex patterns and city synonym maps supporting airport codes (JFK → New York), multi-language variations (Londres → London), and price constraints.

3. Real-time Streaming with SSE

Challenge: Providing live status updates during multi-step LLM processing. Solution: Implemented Server-Sent Events with proper event channels, allowing clients to see real-time progress through each LLM invocation.

4. Concurrent Request Handling

Challenge: Ensuring the system can handle multiple simultaneous users without losing requests or mixing responses. Solution: Used Go's goroutines and proper channel management to handle concurrent requests independently, with each request getting its own SSE stream.

5. MongoDB Integration & Data Consistency

Challenge: Managing flight data with upsert operations and ensuring data consistency across restarts. Solution: Implemented upsert-based seeding that prevents duplicates while ensuring fresh data on each startup.

6. Error Handling & Resilience

Challenge: Building a robust system that gracefully handles LLM failures, network issues, and partial responses. Solution: Implemented fallback mechanisms where if LLM3 aggregation fails, the system combines LLM1 and LLM2 responses directly.

Project Layout

cmd/
  server/            # main.go – HTTP + SSE + orchestration wiring
internal/
  db/                # MongoDB client, models & seed data
  llmclient/         # Thin wrapper around OpenAI ChatCompletion
  orchestrator/      # Core logic (detect flights, prompt LLMs, merge)
  sse/               # Minimal SSE helper
scripts/
  load_test.sh       # Concurrent request testing script
Dockerfile           # Builds the Go binary for prod
Docker-compose.yml   # Mongo + server services

Next Steps

Ability to connect to actual GDS (Global Distribution System) and PSS (Passenger Service System) to get real flight information.
Hot reload for Docker development so each code change deploys new containers automatically.
Simple web UI to make the system more user-friendly and visually clear.
API rate monitor

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
cmd/server		cmd/server
internal		internal
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Go LLM Chat – Multi-LLM Orchestrator Demo

Architecture

Multilingual Support

Getting Started

Prerequisites

Clone & run with Docker Compose (recommended)

Run natively (Go only)

API

Events

Curl Examples

Troubleshooting

Common Issues

Load Testing

Challenges Faced & Solutions

1. Multi-LLM Orchestration Complexity

2. Enhanced NLP for Flight Queries

3. Real-time Streaming with SSE

4. Concurrent Request Handling

5. MongoDB Integration & Data Consistency

6. Error Handling & Resilience

Project Layout

Next Steps

About

Uh oh!

Releases

Packages

Languages

Cris245/go-llm-chat

Folders and files

Latest commit

History

Repository files navigation

Go LLM Chat – Multi-LLM Orchestrator Demo

Architecture

Multilingual Support

Getting Started

Prerequisites

Clone & run with Docker Compose (recommended)

Run natively (Go only)

API

Events

Curl Examples

Troubleshooting

Common Issues

Load Testing

Challenges Faced & Solutions

1. Multi-LLM Orchestration Complexity

2. Enhanced NLP for Flight Queries

3. Real-time Streaming with SSE

4. Concurrent Request Handling

5. MongoDB Integration & Data Consistency

6. Error Handling & Resilience

Project Layout

Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages