Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
265 changes: 250 additions & 15 deletions docs/sessions/memory.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,17 @@ The `BaseMemoryService` defines the interface for managing this searchable, long

## Choosing the Right Memory Service

The ADK offers two distinct `MemoryService` implementations, each tailored to different use cases. Use the table below to decide which is the best fit for your agent.

| **Feature** | **InMemoryMemoryService** | **VertexAiMemoryBankService** |
| :--- | :--- | :--- |
| **Persistence** | None (data is lost on restart) | Yes (Managed by Vertex AI) |
| **Primary Use Case** | Prototyping, local development, and simple testing. | Building meaningful, evolving memories from user conversations. |
| **Memory Extraction** | Stores full conversation | Extracts [meaningful information](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/generate-memories) from conversations and consolidates it with existing memories (powered by LLM) |
| **Search Capability** | Basic keyword matching. | Advanced semantic search. |
| **Setup Complexity** | None. It's the default. | Low. Requires an [Agent Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/overview) instance in Vertex AI. |
| **Dependencies** | None. | Google Cloud Project, Vertex AI API |
| **When to use it** | When you want to search across multiple sessions chat histories for prototyping. | When you want your agent to remember and learn from past interactions. |
The ADK offers three distinct `MemoryService` implementations, each tailored to different use cases. Use the table below to decide which is the best fit for your agent.

| **Feature** | **InMemoryMemoryService** | **VertexAiMemoryBankService** | **OpenMemoryService** |
| :--- | :--- | :--- | :--- |
| **Persistence** | None (data is lost on restart) | Yes (Managed by Vertex AI) | Yes (Self-hosted backend) |
| **Primary Use Case** | Prototyping, local development, and simple testing. | Building meaningful, evolving memories from user conversations. | Self-hosted deployments with data sovereignty requirements, on-premise deployments, cost-effective memory solutions. |
| **Memory Extraction** | Stores full conversation | Extracts [meaningful information](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/generate-memories) from conversations and consolidates it with existing memories (powered by LLM) | Stores session events with multi-sector embeddings and graceful decay |
| **Search Capability** | Basic keyword matching. | Advanced semantic search. | Advanced semantic search with multi-sector embeddings |
| **Setup Complexity** | None. It's the default. | Low. Requires an [Agent Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/overview) instance in Vertex AI. | Medium. Requires self-hosted OpenMemory backend (Docker or Node.js). |
| **Dependencies** | None. | Google Cloud Project, Vertex AI API | Self-hosted OpenMemory server, `httpx` (via `google-adk[openmemory]`) |
| **When to use it** | When you want to search across multiple sessions' chat histories for prototyping. | When you want your agent to remember and learn from past interactions. | When you need self-hosted, open-source memory with full data control, on-premise deployments, or cost-effective alternatives to cloud services. |

## In-Memory Memory

Expand Down Expand Up @@ -194,6 +194,232 @@ runner = adk.Runner(
)
```

## OpenMemory

The `OpenMemoryService` connects your agent to [OpenMemory](https://openmemory.cavira.app/), a self-hosted, open-source memory system that provides brain-inspired multi-sector embeddings, graceful memory decay, and server-side filtering for efficient multi-user agent deployments.

### How It Works

OpenMemory provides a production-ready, self-hosted memory backend that integrates seamlessly with ADK's `BaseMemoryService` interface. The service handles two key operations:

* **Storing Memories:** Automatically converts ADK session events to OpenMemory memories with enriched content format (embedding author/timestamp metadata).
* **Retrieving Memories:** Leverages OpenMemory's multi-sector embeddings for semantic search and retrieval, with server-side filtering by `user_id` for multi-tenant isolation.

### Key Features

* **Multi-sector embeddings:** Factual, emotional, temporal, and relational memory sectors for richer context understanding.
* **Graceful memory decay:** Automatic reinforcement keeps relevant context sharp while allowing less important memories to fade.
* **Server-side filtering:** Efficient multi-user isolation through indexed database queries.
* **Self-hosted:** Full data ownership with no vendor lock-in, perfect for on-premise deployments.
* **Cost-effective:** 6-10× cheaper than SaaS memory APIs while providing high performance.

### Installation

Install ADK with OpenMemory support:

```bash
pip install google-adk[openmemory]
```

This installs `httpx` for making HTTP requests to the OpenMemory API.

### Prerequisites

Before you can use OpenMemory, you need:

1. **A self-hosted OpenMemory backend:** You can run OpenMemory using Docker or by setting up the Node.js backend manually. See the [Self-Hosted Setup](#self-hosted-setup) section below.
2. **Environment Variables (Optional):** You can configure OpenMemory via environment variables or pass them directly to the service:
```bash
export OPENMEMORY_BASE_URL="http://localhost:3000"
export OPENMEMORY_API_KEY="your-api-key" # Optional, only if server requires authentication
```

### Configuration

You can configure OpenMemory in two ways:

#### Option 1: Using the CLI (Recommended for `adk web` and `adk api_server`)

To connect your agent to OpenMemory using the CLI, use the `--memory_service_uri` flag when starting the ADK server. The URI format is `openmemory://<host>:<port>`.

```bash title="bash"
# Basic usage
adk web path/to/your/agents_dir --memory_service_uri="openmemory://localhost:3000"

# With API key
adk web path/to/your/agents_dir --memory_service_uri="openmemory://localhost:3000?api_key=your-secret-key"

# API server
adk api_server path/to/your/agents_dir --memory_service_uri="openmemory://localhost:3000"
```

**Supported URI formats:**
- `openmemory://localhost:3000` → Connects to `http://localhost:3000`
- `openmemory://localhost:3000?api_key=secret` → Connects with API key authentication
- `openmemory://https://example.com` → Connects to `https://example.com`

#### Option 2: Using Python Code

Alternatively, you can configure OpenMemory by manually instantiating the `OpenMemoryService` and passing it to the `Runner`:

```py
from google.adk.memory import OpenMemoryService
from google.adk import Agent, Runner
from google.adk.sessions import InMemorySessionService
from google.adk.artifacts import InMemoryArtifactService

# Configure OpenMemory with defaults
memory_service = OpenMemoryService(
base_url="http://localhost:3000",
api_key="your-key" # Optional, only if server requires authentication
)

# Create agent
agent = Agent(
name="my_agent",
model="gemini-2.0-flash",
instruction="You are a helpful assistant."
)

# Use with Runner
runner = Runner(
app_name="my_app",
agent=agent,
session_service=InMemorySessionService(),
artifact_service=InMemoryArtifactService(),
memory_service=memory_service
)

# Run with memory
response = await runner.run("Hello, remember this conversation!")
```

### Advanced Configuration

You can customize OpenMemory behavior using `OpenMemoryServiceConfig`:

```py
from google.adk.memory import OpenMemoryService, OpenMemoryServiceConfig

# Create custom configuration
config = OpenMemoryServiceConfig(
search_top_k=20, # Number of memories to retrieve (default: 10)
timeout=10.0, # Request timeout in seconds (default: 30.0)
user_content_salience=0.9, # Importance score for user messages (default: 0.8)
model_content_salience=0.75, # Importance score for model responses (default: 0.7)
default_salience=0.6, # Fallback salience value (default: 0.6)
enable_metadata_tags=True # Toggle session/app tagging (default: True)
)

memory_service = OpenMemoryService(
base_url="http://localhost:3000",
api_key="your-api-key",
config=config
)
```

**Configuration Parameters:**

* `search_top_k` (int, default: 10): Maximum number of memories to retrieve per search query.
* `timeout` (float, default: 30.0): HTTP request timeout in seconds.
* `user_content_salience` (float, default: 0.8): Importance score (0.0-1.0) assigned to user messages when storing memories.
* `model_content_salience` (float, default: 0.7): Importance score (0.0-1.0) assigned to model responses when storing memories.
* `default_salience` (float, default: 0.6): Fallback salience value for content without a recognized author.
* `enable_metadata_tags` (bool, default: True): Whether to include session and app tags for filtering memories by application context.

### Self-Hosted Setup

OpenMemory can be deployed using Docker (recommended) or by setting up the Node.js backend manually.

#### Option 1: Docker (Recommended)

The easiest way to run OpenMemory is using Docker:

```bash
# Run OpenMemory container
docker run -p 3000:3000 cavira/openmemory

# Or use the production network build
docker run -p 3000:3000 cavira/openmemory:production
```

Verify it's running:

```bash
curl http://localhost:3000/health
```

#### Option 2: Node.js Backend

For more control, you can set up the OpenMemory backend manually:

1. **Clone the OpenMemory repository:**
```bash
git clone https://github.com/CaviraOSS/OpenMemory.git
cd OpenMemory/backend
```

2. **Install dependencies:**
```bash
npm install
```

3. **Configure environment variables:**

Create a `.env` file in `OpenMemory/backend/`:
```bash
# Embedding Provider (e.g., Gemini)
OM_EMBEDDINGS=gemini
GEMINI_API_KEY=your-gemini-api-key
EMBED_MODE=simple

# Server Configuration
OM_PORT=3000
OM_API_KEY=openmemory-secret-key # Optional, for API authentication

# Database
DB_PATH=./data/openmemory.db
```

4. **Start the server:**
```bash
npm start
# Server will run on http://localhost:3000
```

For more detailed setup instructions, see the [OpenMemory documentation](https://openmemory.cavira.app/).

### Advanced Usage

#### Multi-User Isolation

OpenMemory uses server-side filtering by `user_id` for efficient multi-tenant isolation. The `user_id` is passed as a top-level parameter to leverage OpenMemory's indexed database column, ensuring fast queries and proper tenant isolation in production deployments.

#### App-Level Filtering

When `enable_metadata_tags=True` (default), OpenMemory automatically tags memories with session and app information. This allows you to filter memories by application context, enabling different memory spaces for different applications.

#### Enriched Content Format

OpenMemory uses an enriched content format where author and timestamp metadata are embedded directly in the content string during storage:

```
[Author: user, Time: 2025-11-04T12:34:56] What is the weather today?
```

On retrieval, the service automatically parses this metadata and returns clean content to users. This design avoids N+1 API calls for metadata while preserving context information efficiently.

### Sample Agent

See the [OpenMemory sample agent](https://github.com/google/adk-python/tree/main/contributing/samples/open_memory) in the ADK Python repository for a complete example that demonstrates:

* Setting up OpenMemoryService with custom configuration
* Storing session events to memory
* Retrieving memories across different sessions
* Using memory in agent conversations

The sample includes setup instructions and shows how to run a complete memory-enabled agent workflow.

## Using Memory in Your Agent

When a memory service is configured, your agent can use a tool or callback to retrieve memories. ADK includes two pre-built tools for retrieving memories:
Expand Down Expand Up @@ -249,19 +475,19 @@ The memory workflow internally involves these steps:

### Can an agent have access to more than one memory service?

* **Through Standard Configuration: No.** The framework (`adk web`, `adk api_server`) is designed to be configured with one single memory service at a time via the `--memory_service_uri` flag. This single service is then provided to the agent and accessed through the built-in `self.search_memory()` method. From a configuration standpoint, you can only choose one backend (`InMemory`, `VertexAiMemoryBankService`) for all agents served by that process.
* **Through Standard Configuration: No.** The framework (`adk web`, `adk api_server`) is designed to be configured with one single memory service at a time via the `--memory_service_uri` flag. This single service is then provided to the agent and accessed through the built-in `self.search_memory()` method. From a configuration standpoint, you can only choose one backend (`InMemory`, `VertexAiMemoryBankService`, `OpenMemoryService`) for all agents served by that process.

* **Within Your Agent's Code: Yes, absolutely.** There is nothing preventing you from manually importing and instantiating another memory service directly inside your agent's code. This allows you to access multiple memory sources within a single agent turn.

For example, your agent could use the framework-configured `VertexAiMemoryBankService` to recall conversational history, and also manually instantiate a `InMemoryMemoryService` to look up information in a technical manual.
For example, your agent could use the framework-configured `VertexAiMemoryBankService` to recall conversational history, and also manually instantiate a `OpenMemoryService` to look up information in a self-hosted memory store.

#### Example: Using Two Memory Services

Here’s how you could implement that in your agent's code:

```python
from google.adk.agents import Agent
from google.adk.memory import InMemoryMemoryService, VertexAiMemoryBankService
from google.adk.memory import InMemoryMemoryService, VertexAiMemoryBankService, OpenMemoryService
from google.genai import types

class MultiMemoryAgent(Agent):
Expand All @@ -275,6 +501,10 @@ class MultiMemoryAgent(Agent):
location="LOCATION",
agent_engine_id="AGENT_ENGINE_ID"
)
# Or use OpenMemoryService for self-hosted memory
self.openmemory_service = OpenMemoryService(
base_url="http://localhost:3000"
)

async def run(self, request: types.Content, **kwargs) -> types.Content:
user_query = request.parts[0].text
Expand All @@ -286,11 +516,16 @@ class MultiMemoryAgent(Agent):
# 2. Search the document knowledge base using the manually created service
document_context = await self.vertexai_memorybank_service.search_memory(query=user_query)

# Combine the context from both sources to generate a better response
# 3. Search self-hosted memory using OpenMemory
openmemory_context = await self.openmemory_service.search_memory(query=user_query)

# Combine the context from all sources to generate a better response
prompt = "From our past conversations, I remember:\n"
prompt += f"{conversation_context.memories}\n\n"
prompt += "From the technical manuals, I found:\n"
prompt += f"{document_context.memories}\n\n"
prompt += "From the self-hosted memory, I found:\n"
prompt += f"{openmemory_context.memories}\n\n"
prompt += f"Based on all this, here is my answer to '{user_query}':"

return await self.llm.generate_content_async(prompt)
Expand Down