Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions src/data/nav/chat.ts
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,10 @@ export default {
name: 'Livestream chat',
link: '/docs/guides/chat/build-livestream',
},
{
name: 'Export chat messages',
link: '/docs/guides/chat/export-chat',
},
],
},
],
Expand Down
134 changes: 134 additions & 0 deletions src/pages/docs/guides/chat/export-chat.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
---
title: "Guide: Export chat data to your own systems"
meta_description: "Learn how to export chat data from Ably Chat to your own systems."
meta_keywords: "chat, data, export, stream, storage, Ably, chat SDK, realtime messaging, dependability, cost optimisation"
---

Ably Chat is designed to be a simple and easy to use realtime chat solution that handles any scale from 1:1 and small group chats to large livestream chats with millions of users.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Ably Chat is designed to be a simple and easy to use realtime chat solution that handles any scale from 1:1 and small group chats to large livestream chats with millions of users.
Ably Chat is a simple and easy to use realtime chat solution that handles any scale from 1:1 and small group chats to large livestream chats with millions of users.


Ably holds data for the purpose of providing realtime experiences. While Ably Chat provides flexible data retention for messages (30 days by default, up to a year on request), applications often need longer-term storage or additional control over their data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Ably holds data for the purpose of providing realtime experiences. While Ably Chat provides flexible data retention for messages (30 days by default, up to a year on request), applications often need longer-term storage or additional control over their data.
Ably holds data for the purpose of providing realtime experiences. While Ably Chat provides flexible data retention for messages (30 days by default, up to a year on request), some applications may need longer-term storage or additional control over their data.

More a nit pick, so feel free to ignore :)


This guide presents different ways to use Ably Chat and store chat data in your own systems, which can help you meet your data retention requirements, as well as help you build more complex use cases such as search, analytics, and more.

## Different ways to export data from Ably Chat

We will explain each in detail, and provide code examples for each. This is an overview of the different ways to export data from Ably Chat.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line isn't needed, Perhaps just the line, This article covers the following options?


This article covers the following options:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This article covers the following options:
This guide covers the following options:


1. Using [outbound webhooks](/docs/platform/integrations/webhooks). [HTTP endpoint](/docs/platform/integrations/webhooks/generic), [AWS Lambda](/docs/platform/integrations/webhooks/lambda), and others.
2. Using [outbound streaming](/docs/platform/integrations/streaming). Stream to your own [Kafka](/docs/platform/integrations/streaming/kafka), [Kinesis](/docs/platform/integrations/streaming/kinesis), and others.
3. Using an [Ably queue](/docs/platform/integrations/queues).
4. Publishing via your own servers.
5. Using the Chat History endpoint.

## Decoding and storing messages

Regardless of the delivery mechanism, you will need to decode the received messages into Chat messages. Details of the mapping from Ably Pub/Sub messages to Chat messages are available in the [Integrations](/docs/chat/integrations) documentation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is confusing for me. If I'm an Ably Chat user, using the Ably Chat SDK... why would I have to decode anything? I should be already managing Chat messages, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where we are right now. Integrations (webhooks, outbound streaming, , etc) are a pub/sub features, so you need to do the encoding/decoding yourself.

In the future it would be nice if those were chat-specific to make it simpler but they're not right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The future agreed direction is that the actual payloads will be the same across product, but that the SDKs (e.g. Chat) will have methods that convert these to their local representations


After performing the decoding and you have a chat `Message` object, you can:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
After performing the decoding and you have a chat `Message` object, you can:
After performing the decoding to get your chat `Message` object, you can:


1. Save it to your own database. You can index by `serial`, this is the global unique identifier for a message, and is also used to sort messages in the canonical global order.
2. If the message already exists by `serial`, it means you have received an update, delete, or reaction summary update. To check if you need to update the message, you can use the `version.serial` to compare the latest version of the message you have received with the version of the message you have in your database. Lexicographically higher means newer version.
3. If you want to store reaction summaries, always update the reactions field when receiving a reaction summary update (action `4` or `message.summary`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "reaction summary update"? Is it an special message type, with different metadata? Is there a link to related documentation?

Copy link
Contributor Author

@vladvelici vladvelici Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a message with action message.summary. Over the wire it's a full regular message, but at the time of feature implementation in Chat this was a partial message that only had the summary (and identifying basics like timestamp and serial).

The chat SDK still exposes these via room.messages.reactions.subscribe() as a reactions summary event.

In Pub/Sub you get them via channel.subscribe().

Over integrations they look like messages with action message.summary.

Not sure what doc to point you to, probably https://ably.com/docs/chat/rooms/message-reactions, and maybe also annotations: https://ably.com/docs/messages/annotations#subscribe-to-annotation-summaries-.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should add some links here to direct users to sections on message serials/versions?


<Code>
```javascript
const saveOrUpdateMessage = (message) => {
// Check if the message already exists in your own database by `serial`
const existingMessage = await getMessageBySerial(message.serial);
if (!existingMessage) {
// message not yet in your database => save it
await saveMessage(message);
return;
}

if (message.version.serial < existingMessage.version.serial) {
// if received version is older, discard
return;
} else if (message.version.serial === existingMessage.version.serial && message.action !== 'message.summary') {
// if the message is the same version, and the action is not a summary, discard
return;
}

// message is newer or it's a summary event => update the message
await updateMessage(existingMessage, message);
};
```
</Code>

## Using a webhook via integration rules

Ably can forward messages to your own system via a webhook. This is the simplest to setup if you don't already have other systems in place for message ingestion. This section covers the simple HTTP endpoint webhook, but the same principles apply to other webhook integrations such as AWS Lambda, Azure Function, Google Function, and others.

Read the guide on [outbound webhooks](/docs/platform/integrations/webhooks) for more details on how to setup the webhook with Ably for the platform of your choice.

All webhook integrations allow you to use a regex filter on the channel name to control which channels the webhook should be triggered for. Use a common prefix in the name of chat rooms that you want to trigger a webhook for, and use the prefix as the filter.

Use `channel.message` as the event type.

You need to consider:
- Redundancy. In case of failure, Ably will retry delivering the message to your webhook, but only for a short period of time. You can see errors in the [`[meta]log` channel](/docs/platform/errors#meta).
- Ordering. Messages can arrive out-of-order. Mitigated by the fact that they are globally sortable by their `serial` and `version.serial`. In rare cases, this can cause inconsistencies for reaction summaries if those are of interest to you.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should call out at-least-once semantics here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't spot any reference to that in the platform docs for integrations, so maybe leave it out? I'm thinking this kind of stuff shouldn't come from chat guides and we already cover how to deduplicate by serial.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed briefly here in relation to Lambda: https://ably.com/docs/platform/architecture/idempotency#protocol-support-for-exactly-once-delivery

Also because there's retries you do have to account for it

- Consistency. Missing webhook calls will lead to inconsistencies between your database and Ably, which can be difficult to resolve.
- [At-least-once delivery](/docs/platform/architecture/idempotency#protocol-support-for-exactly-once-delivery). You need to be able to handle duplicate messages. Deduplication can be done by checking `serial` and `version.serial`.

## Using outbound streaming

Ably can stream messages directly to your own queueing or streaming service: Kinesis, Kafka, AMQP, SQS. Read the guide on [outbound streaming](/docs/platform/integrations/streaming) for more details on how to setup the streaming integration with Ably for the service of your choice.

Pros:
- Use your existing queue system to process and save messages from Ably.
- You have control over saving messages to your own database.

You need to consider:
- You need to maintain and be responsible for a reliable queue system. If you don't already have such a system it increases complexity on your end.
- Consistency. If your queue system is not reachable, you will lose messages. Errors can be seen in the [`[meta]log` channel](/docs/platform/errors#meta).

## Using an Ably queue

Ably can forward messages from chat room channels to an Ably Queue, which you can then consume from your own servers to save messages to your own database. Read the guide on [Ably queues](/docs/platform/integrations/queues) for more details on how to setup the queue integration with Ably.

Ably ensures that each message is delivered to only one consumer even if multiple consumers are connected.

Benefits of using an Ably queue:

- You can consume it from your servers, meaning overall this is fault-tolerant. Ably takes care of the complexity of maintaining a queue.
- You can use multiple queues and configure which channels go to which queue via regex filters on the channel name.
- If your systems suffer any downtime, you will not miss anything (up to the queue max size).

You need to consider:
- During peak times you may need to scale up your consumers to avoid overloading the queue past the maximum queue length allowed.
- Each message has a TTL in the queue.
- Oldest messages are dropped if the maximum queue length is exceeded. Check the [dead letter queue](/docs/platform/integrations/queues#deadletter) to see if this is happening.

## Publishing via your own servers

Change the publish path: instead of publishing Chat messages, updates, and deletes to Ably directly, proxy them through your own server. This gives you the opportunity to also save the messages as they are produced, and also apply different validation schemes if needed.

Benefits:
- Full control over publishing.
- Opportunity to add extra validation before publishing to Ably.
- You can publish messages directly via the Chat REST API, and avoid having to encode/decode Chat Messages to and from Ably Pub/Sub messages. You can bypass using an SDK entirely or you can use the Chat SDK for publishing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say just use the chat SDK, all the enocde/decode part for me (as an external developer) sounds confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding to have to do it is a benefit when you publish via your own servers or fetch via history.

But we don't have this benefit in the other methods of integrating right now.

I'd leave this paragraph in, perhaps mention this in the next section as well. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main problem I see is that for me, as a developer that wants to use Ably Chat, is difficult to understand why should I care about PubSub... I want to use the chat, how is that implemented in Ably should be transparent for me. I see that I have to encode or decode from PubSub and that sounds complex and unnecessary. I just want to manage my messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a progressive disclosure of complexity that we aim for - there will be some use-cases where we have to introduce Pub/Sub to the mix.

In this case, once we do the SDK changes to take integration payloads as they are and turn them directly into chat messages... people won't need to worry about Pub/Sub.


You need to consider:
- You need to handle updates and deletes on your own.
- Storing message reactions can be difficult since you will not have access to the aggregate (summaries) Ably provides.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we offer an API for this in the SDK or REST APIs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can:

  • Fetch history (includes summaries)
  • Fetch single message by serial (includes summaries)
  • Fetch "own summary" but not really useful here

But in this context they're all a big annoyance since you need to actively fetch instead and you need to decide when to do it.

- Your own servers are in the middle of the message publish path, so they can become a bottleneck in availability and will add latency in the publish path.
- Your own servers will need to handle the scale you operate at for realtime publishes.

## Using the Chat History endpoint

You can fetch the message history of a chat room using the [Chat History endpoint](/docs/api/chat-rest#tag/rooms/paths/~1chat~1%7Bversion%7D~1rooms~1%7BroomName%7D~1messages/get) or the [Chat SDK](/docs/chat/rooms/history). The chat room history endpoint is a paginated HTTP endpoint that allows you to retrieve messages from a chat room. The Chat SDK provides a convenient way to fetch the history of a chat room.

If your use case is to archive chats that have ended, such as to export the chat history of a support ticket that is closed, you can use the chat history endpoint to export the messages to your own system. Read the docs on [chat history](/docs/chat/rooms/history) for more details.

The intended use of the chat history endpoint is to retrieve messages for pre-filling a chat window, and not for ingesting messages into other systems. As a result, there are some important things to consider:

- The history endpoint is not a changelog, it is a snapshot of the messages in the room at the time the request is made.
- The history API returns messages in their canonical global order (sorted by `serial`).
- For each message, only the latest version of the message is returned.
- You will need to decide when and which rooms to import messages from.
- You can import the same room multiple times (deduplicate by `serial` and `version.serial`), but you will need to always fetch from the first message to make sure you don't miss any updates or deletes of older messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, every message retrieved via the history API is a billable history, so implementing this is not free.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add a note about this? I think perhaps this is better suited for the history page or pricing docs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every message sent via an integration is too - so history isn't alone in this regard

For use cases where there is a clear start and end of the chat, exporting the chat via history requests is a simple, reliable solution. If there is no clear start and end for chats, please consider using one of the other methods mentioned in this guide.