diff --git a/src/data/nav/chat.ts b/src/data/nav/chat.ts index 0a1d08d6e9..fd918d6413 100644 --- a/src/data/nav/chat.ts +++ b/src/data/nav/chat.ts @@ -193,6 +193,10 @@ export default { name: 'Livestream chat', link: '/docs/guides/chat/build-livestream', }, + { + name: 'Export chat messages', + link: '/docs/guides/chat/export-chat', + }, ], }, ], diff --git a/src/pages/docs/guides/chat/export-chat.mdx b/src/pages/docs/guides/chat/export-chat.mdx new file mode 100644 index 0000000000..4705f2a698 --- /dev/null +++ b/src/pages/docs/guides/chat/export-chat.mdx @@ -0,0 +1,134 @@ +--- +title: "Guide: Export chat data to your own systems" +meta_description: "Learn how to export chat data from Ably Chat to your own systems." +meta_keywords: "chat, data, export, stream, storage, Ably, chat SDK, realtime messaging, dependability, cost optimisation" +--- + +Ably Chat is designed to be a simple and easy to use realtime chat solution that handles any scale from 1:1 and small group chats to large livestream chats with millions of users. + +Ably holds data for the purpose of providing realtime experiences. While Ably Chat provides flexible data retention for messages (30 days by default, up to a year on request), applications often need longer-term storage or additional control over their data. + +This guide presents different ways to use Ably Chat and store chat data in your own systems, which can help you meet your data retention requirements, as well as help you build more complex use cases such as search, analytics, and more. + +## Different ways to export data from Ably Chat + +We will explain each in detail, and provide code examples for each. This is an overview of the different ways to export data from Ably Chat. + +This article covers the following options: + +1. Using [outbound webhooks](/docs/platform/integrations/webhooks). [HTTP endpoint](/docs/platform/integrations/webhooks/generic), [AWS Lambda](/docs/platform/integrations/webhooks/lambda), and others. +2. Using [outbound streaming](/docs/platform/integrations/streaming). Stream to your own [Kafka](/docs/platform/integrations/streaming/kafka), [Kinesis](/docs/platform/integrations/streaming/kinesis), and others. +3. Using an [Ably queue](/docs/platform/integrations/queues). +4. Publishing via your own servers. +5. Using the Chat History endpoint. + +## Decoding and storing messages + +Regardless of the delivery mechanism, you will need to decode the received messages into Chat messages. Details of the mapping from Ably Pub/Sub messages to Chat messages are available in the [Integrations](/docs/chat/integrations) documentation. + +After performing the decoding and you have a chat `Message` object, you can: + +1. Save it to your own database. You can index by `serial`, this is the global unique identifier for a message, and is also used to sort messages in the canonical global order. +2. If the message already exists by `serial`, it means you have received an update, delete, or reaction summary update. To check if you need to update the message, you can use the `version.serial` to compare the latest version of the message you have received with the version of the message you have in your database. Lexicographically higher means newer version. +3. If you want to store reaction summaries, always update the reactions field when receiving a reaction summary update (action `4` or `message.summary`). + + +```javascript +const saveOrUpdateMessage = (message) => { + // Check if the message already exists in your own database by `serial` + const existingMessage = await getMessageBySerial(message.serial); + if (!existingMessage) { + // message not yet in your database => save it + await saveMessage(message); + return; + } + + if (message.version.serial < existingMessage.version.serial) { + // if received version is older, discard + return; + } else if (message.version.serial === existingMessage.version.serial && message.action !== 'message.summary') { + // if the message is the same version, and the action is not a summary, discard + return; + } + + // message is newer or it's a summary event => update the message + await updateMessage(existingMessage, message); +}; +``` + + +## Using a webhook via integration rules + +Ably can forward messages to your own system via a webhook. This is the simplest to setup if you don't already have other systems in place for message ingestion. This section covers the simple HTTP endpoint webhook, but the same principles apply to other webhook integrations such as AWS Lambda, Azure Function, Google Function, and others. + +Read the guide on [outbound webhooks](/docs/platform/integrations/webhooks) for more details on how to setup the webhook with Ably for the platform of your choice. + +All webhook integrations allow you to use a regex filter on the channel name to control which channels the webhook should be triggered for. Use a common prefix in the name of chat rooms that you want to trigger a webhook for, and use the prefix as the filter. + +Use `channel.message` as the event type. + +You need to consider: +- Redundancy. In case of failure, Ably will retry delivering the message to your webhook, but only for a short period of time. You can see errors in the [`[meta]log` channel](/docs/platform/errors#meta). +- Ordering. Messages can arrive out-of-order. Mitigated by the fact that they are globally sortable by their `serial` and `version.serial`. In rare cases, this can cause inconsistencies for reaction summaries if those are of interest to you. +- Consistency. Missing webhook calls will lead to inconsistencies between your database and Ably, which can be difficult to resolve. +- [At-least-once delivery](/docs/platform/architecture/idempotency#protocol-support-for-exactly-once-delivery). You need to be able to handle duplicate messages. Deduplication can be done by checking `serial` and `version.serial`. + +## Using outbound streaming + +Ably can stream messages directly to your own queueing or streaming service: Kinesis, Kafka, AMQP, SQS. Read the guide on [outbound streaming](/docs/platform/integrations/streaming) for more details on how to setup the streaming integration with Ably for the service of your choice. + +Pros: +- Use your existing queue system to process and save messages from Ably. +- You have control over saving messages to your own database. + +You need to consider: +- You need to maintain and be responsible for a reliable queue system. If you don't already have such a system it increases complexity on your end. +- Consistency. If your queue system is not reachable, you will lose messages. Errors can be seen in the [`[meta]log` channel](/docs/platform/errors#meta). + +## Using an Ably queue + +Ably can forward messages from chat room channels to an Ably Queue, which you can then consume from your own servers to save messages to your own database. Read the guide on [Ably queues](/docs/platform/integrations/queues) for more details on how to setup the queue integration with Ably. + +Ably ensures that each message is delivered to only one consumer even if multiple consumers are connected. + +Benefits of using an Ably queue: + +- You can consume it from your servers, meaning overall this is fault-tolerant. Ably takes care of the complexity of maintaining a queue. +- You can use multiple queues and configure which channels go to which queue via regex filters on the channel name. +- If your systems suffer any downtime, you will not miss anything (up to the queue max size). + +You need to consider: +- During peak times you may need to scale up your consumers to avoid overloading the queue past the maximum queue length allowed. +- Each message has a TTL in the queue. +- Oldest messages are dropped if the maximum queue length is exceeded. Check the [dead letter queue](/docs/platform/integrations/queues#deadletter) to see if this is happening. + +## Publishing via your own servers + +Change the publish path: instead of publishing Chat messages, updates, and deletes to Ably directly, proxy them through your own server. This gives you the opportunity to also save the messages as they are produced, and also apply different validation schemes if needed. + +Benefits: +- Full control over publishing. +- Opportunity to add extra validation before publishing to Ably. +- You can publish messages directly via the Chat REST API, and avoid having to encode/decode Chat Messages to and from Ably Pub/Sub messages. You can bypass using an SDK entirely or you can use the Chat SDK for publishing. + +You need to consider: +- You need to handle updates and deletes on your own. +- Storing message reactions can be difficult since you will not have access to the aggregate (summaries) Ably provides. +- Your own servers are in the middle of the message publish path, so they can become a bottleneck in availability and will add latency in the publish path. +- Your own servers will need to handle the scale you operate at for realtime publishes. + +## Using the Chat History endpoint + +You can fetch the message history of a chat room using the [Chat History endpoint](/docs/api/chat-rest#tag/rooms/paths/~1chat~1%7Bversion%7D~1rooms~1%7BroomName%7D~1messages/get) or the [Chat SDK](/docs/chat/rooms/history). The chat room history endpoint is a paginated HTTP endpoint that allows you to retrieve messages from a chat room. The Chat SDK provides a convenient way to fetch the history of a chat room. + +If your use case is to archive chats that have ended, such as to export the chat history of a support ticket that is closed, you can use the chat history endpoint to export the messages to your own system. Read the docs on [chat history](/docs/chat/rooms/history) for more details. + +The intended use of the chat history endpoint is to retrieve messages for pre-filling a chat window, and not for ingesting messages into other systems. As a result, there are some important things to consider: + +- The history endpoint is not a changelog, it is a snapshot of the messages in the room at the time the request is made. +- The history API returns messages in their canonical global order (sorted by `serial`). +- For each message, only the latest version of the message is returned. +- You will need to decide when and which rooms to import messages from. +- You can import the same room multiple times (deduplicate by `serial` and `version.serial`), but you will need to always fetch from the first message to make sure you don't miss any updates or deletes of older messages. + +For use cases where there is a clear start and end of the chat, exporting the chat via history requests is a simple, reliable solution. If there is no clear start and end for chats, please consider using one of the other methods mentioned in this guide.