-
Notifications
You must be signed in to change notification settings - Fork 46
chat: bring own database guide #2949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
7f8f974
63f4414
b328602
50da9da
838b68a
7fb2c92
0ef556f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,134 @@ | ||||||
| --- | ||||||
| title: "Guide: Export chat data to your own systems" | ||||||
| meta_description: "Learn how to export chat data from Ably Chat to your own systems." | ||||||
| meta_keywords: "chat, data, export, stream, storage, Ably, chat SDK, realtime messaging, dependability, cost optimisation" | ||||||
| --- | ||||||
|
|
||||||
| Ably Chat is designed to be a simple and easy to use realtime chat solution that handles any scale from 1:1 and small group chats to large livestream chats with millions of users. | ||||||
|
|
||||||
| Ably holds data for the purpose of providing realtime experiences. While Ably Chat provides flexible data retention for messages (30 days by default, up to a year on request), applications often need longer-term storage or additional control over their data. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
More a nit pick, so feel free to ignore :) |
||||||
|
|
||||||
| This guide presents different ways to use Ably Chat and store chat data in your own systems, which can help you meet your data retention requirements, as well as help you build more complex use cases such as search, analytics, and more. | ||||||
|
|
||||||
| ## Different ways to export data from Ably Chat | ||||||
|
|
||||||
| We will explain each in detail, and provide code examples for each. This is an overview of the different ways to export data from Ably Chat. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this line isn't needed, Perhaps just the line, |
||||||
|
|
||||||
| This article covers the following options: | ||||||
AndyTWF marked this conversation as resolved.
Show resolved
Hide resolved
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| 1. Using [outbound webhooks](/docs/platform/integrations/webhooks). [HTTP endpoint](/docs/platform/integrations/webhooks/generic), [AWS Lambda](/docs/platform/integrations/webhooks/lambda), and others. | ||||||
| 2. Using [outbound streaming](/docs/platform/integrations/streaming). Stream to your own [Kafka](/docs/platform/integrations/streaming/kafka), [Kinesis](/docs/platform/integrations/streaming/kinesis), and others. | ||||||
| 3. Using an [Ably queue](/docs/platform/integrations/queues). | ||||||
| 4. Publishing via your own servers. | ||||||
| 5. Using the Chat History endpoint. | ||||||
|
|
||||||
| ## Decoding and storing messages | ||||||
|
|
||||||
| Regardless of the delivery mechanism, you will need to decode the received messages into Chat messages. Details of the mapping from Ably Pub/Sub messages to Chat messages are available in the [Integrations](/docs/chat/integrations) documentation. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This part is confusing for me. If I'm an Ably Chat user, using the Ably Chat SDK... why would I have to decode anything? I should be already managing Chat messages, right?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is where we are right now. Integrations (webhooks, outbound streaming, , etc) are a pub/sub features, so you need to do the encoding/decoding yourself. In the future it would be nice if those were chat-specific to make it simpler but they're not right now.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The future agreed direction is that the actual payloads will be the same across product, but that the SDKs (e.g. Chat) will have methods that convert these to their local representations |
||||||
|
|
||||||
| After performing the decoding and you have a chat `Message` object, you can: | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| 1. Save it to your own database. You can index by `serial`, this is the global unique identifier for a message, and is also used to sort messages in the canonical global order. | ||||||
| 2. If the message already exists by `serial`, it means you have received an update, delete, or reaction summary update. To check if you need to update the message, you can use the `version.serial` to compare the latest version of the message you have received with the version of the message you have in your database. Lexicographically higher means newer version. | ||||||
| 3. If you want to store reaction summaries, always update the reactions field when receiving a reaction summary update (action `4` or `message.summary`). | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is a "reaction summary update"? Is it an special message type, with different metadata? Is there a link to related documentation?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a message with action The chat SDK still exposes these via In Pub/Sub you get them via Over integrations they look like messages with action Not sure what doc to point you to, probably https://ably.com/docs/chat/rooms/message-reactions, and maybe also annotations: https://ably.com/docs/messages/annotations#subscribe-to-annotation-summaries-.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps we should add some links here to direct users to sections on message serials/versions? |
||||||
|
|
||||||
| <Code> | ||||||
| ```javascript | ||||||
| const saveOrUpdateMessage = (message) => { | ||||||
| // Check if the message already exists in your own database by `serial` | ||||||
| const existingMessage = await getMessageBySerial(message.serial); | ||||||
vladvelici marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
| if (!existingMessage) { | ||||||
| // message not yet in your database => save it | ||||||
| await saveMessage(message); | ||||||
david-hernandez-ably marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
| return; | ||||||
| } | ||||||
|
|
||||||
| if (message.version.serial < existingMessage.version.serial) { | ||||||
| // if received version is older, discard | ||||||
| return; | ||||||
| } else if (message.version.serial === existingMessage.version.serial && message.action !== 'message.summary') { | ||||||
| // if the message is the same version, and the action is not a summary, discard | ||||||
| return; | ||||||
| } | ||||||
|
|
||||||
| // message is newer or it's a summary event => update the message | ||||||
| await updateMessage(existingMessage, message); | ||||||
| }; | ||||||
| ``` | ||||||
| </Code> | ||||||
|
|
||||||
| ## Using a webhook via integration rules | ||||||
|
|
||||||
| Ably can forward messages to your own system via a webhook. This is the simplest to setup if you don't already have other systems in place for message ingestion. This section covers the simple HTTP endpoint webhook, but the same principles apply to other webhook integrations such as AWS Lambda, Azure Function, Google Function, and others. | ||||||
|
|
||||||
| Read the guide on [outbound webhooks](/docs/platform/integrations/webhooks) for more details on how to setup the webhook with Ably for the platform of your choice. | ||||||
|
|
||||||
| All webhook integrations allow you to use a regex filter on the channel name to control which channels the webhook should be triggered for. Use a common prefix in the name of chat rooms that you want to trigger a webhook for, and use the prefix as the filter. | ||||||
|
|
||||||
| Use `channel.message` as the event type. | ||||||
|
|
||||||
| You need to consider: | ||||||
| - Redundancy. In case of failure, Ably will retry delivering the message to your webhook, but only for a short period of time. You can see errors in the [`[meta]log` channel](/docs/platform/errors#meta). | ||||||
| - Ordering. Messages can arrive out-of-order. Mitigated by the fact that they are globally sortable by their `serial` and `version.serial`. In rare cases, this can cause inconsistencies for reaction summaries if those are of interest to you. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should call out at-least-once semantics here
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can't spot any reference to that in the platform docs for integrations, so maybe leave it out? I'm thinking this kind of stuff shouldn't come from chat guides and we already cover how to deduplicate by serial. WDYT?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Discussed briefly here in relation to Lambda: https://ably.com/docs/platform/architecture/idempotency#protocol-support-for-exactly-once-delivery Also because there's retries you do have to account for it |
||||||
| - Consistency. Missing webhook calls will lead to inconsistencies between your database and Ably, which can be difficult to resolve. | ||||||
| - [At-least-once delivery](/docs/platform/architecture/idempotency#protocol-support-for-exactly-once-delivery). You need to be able to handle duplicate messages. Deduplication can be done by checking `serial` and `version.serial`. | ||||||
|
|
||||||
| ## Using outbound streaming | ||||||
|
|
||||||
| Ably can stream messages directly to your own queueing or streaming service: Kinesis, Kafka, AMQP, SQS. Read the guide on [outbound streaming](/docs/platform/integrations/streaming) for more details on how to setup the streaming integration with Ably for the service of your choice. | ||||||
|
|
||||||
| Pros: | ||||||
| - Use your existing queue system to process and save messages from Ably. | ||||||
| - You have control over saving messages to your own database. | ||||||
|
|
||||||
| You need to consider: | ||||||
AndyTWF marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
| - You need to maintain and be responsible for a reliable queue system. If you don't already have such a system it increases complexity on your end. | ||||||
| - Consistency. If your queue system is not reachable, you will lose messages. Errors can be seen in the [`[meta]log` channel](/docs/platform/errors#meta). | ||||||
|
|
||||||
| ## Using an Ably queue | ||||||
|
|
||||||
| Ably can forward messages from chat room channels to an Ably Queue, which you can then consume from your own servers to save messages to your own database. Read the guide on [Ably queues](/docs/platform/integrations/queues) for more details on how to setup the queue integration with Ably. | ||||||
|
|
||||||
| Ably ensures that each message is delivered to only one consumer even if multiple consumers are connected. | ||||||
|
|
||||||
| Benefits of using an Ably queue: | ||||||
|
|
||||||
| - You can consume it from your servers, meaning overall this is fault-tolerant. Ably takes care of the complexity of maintaining a queue. | ||||||
| - You can use multiple queues and configure which channels go to which queue via regex filters on the channel name. | ||||||
| - If your systems suffer any downtime, you will not miss anything (up to the queue max size). | ||||||
AndyTWF marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
|
||||||
| You need to consider: | ||||||
| - During peak times you may need to scale up your consumers to avoid overloading the queue past the maximum queue length allowed. | ||||||
| - Each message has a TTL in the queue. | ||||||
david-hernandez-ably marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
| - Oldest messages are dropped if the maximum queue length is exceeded. Check the [dead letter queue](/docs/platform/integrations/queues#deadletter) to see if this is happening. | ||||||
|
|
||||||
| ## Publishing via your own servers | ||||||
|
|
||||||
| Change the publish path: instead of publishing Chat messages, updates, and deletes to Ably directly, proxy them through your own server. This gives you the opportunity to also save the messages as they are produced, and also apply different validation schemes if needed. | ||||||
|
|
||||||
| Benefits: | ||||||
| - Full control over publishing. | ||||||
| - Opportunity to add extra validation before publishing to Ably. | ||||||
| - You can publish messages directly via the Chat REST API, and avoid having to encode/decode Chat Messages to and from Ably Pub/Sub messages. You can bypass using an SDK entirely or you can use the Chat SDK for publishing. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd say just use the chat SDK, all the enocde/decode part for me (as an external developer) sounds confusing.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Avoiding to have to do it is a benefit when you publish via your own servers or fetch via history. But we don't have this benefit in the other methods of integrating right now. I'd leave this paragraph in, perhaps mention this in the next section as well. WDYT?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The main problem I see is that for me, as a developer that wants to use Ably Chat, is difficult to understand why should I care about PubSub... I want to use the chat, how is that implemented in Ably should be transparent for me. I see that I have to encode or decode from PubSub and that sounds complex and unnecessary. I just want to manage my messages.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have a progressive disclosure of complexity that we aim for - there will be some use-cases where we have to introduce Pub/Sub to the mix. In this case, once we do the SDK changes to take integration payloads as they are and turn them directly into chat messages... people won't need to worry about Pub/Sub. |
||||||
|
|
||||||
| You need to consider: | ||||||
| - You need to handle updates and deletes on your own. | ||||||
| - Storing message reactions can be difficult since you will not have access to the aggregate (summaries) Ably provides. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we offer an API for this in the SDK or REST APIs?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can:
But in this context they're all a big annoyance since you need to actively fetch instead and you need to decide when to do it. |
||||||
| - Your own servers are in the middle of the message publish path, so they can become a bottleneck in availability and will add latency in the publish path. | ||||||
| - Your own servers will need to handle the scale you operate at for realtime publishes. | ||||||
|
|
||||||
| ## Using the Chat History endpoint | ||||||
|
|
||||||
| You can fetch the message history of a chat room using the [Chat History endpoint](/docs/api/chat-rest#tag/rooms/paths/~1chat~1%7Bversion%7D~1rooms~1%7BroomName%7D~1messages/get) or the [Chat SDK](/docs/chat/rooms/history). The chat room history endpoint is a paginated HTTP endpoint that allows you to retrieve messages from a chat room. The Chat SDK provides a convenient way to fetch the history of a chat room. | ||||||
|
|
||||||
| If your use case is to archive chats that have ended, such as to export the chat history of a support ticket that is closed, you can use the chat history endpoint to export the messages to your own system. Read the docs on [chat history](/docs/chat/rooms/history) for more details. | ||||||
|
|
||||||
| The intended use of the chat history endpoint is to retrieve messages for pre-filling a chat window, and not for ingesting messages into other systems. As a result, there are some important things to consider: | ||||||
|
|
||||||
| - The history endpoint is not a changelog, it is a snapshot of the messages in the room at the time the request is made. | ||||||
| - The history API returns messages in their canonical global order (sorted by `serial`). | ||||||
| - For each message, only the latest version of the message is returned. | ||||||
| - You will need to decide when and which rooms to import messages from. | ||||||
| - You can import the same room multiple times (deduplicate by `serial` and `version.serial`), but you will need to always fetch from the first message to make sure you don't miss any updates or deletes of older messages. | ||||||
|
|
||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, every message retrieved via the history API is a billable history, so implementing this is not free.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should I add a note about this? I think perhaps this is better suited for the history page or pricing docs?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Every message sent via an integration is too - so history isn't alone in this regard |
||||||
| For use cases where there is a clear start and end of the chat, exporting the chat via history requests is a simple, reliable solution. If there is no clear start and end for chats, please consider using one of the other methods mentioned in this guide. | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.