Replies: 1 comment
-
defmodule RAG.DataCollector do
def fetch_and_chunk_docs(urls) do
Enum.flat_map(urls, &process_directory/1)
end
defp process_directory(url) do
extract_chunks = fn file ->
case file do
%{"type" => "file", "name" => name, "download_url" => download_url} ->
if String.ends_with?(name, ".md") do
Req.get!(download_url).body
|> TextChunker.split(format: :markdown)
|> Enum.map(&Map.get(&1, :text))
else
[]
end
_ -> []
end
end
Req.get!(url).body
|> Enum.flat_map(fn file -> extract_chunks.(file) end)
end
end guides = [
"https://api.github.com/repos/phoenixframework/phoenix_live_view/contents/guides/server",
"https://api.github.com/repos/phoenixframework/phoenix_live_view/contents/guides/client"
]
chunks = RAG.DataCollector.getch_and_chunk_docs(guides)["# Assigns and HEEx templates\n\nAll of the data in a LiveView is stored in the socket, which is a server \nside struct called `Phoenix.LiveView.Socket`. Your own data is stored\nunder the `assigns` key of said struct. The server data is never shared\nwith the client beyond what your template renders.\n\nPhoenix template language is called HEEx (HTML+EEx). EEx is Embedded \nElixir, an Elixir string template engine. Those templates\nare either files with the `.heex` extension or they are created\ndirectly in source files via the `~H` sigil. You can learn more about\nthe HEEx syntax by checking the docs for [the `~H` sigil](`Phoenix.Component.sigil_H/2`).\n\nThe `Phoenix.Component.assign/2` and `Phoenix.Component.assign/3`\nfunctions help store those values. Those values can be accessed\nin the LiveView as `socket.assigns.name` but they are accessed\ninside HEEx templates as `@name`.\n\nIn this section, we are going to cover how LiveView minimizes\nthe payload over the wire by understanding the interplay between\nassigns and templates.\n",
"\n## Change tracking\n\nWhen you first render a `.heex` template, it will send all of the\nstatic and dynamic parts of the template to the client. Imagine the\nfollowing template:\n\n```heex\n<h1><%= expand_title(@title) %></h1>\n```\n\nIt has two static parts, `<h1>` and `</h1>` and one dynamic part\nmade of `expand_title(@title)`. Further rendering of this template\nwon't resend the static parts and it will only resend the dynamic\npart if it changes.\n\nThe tracking of changes is done via assigns. If the `@title` assign\nchanges, then LiveView will execute the dynamic parts of the template,\n`expand_title(@title)`, and send\nthe new content. If `@title` is the same, nothing is executed and\nnothing is sent.\n\nChange tracking also works when accessing map/struct fields.\nTake this template:\n\n```heex\n<div id={\"user_\#{@user.id}\"}>\n <%= @user.name %>\n</div>\n```\n\nIf the `@user.name` changes but `@user.id` doesn't, then LiveView\nwill re-render only `@user.name` and it will not execute or resend `@user.id`\nat all.\n\nThe change tracking also works when rendering other templates as\nlong as they are also `.heex` templates:\n\n```heex\n<%= render \"child_template.html\", assigns %>\n```\n\nOr when using function components:\n\n```heex\n<.show_name name={@user.name} />\n```\n\nThe assign tracking feature also implies that you MUST avoid performing\ndirect operations in the template. For example, if you perform a database\nquery in your template:\n\n```heex\n<%= for user <- Repo.all(User) do %>\n <%= user.name %>\n<% end %>\n```\n\nThen Phoenix will never re-render the section above, even if the number of\nusers in the database changes. Instead, you need to store the users as\nassigns in your LiveView before it renders the template:\n\n assign(socket, :users, Repo.all(User))\n\nGenerally speaking, **data loading should never happen inside the template**,\nregardless if you are using LiveView or not. The difference is that LiveView\nenforces this best practice.\n",
"\n## Pitfalls\n\nThere are some common pitfalls to keep in mind when using the `~H` sigil\nor `.heex` templates inside LiveViews.\n\n### Variables\n\nDue to the scope of variables, LiveView has to disable change tracking\nwhenever variables are used in the template, with the exception of\nvariables introduced by Elixir block constructs such as `case`,\n`for`, `if`, and others. Therefore, you **must avoid** code like\nthis in your HEEx templates:\n\n```heex\n<% some_var = @x + @y %>\n<%= some_var %>\n```\n\nInstead, use a function:\n\n```heex\n<%= sum(@x, @y) %>\n```\n\nSimilarly, **do not** define variables at the top of your `render` function\nfor LiveViews or LiveComponents. Since LiveView cannot track `sum` or `title`,\nif either value changes, both must be re-rendered by LiveView.\n\n def render(assigns) do\n sum = assigns.x + assigns.y\n title = assigns.title\n\n ~H\"\"\"\n <h1><%= title %></h1>\n\n <%= sum %>\n \"\"\"\n end\n\nInstead use the `assign/2`, `assign/3`, `assign_new/3`, and `update/3`\nfunctions to compute it. Any assign defined or updated this way will be marked as\nchanged, while other assigns like `@title` will still be tracked by LiveView.\n\n assign(assigns, sum: assigns.x + assigns.y)\n\nThe same functions can be used inside function components too:\n\n attr :x, :integer, required: true\n attr :y, :integer, required: true\n attr :title, :string, required: true\n def sum_component(assigns) do\n assigns = assign(assigns, sum: assigns.x + assigns.y)\n\n ~H\"\"\"\n <h1><%= @title %></h1>\n\n <%= @sum %>\n \"\"\"\n end\n\nGenerally speaking, avoid accessing variables inside `HEEx` templates, as code that\naccess variables is always executed on every render. The exception are variables\nintroduced by Elixir's block constructs. For example, accessing the `post` variable\ndefined by the comprehension below works as expected:\n\n```heex\n<%= for post <- @posts do %>\n ...\n<% end %>\n```\n",
"\n### The `assigns` variable\n\nWhen talking about variables, it is also worth discussing the `assigns`\nspecial variable. Every time you use the `~H` sigil, you must define an\n`assigns` variable, which is also available on every `.heex` template.\nHowever, we must avoid accessing this variable directly inside templates\nand instead use `@` for accessing specific keys. This also applies to\nfunction components. Let's see some examples.\n\nSometimes you might want to pass all assigns from one function component to\nanother. For example, imagine you have a complex `card` component with \nheader, content and footer section. You might refactor your component\ninto three smaller components internally:\n\n```elixir\ndef card(assigns) do\n ~H\"\"\"\n <div class=\"card\">\n <.card_header {assigns} />\n <.card_body {assigns} />\n <.card_footer {assigns} />\n </div>\n \"\"\"\nend\n\ndefp card_header(assigns) do\n ...\nend\n\ndefp card_body(assigns) do\n ...\nend\n\ndefp card_footer(assigns) do\n ...\nend\n```\n\nBecause of the way function components handle attributes, the above code will\nnot perform change tracking and it will always re-render all three components\non every change.\n\nGenerally, you should avoid passing all assigns and instead be explicit about\nwhich assigns the child components need:\n\n```elixir\ndef card(assigns) do\n ~H\"\"\"\n <div class=\"card\">\n <.card_header title={@title} class={@title_class} />\n <.card_body>\n <%= render_slot(@inner_block) %>\n </.card_body>\n <.card_footer on_close={@on_close} />\n </div>\n \"\"\"\nend\n```\n\nIf you really need to pass all assigns you should instead use the regular\nfunction call syntax. This is the only case where accessing `assigns` inside\ntemplates is acceptable:\n\n```elixir\ndef card(assigns) do\n ~H\"\"\"\n <div class=\"card\">\n <%= card_header(assigns) %>\n <%= card_body(assigns) %>\n <%= card_footer(assigns) %>\n </div>\n \"\"\"\nend\n",
"```\n\nThi |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
We consider 2 types of documents in the Phoenix_LiveView repo: markdown files, and Elixir modules (which contains
moduledocof particular interest).GitHub serves pages with the endpoint "https://raw.githuusercontent.com/...". We can also get the list from the GitHub API at the endpoint "https://api.github.com/repose//...."
To chunk, I tested the package TextChunker. It divides the text into smaller chunk in a hierarchical and iterative manner using a set of separators.
I used
[format: :markdown]for ".md" documents and nothing for the ".html" documents.Chunk sizes are not "small": from 600 to 2000 codepoints.
The result is:
Beta Was this translation helpful? Give feedback.
All reactions