docs - add more instructions regarding the configuration file / update steps in README

ksachdeva · ksachdeva · commit a38aeed4052a · 2025-06-15T03:45:10.000Z
diff --git a/README.md b/README.md
@@ -20,10 +20,6 @@ further explore the starred repositories.
 
 This project/tool uses semantic search and an AI agent as an attempt to solve the above problems.
 
-## Architecture & Implementation Details
-
-[TBD]
-
 ## Install (User)
 
 Read below to install `uv`. You haven't done it yet? Come on guys!!
@@ -47,27 +43,47 @@ You should make a copy of it and perhaps call it `rsg-config.toml` (The name of
 
 ### Step 1 - Obtain the Github Personal Access Token
 
-[TBD]
+This tool fetches your starred github repositories. In order to access to them without incurring rate limits
+it is required to use the Github Personal Access Token.
+
+Read this to learn how to obtain it -
+
+https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens
 
 ### Step 2 - Edit the `rsg-config.toml`
 
 - You should provide the Github PAT obtained in Step 1
 - You should fill the `[embedder]` section (Supported provider types are - ollama, openai, azure_openai)
 - You should fill the `[agent.litellm_params]` section
 
-[TBD] - Don't think above instructions are enough! To update and explain in detail the settings
+In `rsg-config.example.toml`, I have added necessary comments to help fill out various configuration.
 
 ### Step 3 - Build the database
 
+The real work starts with this step. 
+
+At the moment, I use naive RAG technique.
+
+- Information about your starred github repos are downloaded using the GitHub API
+- Then the `readme` files of these repos are downloaded. Note - Some repos, do not have `readme` file.
+- Then these `readme` files are chunked and their embeddings are stored in a vector store
+
+The data above (including vectorstore) is stored in your computers data directories for example `$HOME/.local/share/rsg`
+on macos and linux.
+
+You can change the location of the data by setting the environment variable `RSG_DATA_HOME`
+
 ```bash
 uvx --from repo-stargazer rsg build --config rsg-config.toml
 ```
 
 ### Step 4 - Run the agent using adk web & ui
 
-The agent is built using Google ADK and I have done somewhat of a hack to be able run the agent
-by the built-in fastapi server & user interface. The server & user interface is meant for development needs but
-for now it is the only UI there is 
+Let's see all of it in action.
+
+For the user interface, I am still using the development UI that comes as part of Google ADK. 
+
+In near future, would provide a decent UI with out any developer specific elements.
 
 ```bash
 uvx --from repo-stargazer rsg run-adk-server --config rsg-config.toml
diff --git a/rsg-config.example.toml b/rsg-config.example.toml
@@ -1,18 +1,32 @@
+# Read https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens
+# to learn how to get the GitHub Personal Access Token
 github_pat = ""
 
-
 [embedder]
-provider_type = "ollama"
+# possible values for provider_type are - ollama, azure_openai, openai 
+provider_type = "ollama" 
 model_name = "mxbai-embed-large:latest"
 api_endpoint = "http://host.docker.internal:11434"
+
+# depending on the provider_type you may have to supply other fields
+# they are
+# api_key <- required by azure_openai and openai
+# api_version <- required by azure_openai
+# api_deployment <- required by azure_openai
+
 chunk_size = 1000
 chunk_overlap = 120
 
 [agent.litellm_params]
+# you can consult LiteLLM documentation to learn 
+# about how to specify the models
+# generally it is "provider"/"model"
 model = "azure/gpt-4o"
 
 [agent.litellm_params.provider_config]
+# you should provide the necessary fields as per LiteLLM 
+# documentation for the model provided above
 api_key = ""
 api_base = ""
-api_version = "2024-10-01-preview"
+api_version = ""