-
Notifications
You must be signed in to change notification settings - Fork 57
fix(RAG): Fix completeness for aggregator queries #1009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds an AggregatorQuery query type across schemas and prompts, introduces aggregatorQueryPrompt and aggregatorQueryJsonStream to build structured per-item context and stream JSON, and extends chat orchestration to parse/stream intermediate docIds and fetch documents for aggregator flows. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant U as User
participant API as Chat API
participant SR as Search (Vespa)
participant PR as Provider (aggregatorQueryJsonStream)
participant LLM as LLM
participant DS as Doc Store
U->>API: Query (potential Aggregator)
API->>SR: Search with filters/context
SR-->>API: Items (VespaSearchResult[])
API->>PR: aggregatorQueryJsonStream(query, userCtx, metadata, items, params)
PR->>PR: Build structured context (buildStructuredContextFromItems -> answerContextMap)
PR->>LLM: System: aggregatorQueryPrompt + indexed citations<br/>User: query (JSON mode)
LLM-->>PR: Streamed JSON (aggregated fields, docIds, partials)
PR-->>API: Async stream of ConverseResponse
loop While streaming
API->>API: processIteratorForAggregatorQueries parses stream
alt New docIds detected
API->>DS: Fetch docs by docId
DS-->>API: Docs
API-->>U: Stream partial aggregated result + new docIds
else No new docIds
API-->>U: Stream partial aggregated result
end
end
API-->>U: Final aggregated JSON result + final docId list
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @kalpadhwaryu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the system's ability to respond to complex user queries that involve aggregating or summarizing information from multiple sources. By introducing a dedicated Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new AggregatorQuery type to handle more complex data aggregation and extraction queries within the RAG system. The changes are extensive, touching prompts, type definitions, and the core chat processing logic. My review focuses on improving the maintainability and correctness of the new implementation. I've identified some potential bugs in the pagination logic, use of magic numbers, and some areas for code cleanup like unused variables and parameters. Additionally, I've suggested minor improvements to prompts for better clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/ai/types.ts (1)
396-406: Fix FiltersSchema.sortDirection to accept only "asc" | "desc" or null
Inserver/ai/types.ts(around line 402):- sortDirection: z.string().optional(), + sortDirection: z.enum(["asc", "desc"]).nullable().optional(),
🧹 Nitpick comments (7)
server/ai/prompts.ts (2)
1459-1471: Output schema lists sortDirection as 'asc'/'desc'/null — make earlier blocks consistentThe final “Output JSON” correctly shows sortDirection as "'asc' | 'desc' | null". Ensure all earlier type blocks (including AggregatorQuery and SearchWithFilters sections) use the same string enum to avoid LLM conflicts.
2475-2569: Aggregator prompt: tighten docId handling and remove citation noise
- Decision policy says to ignore items without docId, but provider currently inserts "Doc ID: Unknown". Recommend stating “Ignore items lacking docId; do not consider them” and avoid patterns that look like citations.
- The prompt doesn’t use [index] citations; consider removing “[Index]” style from the context to reduce confusion.
server/ai/provider/index.ts (2)
2170-2174: Drop indexToCitation for aggregator flow (no citations used here)The aggregator prompt doesn’t use [index] citations; converting “Index …” adds noise. Pass the structured context directly.
Apply:
- params.systemPrompt = aggregatorQueryPrompt( - userCtx, - indexToCitation(structuredContext), - userMetadata.dateForAI, - ) + params.systemPrompt = aggregatorQueryPrompt( + userCtx, + structuredContext, + userMetadata.dateForAI, + )
2160-2162: Type items as Vespa type instead of any[]For safety and editor help, prefer the concrete VespaSearchResult (or existing project type). It will also make the docId presence check type-safe.
Apply:
- items: any[], // VespaSearchResult[] + items: VespaSearchResults[], // import the concrete typeIf VespaSearchResults differs, import the correct one from where answerContextMap expects.
server/api/chat/chat.ts (3)
710-718: Narrow the parsed type for safer handlingCurrent union
{ docIds: string[] | null } | { [k: string]: any }is too permissive and hard to reason about. Prefer a single structural type with an optional docIds key.-type AggregatorParsed = { docIds: string[] | null } | { [k: string]: any } +type AggregatorParsed = { docIds?: string[] | null } & Record<string, unknown>
712-773: Validate IDs, avoid buffer bloat, return unique IDs
- Validate that parsed IDs are non-empty strings.
- Stream/return unique IDs (right now return value may include duplicates).
- Remove unused
currentDocIds.- Prevent unbounded growth if the model keeps emitting JSON.
export async function* processIteratorForAggregatorQueries( iterator: AsyncIterableIterator<ConverseResponse>, ): AsyncIterableIterator< ConverseResponse & { docIds?: string[] } > { - let buffer = "" - let currentDocIds: string[] = [] + let buffer = "" let parsed: AggregatorParsed = { docIds: null } // token we require before attempting to parse a final JSON object const DOCIDS_TOKEN = '"docIds":' // Track already-yielded ids to only stream new ones const yielded = new Set<string>() + const MAX_DOCIDS = (config?.maxUserRequestCount as number) || 100 for await (const chunk of iterator) { if (chunk.text) { buffer += chunk.text try { const parsable = cleanBuffer(buffer) parsed = jsonParseLLMOutput(parsable, DOCIDS_TOKEN) as AggregatorParsed // Expect shape: { docIds: string[] } OR { docIds: [] } - const docIds = Array.isArray((parsed as any)?.docIds) - ? ((parsed as any).docIds as string[]) - : null + const docIdsRaw = Array.isArray((parsed as any)?.docIds) + ? ((parsed as any).docIds as unknown[]) + : null + const docIds = docIdsRaw + ? docIdsRaw.filter((v): v is string => typeof v === "string" && v.trim().length > 0) + : null if (docIds) { // Stream only the new docIds - const newOnes: string[] = [] + const newOnes: string[] = [] for (const id of docIds) { if (!yielded.has(id)) { - yielded.add(id) - newOnes.push(id) + if (yielded.size < MAX_DOCIDS) { + yielded.add(id) + newOnes.push(id) + } } } if (newOnes.length > 0) { yield { docIds: newOnes } } - currentDocIds = docIds } else if ((parsed as any)?.docIds === null) { // If model explicitly outputs null or incomplete JSON, keep waiting // Do nothing; continue accumulating } } catch { // Not parseable yet; keep accumulating } } if (chunk.cost) { // Pass through token/cost info if present yield { cost: chunk.cost } } } - // Return final array (empty if none) - return ( - Array.isArray((parsed as any)?.docIds) ? (parsed as any).docIds : [] - ) as any + // Return final unique list (empty if none) + return Array.from(yielded) as any }
3105-3140: Helper shape is fine; remove unused params
app,entity, andchunksCountare not used. Drop or add_-prefix to avoid lint noise.-async function* processResultsForAggregatorQuery( - items: VespaSearchResult[], - input: string, - userCtx: string, - userMetadata: UserMetadataType, - app: Apps[] | null, - entity: any, - chunksCount: number | undefined, +async function* processResultsForAggregatorQuery( + items: VespaSearchResult[], + input: string, + userCtx: string, + userMetadata: UserMetadataType, + _app: Apps[] | null, + _entity: any, + _chunksCount: number | undefined,
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
server/ai/agentPrompts.ts(3 hunks)server/ai/prompts.ts(5 hunks)server/ai/provider/index.ts(3 hunks)server/ai/types.ts(3 hunks)server/api/chat/chat.ts(7 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
server/ai/provider/index.ts (4)
server/types.ts (1)
UserMetadataType(608-608)server/ai/context.ts (1)
answerContextMap(699-767)server/ai/types.ts (2)
ModelParams(285-310)ConverseResponse(312-324)server/ai/prompts.ts (1)
aggregatorQueryPrompt(2475-2568)
server/api/chat/chat.ts (6)
server/ai/types.ts (1)
ConverseResponse(312-324)server/ai/provider/index.ts (2)
jsonParseLLMOutput(650-777)aggregatorQueryJsonStream(2156-2191)server/types.ts (1)
UserMetadataType(608-608)server/shared/types.ts (1)
Apps(38-38)server/search/vespa.ts (3)
searchVespa(69-133)searchVespaAgent(135-159)GetDocumentsByDocIds(201-201)server/api/chat/utils.ts (2)
getChannelIdsFromAgentPrompt(72-90)expandEmailThreadsInResults(169-193)
server/ai/prompts.ts (1)
server/ai/context.ts (1)
userContext(855-874)
🔇 Additional comments (8)
server/ai/agentPrompts.ts (1)
1404-1410: Enum list duplication: keep SearchWithFilters and AggregatorQuery listed, but ensure examples alignMinor: Keep the enum list concise and consistent with types; no change required, just ensure example JSON blocks use the same sortDirection contract everywhere.
server/ai/types.ts (2)
421-429: AggregatorQuery schema looks correct and consistent with other variantsNo issues. Good addition to the discriminated union.
439-452: Union extension is correctIncluding AggregatorQuery in QueryRouterResponseSchema is proper.
server/api/chat/chat.ts (5)
14-15: Import looks goodNo issues with the new provider import.
1761-1765: LGTMContext building call change is fine.
3167-3168: LGTMAggregator flag wiring looks correct.
4086-4086: LGTMAggregator classification is correctly considered in routing.
4090-4091: LGTMIncluding AggregatorQuery in the metadata path is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/ai/provider/index.ts (1)
1323-1489: NormalizesortDirectionto string enum in server/ai/prompts.ts
Replace all<boolean> or nulloccurrences at lines 1278, 1296, 1318, and 1354 with< 'asc' | 'desc' | null>to match the existing usage on line 1468.
🧹 Nitpick comments (4)
server/ai/prompts.ts (1)
2475-2569: Aggregator selection prompt looks good; minor clarity and consistency tweaks.
- The docId eligibility/decision policy is clear. Ensure the provider filters out items without docId to avoid confusing the model (see provider comment).
- Optional: add “Return docIds in the determined usefulness order” to tie Output explicitly to the Ordering rule.
Suggested copy tweak:
@@ -# Output Rules -- Return ONLY a JSON object with the following shape: { "docIds": ["<docId1>", "<docId2>", "..."] } +# Output Rules +- Return ONLY a JSON object with the following shape: { "docIds": ["<docId1>", "<docId2>", "..."] } in the usefulness order defined above - If NO relevant items are found or the context cannot answer the query: { "docIds": [] }server/ai/provider/index.ts (2)
2156-2191: Avoid indexToCitation for aggregator context; use stronger typing.
- indexToCitation won't match "Document Index: N" (colon breaks the regex "Index (\d+)"). It’s unnecessary in this flow and could create confusion; pass structuredContext directly.
- Prefer a concrete type over any[] for items to catch schema drift at compile time.
Apply:
- params.systemPrompt = aggregatorQueryPrompt( - userCtx, - indexToCitation(structuredContext), - userMetadata.dateForAI, - ) + params.systemPrompt = aggregatorQueryPrompt( + userCtx, + structuredContext, + userMetadata.dateForAI, + )And consider (type import path as appropriate in your codebase):
-export const aggregatorQueryJsonStream = ( +export const aggregatorQueryJsonStream = ( userQuery: string, userCtx: string, userMetadata: UserMetadataType, - items: any[], // VespaSearchResult[] + items: /* VespaSearchResults[] */ any[], // TODO: replace any[] with your Vespa result type params: ModelParams, ): AsyncIterableIterator<ConverseResponse> => {
2156-2191: Reduce token bloat by limiting summary chunks in builder (optional).Given aggregator only needs to decide relevant docIds, you can cap content size while retaining key metadata.
Example change in builder (if answerContextMap supports it):
-const contextFromAnswerContextMap = answerContextMap(item, userMetadata) +const contextFromAnswerContextMap = answerContextMap(item, userMetadata, /* maxSummaryChunks */ 2)server/api/chat/chat.ts (1)
710-771: Harden parsing: guard and sanitize docIds before yielding/returningAdd type guards to keep only non-empty strings and avoid accidental non-string entries from the LLM.
- const docIds = Array.isArray((parsed as any)?.docIds) - ? ((parsed as any).docIds as string[]) - : null + const docIds = Array.isArray((parsed as any)?.docIds) + ? ((parsed as any).docIds as unknown[]).filter( + (v): v is string => typeof v === "string" && v.trim().length > 0, + ) + : null @@ - return ( - Array.isArray((parsed as any)?.docIds) ? (parsed as any).docIds : [] - ) as any + const finalIds = Array.isArray((parsed as any)?.docIds) + ? ((parsed as any).docIds as unknown[]).filter( + (v): v is string => typeof v === "string" && v.trim().length > 0, + ) + : [] + return finalIds as any
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
server/ai/prompts.ts(5 hunks)server/ai/provider/index.ts(3 hunks)server/api/chat/chat.ts(7 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
server/ai/provider/index.ts (4)
server/types.ts (1)
UserMetadataType(608-608)server/ai/context.ts (1)
answerContextMap(699-767)server/ai/types.ts (2)
ModelParams(285-310)ConverseResponse(312-324)server/ai/prompts.ts (1)
aggregatorQueryPrompt(2475-2568)
server/api/chat/chat.ts (5)
server/ai/types.ts (2)
ConverseResponse(312-324)Intent(394-394)server/ai/provider/index.ts (2)
jsonParseLLMOutput(650-777)aggregatorQueryJsonStream(2156-2191)server/types.ts (1)
UserMetadataType(608-608)server/search/vespa.ts (3)
searchVespa(69-133)searchVespaAgent(135-159)GetDocumentsByDocIds(201-201)server/api/chat/utils.ts (2)
getChannelIdsFromAgentPrompt(72-90)expandEmailThreadsInResults(169-193)
server/ai/prompts.ts (1)
server/ai/context.ts (1)
userContext(855-874)
🔇 Additional comments (18)
server/ai/prompts.ts (3)
1238-1238: LGTM: AggregatorQuery added to router classification.
The new type inclusion is consistent with downstream usage.
1366-1366: LGTM: Type enum list updated with AggregatorQuery.
Matches the new classification rules.
1323-1358: Fix sortDirection type to 'asc' | 'desc' | null and tighten wording.
- The JSON skeleton here still shows "sortDirection": , but your final output spec below uses "'asc' | 'desc' | null". This mismatch will cause brittle LLM outputs. Align to string enum.
- Also change “Both or are present” to “both and are present” (currently ambiguous).
Apply within this block:
@@ - - Both <app> or <entity> are present (same requirement as GetItems), and + - Both <app> and <entity> are present (same requirement as GetItems), and @@ - "sortDirection": <boolean or null>, + "sortDirection": "<'asc' | 'desc' | null>",Also mirror the same 'asc'/'desc'/null change in the earlier JSON skeletons within this prompt for:
- SearchWithoutFilters ("filters.sortDirection")
- GetItems ("filters.sortDirection")
- SearchWithFilters ("filters.sortDirection")
server/ai/provider/index.ts (3)
90-91: LGTM: aggregator prompt imported.
Import wiring is correct.
37-38: LGTM: answerContextMap import.
Consistent with builder usage.
2135-2154: Filter out items without docId and remove leading whitespace in the template.Decision Policy requires ignoring items lacking docId. Current code still includes them (empty string) and introduces inconsistent indentation in the multi-line string.
Apply:
-const buildStructuredContextFromItems = ( - items: any[], - userMetadata: UserMetadataType, -): string => { - return items - .map((item, index) => { - // Use answerContextMap to create structured context for each item - const contextFromAnswerContextMap = answerContextMap(item, userMetadata) - - // Add document index and docId to the context - const documentIndex = index + 1 - const docId = item?.fields?.docId || "" - - return `Document Index: ${documentIndex} - Doc ID: ${docId} - ${contextFromAnswerContextMap} - ` - }) - .join("\n\n") -} +const buildStructuredContextFromItems = ( + items: any[], + userMetadata: UserMetadataType, +): string => { + const valid = items.filter((it) => it?.fields?.docId) + if (valid.length === 0) return "" + return valid + .map((item, index) => { + const contextFromAnswerContextMap = answerContextMap(item, userMetadata) + const documentIndex = index + 1 + const docId = item.fields.docId + return `Document Index: ${documentIndex}\nDoc ID: ${docId}\n${contextFromAnswerContextMap}` + }) + .join("\n\n") +}server/api/chat/chat.ts (12)
14-15: Import added correctly
aggregatorQueryJsonStreamimport looks good and is used below.
1760-1763: LGTM: context call fixedPassing
userMetadataintobuildContextis correct and consistent with the function signature.
3103-3132: Aggregator streaming wrapper is wired correctlyGood separation: build stream options, call
aggregatorQueryJsonStream, and yield via the new iterator.
3161-3161: Aggregator classification flagAdding
isAggregatorQueryis correct and used below for routing.
3799-3801: Promote magic numbers to named constantsMove
40(page size) and10(max iterations) to top-level constants for clarity and single‑point change.- const pageSizeForAggregatorQuery = 40 - const maxIterationsForAggregatorQuery = 10 + // Top-level (near other config destructures): + // export const AGGREGATOR_QUERY_PAGE_SIZE = 40 + // export const AGGREGATOR_QUERY_MAX_ITERATIONS = 10 + const pageSizeForAggregatorQuery = AGGREGATOR_QUERY_PAGE_SIZE + const maxIterationsForAggregatorQuery = AGGREGATOR_QUERY_MAX_ITERATIONS
4074-4074: LGTM: aggregator flag addedThis mirrors the change in
generateMetadataQueryAnswer.
4079-4079: Routing includes aggregator queriesIncluding
isAggregatorQueryhere ensures proper metadata path routing.
3858-3863: Fix pagination: use constant page size; avoid growing limit
limitgrows each iteration, causing overlap and increasing load. Keep a fixedlimitwith movingoffset.- ...searchOptions, - limit: - pageSizeForAggregatorQuery + - pageSizeForAggregatorQuery * iteration, - offset: pageSizeForAggregatorQuery * iteration, + ...searchOptions, + limit: pageSizeForAggregatorQuery, + offset: pageSizeForAggregatorQuery * iteration,
3874-3883: Fix pagination in agent path as wellMirror the fixed page size here too.
- ...searchOptions, - offset: pageSizeForAggregatorQuery * iteration, - limit: - pageSizeForAggregatorQuery + - pageSizeForAggregatorQuery * iteration, + ...searchOptions, + limit: pageSizeForAggregatorQuery, + offset: pageSizeForAggregatorQuery * iteration,
3897-3897: Telemetry: wrong variable used for offset attributeThis uses
pageSize(10) instead ofpageSizeForAggregatorQuery(40).- iterationSpan?.setAttribute("offset", pageSize * iteration) + iterationSpan?.setAttribute("offset", pageSizeForAggregatorQuery * iteration)
3934-3940: Deduplicate and cap finalDocIds before fetching documentsAvoid duplicate fetches and bound the results to user/request limits.
- let finalItems: VespaSearchResult[] = [] - if (finalDocIds.length !== 0) { - const allFinalDocs = await GetDocumentsByDocIds(finalDocIds, span!) - finalItems = allFinalDocs.root.children || [] - } + // Dedup and cap + const uniqueFinalIds = Array.from(new Set(finalDocIds)) + const cap = + (classification.filters.count as number | undefined) ?? + (config.maxUserRequestCount as number) + const boundedIds = + typeof cap === "number" && cap > 0 + ? uniqueFinalIds.slice(0, cap) + : uniqueFinalIds + + let finalItems: VespaSearchResult[] = [] + if (boundedIds.length !== 0) { + const allFinalDocs = await GetDocumentsByDocIds(boundedIds, span!) + finalItems = allFinalDocs.root.children || [] + }
3917-3925: The previous search failed due to unrecognized TSX types. Please manually verify thatDocIdsUpdateisn’t defined or emitted by running:rg -n 'export enum ChatSSEvents' -g '*.ts' -g '*.tsx' rg -n 'DocIdsUpdate' -g '*.ts' -g '*.tsx' rg -n 'docIds\s*:' -g '*.ts' -g '*.tsx' rg -n 'writeSSE' -g 'server/api/chat/**/*.ts'
Description
Fix completeness for aggregator queries.
Summary by CodeRabbit
New Features
Improvements
Documentation