(fix) arrays handling in analytics connector #2

teetangh · 2025-04-27T15:10:24Z

No description provided.

…for aggregation of both object arrays and primitive arrays. Refine value extraction logic in getData function to support nested paths and improve data representation. Introduce a helper function for nested value retrieval.

gemini-code-assist

Hello @teetangh, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request addresses the handling of arrays in the analytics connector, specifically within the buildSchema and getData functions. Previously, arrays were skipped during schema construction. Now, the code aggregates array data into string fields, differentiating between arrays of objects and arrays of primitives. The getData function has also been modified to handle nested results and extract values from aggregated array fields. A new utility function, getNestedValue, has been added to assist with accessing nested values within objects, including those within arrays.

Highlights

Array Handling in Schema Building: The buildSchema function now processes arrays, creating aggregated string fields for both arrays of objects (e.g., 'schedule.day') and arrays of primitives (e.g., 'aliases'). Empty arrays are still skipped.
Data Extraction from Arrays: The getData function has been updated to extract data from aggregated array fields. It now identifies the base part and nested part of a field ID (e.g., 'schedule' and 'day' in 'schedule.day') and aggregates values accordingly.
Nested Value Retrieval: A new utility function, getNestedValue, has been introduced to handle nested value retrieval within objects, including those nested within arrays. This function replaces the previous reduce based approach.

Changelog

src/analytics/connector.gs
- Modified buildSchema to handle arrays of objects and primitives, creating aggregated string fields in the schema.
- Modified getData to handle nested results and extract values from aggregated array fields, using the new getNestedValue function.
- Added a new utility function getNestedValue to retrieve nested values within objects, including those within arrays.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

In code we dive, arrays in sight,
To string them together, with all our might.
Nested values we chase,
In this digital space,
Making data shine ever so bright.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The pull request addresses the handling of arrays in the analytics connector, which is a valuable improvement. The changes introduce logic to process arrays of objects and primitives, creating aggregated string fields for schema generation. The code is generally well-structured and includes logging for debugging purposes. However, there are a few areas that could be improved for clarity and efficiency.

Summary of Findings

Potential Performance Bottleneck: The use of schema.some within loops to check for existing fields could lead to performance issues, especially with large schemas. Consider using a more efficient data structure for faster lookups.
Inconsistent Logging: The logging messages use different formats and levels of detail. Standardizing the logging format would improve readability and maintainability.
Code Duplication: The logic for adding fields to the schema is repeated for both arrays of objects and arrays of primitives. This could be refactored into a single function to reduce code duplication.

Merge Readiness

The pull request introduces important functionality for handling arrays, but there are some potential performance and maintainability issues that should be addressed before merging. Specifically, the performance of the schema.some checks should be improved, the logging format should be standardized, and the duplicated code for adding fields to the schema should be refactored. I am unable to approve this pull request, and recommend that others review and approve this code before merging.

gemini-code-assist · 2025-04-27T15:10:45Z

src/analytics/connector.gs

+            if (!schema.some(f => f.name === aggFieldName)) {
+              schema.push({
+                name: aggFieldName,
+                label: aggFieldName,
+                dataType: 'STRING', 
+                semantics: { conceptType: 'DIMENSION' } 
+              });
+              Logger.log('addFieldToSchema (Array of Objects): Added aggregated field: %s', aggFieldName);
+            }


The schema.some method is used to check if a field already exists before pushing it to the schema. This approach has a time complexity of O(n) for each field being added, where n is the number of fields in the schema. For large schemas, this can become a performance bottleneck. Consider using a Set or Map to store the field names for O(1) lookup time.

Suggestion: Use a Set to track existing field names for faster lookups.

const existingFields = new Set(schema.map(f => f.name)); Array.from(allKeys).forEach(key => { const aggFieldName = `${arrayPrefix}.${key}`; if (!existingFields.has(aggFieldName)) { // O(1) lookup schema.push({ name: aggFieldName, label: aggFieldName, dataType: 'STRING', semantics: { conceptType: 'DIMENSION' } }); Logger.log('addFieldToSchema (Array of Objects): Added aggregated field: %s', aggFieldName); } });

gemini-code-assist · 2025-04-27T15:10:46Z

src/analytics/connector.gs

+           if (!schema.some(f => f.name === aggFieldName)) {
+             schema.push({
+               name: aggFieldName,
+               label: aggFieldName,
+               dataType: 'STRING', 
+               semantics: { conceptType: 'DIMENSION' } 
+             });
+             Logger.log('addFieldToSchema (Array of Primitives): Added aggregated field: %s', aggFieldName);
+           }


Similar to the array of objects case, the schema.some method is used here as well. Consider using the same optimization with a Set to improve performance for large schemas.

Suggestion: Use a Set to track existing field names for faster lookups.

const existingFields = new Set(schema.map(f => f.name)); const aggFieldName = fieldName; if (!existingFields.has(aggFieldName)) { // O(1) lookup schema.push({ name: aggFieldName, label: aggFieldName, dataType: 'STRING', semantics: { conceptType: 'DIMENSION' } }); Logger.log('addFieldToSchema (Array of Primitives): Added aggregated field: %s', aggFieldName); }

gemini-code-assist · 2025-04-27T15:10:46Z

src/analytics/connector.gs

+        if (value.length > 0 && typeof value[0] === 'object' && value[0] !== null) {
+          // Array of Objects: Create aggregated string fields
+          const arrayPrefix = fieldName; // Use the current fieldName as the prefix (e.g., 'schedule')
+          const allKeys = new Set();
+          Object.keys(value[0]).forEach(key => allKeys.add(key)); // Assume consistent keys based on first item
+
+          Logger.log('addFieldToSchema (Array of Objects): Found keys [%s] for prefix \'%s\'', Array.from(allKeys).join(', '), arrayPrefix);
+
+          Array.from(allKeys).forEach(key => {
+            const aggFieldName = `${arrayPrefix}.${key}`; // e.g., schedule.day
+            // Check if field already exists before pushing
+            if (!schema.some(f => f.name === aggFieldName)) {
+              schema.push({
+                name: aggFieldName,
+                label: aggFieldName,
+                dataType: 'STRING', 
+                semantics: { conceptType: 'DIMENSION' } 
+              });
+              Logger.log('addFieldToSchema (Array of Objects): Added aggregated field: %s', aggFieldName);
+            }
+          });
+        } else if (value.length > 0) {
+           // Array of Primitives: Create one aggregated string field
+           const aggFieldName = fieldName; // Use the current fieldName (e.g., 'aliases')
+           if (!schema.some(f => f.name === aggFieldName)) {
+             schema.push({
+               name: aggFieldName,
+               label: aggFieldName,
+               dataType: 'STRING', 
+               semantics: { conceptType: 'DIMENSION' } 
+             });
+             Logger.log('addFieldToSchema (Array of Primitives): Added aggregated field: %s', aggFieldName);
+           }
+        } else {
+          // Empty array: Skip (consistent with original logic)
+          Logger.log('addFieldToSchema: Skipping empty array for key: %s', fieldName);
+        }


The code for handling arrays of objects and arrays of primitives is very similar. Consider refactoring this logic into a separate function to reduce code duplication and improve maintainability. This function could take the array and a flag indicating whether it's an array of objects or primitives as input.

Suggestion: Refactor the array handling logic into a separate function.

} else if (value.length > 0 && typeof value[0] === 'object' && value[0] !== null) { handleArrayOfObjects(value, fieldName, schema); } else if (value.length > 0) { handleArrayOfPrimitives(value, fieldName, schema); } else { // Empty array: Skip (consistent with original logic) Logger.log('addFieldToSchema: Skipping empty array for key: %s', fieldName); }

gemini-code-assist bot reviewed Apr 27, 2025

View reviewed changes

gemini-code-assist bot suggested changes Apr 27, 2025

View reviewed changes

teetangh self-assigned this May 14, 2025

teetangh marked this pull request as draft May 14, 2025 04:51

teetangh added bug Something isn't working invalid This doesn't seem right labels May 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(fix) arrays handling in analytics connector #2

(fix) arrays handling in analytics connector #2

Uh oh!

teetangh commented Apr 27, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 27, 2025

Uh oh!

gemini-code-assist bot Apr 27, 2025

Uh oh!

gemini-code-assist bot Apr 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

(fix) arrays handling in analytics connector #2

Are you sure you want to change the base?

(fix) arrays handling in analytics connector #2

Uh oh!

Conversation

teetangh commented Apr 27, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

Uh oh!

gemini-code-assist bot Apr 27, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 27, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants