This repository was archived by the owner on Aug 22, 2025. It is now read-only.

Description
If all observed values of a document field are [] the generated schema is for ArrayType[NullType] which cannot be persisted or used in any meaningful way.
In the absence of evidence of the type of array elements a more logical behavior would be to allow for overrides of the schema of a subset of fields, e.g., as a JSON string (schema.json in Spark) or, if a default behavior is needed, map to ArrayType[StringType] as opposed to ArrayType[NullType]. The benefits are that this mapping can be persisted and it can represent any Mongo arrays, including heterogeneous ones.