Skip to content

Aggregation with collectSet on empty dataset produces Null  #452

@ayoub-benali

Description

@ayoub-benali

Consider the following example:

import frameless.functions.aggregate.{collectSet, max, min}
import frameless.syntax._
import frameless.TypedDataset

case class Foo(bar: Int)
val ds = TypedDataset.create(List.empty[Foo])

ds
  .agg(
    min(ds('bar)),
    collectSet(ds('bar))
  )
  .collect
  .run

It produces WrappedArray(null) while the expected value would be WrappedArray() because the initial dataset is empty.
Please note that this bug happen only when collectSet is combined with an other aggregation. Aggregating only with min produces the expected result.

While using only collectSet

ds.agg(collectSet(ds('bar))).collect.run

Produces WrappedArray(Vector()) while the expected value is WrappedArray()

I haven't tested with the other collect functions llke collectList

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions