Skip to content
38 changes: 30 additions & 8 deletions docs/gettingstarted/quickstart.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -282,9 +282,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reduce Function\n",
"## The `map_rows` Function\n",
"\n",
"Finally, we'll end with the flexible `reduce` function. `reduce` functions similarly to pandas' `apply` but flattens (reduces) the inputs from nested layers into array inputs to the given apply function. For example, let's find the mean flux for each dataframe in \"nested\":"
"Finally, we'll end with the flexible `map_rows` function. `map_rows` functions similarly to pandas' `apply` but applies row by row and flattens the inputs from nested layers into array inputs to the given apply function. For example, let's find the mean flux for each dataframe in \"nested\":"
]
},
{
Expand All @@ -297,7 +297,8 @@
"\n",
"# use hierarchical column names to access the flux column\n",
"# passed as an array to np.mean\n",
"nf.reduce(np.mean, \"lightcurve.brightness\")"
"# row_container signals how to pass the data to the function, in this case as direct arguments\n",
"nf.map_rows(np.mean, \"lightcurve.brightness\", row_container=\"args\")"
]
},
{
Expand All @@ -313,15 +314,15 @@
"metadata": {},
"outputs": [],
"source": [
"def show_inputs(*args):\n",
" return args"
"def show_inputs(row):\n",
" return row"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Applying some inputs via reduce, we see how it sends inputs to a given function. The output frame `nf_inputs` consists of two columns containing the output of the “ra” column and the “lightcurve.time” column."
"Applying some inputs via `map_rows`, we see how it sends inputs to a given function. The output frame `nf_inputs` consists of two columns containing the output of the “ra” column and the “lightcurve.time” column."
]
},
{
Expand All @@ -330,8 +331,12 @@
"metadata": {},
"outputs": [],
"source": [
"nf_inputs = nf.reduce(show_inputs, \"ra\", \"lightcurve.time\")\n",
"nf_inputs"
"# row_container=\"dict\" passes the data as a dictionary to the function\n",
"nf_inputs = nf.map_rows(show_inputs, columns=[\"ra\", \"lightcurve.time\"], row_container=\"dict\")\n",
"nf_inputs\n",
"\n",
"# map_rows returns a dataframe view of the dicts, but the two columns can be accessed with show_inputs as\n",
"# row[\"ra\"] and row[\"lightcurve.time\"]"
]
},
{
Expand All @@ -343,6 +348,23 @@
"nf_inputs.loc[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# row_container=\"args\" passes the data as arguments to the function\n",
"\n",
"\n",
"def show_inputs(*args):\n",
" return args\n",
"\n",
"\n",
"nf_inputs = nf.map_rows(show_inputs, columns=[\"ra\", \"lightcurve.time\"], row_container=\"args\")\n",
"nf_inputs"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/nestedframe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Extended Pandas.DataFrame Interface
NestedFrame.query
NestedFrame.dropna
NestedFrame.sort_values
NestedFrame.reduce
NestedFrame.map_rows
NestedFrame.drop
NestedFrame.min
NestedFrame.max
Expand Down
7 changes: 0 additions & 7 deletions docs/tutorials/data_manipulation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -105,13 +105,6 @@
"## Adding or Replacing Nested Columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> *A Note on Performance: These operations involve full reconstruction of the nested columns so expect impacted performance when doing this at scale. It may be appropriate to do these operations within reduce functions directly (e.g. subtracting a value from a column) if performance is key.*"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
Loading