Support for mapHeaders to filter columns and edit the printed headers by hmalphettes · Pull Request #28 · max-mapper/csv-write-stream

hmalphettes · 2016-08-23T02:03:14Z

Setting the mapHeaders function in the options will skip some columns
and edit the printed headers line.

For example:

var writer = csv({
  mapHeaders: function (header) {
    return header === 'skipthisone' ? null
    : header.substring(0,1).toUpperCase() + header.substring(1)
  }
})

This the reverse of csv-parser's mapHeaders:
mafintosh/csv-parser#54

Setting the mapHeaders function in the options will skip some columns and edit the printed headers line. For example: ``` var writer = csv({ mapHeaders: function (header) { return header === 'skipthisone' ? null : header.substring(0,1).toUpperCase() + header.substring(1) } }) ``` This the reverse of csv-parser's `mapHeaders`

notslang · 2016-08-23T02:37:08Z

Having a function to edit the headers seems overly complex... you could just pipe the stream of objects into a transform that rewrites the keys or removes keys you don't want before sending them to csv-write-stream.

hmalphettes · 2016-08-24T07:46:05Z

@slang800 ok, that works too.

I would argue that no matter what that transform stream will cause some overhead.
And such a transform stream is almost as complex as the csv-write-stream

If one wants to optimise such a transform stream one would need to redo the same type of work than what the csv-write-stream compiler is doing:

extract the headers, transform them
compile a function that extracts the values from the stream of objects and rows and pushes a transformed stream of rows.

So at this point all that is left is the csv formatting and one would end-up with csv-write-stream

Hence I think there is value in handling this transform in the csv-write-stream itself.

notslang · 2016-08-24T09:28:31Z

It's not complex and you shouldn't need to extract the headers or redo any work. It should look something like this:

var map = require('through2')
var transform = map({objectMode: true}, function (obj, enc, cb) {
  var i, key, len, newKey, keys
  delete obj.skipthisone
  keys = Object.keys(obj)
  for (i = 0, len = keys.length; i < len; i++) {
    key = keys[i]
    newKey = key.substring(0, 1).toUpperCase() + key.substring(1)
    obj[newKey] = obj[key]
    delete obj[key]
  }
  cb(null, obj)
})

...Unless I'm totally misunderstanding what you want to do.

hmalphettes · 2016-08-24T10:30:17Z

Thanks @slang800

The csv-write-stream could have been written in the same fashion than what you suggest:

      this.push(Object.keys(record).map(k => {
        const value = record[k];
        if (typeof value === 'string') {
          return '"' + value.replace(/"/g, '""') + '"';
        }

        return value;
      }).join() + '\n');

Instead it produces a block of javascript code in the _compile method.

That is benchmarked to be much faster than making V8 discover the object it transforms with Object.keys or any other for-loop on every row.

For example this is what it generates for {hello: "world", foo: "bar", baz: "taco"}:

function toRow(obj) {
var a0 = obj.hello == null ? "" : obj.hello
var a1 = obj.foo == null ? "" : obj.foo
var a2 = obj.baz == null ? "" : obj.baz
var result = (/[,\r\n"]/.test(a0) ? esc(a0+"") : a0)+","+(/[,\r\n"]/.test(a1) ? esc(a1+"") : a1)+","+(/[,\r\n"]/.test(a2) ? esc(a2+"") : a2)
return result +"\n"
}

So I am trying to make sure that transforming the headers and skipping columns wont be a perf bottleneck by incorporating that into the writer.

This PR does that for an extra O(1) and adds zero overhead to the serialisation itself.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for mapHeaders to filter columns and edit the printed headers#28

Support for mapHeaders to filter columns and edit the printed headers#28
hmalphettes wants to merge 1 commit intomax-mapper:masterfrom
sutoiku:master

hmalphettes commented Aug 23, 2016

Uh oh!

notslang commented Aug 23, 2016 •

edited

Loading

Uh oh!

hmalphettes commented Aug 24, 2016 •

edited

Loading

Uh oh!

notslang commented Aug 24, 2016

Uh oh!

hmalphettes commented Aug 24, 2016 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hmalphettes commented Aug 23, 2016

Uh oh!

notslang commented Aug 23, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hmalphettes commented Aug 24, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

notslang commented Aug 24, 2016

Uh oh!

hmalphettes commented Aug 24, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

notslang commented Aug 23, 2016 •

edited

Loading

hmalphettes commented Aug 24, 2016 •

edited

Loading

hmalphettes commented Aug 24, 2016 •

edited

Loading