CSV for Node.js

IssuesGitHub

CSV Examples

Introduction

This package proposes different API flavors. Every example is available on GitHub.

Using the stream API

The Node.js stream API is scallable and offers the greatest control over the data flow. It comes at the cost of being more verbose and harder to write. Data is consumed inside the readable event with the stream.read function. It is then written by calling the stream.write function. The stream example illustrates how to initialize each packages and how to plug them.

// Import the package
import {generate, parse, transform, stringify} from 'csv';

let i = 0;

const generator = generate({seed: 1, columns: 2, length: 20});
const parser = parse();
const transformer = transform(function(data){
  i++;
  return data.map(function(value){return value.toUpperCase();});
});
const stringifier = stringify();

// Read generated CSV data and send it to the parser
generator.on('readable', function(){
  let data; while((data = generator.read()) !== null){
    parser.write(data);
  }
});
// When generation is over, close the parser
generator.on('end', function(){
  parser.end();
});

// Read parsed records and send them to the transformer
parser.on('readable', function(){
  let data; while((data = parser.read()) !== null){
    transformer.write(data);
  }
});
// When parsing is over, close the transformer
parser.on('end', function(){
  transformer.end();
});

// Read transformed records and send them to the stringifier
transformer.on('readable', function(){
  let data; while((data = transformer.read()) !== null){
    stringifier.write(data);
  }
});
// When transformation is over, close the stringifier
transformer.on('end', function(){
  stringifier.end();
});

// Read CSV data and print it to stdout
stringifier.on('readable', function(){
  let data; while((data = stringifier.read()) !== null){
    process.stdout.write(data);
  }
});
// When stringifying is over, print a summary to stderr
generator.on('close', function(){
  process.stderr.write('=> ' + i + ' records\n');
});

Using the pipe API

Piping in Node.js is part of the stream API and behave just like Unix pipes where the output of a process, here a function, is redirected as the input of the following process. A pipe example is provided with an unconventional syntax:

// Import the package
import {generate, parse, transform, stringify} from 'csv';

// Run the pipeline
generate ({seed: 1, length: 20}).pipe(
parse ()).pipe(
transform (function(record){
                return record.map(function(value){
                  return value.toUpperCase();
              });})).pipe(
stringify ()).pipe(process.stdout);

A more conventional pipe example is:

// Import the package
import * as csv from '../lib/index.js';

// Run the pipeline
csv
// Generate 20 records
  .generate({
    delimiter: '|',
    length: 20
  })
// Transform CSV data into records
  .pipe(csv.parse({
    delimiter: '|'
  }))
// Transform each value into uppercase
  .pipe(csv.transform((record) => {
    return record.map((value) => {
      return value.toUpperCase();
    });
  }))
// Convert objects into a stream
  .pipe(csv.stringify({
    quoted: true
  }))
// Print the CSV stream to stdout
  .pipe(process.stdout);

Using the callback API

Also available in the csv module is the callback API. The all dataset is available in the second callback argument. Thus it will not scale with large dataset. The callback example initialize each CSV function sequentially, with the output of the previous one. Note, for the sake of clarity, the example doesn't deal with error management. It is enough spaghetti code.

// Import the package
import {generate, parse, transform, stringify} from 'csv';

// Run the pipeline
generate({seed: 1, columns: 2, length: 20}, function(err, data){
  parse(data, function(err, data){
    transform(data, function(data){
      return data.map(function(value){return value.toUpperCase();});
    }, function(err, data){
      stringify(data, function(err, data){
        process.stdout.write(data);
      });
    });
  });
});

Using the sync API

The sync API behave like pure functions. For a given input, it always produce the same output.

Because of its simplicity, this is the recommended approach if you don't need scalability and if your dataset fit in memory.

The module to import is csv/sync. The sync example illustrate its usage.

import assert from 'assert';
import {generate, parse, transform, stringify} from 'csv/sync';

// Run the pipeline
const input = generate({seed: 1, columns: 2, length: 5});
const rawRecords = parse(input);
const refinedRecords = transform(rawRecords, function(data){
  return data.map(function(value){return value.toUpperCase();});
});
const output = stringify(refinedRecords);
// Print the final result
assert.equal(output, 
  `OMH,ONKCHHJMJADOA
D,GEACHIN
NNMIN,CGFDKB
NIL,JNNMJADNMINL
KB,DMIM
`);