CSVParse for Node.js

IssuesGitHub

Option cast

The cast option works at the field-level to alter its value. It is possible to transform the field's value or change its type.

The cast value is expected to be a function that receives context-rich information. The function has full control over a field. The test/option.cast.coffee test provides insights on how to use it and its supported functionalities.

Context

The cast function is called with 2 arguments: the field value and the context object.

The context object exposes the following properties:

  • column (number|string)
    The column name if the columns options is defined, or the field position.
  • empty_lines (number)
    Internal counter of empty lines encountered until this field.
  • header (boolean)
    A boolean indicating if the provided value is a part of the header.
  • index (number)
    The field position within the current record starting at 0.
  • invalid_field_length (number)
    Number of records with a non uniform length when relax_column_count is true. It was named skipped_lines until version 3.
  • lines (number)
    The number of lines which have been processed including the current line.
  • quoting (boolean)
    A boolean indicating if the field was surrounded by quotes.
  • records (number)
    The number of records which have been fully parsed. It was named count until version 3.

The cast example uses the context to transform the first field into a date and replace the second field with the injected context:

import assert from 'node:assert';
import { parse } from 'csv-parse/sync';

const data = `
  2000-01-01,date1
  2050-11-27,date2
`.trim();
const records = parse(data, {
  // The cast option exect a function which 
  // is called with two arguments,
  // the parsed value and a context object
  cast: function(value, context){
    // You can return any value
    if(context.index === 0){
      // Such as a string
      return `${value}T05:00:00.000Z`;
    }else{
      // Or the `context` object literal
      return context;
    }
  },
  trim: true
});
assert.deepStrictEqual(records, [
  [ '2000-01-01T05:00:00.000Z', {
    bytes: 16, comment_lines: 0, empty_lines: 0, invalid_field_length: 0,
    lines: 1, records: 0, columns: false, error: undefined, header: false,
    index: 1, column: 1, quoting: false, raw: undefined
  } ],
  [ '2050-11-27T05:00:00.000Z', {
    bytes: 35, comment_lines: 0, empty_lines: 0, invalid_field_length: 0,
    lines: 2, records: 1, columns: false, error: undefined, header: false,
    index: 1, column: 1, quoting: false, raw: undefined
  } ]
]);

Using the cast and columns functions conjointly

The cast function is called for each and every field, whether it is considered a header or not. The columns function is called once the first record is created (if treated as a header). For this reason, cast is executed before columns.

To distinguish a header field from a data field in the cast function, use the context.header property from the second argument to the cast function:

import assert from 'node:assert';
import {parse} from 'csv-parse/sync';

assert.deepEqual(
  parse('a,b,c\n1,2,3\n4,5,6', {
    cast: (value, context) => {
      if(context.header) return value;
      if (context.column === 'B') return Number(value);
      return String(value);
    },
    columns: (header) => {
      return header.map((label) => label.toUpperCase());
    },
    trim: true,
  })
  , [
    { A: '1', B: 2, C: '3' },
    { A: '4', B: 5, C: '6' }
  ]);

Note, the above example can be rewritten to implement the columns transformation directly inside cast, by setting columns: true and by replacing if(context.header) return value; by if(context.header) return value.toUpperCase();:

import assert from 'node:assert';
import {parse} from 'csv-parse/sync';

assert.deepEqual(
  parse('a,b,c\n1,2,3\n4,5,6', {
    cast: (value, context) => {
      if(context.header) return value.toUpperCase();
      if (context.column === 'B') return Number(value);
      return String(value);
    },
    columns: true,
    trim: true,
  })
  , [
    { A: '1', B: 2, C: '3' },
    { A: '4', B: 5, C: '6' }
  ]);

About

The Node.js CSV project is an open source product hosted on GitHub and developed by Adaltas.