CSVParse for Node.js

IssuesGitHub

Option bom

The bom option strips the byte order mark (BOM) from the input string or buffer. When activated, the BOM is automatically detected and the parsing will occur whether a BOM was found or not.

It is recommended to always activate this option when working with UTF-8 files.

About

The UTF-8 BOM is a sequence of Bytes at the start of a text-stream (EF BB BF or \ufeff) that allows the reader to reliably determine if file is being encoded in UTF-8.

Example

Default is the boolean value false. The bom example simply activate the option:

import assert from 'node:assert';
import { parse } from 'csv-parse/sync';

const data = '\ufeffa,b,c\n';
const records = parse(data, {
  bom: true
});
assert.deepStrictEqual(records, [
  [ 'a', 'b', 'c' ]
]);

Hidden BOM in output

The option is disabled by default. When importing UTF-8 input, such as when reading from a file encoded as UTF-8, it is safe to activate the option, even if you are not sure it includes the BOM header.

Handling BOM header without this option may create unexpected behaviors. The BOM bytes will be present in the output and invisible, either in the values or in the object properties when using the column option.

Consider the following example, it illustrate how the property name is not the one printed in the console:

import assert from 'node:assert';
import { parse } from 'csv-parse/sync';

const data = "\ufeffkey\nvalue";
const records = parse(data, {
  bom: false,
  columns: true
});
// It seems that the output is perfectly fine
assert.equal(JSON.stringify(records[0]), '{"key":"value"}');
// However, the first property include the BOM bytes
assert.equal(Object.keys(records[0])[0], '\ufeffkey');

About

The Node.js CSV project is an open source product hosted on GitHub and developed by Adaltas.