Option encoding
The encoding
option declare the input and output encodings.
The default encoding value is utf8
. The default 'utf8' encoding is also used when the value is true
. The values null
and false
disable string serialization and returns buffers instead of strings.
- Type:
string|null
- Optional
- Default:
utf8
- Since: 4.13.0
- Related:
bom
— see Available Options
Default behavior
The list of available supported encoding in Node.js is available inside its source code. At the time of this writing, it includes 'utf8', 'ucs2', 'utf16le', 'latin1', 'ascii', 'base64', 'hex'.
The default encoding in Node.js is UTF-8. When using UTF-8, you do not need to specify anything.
When an alternative encoding is used, it can be discovered with the BOM (byte order mark) present at the begining of the input data or it can be defined with this option.
Working with options
When providing options, the values must internally reflect the data source encoding. If the value is a string, the parser will convert the value into a buffer representation using the selected encoding input value.
However, if the value is a buffer, you must make sure the buffer was created with the right encoding, here is an exemple encoding an option as buffer, the delimiter
option in this case:
Bom automatic detection
The BOM is a special Unicode character sequence at the begining of a text stream to indicate the encoding.
Because the BOM is specific to unicode, only the UTF-8 and UTF-16LE encoding are natively detected by the parser. Here is an example detecting the encoding, UTF-16LE in this case:
Notice how the BOM is declared as \uFEFF
. You can see how it is converted to the hexadecimal representation of FF EE
with the command node -e 'console.info(Buffer.from("\ufeff", "utf16le"))'
. You can refer to the Wikipedia byte order mark by encoding table for further investigations.
Buffer output
A value of null
or false
disables output encoding and returns the raw buffer.