Working with the UTF-8 bytes of JavaScript strings
This post assumes you understand UTF-8.
Recently, I wanted to get the UTF-8 bytes of a JavaScript string for a demo I was working on.
I took advantage of JavaScript’s built-in TextEncoder
, which turns a string into a Uint8Array
of the string’s bytes.
new TextEncoder().encode("hi 🌍");
// => Uint8Array(7) [104, 105, 32, 240, 159, 140, 141]
You can use TextDecoder
to reverse the process.
const bytes = new Uint8Array([240, 159, 145, 139, 32, 104, 105]);
new TextDecoder().decode(bytes);
// => "👋 hi"
That’s it!
If you’re curious, I also wrote up how to do this for UTF-16 and UTF-32, which are more complicated.