Working with the UTF-8 bytes of JavaScript strings

by
, posted

This post assumes you understand UTF-8.

Recently, I wanted to get the UTF-8 bytes of a JavaScript string for a demo I was working on.

I took advantage of JavaScript’s built-in TextEncoder, which turns a string into a Uint8Array of the string’s bytes.

new TextEncoder().encode("hi 🌍");
// => Uint8Array(7) [104, 105, 32, 240, 159, 140, 141]

You can use TextDecoder to reverse the process.

const bytes = new Uint8Array([240, 159, 145, 139, 32, 104, 105]);

new TextDecoder().decode(bytes);
// => "👋 hi"

That’s it!

If you’re curious, I also wrote up how to do this for UTF-16 and UTF-32, which are more complicated.