The world's smallest PNG

by Evan Hahn

, updated Jan 21, 2024 (originally posted Jan 4, 2024)

The smallest PNG file is 67 bytes. It’s a single black pixel. Here’s what it looks like, zoomed in 200×:

A single black pixel.

Wow, what a beauty.

This file has four sections:

The PNG signature, the same for every PNG: 8 bytes
The image’s metadata, which includes its dimensions: 25 bytes
The image’s pixel data: 22 bytes
An “end of image” marker: 12 bytes

The rest of this post describes this file in more detail and tries to explain how PNGs work along the way.

There’s a big twist at the end, if that excites you. But I hope you’re just excited to learn about PNGs.

Part 1: the PNG signature

Every single PNG, including this one, starts with the same 8 bytes. Encoded in hex, those bytes are:

89 50 4E 47 0D 0A 1A 0A

This is called the PNG signature. Try doing a hex dump on any PNG and you’ll see that it starts with these bytes.

PNG decoders use the signature to ensure that they’re reading a PNG image. Typically, they reject the file if it doesn’t start with the signature. Data can get corrupted in various ways (ever had a file with the wrong extension?) and this helps address that.¹

Fun fact: if you decode these bytes as ASCII, you’ll see the letters “PNG” in there:

.PNG....

So that’s the first 8 bytes. One part done! Here’s our “checklist”:

~~PNG signature~~
Image metadata chunk
Pixel data chunk
“End of image” chunk

What about the rest?

Part 2: image metadata

The next part of the PNG is the image metadata, which is one of several chunks. What’s a chunk?

Quick intro to chunks

Other than the PNG signature at the start, PNGs are made up of chunks.

Chunks have two logical pieces: a type and some data bytes. Types are things like “image header” or “text metadata”. The data depends on the type—the text metadata chunk is encoded differently from the image header chunk.

These logical pieces are encoded with four fields. These fields are always in the same order for every chunk. They are:

Length: the number of bytes in the chunk’s data field (field #3 below). Encoded as a 4-byte integer.²
Chunk type: the type of chunk this is. There are lots of different chunk types. Encoded as a 4-byte ASCII string, such as “IHDR” for “image header” or “tEXt” for “text metadata”.
Data: the data for the chunk. See the “length” field for how many bytes there will be. Varies based on the chunk type. For example, the IHDR chunk encodes the image’s dimensions. May be empty, but usually isn’t.
Checksum: a checksum for the rest of the chunk, to make sure no data was corrupted. 4 bytes.

As you can see, each chunk is a minimum of 12 bytes long (4 for the length, 4 for the type, and 4 for the checksum).

Note that the “length” field is the size of the “data” field, not the entire chunk. If you want to know the whole size of the chunk, just add 12—4 bytes for the length, 4 bytes for the type, and 4 bytes for the checksum.

You have some wiggle room but chunks have a specific order. For example, the image metadata chunk has to appear before the pixel data chunk. Once you reach the “image is done” chunk, the PNG is done.

Our tiny PNG will have just three of these chunks.

The image header chunk

The first chunk of every PNG, including ours, is of type IHDR, short for “image header”.

Each chunk starts with the length of the data in that chunk.

The IHDR chunk always has 13 bytes of associated data, as we’ll see in a moment. 13 is 0D in hex, which gets encoded like this:

00 00 00 0D

The chunk type is next. This is another four bytes. “IHDR” is encoded as:

49 48 44 52

This is just ASCII encoding. Chunk types are made up of ASCII letters. The capitalization of each letter is significant. For example, the first letter is capitalized which means it’s a required chunk.

Next, the chunk’s data. IHDR’s data happens to be 13 total bytes, arranged as follows:

The first eight bytes encode the image’s width and height. Because this is a 1×1 image, that’s encoded as 00 00 00 01 00 00 00 01.
The next two bytes are the bit depth and color type.
These values are probably the most confusing part of this PNG.
There are five possible color types. Our image is black-and-white so we use the “greyscale” color type (encoded as 00). If our image had color, we might use the “truecolor” type (encoded with 02). There are three other color types which we don’t need today, but you can read more about them in the PNG specification.
Once you’ve picked a color type, you need to pick a bit depth. The bit depth depends on the color type, but usually means the number of bits per color channel in an image. For example, hex colors like #FE9802 have a bit depth of eight—eight bits for red, eight bits for green, and eight bits for blue. Our all-black image doesn’t need all that…we only need one bit! The pixel is either completely black (0) or completely white (1)—in our case, it’s completely black.
If we picked a more “expressive” color type and bit depth, we could make the same 1×1 image visually, but the file could be bigger because there could be more bits per pixel that we don’t actually need. For example, if we used the “truecolor” type and 16 bits per channel, each pixel would require 48 bits instead of just one—not necessary to encode “completely black”.
With bit depth of 1 and a color type of 0, we encode these two values with 00 01.
The next byte is the compression method. All PNGs set this to 00 for now. This is here just in case they want to add another compression method later. As far as I know, nobody has.
Same story for the filter method. It’s always 00.
The last part of the chunk’s data is the interlace method. PNGs support progressive decoding which allows images to be partly rendered as they download. We aren’t going to use that feature so we set this to 00.

Finally, every chunk ends with a four-byte checksum. It uses a common checksum function called CRC32, and uses the rest of the chunk as an input. Computing that checksum gives us the following bytes:

37 6E F9 24

All together, here’s the whole IHDR chunk:

Bytes	What?
`00 00 00 0D`	data length of 13 bytes
`49 48 44 52`	“IHDR” as ASCII
`00 00 00 01`	width
`00 00 00 01`	height
`01`	bit depth
`00`	color type
`00`	compression method
`00`	filter method
`00`	interlace method
`37 6E F9 24`	checksum

So that’s our first chunk! Let’s take another look at our checklist:

~~PNG signature~~
~~Image metadata chunk~~
Pixel data chunk
“End of image” chunk

Two more chunks to go—pixel data is next.

Part 3: pixel data chunk

Our next chunk is IDAT, short for “image data”. This is where the actual pixels are encoded…or just one pixel, in our case.

Remember that each chunk has four parts: the data’s length, the chunk type, the data, and a checksum.

This chunk will have 10 bytes of data. We’ll talk about what that data is shortly, but I promise it’s 10 bytes. Let’s encode that length:

00 00 00 0A

Next, let’s encode “IDAT” for the chunk type:

49 44 41 54

Again, this is just ASCII, and I’m showing the hex-encoded values.

Now for the interesting part: the image data.

First step: uncompressed pixels

Image data is encoded in a series of “scanlines”, and then compressed.

A scanline represents a horizontal line of pixels. For example, a 123×456 image has 456 scanlines. In our case, we have just one scanline, because our image is just one pixel tall.

Scanlines start with something called a filter type which can improve compression, depending on your image. Our image is so small that this is irrelevant, so we use filter type 0, or “None”.³

After the filter type, each pixel is encoded with one or more bits, depending on the bit depth. In our case, we just need one bit per pixel—recall that we have a bit depth of 1; all black or all white.

If your pixel data doesn’t line up with a byte boundary—in other words, if it’s not a multiple of 8 bits—you pad the end of your scanline with zeroes. That’s true in our case, so we add seven padding bits to fill out a byte.

Putting that together (a zero byte to start the scanline, the single zero bit, and seven zero padding bits), our single scanline is:

00 00

Now it’s time to “compress” the data.

Second step: “compression”

Next, we compress the scanline data…well, not quite.

More accurately, we run it through a compression algorithm. Most of the time, compression algorithms produce smaller outputs—that’s the whole point! But sometimes, “compressing” tiny inputs actually produces bigger outputs because of some small overhead. Unfortunately for us, that’s what happens here. But the PNG file format makes us do it.

PNG image data is encoded in the zlib format using the DEFLATE compression algorithm. DEFLATE is also used with gzip and ZIP, two very popular compression formats.

I won’t go in depth on DEFLATE here (in part because I am not an expert⁴), but here’s what our chunk’s data contains:

The zlib header: 2 bytes
One compressed DEFLATE block that encodes two literal zeroes⁵: 4 bytes
The zlib checksum (this is separate from the PNG chunk checksum!): 4 bytes

For more on how DEFLATE works, check out “An Explanation of the DEFLATE Algorithm”. I also recommend infgen, a useful tool for inspecting DEFLATE streams.

All together, here are the ten data bytes:

78 01 63 60 00 00 00 02 00 01

Again, unfortunate that we had to run our two-byte scanline through an algorithm that made it five times bigger, but PNG makes us do it!

With that, we can compute the PNG’s checksum field and finish off the chunk.

Bytes	What?
`00 00 00 0A`	data length of 10 bytes
`49 44 41 54`	“IDAT” as ASCII
`78 01`	zlib header
`63 60 00 00`	“compressed” DEFLATE block
`00 02 00 01`	zlib checksum
`73 75 01 18`	chunk checksum

Just one more chunk to go! Taking a final look at our checklist before everything is crossed off:

~~PNG signature~~
~~Image metadata chunk~~
~~Pixel data chunk~~
“End of image” chunk

Let’s finish this up.

Part 4: the end

Poetically, PNGs end like they begin: with a small number of constant bytes.

IEND is the final chunk, short for “image trailer”.

The zero length is encoded with 4 zeroes:

00 00 00 00

“IEND” is then encoded:

49 45 4E 44

There’s no data in IEND chunks, so we just move onto the checksum. Because everything else in the chunk is constant, this checksum is always the same:

AE 42 60 82

Here’s the whole trailer chunk:

Bytes	What?
`00 00 00 00`	data length of 0
`49 45 4E 44`	“IEND” as ASCII
`AE 42 60 82`	checksum

And our PNG is done!

Admiring our work

Here it is one more time, scaled up 200×:

A single black pixel.

Beautiful. It starts with the classic PNG signature, follows up with a bit of metadata, “compresses” the pixel data, and signs off with an empty chunk.

And that’s the world’s smallest PNG!

…or is it?

The twist: there are lots of champions

Throughout this post, I’ve said that this is the world’s smallest PNG. But that’s not quite true: it’s tied for first. There are several “world’s smallest PNGs”!

As long as we encode all pixel data in a single byte, we can tie for the world’s smallest PNG.

For example, you could encode this 8×1 black image, which is also 67 bytes:

A black rectangle, 8 pixels wide and 1 pixel tall.

This works because we use all eight bits are used to encode pixel data.

With our 1×1 image, recall that seven bits were effectively “wasted” on padding. Here’s basically what happened:

Bits	What?
`0`	a black pixel
`0000000`	padding

An 8×1 image can encode eight black pixels like so:

Bits	What?
`00000000`	eight black pixels

Instead of adding more pixels, you could also add more color resolution. Many grey colors can be encoded in a single byte, letting us tie for first. For example, this 1×1 grey pixel is also 67 bytes:

A single grey pixel.

Again, this “uses up” the whole byte we have available, unlike our 1×1 image.

For more on this, my former coworker Jordan Rose published “The Biggest Smallest PNG” in response to this post. It shows the biggest 67-byte PNG: a 1×2064 black line.

Summary

PNGs start with a “signature”. The rest of the file is made up of chunks. Each chunk has a length, type, data, and checksum. Some chunks are always required, like the image header (IHDR) chunk.

The smallest PNGs use the minimum number of chunks and the smallest possible data.

Our PNGs are made up of four parts:

The constant PNG signature (8 bytes)
The IHDR chunk, containing metadata (25 bytes)
The IDAT chunk, image pixel data (22 bytes)
The IEND chunk, an image trailer (12 bytes)

If you’re interested in learning more about PNGs interactively, I built PNG Chunk Explorer, which lets you analyze PNGs. Try uploading your own images to see what they’re made of! (It doesn’t work well on mobile, apologies.)

I also built Single Color Image, which generates monochromatic PNGs of arbitrary sizes. For example, you could generate a 12×34 purple rectangle. The images should be small but I haven’t yet implemented the most sophisticated compression, so you might need to run its results through a PNG compressor to achieve the smallest sizes.

Finally, I also wrote about the largest possible PNG. There’s no theoretical file size limit, but there is a maximum number of pixels, and many decoders impose various limits.

I hope this long post has given you a good understanding of the PNG file format. If you read this far and have anything to say, let me know!

If you have some data and don’t know what it is, you can look at the first 8 bytes. If you see the PNG signature, it’s probably a PNG. See the MIME Sniffing specification. ↩︎
Specifically, the length is encoded as a 4-byte unsigned big-endian integer. PNGs also impose an additional restriction: the most significant bit is unused, so the range is 0 to 2³¹−1. According to the spec, this helps “accommodate languages that have difficulty with unsigned four-byte values.” ↩︎
To be pedantic, filter type 0 is technically a filter type. It’s just a very boring one. ↩︎
If you are an expert, please contact me. I want to learn!! ↩︎
The deflate block is made up of 4 bytes, or 32 bits. The first bit signifies that this is the final DEFLATE block. The next 2 say that this is block that uses hard-coded Huffman codes; no “dictionary” is included in the payload. The next 8 bits encode a literal zero, and then another 8 bits encode the same. The next 7 bits are the “end of block” marker, and the final 6 pad the data so that it’s byte-aligned. ↩︎