What is the difference between CRC32 and CRC32C?

Both are 32-bit cyclic redundancy checks, but they use different generator polynomials. CRC32 uses the IEEE 802.3 polynomial (0x04C11DB7) and is found in ZIP, PNG, Ethernet, and gzip. CRC32C uses the Castagnoli polynomial (0x1EDC6F41), which has better error-detection properties and is accelerated by Intel SSE4.2 hardware — making it the choice for databases, storage systems (iSCSI, Btrfs), and AWS S3.

Can I use CRC32 as a security hash to detect tampering?

No. CRC32 and Adler-32 are non-cryptographic and non-keyed. An attacker can trivially compute a four-byte suffix that forces any file to produce a chosen CRC32 value, and they can craft two different files with the same checksum in seconds. For tamper detection or authenticity, use SHA-256, SHA-3, or BLAKE3. For keyed message authentication, use HMAC-SHA256.

Why does Adler-32 use modulo 65521 instead of 65536?

65521 is the largest prime number below 2^16. Using a prime modulus breaks the symmetric aliasing patterns that occur with power-of-two moduli — certain symmetric inputs would produce identical checksums under modulo 65536. The prime ensures a more uniform distribution of output values, improving the algorithm's collision resistance for short inputs.

The CRC32 of the string '123456789' should be 0xCBF43926. Is that a reliable test?

Yes, this is the canonical standard test vector published in the CRC specification. If your implementation returns any other value for the ASCII bytes 0x31 through 0x39, it is incorrect. Similarly, CRC32C of '123456789' must return 0xE3069283, and Adler-32 of 'Wikipedia' (ASCII) must return 0x11E60398. Always validate implementations against these known vectors.

When would I choose Adler-32 over CRC32?

Adler-32 is most useful on resource-constrained embedded systems where you cannot afford a 1KB lookup table in ROM or RAM, since it computes using only simple additions and a modulo operation. For bulk data over ~1KB, its error-detection quality is comparable to CRC32. For inputs shorter than about 128 bytes, CRC32 is a better choice because Adler-32's two accumulators haven't had enough bytes to build meaningful variance.

Does my ZIP file use CRC32 on compressed or uncompressed data?

The ZIP format stores the CRC32 of the original, uncompressed file content — not the compressed deflate stream. This means your decompressor first decompresses the data, then computes CRC32 on the result, and compares it to the stored value. This catches both corruption in the compressed payload and bugs in the decompressor itself.

CRC32 & Adler-32 Calculator

Non-cryptographic checksums for ZIP files, network frames & integrity checks

Input Mode

Input Text

Encoded as UTF-8 bytes before hashing.

CRC32 and Adler-32: The Workhorses of Lightweight Data Integrity

When a ZIP file lands on your hard drive intact, when an Ethernet frame traverses a noisy cable without corruption, when a PNG image renders pixel-perfect after crossing the internet — you have a non-cryptographic checksum to thank. CRC32, its modern sibling CRC32C, and Adler-32 are three of the most widely deployed integrity mechanisms in computing history. They are not designed to resist adversaries (that job belongs to SHA-256 or BLAKE3), but they are spectacularly good at one specific task: detecting accidental corruption quickly and cheaply.

The Mathematics of CRC32

CRC stands for Cyclic Redundancy Check. Despite the intimidating name, the concept is elegant: treat the entire input as one enormous binary polynomial, divide it by a fixed generator polynomial, and the 32-bit remainder is your checksum. Because division is reversible, a single flipped bit almost certainly changes the remainder, making accidental corruption detectable with near-certainty.

The standard CRC32 used in ZIP archives, zlib, Ethernet (IEEE 802.3), and PNG files uses the IEEE 802.3 generator polynomial: 0x04C11DB7. In practice, implementations use its bit-reversed form 0xEDB88320 to enable a clean right-shifting hardware circuit. The algorithm initializes a 32-bit register to 0xFFFFFFFF, processes each byte by XOR-ing it into the low byte of the register, then shifting right 8 times with the polynomial XOR-ed in when a 1-bit falls off the bottom. The final result is XOR-ed with 0xFFFFFFFF again (a bitwise NOT) to give the output. This pre- and post-conditioning ensures that a stream of leading or trailing zeros actually changes the checksum.

To avoid recomputing the polynomial XOR on every bit, all practical implementations precompute a 256-entry lookup table — one entry per possible byte value — reducing the cost to one table lookup and one XOR per byte of input. This is fast enough to run at memory-bus speeds on modern CPUs.

CRC32C: The Castagnoli Variant

CRC32C uses a different generator polynomial: 0x1EDC6F41 (reversed: 0x82F63B78), chosen by Guy Castagnoli and colleagues in 1993 specifically for its superior error detection profile. Compared to the IEEE polynomial, CRC32C catches a wider class of burst errors for the same 32-bit cost.

More practically, Intel's SSE4.2 instruction set (introduced in 2008) added a hardware CRC32 instruction that computes CRC32C natively, processing four bytes per clock cycle. This made CRC32C the preferred checksum for storage systems: it is used in iSCSI, SCTP (a network protocol), Google's LevelDB and RocksDB key-value stores, and the Btrfs filesystem. If you are choosing between CRC32 and CRC32C for a new application, CRC32C wins unless you need compatibility with existing ZIP/PNG/zlib ecosystems.

The test vector for both: the ASCII string 123456789 hashes to 0xCBF43926 under CRC32 and 0xE3069283 under CRC32C. These are the canonical validation values published in the spec — if your implementation does not hit these numbers, something is wrong.

Adler-32: Speed Over Polynomial Rigor

Mark Adler (co-author of gzip and zlib) designed Adler-32 in 1995 as a faster alternative to CRC32 for the zlib compression library. The algorithm is almost childishly simple: maintain two 16-bit running sums, A and B, initialized to 1 and 0 respectively. For each input byte, add it to A; add the new value of A to B. Both sums are taken modulo 65521 (the largest prime below 2¹⁶). The final 32-bit output packs B in the high 16 bits and A in the low 16 bits.

Why 65521 instead of 65536? Using a prime modulus eliminates patterns that would cause many distinct inputs to produce the same checksum. With a power-of-two modulus, certain symmetric inputs would alias — the prime breaks that symmetry.

The trade-off is detection quality. Adler-32 is weaker than CRC32 for very short inputs (under ~128 bytes) because A and B haven't had enough bytes to accumulate meaningful variation. For bulk data transfer — which is exactly what zlib processes — it performs adequately and runs faster on software implementations because it avoids table lookups entirely. The PNG specification, for instance, uses CRC32 for per-chunk integrity but wraps the whole compressed payload in a zlib Adler-32 as well.

Where These Algorithms Actually Appear in Production

CRC32 (IEEE): Every ZIP file ends with a local file header containing the CRC32 of the original uncompressed content. PNG chunks (IHDR, IDAT, IEND, etc.) each carry a CRC32 trailer. The Ethernet FCS (Frame Check Sequence) is CRC32. The SATA and USB protocols use it internally. So does Gzip (.gz files) — the format stores both the Adler-32 of the zlib stream and the CRC32 of the original file.

CRC32C: Used by Google Spanner's storage layer, Apache Kafka, NVMe drives (in the optional end-to-end data integrity feature), and AWS's S3 which now accepts x-amz-checksum-crc32c headers as a first-class integrity mechanism for object uploads.

Adler-32: Embedded in every zlib stream — which means every HTTP response compressed with deflate, every PNG file's IDAT chunk, and every Java JAR (which are ZIPs with zlib-compressed entries) runs an Adler-32 under the hood.

What These Algorithms Cannot Do

Non-cryptographic checksums guarantee nothing against intentional manipulation. Given a target CRC32 value, an attacker can append four bytes to any file to produce exactly that checksum — this is a trivial algebraic inversion. They can also construct two different files with the same CRC32 in seconds on commodity hardware. For anything adversarial — file authenticity, tamper detection, digital signatures — use SHA-256, SHA-3, or BLAKE3. CRC32 is not a substitute for a cryptographic hash.

Also worth noting: all three algorithms are non-keyed. There is no secret involved. If you need to authenticate a message as coming from a specific sender, you want HMAC-SHA256, not a CRC.

Performance Characteristics

On modern hardware, a software CRC32 table implementation processes roughly 500–1000 MB/s per core. Using SSE4.2 hardware instructions, CRC32C reaches 10–30 GB/s. Adler-32 in software sits in the 800–1500 MB/s range because the modular arithmetic (especially the modulo prime) is the bottleneck rather than memory access. For embedded systems with no hardware acceleration, Adler-32's lack of a lookup table makes it preferable when ROM or RAM is scarce.

The 32-bit output size is a deliberate engineering choice. At 32 bits, the probability of an undetected random error is roughly 1 in 4 billion — adequate for the frame sizes and file sizes these algorithms were designed to protect. For multi-gigabyte objects where you want even stronger accidental-error detection, CRC64 variants exist, but they remain niche compared to the ubiquitous 32-bit forms.

Understanding these three algorithms — their polynomials, their edge cases, their strengths, and their deliberate limitations — gives you a solid foundation for making the right integrity-checking choice in your own storage, networking, or compression code. Sometimes the right tool really is a 1970s cyclic polynomial, running faster than your memory bus.

🧮 CRC32 & Adler-32 Checksum Calculator

CRC32 & Adler-32 Calculator

CRC32 and Adler-32: The Workhorses of Lightweight Data Integrity

The Mathematics of CRC32

CRC32C: The Castagnoli Variant

Adler-32: Speed Over Polynomial Rigor

Where These Algorithms Actually Appear in Production

What These Algorithms Cannot Do

Performance Characteristics

FAQ