Signing data structures the wrong way

The article explores the critical issues of canonical encoding and domain separation in cryptography, highlighting how improper data packaging leads to vulnerabilities and introducing Snowpack as a systematic solution.
How do you package data before feeding it into a cryptographic algorithm, like Sign, Encrypt, MAC or Hash? This question has lingered for decades without a sufficient solution. There are at least two important problems to solve. First, the encoding ought to produce canonical outputs, as systems like Bitcoin have struggled when two different encodings decode to the same in-memory data. But more important, the encoding system ought to weigh in on the important problem of domain separation.
To get a sense for this issue, let’s look at a simple example, using a well-known IDL like protobufs. Imagine a distributed system that has two types of messages: TreeRoots and KeyRevokes. By a stroke of bad luck, these two data structures line up field-for-field. If a node signs a TreeRoot, an attacker might try to forge a KeyRevoke message that serializes byte-for-byte into the same message, then staple the signature onto it. A verifier might be fooled into “verifying” a statement that the signer never intended.
This is not a theoretical attack. It has a long historical record of success in Bitcoin, Ethereum, TLS, JWTs, and AWS. The systems that have taken stabs at domain separation use ad-hoc techniques. A more systematic approach is warranted. When building FOKS, we invented one: Snowpack.
The main idea behind Snowpack is to put random, immutable domain separators directly into the IDL. A simple compiler transpiles the IDL to a target language. In the target language, a runtime library provides a method to sign such an object: it makes a concatenation of the domain separator and the serialization of the object, and then feeds the byte stream into the signing primitive. Similarly, verification of an object verifies this same reconstructed concatenation against the supplied signature.
In Go and TypeScript, the type system enforces the security guarantees. These 64-bit domain separators are not required for all structs, but untagged structs cannot be fed into Sign or Verify without type errors. As long as the random domain separators are unique, there is no chance of the signer and verifier misaligning on what data types they are dealing with.
Snowpack also ensures canonical encodings. It encodes structures as JSON-like positional arrays. This system supports removal and addition of fields, ensuring forwards- and backwards-compatibility for both RPCs and cryptographic inputs. Old decoders can still decode new encodings by seeing 0-values for expected fields, and vice versa.
Source: Hacker News











