One thing that strikes me as weird is the reference to ASN.1, I always thought that bitcoin only uses DER encoding for the signatures themselves (because that is what is usual for ECDSA, even thought it is suboptimal for multiple reasons) and the rest of the protocol including transaction format is specified in terms of bytes and varints. Have I missed something?
I thought that was the entire point (though it is possible that I misunderstood myself): that the transaction identifier is formed by taking a hash of the entire transaction and the signature (which, of course, could not have been signed); if anything in the data being signed were modified then this would be a very different issue, so the only options for malleability are the signature and any structure connecting the signature to the data.
The script language itself is malleable due to being executed. `NOP NOP NOP PUSHDATA` has the same result as `PUSHDATA`, despite having different bytes and a different resulting hash. The `PUSHDATA` opcodes are also in themselves malleable, you can do a `PUSHDATA2` (push the next 2 bytes to the stack) or a `PUSHDATA4 (push the next 4 bytes to the stack) and get exactly the same output. These can largely fixed with policy, but for a lot of cases that add back in this behaviour- Segregated Witness simply doesn't include this data in the TXID hash (but it is hashed in a commitment for the block to avoid other attacks).
For purposes of my intentionally super-zoomed-out view of this (as I think that is what is most valuable from a cryptography perspective), the script is part of the "structure connecting the signature to the data".
What I meant is that there is no ASN.1 involved in the format of transaction itself, only thing that is serialized in DER is the argument to OP_PUSH_DATA as part of scriptsig
That script is how the signature is attached to the data and is how the transaction is verified, though; it is no different at a conceptual level than "someone decided to attach the signature to the data using JSON and didn't realize that the order of the two fields mattered". Instead of using JSON, they are using a virtual machine, but that just provides even more opportunity for shenanigans ;P.
I think the reason the design is this way is because it was really convenient to get the ASN.1 format out of OpenSSL for the first implementation. If the protocol was designed today, there would be no ASN.1 involved at all. (see other comments for the problems associated with ASN.1 parsing)