Mandatory comment about ASN.1, a protocol from 1984, already did what Protobuf d...

theamk · 2025-12-02T01:48:49 1764640129

ASN.1 has too much stuff. The moment you write "I made ASN.1 decoder/encoder", someone will throw TeletexString or BMPString at it. Or inheritance, as morshu9001 sad. So at this point:

- You can support all those features, and your ASM.1 library will be horribly bloated and over-engineered.

- You can support your favorite subset, but then you cannot say it's ASN.1 anymore. It will be "ASN.brabel", which only has one implementation (yours). And who wants that?

(unless you are Google and have immense developer influence... But in this case, why not design things from scratch, since we are making all-new protocol anyway?)

themafia · 2025-12-02T07:07:38 1764659258

> someone will throw TeletexString or BMPString

ASCII with escapes and UCS-2.

> horribly bloated and over-engineered.

It's no more or less complicated than XML, JSON or CSV. Which is why you can use ASN.1 to serialize to and from all these formats. ASN.1 provides you an additional level of schema above these. It simply allows you to describe your problem.

I find ASN.1 far more sane and useful than something like JSON Schema which is just as "bloated and over-engineered." It turns out describing data is not a simple problem.

kragen · 2025-12-02T10:37:19 1764671839

ASN.1 is far, far more complicated than JSON or any particular flavor of CSV, in part because it does provide an extra level of schema that those other formats don't.

theamk · 2025-12-02T17:53:16 1764697996

Nope, TeletexString is ITU T.61, a.k.a codepage 1036. So Backspace (0x08) is OK, but Tab (0x09) is not.

What, your implementation does not include CP1036 to Unicode translation table? Sorry, it's no longer ASN.1, it's now ASN.themafia.

Oh, it does? Then how about xmlhstring vs hstring, are you handing difference properly?

What about REAL type? Does your asn.1 library include support for arbitrary-precision floating point numbers, both base 2 and 10? And no, you cannot use "double", I am sure there is an application out there which uses base 10 reals, or has 128 bits of mantissa.

ASN.1 is full of overengineered, ancient things like those, and the worst part - once you actually start using it to interoperate with other software, there is a good chance you'll see them. If you want something that people actually implement fully, choose something else.

themafia · 2025-12-02T22:27:08 1764714428

> Nope, TeletexString is ITU T.61

Yes, read the standard, it's ASCII with special escape sequences. Which I don't have to render, I only have to convey them correctly across the network.

> What, your implementation does not include CP1036 to Unicode translation table? Sorry, it's no longer ASN.1, it's now ASN.themafia.

Why would I need the table?

> Oh, it does? Then how about xmlhstring vs hstring, are you handing difference properly?

What exactly needs to be "handled?"

> Does your asn.1 library include support for arbitrary-precision floating point numbers

Yes, because there are third party libraries which supply this functionality, so it's hardly any special effort to implement.

> ancient things like those

So no one ever needs arbitrary precision integers? You'll eventually need them for some application. Now all you have is ad-hoc implementations and almost no software to interoperate with them or verify them.

> If you want something that people actually implement fully, choose something else.

Name anything else with the same features yet is easier to implement "fully." Seriously, go read the JSON Schema specification. This is a _hard_ problem. If you think you've found an easy solution it's likely you've just left most of the functionality on the floor. And now we have to ask "is software X compatible with software Y?" Obviating the entire point of a standard.

zzo38computer · 2025-12-01T19:53:07 1764618787

I also think ASN.1 DER is better (there are other formats, but in my opinion, DER is the only good one, because BER is too messy). I use it in some of my stuff, and when I can, my new designs also use ASN.1 DER rather than using JSON and Protobuf etc. (Some types are missing from standard ASN.1 but I made up a variant called "ASN.1X" which adds some additional types such as key/value list and some others. With the key/value list type added, it is now a superset of the data model of JSON, so you can convert JSON to ASN.1X DER.)

(I wrote a implementation of DER encoding/decoding in C, which is public domain and FOSS.)

pphysch · 2025-12-01T19:58:16 1764619096

> ASN.1, a protocol from 1984, already did what Protobuf does, with more flexibility.

After working heavily with SNMP across a wide variety of OEMs, this flexibility becomes a downside. Or SNMP/MIBs were specified at the wrong abstraction level, where the ASN.1 flexibility gives mfgs too much power to do insane and unconventional things.

morshu9001 · 2025-12-01T20:21:35 1764620495

Yeah same, ASN.1 was a nightmare when I was dealing with LTE

cryptonector · 2025-12-02T05:39:17 1764653957

I've been working on and with Kerberos and PKIX for decades. I don't find ASN.1 to be a problem as long as you have good tooling or are willing to build it. The specs are a pleasure to read -- clear, concise, precise, and approachable (once you have a mental model for it anyways).

Of course, I am an ASN.1 compiler maintainer, but hey, I had to become one because the compiler I was using was awesome but not good enough, so I made it good enough.

I'm curious what made it a nightmare for you.

whatevaa · 2025-12-02T07:58:01 1764662281

You just said it. You had to become compiler maintainer to make it good enough.

cryptonector · 2025-12-02T18:35:21 1764700521

Here's the problem though: people have used the absence of tooling to justify the creation of new, supposedly-superior schemas and codecs that by definition have strictly less tooling available on day zero and which invariably turn out to be worse than ASN.1/DER were in 1984 because the authors also refused to study the literature to see what good ideas they could pick up. That's how we end up with:

- PB being a TLV encoding, just like DER, with all the same problems

   (Instead PB should have been inspired by XDR or OER, but not DER.)

 - PB's IDL requiring explicitly tagging every field of every data structure(!) even though ASN.1 never required tagging every field, and even though ASN.1 eventually adopted automatic tagging.

 - PB's very naive approach to extensibility that is just like 1984 ASN.1's.

It's a mistake.

Some people, when faced with a dearth of tooling, will write said tooling. Other people will say that the technology in question is a nightmare, and some of those people will then go on to invent a worse wheel.

I'd be ecstatic to use something other than ASN.1 if it wasn't a poor reinvention of it.

morshu9001 · 2025-12-02T21:08:05 1764709685

Protobuf ended up having more tooling in the end though, and it didn't take very long to get there. This is like how JSON replaced XML for many use cases.

cryptonector · 2025-12-02T21:10:52 1764709852

If they had put the same energy towards building tooling for an existing IDL/codec then they would have had strictly less work to do. Besides being inefficient in the use of their resources they also saddled us with 15th system (probably more like a 25th system, but you get the reference), and a poor one at that. There is really nothing much good to say about PB.

morshu9001 · 2025-12-02T23:30:42 1764718242

I rely on protos for lots of stuff at work and honestly couldn't imagine having to do all this in ASN.1, even if tooling were completely solved.

cryptonector · 2025-12-02T23:35:21 1764718521

I use both (and JSON, and I've used XML, and I've used XDR, and ...). Check this out and weep for not having anything like this for PB: https://github.com/heimdal/heimdal/blob/master/lib/asn1/READ...

morshu9001 · 2025-12-03T00:11:10 1764720670

Not sure what this is. Transcoding to/from JSON is something protobuf does easily, but this readme is about a lot more than that.

cryptonector · 2025-12-03T03:13:13 1764731593

Yes, it's about a lot more than that. It's about automatically and recursively encoding/decoding through "typed hole". A typed hole is where you have a struct with one field that denotes the type of the other, and the other is basically a byte string whose value is an encoding of a value of a type identified by the other field. Typed holes are surprisingly common in protocols. Typically you first decode the outer value, then you inspect the typed hole's type ID field, then you decode the typed hole's value accordingly, and this is code you have to write by hand. Whereas with automatic handling of typed holes just one invocation of the codec is sufficient (as opposed to one codec invocation for the outermost value plus one invocation for every typed hole).

morshu9001 · 2025-12-04T23:31:38 1764891098

Why isn't the other value just a oneof? I get if your holed value is passthru data encoded in some special way that isn't standard asn1 or proto, but at that point it's heavily application-dependent and not really the outer protocol's job to support.

cryptonector · 2025-12-05T04:37:32 1764909452

You can do CHOICE in ASN.1, yes, and you can even make it an extensible CHOICE. In that case the tag is the type determinant, and it looks a lot like a typed hole. But! sometimes you want a typed hole where the type determinant is something like a URN, or a URI, or some other type where the value space is a) large, and b) structured so you can avoid needing a registry. And sometimes the protocol you're writing inherently can't have a type registry -- think of an RPC layer where you have headers that provide things like authentication and negotiation of things, session-like things, while the application provides the procedures (the 'P' in RPC) and so you need to identify the application without a registry of oneof tags.

morshu9001 · 2025-12-02T14:49:20 1764686960

This was the main reason. The asn.1 language has a ton of unnecessary features that make it harder to implement, but the stuff I dealt with was using those features so I couldn't just ignore it. I didn't write a compiler but did hack around some asn1c outputted code to make it faster for our use case. And had to use asn1c in the first place because there was no complete Rust asn1 compiler at the time, though I tried DIY'ing it and gave up.

I also remember it being complicated to use, but it's been too long to recall why exactly, probably the feature bloat. Once I used proto3, I realized it's all you need.

cryptonector · 2025-12-02T22:54:30 1764716070

> The asn.1 language has a ton of unnecessary features that make it harder to implement

Only if you want to implement them. You could get quite far with just a subset of UNIVERSAL types, including UTF8String, SEQUENCE/SET, SEQUENCE OF / SET OF, etc. There's a ton of features in x.680 you can easily drop.

I've implemented a subset of x.681, x.682, and x.683 to get automatic, recursive decoding through all typed holes in PKIX certificates, CRLs, CSRs, etc. Only a subset, and it got me quite far. I had a pretty good open source x.680 implementation to build on.

This is the story of how Heimdal's authors wrote its ASN.1 compiler: they wanted tooling, there wasn't a good option, they built enough for PKIX and Kerberos. They added things as they went along. OpenSSL does not-quite-DER things? Add support in the Heimdal decoder. They hacked a lot of things for a while which I later fixed, like they didn't support DEFAULT, so they changed DEFAULTed members to OPTIONAL, and they hacked IMPLICIT support, which I finished. And so on. It still doesn't have things like REAL (who needs it in security protocols? no one). Its support for GeneralString is totally half-assed just like... MIT Kerberos, OpenSSL, etc. We do what we need to. Someone could take that code, polish it up, add features, support more programming languages, and make some good money. In fact, Fabrice Belllard has his own not-open-source, commercial ASN.1 compiler and stack, and it must be quite good -- very smart!

cryptonector · 2025-12-02T05:40:18 1764654018

That's not ASN.1's fault though.

whatevaa · 2025-12-02T07:57:11 1764662231

Json doesn't support comments specifically to not allow parsing directives, that means less customization. More customization of interoperability protocols is not always a good thing.

cryptonector · 2025-12-02T18:30:05 1764700205

The compiler I [occasionally] work on does not abuse comments for directives. All directives in that compiler are out of band.

pphysch · 2025-12-02T19:02:44 1764702164

No, but it is an argument against "ASN.1 is superior to protobufs".

Many modern high-volume telemetry systems use gRPC for a good reason, it wins in the "pragmatic" department.

strongpigeon · 2025-12-01T20:21:34 1764620494

> Protobuf is ok but if you actually look at how the serializers work, it's just too complex for what it achieves.

Yeah. I do remember a lot of workloads at Google where most of the CPU time was spent serializing/deserializing protos.

ses1984 · 2025-12-02T01:50:07 1764640207

I feel like most high throughput distributed systems eventually reach a point where some part of it is constrained by de/serialization.

Not much is faster than protobuf except for zero copy formats.

kragen · 2025-12-02T10:39:17 1764671957

But zero-copy formats like FlatBuffers or Cap'n Proto can be much faster. Like, faster by an arbitrarily large factor, for data at rest.

morshu9001 · 2025-12-01T20:20:50 1764620450

ASN.1 is way overengineered to the point of making it hard to support. You don't need inheritance for example.

zzo38computer · 2025-12-01T20:35:04 1764621304

it is not necessary to use or to implement all of the data types and other features of ASN.1; you can implement only the features that you are using. Since DER uses the same framing for all data types, it is possible to skip past any fields that you do not care about (although in some cases you will still need to check its type, to determine whether or not an optional field is present; fortunately the type can be checked easily, even if it is not a type you implement).

morshu9001 · 2025-12-01T20:41:12 1764621672

Yes but I don't want to worry about what parts of the spec are implemented on each end. If you removed all the unnecessary stuff and formed a new standard, it'd basically be protobuf.

zzo38computer · 2025-12-01T20:55:17 1764622517

I do not agree. Which parts are necessary depends on the application; there is not one good way to do for everyone (and Protobuf is too limited). You will need to implement the parts specific to your schema/application on each end, and if the format does not have the data types that you want then you must add them in a more messy way (especially when using JSON).

morshu9001 · 2025-12-01T21:33:14 1764624794

In what ASN1 application is protobuf spec too limited? I've used protobuf for tons of different things, it's always felt right. Though I understand certain encodings of ASN1 can have better performance for specific things.

zzo38computer · 2025-12-01T22:03:21 1764626601

Numbers bigger than 64-bits, character sets other than Unicode (and ASCII), OIDs, etc.

morshu9001 · 2025-12-02T00:04:15 1764633855

These are only scalars that you'd encode into bytes. I guess it's slightly annoying that both ends have to agree on how to serialize rather than protobuf itself doing it, but it's not a big enough problem.

Also I don't see special ASN1 support for non-Unicode string encodings, only subsets of Unicode like ascii or printable ascii. It's a big can of worms once you bring in things like Latin-1.

zzo38computer · 2025-12-02T00:53:06 1764636786

ASN.1 has support for ISO 2022 as well as ASCII and Unicode (ASCII is a subset of Unicode as well as a subset of ISO 2022). (My own nonstandard extensions add a few more (such as TRON character code and packed BCD), and the standard unrestricted character string type can be used if you really need arbitrary character sets.) (Unicode is not a very good character set, anyways.)

Also, DER allows to indicate the type of data within the file (unless you are using implicit types). Protobuf has only a limited case of this (you cannot always identify the types), and it requires different framing for different types. However, DER uses the same framing for all types, and strings are not inherently limited to 2GB by the file format.

Furthermore, there are other non-scalar types as well.

In any of these cases, you do not have to use all of the types (nor do you need to implement all of the types); you only need to use the types that are applicable for your use.

I will continue to use ASN.1; Protobuf is not good enough in my opinion.

cryptonector · 2025-12-02T05:36:51 1764653811

To be fair, if you don't need to support anything other than Unicode, then this is not an advantage, and over time we're all going to need non-Unicode less and less. That said I'm a big fan of ASN.1 (see my comment history).

morshu9001 · 2025-12-02T18:13:39 1764699219

I'm still confused how these ISO 2022 strings even work, and the ASN1 docs discourage using the UniversalString and GraphicString types. All these different string types are intimidating if I just want unicode/ascii, and even if I were using an obscure encoding, I'd use generic bytes instead of wanting asn1 to care about it.

cryptonector · 2025-12-02T18:36:28 1764700588

GeneralString relies on control characters to "load" character sets into the C0 and C1 registers. This is madness -- specifically it's pre-Unicode madness, but before Unicode it made sense.

morshu9001 · 2025-12-02T23:18:47 1764717527

Oh gosh. Fair enough that this exists and something uses it, but I'd absolutely want to handle that on the ends only, not get asn1 involved in parsing it.

zzo38computer · 2025-12-03T21:32:38 1764797558

ASN.1 does not necessarily need to get involved in parsing the values; for some applications doing so is unnecessary anyways (this is true for many fields of many types and not only this one, though). ASN.1 will need to be involved in parsing the framing; whether or not it is involved in parsing the values depends on whether the application requires it for that specific value (for example, it is commonly not necessary to parse OIDs (you can usually just treat them as opaque data which can be compared for equality (or looked up in a table), although sometimes it is useful to display them), although some implementations insist on doing so anyways).

cryptonector · 2025-12-05T16:49:18 1764953358

Correct, ASN.1 does not tell you how to implement all its semantics. You can totally have tooling that exposes a GeneralString's blob payload to the application and lets the app handle the codeset switching aspects of GeneralString.

I want to object that anyone who really has to handle multi-codeset GeneralString values would want a better library but...

...what would that be if not a converter to/from Unicode? But what if one really wants an array/list of {codeset, string} pairs? At that point open-coding support for those escapes is probably just as well since that one might have the only application in the world that wants that!!

:laugh:

zzo38computer · 2025-12-05T21:10:38 1764969038

> You can totally have tooling that exposes a GeneralString's blob payload to the application and lets the app handle the codeset switching aspects of GeneralString.

You can do that for all types, and my own implementation does just expose the payloads for all types, although it also has functions to encode and decode many of them, they are functions that must be called separately (e.g. asn1_decode_number and asn1_decode_date).

For some applications, there is no need to decode the payload anyways, and you can just treat it as opaque data if you do not need to display them (the same is true for many other types).

> I want to object that anyone who really has to handle multi-codeset GeneralString values would want a better library

I did start to try to write such a thing (it is not published yet), and the library for ISO 2022 is separate than the library for ASN.1, although they can be used together. It is intended to be usable for multiple uses. (I might also add support for character codes other than ISO 2022 (such as the encodings of Unicode and TRON code), although it is mainly intended to support ISO 2022.)

> ...what would that be if not a converter to/from Unicode?

For one thing, not all of the codes can be correctly converted to/from Unicode (especially control characters), and even if they can be, this is does not preserve some of the details, because the way the character set works is different from Unicode (e.g. some things might be considered to match in Unicode but not other character sets and vice-versa; this would also be true of case-folding, missing details, character properties, etc). For example, some information may be lost when converting to Unicode.

(This does not necessarily mean that converting to/from Unicode is never useful, although you should consider if you can do it in a better way; for example, if your program accepts input that is meant to be added to a General string (or Graphic string) in a DER file then it would be better to accept ISO 2022 input directly if possible. Converting to TRON code (especially for CJK text) might be better, but even that depends on what you are using it for; some uses do not require conversion at all.)

If you really want to store Unicode text anyways, you should consider if UTF-8 is a valid type according to the schema you are using (and use that type instead if so; note that some schemas might not care so much about the type in some cases); if not, consider prepending the three bytes <1B 25 47> when encoding it as a ISO 2022 string. (As far as I can tell, this is not really supposed to be allowed in ASN.1, but I suppose it is possible if you really need to. You might also check if it is only ASCII and avoid adding this prefix if so.)

> But what if one really wants an array/list of {codeset, string} pairs? At that point open-coding support for those escapes is probably just as well since that one might have the only application in the world that wants that!!

I think that what you will want will depend on the specific application. Some will want that, some will want something else, and also different ways that you might want handling control characters, etc.

cryptonector · 2025-12-02T23:38:07 1764718687

Oh GeneralString is madness. It's pre-Unicode madness. It exists because Unicode didn't exist in 1984, but people still wanted to be able to exchange text in multiple scripts, which necessitated being able to "switch codesets" in the middle. It's... yeah, it's.. it's nuts. I've _not_ implemented GeneralString, and practically no one needs to even when specs say to. E.g., in Kerberos the strings are GeneralString, but all the implementations just-send-8 and do not attempt to interpret any codeset switching escapes.

zzo38computer · 2025-12-02T19:55:24 1764705324

> I'm still confused how these ISO 2022 strings even work

There is C0, G0, C1, and G1 sets (C0 and C1 are control characters and G0 and G1 are graphic characters), and escape sequences are used to select the C or G set for bytes with or without the high bit set. Graphic string does not allow control characters and General string does allow control characters.

You probably do not need all control characters; your schema should probably restrict which control characters are allowed in each context (although the ASN.1 schema format does not seem to have any way to do this). This way, you will only handle the control characters which are appropriate for your use.

This is messy, although canonical form simplifies it by adding some restrictions (this is one of the reasons why DER is better than BER, in my opinion). TRON code is better and is much simpler than the working of ISO 2022. (Unicode has a different kind of mess; although decoding is simpler, actually handling the decoded characters in text is its own big mess for many reasons. Unicode is a stateful character set, even though the encoding is stateless; TRON code is the other way around (and with a significantly simpler stateful encoding than ISO 2022).)

> the ASN1 docs discourage using the UniversalString and GraphicString types

UniversalString is UTF-32BE and GraphicString is ISO 2022 without control characters. By knowing what they are, you should know in which circumstances they should be considered useful or not useful; I think that they should not be discouraged in general (although usually if you want Unicode, you would use UTF-8 rather than UTF-32, there are some circumstances where you might want to use UTF-32, such as if the data or program is already UTF-32 for other reasons).

(The data type which probably should be avoided is the UTC time type, which is not Y2K compliant.)

> All these different string types are intimidating if I just want unicode/ascii

If you only want ASCII, use the IA5 type (or Visible if you do not want control characters); if you only want Unicode, use the UTF-8 string type (or Universal if you want UTF-32 instead for some reason). ("IA5" is another name for ASCII that as far as I can tell hardly anyone other than ITU uses.)

However, Unicode is not a very good character set, and they should not force or expect you to use it.

As I had mentioned before, you do not need to use or implement all of the ASN.1 data types; only use the ones appropriate for your application (so, if you do not like most of the types, then don't use those types). I also made up some additional nonstandard ASN.1 types (called ASN.1X), which also might be useful for some applications; you are not required to use or implement these either.

cryptonector · 2025-12-02T21:16:19 1764710179

> However, Unicode is not a very good character set, and they should not force or expect you to use it.

Unicode is an excellent character set, and for 99% of cases (much more probably) it's absolutely the best choice. So one should choose Unicode (and UTF-8) in all cases unless there is an excellent reason to do otherwise. As time passes there will be fewer and fewer cases where Unicode is not sufficient, so really we are asymptotically approaching the point at which Unicode is the only good choice to make.

This is all independent of ASN.1. But it is true that ASN.1 has built-in types for Unicode and non-Unicode strings that many other protocols lack.

Have you written up anything about ASN.1X anywhere? I'd love to take a look.

zzo38computer · 2025-12-02T22:04:00 1764713040

> Have you written up anything about ASN.1X anywhere? I'd love to take a look.

ASN1_BCD_STRING (64): Represents a string with the following characters: "0123456789*#+-. " (excluding the quotation marks). Each octet encodes two characters, where the high nybble corresponds to the first character and the low nybble corresponds to the second character.

ASN1_PC_STRING (65): Represents a string of characters in the PC character set. Note that the control characters can also be used as graphic characters.

ASN1_TRON_STRING (66): Represents a string of characters in the TRON character set, encoded as TRON-8.

ASN1_KEY_VALUE_LIST (67): Represents a set of keys (with no duplicate keys) and with a value associated with each key. The encoding is the same as for a SET of the keys, but with the corresponding value immediately after each key (when they are sorted, only the keys are sorted and the values are kept with the corresponding keys).

ASN1_UTC_TIMESTAMP (68): Represents a number of UTC seconds (and optionally fractions of seconds), excluding leap seconds, relative to epoch.

ASN1_SI_TIMESTAMP (69): Represents a number of SI seconds (and optionally fractions of seconds), including leap seconds, relative to epoch.

ASN1_UTC_TIME_INTERVAL (70): Represents a time interval as a number of UTC seconds. The number of seconds does not include leap seconds.

ASN1_SI_TIME_INTERVAL (71): Represents a time interval as a number of SI seconds (which may include fractions).

ASN1_OUT_OF_BAND (72): This type is not for use for general-purpose data. It represents something which is transmitted out of band (e.g. a file descriptor) with whatever transport mechanism is being used. The transport mechanism defines how a value of this type is supposed to be encoded with whatever ASN.1 encoding is being used.

ASN1_MORSE_STRING (73): Represents a string of characters in the Morse character set. The encoding is like a relative object identifier, where 0 means an empty space, and others is like bijective base 2 with 1 for dots and 2 for dashes, with the high bit for the first dot/dash, e.g. 4 means A and 8 means U.

ASN1_REFERENCE (74): A reference to another node within the same file. (Not all implementations will support this feature.) The encoding is like a Relative Object Identifier; the first number is how many times to go to the parent node (where 0 means the reference itself), and then the rest of the numbers specify which child node of the current node to go to where 0 means the first child, 1 means the second child, etc. It can reference a primitive or constructed node of a BER file, but you cannot specify a child index for a child of a primitive node, since primitive nodes cannot have child nodes. At least one number (how many levels of parents) is required, but any number of numbers is potentially possible.

ASN1_IDENTIFIED_DATA (75): Data which has a format and/or meaning which is identified within the data. The encoding is always constructed and consists of two or three items. The first item is a set of object identifiers, object descriptors (used only for display), and/or sequences where the first item of the sequence is a object identifier. The receiver ignores any items in this set that it does not understand. The second item in a ASN1_IDENTIFIED_DATA can be any single item of any type; it is interpreted according to the object identifiers in the first set that the receiver understands. The third item is optional, and if it is present it is a key/value list of extensions; the keys are object identifiers and the values are of any type according to the object identifiers. The default value of this key/value list is an empty key/value list.

ASN1_RATIONAL (76): Stored as constructed, containing two integers, being the numerator and the denominator. The denominator must be greater than zero. If it is canonical form, then it must be lowest terms.

ASN1_TRANSLATION_LIST (77): A key/value list where the keys identify languages. If the key is null then it means the default in case no language present in this list is applicable. The types of the values depends on the application (usually they will be some kind of character strings).

In addition, the same number for the BMP string type can also be used for a UTF-16 string type, and there is a "OBJECT IDENTIFIER RELATIVE TO" type which encodes a OID as either relative or absolute (in canonical form, it is always relative when possible) in order to save space; the schema will specify what it is relative to. ANY and ANY DEFINED BY are allowed despite being removed from the most recent versions of standard ASN.1. (The schema format for these extensions is not defined, since I am not using the ASN.1 schema format; however, someone who does use it might do so if they need it.)

There is also SDER, which is a superset of DER but a subset of BER, in case you do not want the mess of BER but do not want to require strictly canonical form either; and also SDSER which uses the same encoding for types and values than SDER but but length works differently in order to support streaming better.

As is usual, you do not have to use any or all of these types, but someone might find them useful for some uses. I have used some of them in my own stuff.

cryptonector · 2025-12-02T23:10:41 1764717041

ASN1_BCD_STRING can be just IA5String with a constraint attached...

Your time types can be just an INTEGER with a constraint attached... (In Heimdal we use INTEGER constraints to pick a representation in the programming language.) E.g.,

  -- 64-bit signed count of seconds where 0 is the Unix epoch
  ASN1_UTC_TIMESTAMP ::= INTEGER (-18446744073709551616..18446744073709551615)

ASN1_OUT_OF_BAND can just be a NULL with an APPLICATION tag or whatever:

  Out-of-Band ::= [APPLICATION 100] NULL

or maybe an ENUMERATED or BIT STRING with named bits to indicate what kind of thing is referenced out of band. You might even use this with a SEQUENCE type instead where one member identifies an out of band datum as an index, and the other identifies the kind.

ASN1_REFERENCE is... interesting. I've not needed it, but some RPC protocols support intra-payload and even circular references, so if you have a need for that (hopefully you don't), then your ASN1_REFERENCE would be useful indeed.

ASN1_IDENTIFIED_DATA... ASN.1 has EMBEDDED-PDV, open types, and the TYPE-IDENTIFIER class -- there are many ways to do this in ASN.1. See https://github.com/heimdal/heimdal/blob/master/lib/asn1/READ...

ASN1_RATIONAL is just a tagged sequence of numerator and denominator, with a constraint that the denominator must not be zero.

OBJECT IDENTIFIER RELATIVE TO is just a CHOICE of OBJECT IDENTIFIER and RELATIVE IDENTIFIER.

Re: SDER... yeah, so Heimdal's codec produces DER but accepts a subset of BER for interop with OpenSSL and others. If you really want streaming then you'll want a variant of OER with fixed-length lengths (which IMO OER should have had, dammit), which then looks a lot like XDR but with different alignment and more types.

I had kind of expected a subset of x.680.

zzo38computer · 2025-12-04T04:47:21 1764823641

> ASN1_BCD_STRING can be just IA5String with a constraint attached...

The abstract meaning matches, but the format is differently.

> ASN1_OUT_OF_BAND can just be a NULL with an APPLICATION tag or whatever

There are some uses of having a dedicated "out of band" type, such as being able to find them regardless of the schema (e.g. it might be used by a protocol that can use data with any schemas, but allows out of band data with any schema for some reason, and might want to modify the representations of out of band data when sending it to someone else).

> ASN1_IDENTIFIED_DATA... ASN.1 has EMBEDDED-PDV, open types, and the TYPE-IDENTIFIER class -- there are many ways to do this in ASN.1

EMBEDDED-PDV and those other things are different situations than I am doing, although it is similar, the use is not quite the same. ASN1_IDENTIFIED_DATA is simpler in some ways but also allows some things that EMBEDDED-PDV does not do.

Programs can also use ASN1_IDENTIFIED_DATA to identify the schema of a file that uses this type (and potentially be able to e.g. uncompress or decrypt it; this is the reason why the identifiers are allowed to be sequences and not only plain OIDs), or a part of another file.

> ASN1_REFERENCE is... interesting. I've not needed it, but some RPC protocols support intra-payload and even circular references, so if you have a need for that (hopefully you don't), then your ASN1_REFERENCE would be useful indeed.

Yes, it is what I thought too. So far I have not needed it either, but it might sometimes be useful.

> ASN1_RATIONAL is just a tagged sequence of numerator and denominator, with a constraint...

It can be defined as such in standard ASN.1, and the format is the same as that, but the abstract meaning is different. There is also a further constraint for the canonical form.

(Currently, the only place I have used this type is the tempo ratio in the .BGM lumps in Super ZZ Zero, but it would have other uses too, such as when converting data from other formats that have a rational number type.)

> OBJECT IDENTIFIER RELATIVE TO is just a CHOICE of OBJECT IDENTIFIER and RELATIVE IDENTIFIER.

It can be implemented that way in standard ASN.1 and has the same DER representation as your described type, although the abstract meaning is essentially the same as OBJECT IDENTIFIER and there is an additional constraint in the canonical form (as far as I know, this additional constraint cannot be written in standard ASN.1, but Super ZZ Zero cares about it being in canonical form (except for sound card identifiers in .BGM lumps, but this is an implementation detail for that specific part of the program)).

> I had kind of expected a subset of x.680.

Currently I am not using the schema format for ASN.1X (nor do I use the schema format of standard ASN.1); if someone else does then they might implement a variant of X.680 for use with ASN.1X. I probably would remove some stuff (and add some stuff) if I did make a variant, though.

(The use of ASN.1X is also not defined for JER, XER, OER, etc; if someone needs to, then they might do that.)

cryptonector · 2025-12-05T16:51:38 1764953498

I couldn't find anything about "Super ZZ Zero". Is this open source? Do you have a link?

zzo38computer · 2025-12-05T20:21:10 1764966070

Yes, it is open source, and I do have a link: https://github.com/zzo38/superzz0 (it is a game creation system similar to ZZT and MegaZeux)

cryptonector · 2025-12-05T23:17:32 1764976652

Thanks!

cryptonector · 2025-12-02T22:57:04 1764716224

Re: ASN1_KEY_VALUE_LIST, why not just do something like:

  ValueList ::= SET OF KeyValue

  KeyValue ::= SEQUENCE { key MyKeyType, value MyValueType }

  MyKeyType   ::= ... -- whatever you want here
  MyValueType ::= ... -- ditto

?

cryptonector · 2025-12-02T22:57:29 1764716249

What are the numbers in parenthesis? UNIVERSAL tag values?

cryptonector · 2025-12-02T05:35:55 1764653755

Open types, constrained types, parameterized types, not needing tags, etc.

cryptonector · 2025-12-02T05:31:48 1764653508

Thank you! Now I don't have to be the one saying this. Props if you use OER over DER. But since OP needs available tooling they might as well go to flatbuffers, which is much better than PB much like OER is much better than DER.

dgan · 2025-12-01T20:04:15 1764619455

I honestly looked up for a encoder/decoder for python/c++ application, and couldnt find anything usable; i guess i would need to contact the purchase department for a license (?), while with protobuf i can make the decision myself & all alone

adastra22 · 2025-12-02T05:14:33 1764652473

ASN.1 is a nightmare, and I would never use it for a greenfield project.

cryptonector · 2025-12-02T05:34:40 1764653680

I... wouldn't use it for a greenfield project either unless I got good at porting Luke Howard's Swift ASN.1 stack to whatever language I might be using that isn't C. For C I'd just use Heimdal's awesome ASN.1 compiler and be done. Even then I would be tempted to use flatbuffers instead, or else I'd have to go implement OER (a bunch of work I don't really care to do).

The problem with ASN.1 -- the only real problem with ASN.1, is lack of excellent tooling.

IshKebab · 2025-12-02T17:58:42 1764698322

I don't see how you can seriously criticise Protobuf's very simple encoding scheme as being too complex while recommending ASN.1!!

Totally mad.

bloppe · 2025-12-01T19:49:19 1764618559

What makes it too complex in your opinion?

kragen · 2025-12-02T10:40:25 1764672025

Experience with ASN.1.

kstrauser · 2025-12-02T16:24:07 1764692647

Same. My take on ASN.1 is that no one would pay me what I would ask to work on ASN.1. I’d only touch it if I had to parse files from an outside source, and a package already exists in my language of choice that’s capable of parsing those files.