Add conformance tests for overlong varints as tags.
No wire format should ever contain an overlong varint, so the topic here is only how to react to non-standard and potentially corrupted data. The situation today is that there's 4 main ways that implementations deal when parsing tags: 1) parse up to 10 bytes, cast to uint32 2) parse up to 10 bytes, reject if it is above uint32_max 3) parse up to 5 bytes, cast to uint32 4) parse up to 5 bytes, reject if it is above uint32_max Of our primary supported implementations, these four strategies are used by Java, Go, C++ and upb correspondingly. Based on examining the situation, the decision taken is that: - Coercing down silently ignoring bits in the tag is dangerous to interpretation-confusion / silent misparsing, which means Java approach is dangerous. - Needing to support parsing up to 10 bytes (even when they may just be all 0x80 and no content) would have real performance implications on the upb and C++ parsers. Since it should really never happen taking any performance hit on all parses based on a hypothetical is considered undesirable. For that reason, the conformance test is set to match upb's behavior, which is slight mismatch to C++ and Go behavior today (in different ways), and larger mismatch to the Java behavior today. Because fixing this 'bug' may be disruptive to a customer in theory (though it would probably mean they have some bad data that was accidentally parsing), we may hold back fixing the behavior to a breaking change release; this change to the conformance suite only establishes the decision on preferred behavior. PiperOrigin-RevId: 841856475
P
Protobuf Team Bot committed
448b53feed0e0d8ed3113cc8c76ef75ff0072813
Parent: 54a48aa
Committed by Copybara-Service <copybara-worker@google.com>
on 12/8/2025, 7:55:52 PM