From 8693a005c08e1c84d693fe7baa154f8785007520 Mon Sep 17 00:00:00 2001 From: "Neal H. Walfield" Date: Wed, 6 Nov 2019 14:42:50 +0100 Subject: openpgp: Replace RFC 2822 parser with a de factor parser MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - RFC 4880 says that "by convention, [a User ID Packet] includes an RFC 2822 [RFC2822] mail name-addr." This is not the actual convention, and attempting to parse User IDs using an RFC 2822 parser means that many common User IDs cannot be parsed. - Disparities between the actual convention and the stated convention include: - Neither users nor the software they use to create keys correctly quotes User IDs: - 'Nachname, Vorname ' is not valid, because it contains an unquoted comma. It should be 'Nachname\, Vorname ' or '"Nachname, Vorname" '. (The same goes for dots, single quotes, etc.) - 'user@example.org ' is not valid, because it contains an unquoted at symbol. - 'Bj=?utf-8?q?=C3=B6?=rn ' is encoded using RFC 2047, which is what RFC 2822 mandates when using non-ASCII characters, but no OpenPGP software would decode this User ID. In practice, everyone just uses UTF-8 (in this case: 'Björn '). - There are many examples of User IDs containing raw email addresses ('user@example.org'). But, these are not "name-addr"s. At best, they are RFC 2822 "mailbox"es. - Some User IDs only contain a name (e.g, "Frank PGP"). - RFC 2822 also includes a lot of complexity that no one uses or needs. For instance, CFWS (comments and folding whitespace) can be placed everywhere, and the rules for parsing them are complex. - Instead of continuing to bend the RFC 2822 parser to our will, we instead accept reality. - This patch replaces the RFC 2822 parser with a significantly simpler parser, which is based on actual convention (i.e., User IDs in the wild). - This parser is based on dkg's mail to the OpenPGP working group mailing list. Message-ID: <87woe7zx7o.fsf@fifthhorseman.net> https://mailarchive.ietf.org/arch/msg/openpgp/wNo27-0STfGR9JZSlC7s6OYOJkI - This initial version has one notable regression with respect to the RFC 2822 parser: it doesn't handle User IDs holding URIs. --- tool/src/sq.rs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'tool/src') diff --git a/tool/src/sq.rs b/tool/src/sq.rs index ccc56738..b3f78255 100644 --- a/tool/src/sq.rs +++ b/tool/src/sq.rs @@ -292,7 +292,7 @@ fn real_main() -> Result<(), failure::Error> { let addr = m.value_of("address").map(|a| a.to_string()) .or_else(|| { if let Some(Ok(Some(a))) = - tpk.userids().nth(0).map(|u| u.userid().address()) + tpk.userids().nth(0).map(|u| u.userid().email()) { Some(a) } else { -- cgit v1.2.3