From 8693a005c08e1c84d693fe7baa154f8785007520 Mon Sep 17 00:00:00 2001
From: "Neal H. Walfield" <neal@pep.foundation>
Date: Wed, 6 Nov 2019 14:42:50 +0100
Subject: openpgp: Replace RFC 2822 parser with a de factor parser
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

  - RFC 4880 says that "by convention, [a User ID Packet] includes an
    RFC 2822 [RFC2822] mail name-addr."  This is not the actual
    convention, and attempting to parse User IDs using an RFC 2822
    parser means that many common User IDs cannot be parsed.

    - Disparities between the actual convention and the stated
      convention include:

      - Neither users nor the software they use to create keys
        correctly quotes User IDs:

        - 'Nachname, Vorname <name@example.org>' is not valid, because
          it contains an unquoted comma.  It should be 'Nachname\,
          Vorname <name@example.org>' or '"Nachname, Vorname"
          <name@example.org>'.  (The same goes for dots, single
          quotes, etc.)

        - 'user@example.org <user@example.org>' is not valid, because
          it contains an unquoted at symbol.

        - 'Bj=?utf-8?q?=C3=B6?=rn <bjoern@example.net>' is encoded
          using RFC 2047, which is what RFC 2822 mandates when using
          non-ASCII characters, but no OpenPGP software would decode
          this User ID.  In practice, everyone just uses UTF-8 (in
          this case: 'Björn <bjoern@example.net>').

      - There are many examples of User IDs containing raw email
        addresses ('user@example.org').  But, these are not
        "name-addr"s.  At best, they are RFC 2822 "mailbox"es.

      - Some User IDs only contain a name (e.g, "Frank PGP").

    - RFC 2822 also includes a lot of complexity that no one uses or
      needs.  For instance, CFWS (comments and folding whitespace) can
      be placed everywhere, and the rules for parsing them are
      complex.

  - Instead of continuing to bend the RFC 2822 parser to our will, we
    instead accept reality.

  - This patch replaces the RFC 2822 parser with a significantly
    simpler parser, which is based on actual convention (i.e., User
    IDs in the wild).

    - This parser is based on dkg's mail to the OpenPGP working group
      mailing list.

        Message-ID: <87woe7zx7o.fsf@fifthhorseman.net>
        https://mailarchive.ietf.org/arch/msg/openpgp/wNo27-0STfGR9JZSlC7s6OYOJkI

  - This initial version has one notable regression with respect to
    the RFC 2822 parser: it doesn't handle User IDs holding URIs.
---
 net/src/wkd.rs | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

(limited to 'net')
diff --git a/net/src/wkd.rs b/net/src/wkd.rs
index 02dee7ec..5b3422a2 100644
--- a/net/src/wkd.rs
+++ b/net/src/wkd.rs
@@ -240,7 +240,7 @@ fn parse_body<S: AsRef<str>>(body: &[u8], email_address: S)
         // method to maintain
         .filter(|tpk| {tpk.userids()
             .any(|uidb|
-                if let Ok(Some(a)) = uidb.userid().address() {
+                if let Ok(Some(a)) = uidb.userid().email() {
                     a == email_address
                 } else { false })
         }).cloned().collect();
@@ -350,7 +350,7 @@ pub fn insert<P, S, V>(base_path: P, domain: S, variant: V,
 
     // First, check which UserIDs are in `domain`.
     let addresses = tpk.userids().filter_map(|uidb| {
-        uidb.userid().address().unwrap_or(None).and_then(|addr| {
+        uidb.userid().email().unwrap_or(None).and_then(|addr| {
             if EmailAddress::from(&addr).ok().map(|e| e.domain == domain)
                 .unwrap_or(false)
             {
-- 
cgit v1.2.3