Internet-Draft DKIM Replay: Problem Statement March 2023
Crocker Expires 9 September 2023 [Page]
Intended Status:
D. Crocker
Brandenburg InternetWorking

DKIM Replay: Problem Statement


DomainKeys Identified Mail (DKIM, RFC6376) permits claiming some responsibility for a message by cryptographically associating a domain name with the message. For data covered by the cryptographic signature, this also enables detecting changes made during transit. DKIM survives basic email relaying. In a Replay Attack, a recipient of a DKIM-signed message re-posts the message to other recipients, while retaining the original, validating signature, and thereby leveraging the reputation of the original signer. This document discusses the resulting damage to email delivery, interoperability, and associated mail flows. A significant challenge to mitigating this problem is that it is difficult for receivers to differentiate between legitimate forwarding flows and a DKIM Replay Attack.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 9 September 2023.

Table of Contents

1. Introduction

DKIM is a well-established email protocol RFC6376:

1.1. The problem

The presence of a DKIM signature serves as a basis for developing an assessment of mail received, over time, using that signature. That assessment constitutes a reputation, which then serves to guide future handling of mail arriving with a DKIM signature for that domain name. The presence of a validated DKIM signature was designed to ensure that the developed reputation is the result of activity only by the domain owner, and not by other, independent parties. That is, it defines a 'clean' channel of behavior by the domain owner, with no 'noise' introduced by other actors.

A receiving filtering system contains a rich array of rules and heuristics for assessing email, for protecting users against spam, phishing, and other abuses. DKIM therefore provides an identity that this system can use for reputation assessment and prediction of future sender behavior.

During development of the DKIM specification, DKIM Replay was identified as only of hypothetical concern. However, that attack has become commonplace:

Internet Mail permits sending a message to addresses that are not listed in the content To:, Cc: or Bcc: header fields. Although DKIM covers portions of the message content, and can cover these header fields, it does not cover the envelope addresses, used by the email transport service, for determining handling behaviors. So this message can then be replayed to arbitrary thousands or millions of other recipients, none of whom were specified by the original author.

That is, DKIM Replay takes a message with a valid DKIM signature, and distributes it widely to many additional recipients, without breaking the signature.

Therefore, DKIM Replay is impossible to detect or prevent with current standards and practices. Simply put, email authentication does not distinguish benign re-posting flows from a DKIM Replay Attack.

ARC RFC8617 is a protocol to securely propagate authentication results seen by Mediators that re-post a message, such as mailing lists. It can be used to adjust DMARC RFC7489 validation as described in section 7.2.1. Because ARC is heavily based on DKIM it has the same "replay" issue as described in section 9.5.

1.2. Glossary

Modern email operation often involves many actors and many different actions. This section attempts to identify those relevant to Replay Attacks.

This document is only Informative and omits the normative language defined in RFC2119. Mail architectural terminology that is used here is from RFC5598 and RFC5321.

RFC5598 defines mail interactions conceptually from three perspectives of activities, divided into three types of roles:

Also, as noted in RFC5598, a given implementation might perform multiple roles.

It is useful to broadly identify participants in mail handling by functionality as defined in RFC5598 as:

In addition, a user interacts with the handling service via a:

The following is a subset of the Mail Handling Services defined in RFC5598 to be used in this document:

Any of these actors, as well as those below, can add trace and operational header fields.

Modern email often includes additional services. Four that are relevant to DKIM Replay are:

The above services can use email authentication as defined in the following specifications:

2. Mail Flow Scenarios

The following section categorizes the different mail flows by a functional description, email authentication and recipient email header fields.

2.1. Basic types of flows

2.2. Direct examples

2.3. Indirect Examples

Indirect mail flows break SPF validation, unless the Mediator is listed in the SPF record. This is almost never the case.

3. DKIM Replay

3.1. Scenario

A spammer will find a mailbox provider with a high reputation and that signs their message with DKIM. The spammer sends a message with spam content from there to a mailbox the spammer controls. This received message is sometimes updated with additional header fields such as To: and Subject: that do not damage the existing DKIM signature, if those fields were not covered by the DKIM signature. The resulting message is then sent at scale to target recipients. Because the message signature is for a domain name with a high reputation, the message with spam content is more likely to get through to the inbox. This is an example of a spam classification false negative incorrectly assessing spam to not be spam.

When large amounts of such spam are sent to a single mailbox provider -- or through a filtering service with access to data across multiple mailbox providers -- the operator's filtering engine will eventually react by dropping the reputation of the original DKIM signer. Benign mail from the signer's domain then starts to go to the spam folder. For the benign mail, this is an example of a spam classification false positive.

In both cases, mail that is potentially wanted by the recipient becomes much harder to find, reducing its utility to the recipient (and the author.) In the first case, the wanted mail is mixed with potentially large quantities of spam. In the second case, the wanted mail is put in the spam folder.

3.2. Direct Flows

Legitimate mail might have a valid DKIM signature and no associated SPF record.

So might a Replay attack.

3.3. Indirect Flows

Example benign indirect flows are outbound and inbound gateway, mailing lists, and forwarders. This legitimate mail might have a valid DKIM signature, and SPF validation that is not aligned with the content From:

So might a Replay attack.

4. Replay technical characteristics

A message that has been replayed will typically show these characteristics:

5. Basic solution space

As can be seen from the above discussion, there is no straightforward way to detect DKIM Replay for an individual message, and possibly nothing completely reliable even in the aggregate. The challenge, then, is to look for passive analysis that might provide a good heuristic, as well as active measures by the author's system to add protections.

Here are some potential solutions to the problem, and their pros and cons:

Include the SMTP RCPT-TO address in the DKIM signature:

Cache known DKIM signatures, to support aggregate analysis:

Strip DKIM signatures on mailbox delivery:

Shorten DKIM signature key lifetime:

Add a per-hop signature, specifying the destination domain for the next hop:

6. Security Considerations

Author's Address

Dave Crocker
Brandenburg InternetWorking
675 Spruce Drive
Sunnyvale, CA 94086
United States of America