Private authentication protocols

Introduction

Authentication protocols are used to verify that network connections are not being monitored through a man-in-the-middle attack (MitM). But the commonly used constructions for authentication—often some framework surrounding a digital signature or key exchange protocol—reveal considerable amounts of identifying information to the participants (and MitMs). This information can potentially be used to track otherwise anonymous users around the network and correlate users across multiple services, if keys are reused.

Ultimately, key-based authentication protocols are trying to answer the question, "Does the remote party know the corresponding private key for an identity key we accept?" A protocol which answers this question and nothing else would naturally provide for authentication that is undetectable by MitMs: just make its usage mandatory and use random keys when no identity is expected. Such a protocol would also provide no avenue for leaking any identifying information beyond the absolute minimum needed to achieve authentication.

This is a work in progress to explore the possibilities and properties such protocols have. It lacks formal definitions and security proofs for the example protocols we have in mind, but these are being worked on.

Definition

We define a private authentication protocol (PAP) as a protocol that allows a challenger to establish whether their peer possesses a private key corresponding to one of (potentially) several acceptable public keys. It is intended to be used over an encrypted but unauthenticated connection.

There are two parties: the challenger and the responder. The challenger has a set of acceptable public keys $\mathbf{Q}$, with a publicly-known upper bound $n$. $\mathbf{Q}$ can be empty if no authentication is desired. The responder has a set of private keys $\mathbf{p}$ (possibly with a known upper bound $m$), with corresponding set of public keys $\mathbf{P}$. These sets may be empty if the responder has no key material. The challenger at the end outputs a Boolean: success or failure.

A PAP must have the following properties:

Two additional properties may be provided:

Note that while a PAP does not require the responder to output whether they are successful, doing so is also not in conflict with any of the required properties above. When a PAP has forward challenger privacy however, it is actually impossible for a responder to know whether they are successful.

A PAP as defined here is unidirectional. If the responder also wants to authenticate the challenger, it can be run a second time with the roles reversed. If done sequentially, the outcome of the protocol in one direction can be used as input for the next one, i.e. if the first run fails the first challenger can act as second responder with empty $\mathbf{p}$ (no private keys).

Discussion

These properties are primarily useful in the context of optional authentication. Imagine a setting where ephemeral encryption is automatic and mandatory but authentication is opportunistic: if an end-point expects a known identity it will authenticate, otherwise it won't and only get resistance against passive observation.

Selective interception attack

In this configuration, when the attempt at authentication is observable to an active attacker, a selective interception attack is possible that evades detection:

Challenger privacy allows mitigating this vulnerability: due to it, MitMs cannot distinguish PAP runs which do and don't desire authentication. Thus if all connections (even those that don't seek authentication) use a PAP, the MitM is forced to either drop all connections (becoming powerless while causing collateral damage) or risk being detected on every connection (as every PAP-employing connection could be an attempt to authenticate).

Reducing surveillance

As unauthenticated connections are an explicit use case, private authentication protocols assure the responder's privacy in the unauthenticated case. Responder privacy implies that the challenger cannot learn whether two separate protocol runs (in separate connections) were with peers that possess the same private key, effectively preventing the challenger from surveilling its unauthenticated peers and following them around.

Responder privacy also implies that the challenger does not learn which of its acceptable public keys the responder's private key corresponded to, in case there are multiple. To see why this may be useful, note that the anti-surveillance property from the previous paragraph breaks down whenever the challenger can run many instances of the protocol with separate acceptable keys, for a large set of (e.g. leaked) keys that may include the responder's public key. In order to combat this, the responder can limit the number of independent protocol runs it is willing to participate in. If the challenger could learn which acceptable public key the responder's private key corresponded to, this would need to be a limit on the total number of keys in all protocol runs combined, rather than the total number of protocol runs. If the challenger has hundreds of acceptable public keys, and the responder is one of them, the responder must support participating in a protocol with hundreds of acceptable keys—but doesn't have to accept participating in more than one protocol run.

Example protocols

Common notation and assumptions

We assume an encrypted but unauthenticated connection already exists between the two participants. We also assume a unique session id $s$ exists, only known to the participants. Both could for example be set up per a Diffie-Hellman negotiation.

$G$ and $M$ are two generators of an additively-denoted cyclic group in which the discrete logarithm problem is hard, and $M$'s discrete logarithm w.r.t. $G$ is not known to anyone. The $\cdot$ symbol denotes scalar multiplication (repeated application of the group operation). Lowercase variables refer to integers modulo the group order, and uppercase variables refer to group elements. $h$ refers to a hash function onto the integers modulo the group order, modeled as a random oracle. Sets are denoted in bold, and considered serialized by concatenating their elements in sorted order.

The set of acceptable public keys $\mathbf{Q}$ consists of group elements $Q_{0}$, $Q_{1}$, ..., $Q_{n-1}$. The set of the responder's private keys is $\mathbf{p}$, consisting of integers $p_{0}$, $p_{1}$, ..., $p_{m-1}$. The corresponding set of public keys is $\mathbf{P}$, consisting of $P_{0} = p_{0} \cdot G$, $P_{1} = p_{1} \cdot G$, ..., $P_{m-1} = p_{m-1} \cdot G$.

In case fewer than $n$ acceptable public keys exist, the $Q_{i}$ values are padded with randomly generated ones. In case no authentication is desired, all of them are randomly generated. Similarly, if a protocol has an upper bound on the number of private keys $m$, and fewer keys than that are present, it is padded with randomly generated ones.

Terms used in the security properties:

Partial solutions

Here we give a few examples of near solutions which don't provide all desired properties simultaneously. This demonstrates how easy some of the properties are in isolation, while being nontrivial to combine them.

Example 1: no responder privacy

If we do not care about responder privacy, it is very simple. The responder just reports their public key and a signature with it. For a single private key version ($m=1$) that is:

This clearly provides challenger privacy, as the challenger does not send anything at all. It however does not provide responder privacy, as the responder unconditionally reveals their public key.

Example 2: only responder privacy when failing, no challenger privacy

It seems worthwhile to try to only have the responder reveal their key in case of success. In order to do that, it must know whether its key is acceptable. In this first attempt, we compromise by giving up challenger privacy.

Now there is obviously no challenger privacy as $\mathbf{Q}$ is revealed directly. Yet, we have not recovered responder privacy either, as the matching public key is revealed in case of success.

Example 3: single key, no challenger privacy

In case there is at most a single acceptable public key $Q_{0}$ ($n=1$), responder privacy can be recovered:

Yet, challenger privacy is still obviously lacking. It is possible to improve upon this somewhat by e.g. sending $h(Q_{0} || s)$ instead of $Q_{0}$ directly. While that indeed means the key is not revealed directly anymore, an attacker who has a list of candidate keys can still test for matches, and challenger privacy requires that no information about the key can be inferred at all.

Private authentication protocols

We conjecture that the following protocols do provide all the properties needed for a PAP under reasonable assumptions.

Example 4: simple multi-key PAP

To achieve responder privacy in the multi-key case, while simultaneously retaining challenger privacy, we need a different approach. The idea is to perform a Diffie-Hellman key exchange between an ephemeral key ($d$ below) and the acceptable public keys, and use the results to blind a secret value $y$, whose hash is revealed to the responder. If the responder can recover $y$, they must have one of the corresponding private keys.

The result is a multi-acceptable-key ($n \geq 1$), unbounded-private-keys ($m$ need not be publicly known), single-roundtrip PAP with $O(n)$ communication cost. Scalar and hash operations scale with $O(mn)$, but group operations only with $O(n)$.

Conjectured properties:

It has no forward challenger privacy, as responders learn which of their private key(s) was acceptable.

Example 5: Countersign

If we want forward challenger privacy, we must go even further. We again perform a Diffie-Hellman exchange between an ephemeral key and the acceptable public key (now restricted to just a single one), but then use a (unidirectional) variant of the socialist millionaire protocol to verify both sides reached the same shared secret. This is similar to the technique used in the Off-the-Record protocol for authentication, as well as in SPAKE2 for verifying passwords.

The result is a protocol we call Countersign: a single-acceptable-key ($n=1$), multi-private-key ($m \geq 1$), single-roundtrip PAP with both forward challenger privacy and forward responder privacy. It has $O(m)$ communication and computational cost.

Conjectured properties:

Because of forward challenger privacy, this protocol does not let the responder learn whether they are successful themselves.

Note that one cannot simply run this protocol multiple times with different acceptable keys to obtain an $n>1$ PAP, because the challenger would learn which acceptable key succeeded, violating responder privacy.

Also interesting is that there appears to be an inherent trade-off between unconditional challenger privacy and unconditional responder privacy, and both cannot exist simultaneously. Responder privacy seems to imply that the messages sent by the challenger must form a secure commitment to the set $\mathbf{Q}$. If this wasn't the case and challengers could "open" it to other keys, then nothing would prevent them from learning about intersections between $\mathbf{P}$ and these other keys as well. Thus responder privacy seems to imply that the challenger must bind to their keys, while challenger privacy requires hiding the keys. Binding to and hiding the same data cannot both be achieved unconditionally.

Example 6: constructing a PAP from private set intersection

There is a striking similarity between PAPs and private set intersection protocols (PSIs); both are related to finding properties of the intersection between two sets of elements in a private way. The differences are:

With certain restrictions, it is possible to exploit this similarity and build PAPs out of (variations of) PSIs. We first need to convert the private keys $\mathbf{p}$ and public keys $\mathbf{Q}$ to sets of symmetric elements that can be compared. To do so, we repeat the Diffie-Hellman trick from the previous protocols:

The PAP problem is now reduced to the challenger learning whether the intersection between $\mathbf{e}$ and $\mathbf{f}$ is non-empty. Depending on the conditions there are various ways to accomplish this with PSIs. In all cases, the PAP properties for correctness, security against impersonation, challenger privacy, and (forward) responder privacy then follow from CDH plus the PSI's security and privacy properties.

Comparison of applicability

Countersign is the most private protocol in the single-acceptable-key setting. It appears most useful in situations where a connection initiator knows who they are trying to connect to (implicitly limiting to $n=1$).

The multi-key protocol is significantly more flexible, but lacks forward challenger privacy, and thus more strongly relies on keeping private key material private, even after decommissioning.

A potentially useful composition of the two is using Countersign for the connection initiator trying to authenticate the connection acceptor, but then using the multi-key protocol for the other direction. In case the first protocol fails, the second direction can run without private keys.

Acknowledgments

Thanks to Greg Maxwell for the idea behind Countersign, and the rationale for private authentication protocols and optional authentication. Thanks to Tim Ruffing for the simple multi-key PAP, discussions, and feedback. Thanks to Mark Erhardt for proofreading.

Changelog