Difference between revisions of "What is a Mix Network?"

From xx network wiki
Jump to navigation Jump to search
m (Protected "What is a Mix Network" ([Edit=Allow only administrators] (indefinite) [Move=Allow only administrators] (indefinite)))
m (Clean up the claims of what a mixnet provides and expand on what metadata means in the first paragraph.)
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{DISPLAYTITLE:What is a Mix Network?}}
A mix network (also known as a mixnet) is a type of anonymous communication network that protects the communication between individuals on the network with protocols that resist traffic analysis. While most current internet traffic is encrypted to protect its contents, analyzing traffic patterns can leak a staggering amount of metadata that allows attackers to infer identity and relationships between users as well as the potential nature of the content of their messages. Encryption provides message confidentiality and integrity but does nothing to hide who is sending and receiving said messages. The identities of the communicating parties using the Internet Protocol (IPv4/IPv6) are typically leaked to many intermediary Internet infrastructure operators in the network.


A mix network (also known as mixnet) is a type of anonymous communication network that reduces the amount of metadata leaked onto the network. Another way of expressing this idea of anonymous communication is to say that the communications protocol resists traffic analysis. These days, most Internet traffic is encrypted and it is important to note that traffic analysis can yield a great deal of metadata. Encryption only gives us message confidentiality and integrity but does nothing to hide who is talking to whom. The identities of the communication parties using the Internet Protocol (IPv4/IPv6) are in fact leaked to many network intermediary Internet infrastructure operators.
Other examples of metadata leaked by common network protocols include:


Other examples of the metadata leaked by all common network protocols are:
* Geographic location
* Message sender
* Message receiver
* Message sent time
* Message received time
* Message size
* Sequence of messages
* Frequency of sent messages
* Frequency of received messages


* geographic location
An encrypted messaging system that leaks the identity of message senders and recipients also leaks entire social graphs of its users. Often these so-called “secure messaging” systems are a group of centralized servers owned by one company, such as Amazon web hosting. These systems act as focal points for metadata collection by a variety of parties, including retailers and government actors.
* message sender
* message receiver
* message sent time
* message receive time
* message size
* sequence of messages
* frequency of sent messages
* frequency of received messages


Additionally, we would like to point out that an encrypted messaging system that leaks message senders and receives, does in fact leak entire social graphs of its users. Often these so-called “secure messaging” systems are a group of centralized servers owned by one company or perhaps all hosted on Amazon; these systems act as focal points for metadata collection by nation states.
Since David Chaum published his seminal paper on anonymous communication, “[https://www.freehaven.net/anonbib/cache/chaum-mix.pdf Untraceable electronic mail, return addresses, and digital pseudonyms]” in 1979, over 40 years of academic research have taken place to explore potential anonymous communication options. However, mix networks remain the most practical in terms of computational overhead, latency, and bandwidth overhead.


Since David Chaum’s seminal paper on anonymous communication “Untraceable electronic mail, return addresses, and digital pseudonyms” <nowiki>https://www.freehaven.net/anonbib/cache/chaum-mix.pdf</nowiki> published in 1979, there have been over 40 years of academic research. However mix networks remain the most practical in terms of computational overhead, latency and bandwidth overhead. All of the other options for anonymous network designs seem to have impractical tradeoffs at this time. Certainly Tor is the most widely used anonymous communications network however it does not offer strong anonymity properties. Tor is trivially broken by short-term timing correlation attacks and we can say with confidence that Tor’s privacy notions are trivially violated by sufficiently global adversaries such as the NSA in collaboration with the Five Eyes <nowiki>https://en.wikipedia.org/wiki/Five_Eyes</nowiki> Therefore in our view mix networks remain the very best option for strong anonymity on a massive scale.
Other options currently available for anonymous network design tend to have downsides that limit their performance. Tor is the most widely used anonymous communications network; however, it does not offer strong anonymity properties. Tor can be trivially broken with short-term timing correlation attacks. With clear benefits and protection against such attacks, mix networks remain the best option for strong anonymity on a massive scale.


All that having been said, all anonymous communication networks have a remarkable kind of tension between latency, bandwidth and anonymity which means that you can have strong anonymity and low bandwidth or strong anonymity and low latency but not both. The details of this are described in the “Anonymity Trilemma: Strong Anonymity, Low Bandwidth Overhead, Low Latency - Choose Two”. <nowiki>https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8418599</nowiki>
That being said, all anonymous communication networks face a kind of tension in striking a balance between latency, bandwidth, and anonymity. This typically means that you can have strong anonymity and low bandwidth or strong anonymity and low latency but not both. The details of this are described in the paper “[https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8418599 Anonymity Trilemma: Strong Anonymity, Low Bandwidth Overhead, Low Latency - Choose Two]”.


Mix networks are message oriented and are usually only designed to support applications which use a low amount of bandwidth and have at least some latency tolerance. In the case of the Elixxir mix network (also known as the xx network), the first application of the network is the xx messenger, an encrypted chat client capable of group and one to one chats.
Mix networks are message-oriented and are usually only designed to support low-bandwidth applications with at least some latency tolerance. In the case of the xx network (also known as the Elixxir mix network), the first application of the network is in xx messenger, an encrypted messaging platform where users can participate in one-to-one and group chats.


Mix networks are composed of two distinct types of network nodes:
Mix networks are composed of two distinct types of network nodes:


# Mix nodes (also known as mixes): Cryptographic routers which are composed into a network known as a mix network.
# Mix nodes (also known as mixes): Cryptographic routers that are part of a network known as a mix network.
# PKI: Infrastructure responsible for distributing public key materials and network connection information which is needed for the mix network to operate.
# PKI: Infrastructure responsible for distributing public key materials and network connection information that is needed for the mix network to operate.


The root of all authority within the mix network is encapsulated by it’s PKI system therefore it’s important for this component to be decentralized. If the PKI system were compromised by an adversary wishing to violate the privacy notions of the mix network then they could do so by replacing all the mix nodes with their own where all the secret key material is known to them.
The root of all authority within a mix network is captured in its PKI system, making decentralization of this component a top priority for developers. If the PKI system were compromised by an adversary wishing to violate the privacy of a mix network, they could do so by replacing the network’s mix nodes with their own where all the secret key material is known to them.


A mix is a specialized cryptographic router whose purpose is to hide the linkage between ingress and egress messages from the perspective of a passive network observer. This is done by the mix node receiving many messages and then using bitwise unlinkability and added latency to prevent a passive network observer from knowing which ingress message corresponds with which egress message.
A ''mix'' is a specialized cryptographic router that hides the link between incoming and outgoing messages from the perspective of a passive network observer. The mix node receives many messages and then uses bitwise unlinkability and added latency to prevent a passive network observer from knowing which incoming message corresponds with which outgoing one.


Messages undergo cryptographic transformations by the mix node which yields the property of bitwise unlinkability. This means that egress messages cannot be linked to ingress messages by any of their bit patterns in the message.
Messages undergo cryptographic transformations by the mix node, a process that creates bitwise unlinkability. xx network messages are highly secure in this regard: outgoing messages cannot be linked to incoming ones via bit patterns contained in the message data.


Mix nodes also need to add some latency in order to prevent trivial timing correlation. If we take the Elixxir mix network as an example, a batch mix with 1000 message slots. A passive adversary would have a probability of 1/1000 = .001 of guessing which egress message a given ingress message corresponds to. It can be said that some latency is incurred by awaiting the arrival of the messages to fill the ingress message slots. Currently the Elixxir mix net has an overall latency of about 2.5 seconds. Whereas continuous time mixing strategies such as Stop and Go or the Poisson mix add random latency for each hop through the mix network and must be tuned such that mix nodes always have some messages dwelling within their mix queues.
Mix nodes also need to add latency in order to prevent attacks that are correlated with timing patterns. The xx network, for example, is a batch mix with 1000 message slots. A passive adversary would have a probability of 1/1000, or a .1% chance, of guessing which outgoing message corresponds with an incoming message.


A single mix is enough to give us the strong anonymity properties we are looking for. However a single mix is an undesirable single point of failure. Therefore mix networks use many mixes, at least three, for each route through the network. Ideally these mix nodes would be operated by several entities that can be relied upon to not collude against the users of the network.
Some latency is created by waiting for messages to arrive. Currently, the xx network has an overall latency of about 2.5 seconds. In comparison, continuous-time mixing strategies such as Stop and Go or the Poisson mix add random latency for each hop through the mix network. Furthermore, these approaches must be tuned so that mix nodes always have some messages dwelling within their mix queues, which is a more complicated balance to achieve.


A mix cascade is a fixed ordering of mix nodes that collaborate in routing messages. Let’s say we’ve got three mix nodes in a cascade: an entry mix, a middle mix and an exit mix. The entry mix sees the client’s locations (e.g. IPv4 addresses). The middle mix is not able to see client sources or destinations. The exit mix only sees the message destinations. However as long as one of these mixes is honest we still have the input/output message unlinkability property needed. That is referred to as the Any Trust security model, a term coined by Bryon Ford, in some of his papers. (Add citations?)
A single mix is enough to ensure a strongly anonymous platform that protects user data. However, a single mix also means a single point of failure. Therefore, mix networks use many mixes, at least three, for each route through the network. Ideally, these mix nodes would be operated by several entities that are proven to be reliable and trustworthy.


There are dozens of mixing strategies and several ways to categorize them, for example:
[[File:Three Node Mix Network.png|509x330px|center|alt=Three Node Mix Network]]


# Continuous Time Mixing Strategies Versus Batch Mixing Strategies
A mix cascade is a fixed ordering of mix nodes that collaborate in routing messages. For example, there could be three mix nodes in a cascade: an entry mix, a middle mix, and an exit mix. The entry mix sees the client’s locations (e.g., IPv4 addresses). The middle mix is not able to see client sources or destinations. The exit mix only sees the message destinations. However, as long as one of these mixes is honest and accurate, the input/output message unlinkability property needed to guarantee privacy is upheld. This type of network is referred to as the ''Any Trust security model'', a term coined by Bryon Ford in his papers such as “[https://dedis.cs.yale.edu/dissent/papers/eurosec12.pdf Scalable Anonymous Group Communication in the Anytrust Model]”.
# Verified Mixing Versus Decryption Mix Networks


The Stop and Go mixing strategy is an example of a continuous time mixing strategy; in this mixing strategy the network client specifies the latency for each hop by means of egress and ingress times for each mix hop through the network. Whereas a threshold mixing strategy is an example of a batch mixing strategy as I mentioned above. Paul Syverson made an interesting observation that continuous time mixing strategies imply no statistical interference between input and output messages whereas all other mixing strategies imply statistical interference. See “Sleeping dogs lie on a bed of onions but wake
There are dozens of mixing strategies and several ways to categorize them, including:


when mixed” <nowiki>https://petsymposium.org/2011/papers/hotpets11-final10Syverson.pdf</nowiki>
# Continuous-time mixing strategies vs. Batch mixing strategies
# Verified mixing vs. Decryption mix networks


Likewise there are many possibilities for mix network topologies such as:
The Stop and Go mixing strategy is an example of a continuous-time mixing strategy, where the network client specifies the latency for each hop by means of egress and ingress times for each mix hop through the network. On the other hand, a threshold mixing strategy is an example of a batch mixing strategy as described above. Globally recognized computer scientist Paul Syverson made an interesting observation that continuous-time mixing strategies imply no statistical interference between input and output messages, whereas all other mixing strategies imply statistical interference. See his paper [https://petsymposium.org/2011/papers/hotpets11-final10Syverson.pdf “Sleeping dogs lie on a bed of onions but wake when mixed]” for more details.
 
Similarly, there are many possibilities for mix network topologies, including:


# Multi-cascade
# Multi-cascade
Line 57: Line 59:
# Stratified Topology with Restricted Routes
# Stratified Topology with Restricted Routes


For an interesting discussion of the pros and cons of various network topologies see “Impact of Network Topology on Anonymity and Overhead in Low-Latency Anonymity Networks” <nowiki>https://www.freehaven.net/anonbib/cache/topology-pet2010.pdf</nowiki>
For an interesting discussion of the pros and cons of various network topologies, see “[https://www.freehaven.net/anonbib/cache/topology-pet2010.pdf Impact of Network Topology on Anonymity and Overhead in Low-Latency Anonymity Networks]”.


cMix refers to a verified mixing strategy which means that the mix nodes can provide a proof that mixing was done correctly. cMix is designed to be used with the Multi-cascade topology where the mix network’s PKI publishes many mix cascades that clients can choose to use for routing their messages over the network.
cMix refers to a verified mixing strategy where the mix nodes can provide proof that mixing was done correctly. cMix is designed to be used with the multi-cascade topology where the mix network’s PKI publishes numerous mix cascades that clients can choose for routing their messages over the network.


Read more about cMix here → link to cMix wiki page.
[[What is cMix?|Read more about cMix.]]

Latest revision as of 16:53, 28 June 2022

A mix network (also known as a mixnet) is a type of anonymous communication network that protects the communication between individuals on the network with protocols that resist traffic analysis. While most current internet traffic is encrypted to protect its contents, analyzing traffic patterns can leak a staggering amount of metadata that allows attackers to infer identity and relationships between users as well as the potential nature of the content of their messages. Encryption provides message confidentiality and integrity but does nothing to hide who is sending and receiving said messages. The identities of the communicating parties using the Internet Protocol (IPv4/IPv6) are typically leaked to many intermediary Internet infrastructure operators in the network.

Other examples of metadata leaked by common network protocols include:

  • Geographic location
  • Message sender
  • Message receiver
  • Message sent time
  • Message received time
  • Message size
  • Sequence of messages
  • Frequency of sent messages
  • Frequency of received messages

An encrypted messaging system that leaks the identity of message senders and recipients also leaks entire social graphs of its users. Often these so-called “secure messaging” systems are a group of centralized servers owned by one company, such as Amazon web hosting. These systems act as focal points for metadata collection by a variety of parties, including retailers and government actors.

Since David Chaum published his seminal paper on anonymous communication, “Untraceable electronic mail, return addresses, and digital pseudonyms” in 1979, over 40 years of academic research have taken place to explore potential anonymous communication options. However, mix networks remain the most practical in terms of computational overhead, latency, and bandwidth overhead.

Other options currently available for anonymous network design tend to have downsides that limit their performance. Tor is the most widely used anonymous communications network; however, it does not offer strong anonymity properties. Tor can be trivially broken with short-term timing correlation attacks. With clear benefits and protection against such attacks, mix networks remain the best option for strong anonymity on a massive scale.

That being said, all anonymous communication networks face a kind of tension in striking a balance between latency, bandwidth, and anonymity. This typically means that you can have strong anonymity and low bandwidth or strong anonymity and low latency but not both. The details of this are described in the paper “Anonymity Trilemma: Strong Anonymity, Low Bandwidth Overhead, Low Latency - Choose Two”.

Mix networks are message-oriented and are usually only designed to support low-bandwidth applications with at least some latency tolerance. In the case of the xx network (also known as the Elixxir mix network), the first application of the network is in xx messenger, an encrypted messaging platform where users can participate in one-to-one and group chats.

Mix networks are composed of two distinct types of network nodes:

  1. Mix nodes (also known as mixes): Cryptographic routers that are part of a network known as a mix network.
  2. PKI: Infrastructure responsible for distributing public key materials and network connection information that is needed for the mix network to operate.

The root of all authority within a mix network is captured in its PKI system, making decentralization of this component a top priority for developers. If the PKI system were compromised by an adversary wishing to violate the privacy of a mix network, they could do so by replacing the network’s mix nodes with their own where all the secret key material is known to them.

A mix is a specialized cryptographic router that hides the link between incoming and outgoing messages from the perspective of a passive network observer. The mix node receives many messages and then uses bitwise unlinkability and added latency to prevent a passive network observer from knowing which incoming message corresponds with which outgoing one.

Messages undergo cryptographic transformations by the mix node, a process that creates bitwise unlinkability. xx network messages are highly secure in this regard: outgoing messages cannot be linked to incoming ones via bit patterns contained in the message data.

Mix nodes also need to add latency in order to prevent attacks that are correlated with timing patterns. The xx network, for example, is a batch mix with 1000 message slots. A passive adversary would have a probability of 1/1000, or a .1% chance, of guessing which outgoing message corresponds with an incoming message.

Some latency is created by waiting for messages to arrive. Currently, the xx network has an overall latency of about 2.5 seconds. In comparison, continuous-time mixing strategies such as Stop and Go or the Poisson mix add random latency for each hop through the mix network. Furthermore, these approaches must be tuned so that mix nodes always have some messages dwelling within their mix queues, which is a more complicated balance to achieve.

A single mix is enough to ensure a strongly anonymous platform that protects user data. However, a single mix also means a single point of failure. Therefore, mix networks use many mixes, at least three, for each route through the network. Ideally, these mix nodes would be operated by several entities that are proven to be reliable and trustworthy.

Three Node Mix Network

A mix cascade is a fixed ordering of mix nodes that collaborate in routing messages. For example, there could be three mix nodes in a cascade: an entry mix, a middle mix, and an exit mix. The entry mix sees the client’s locations (e.g., IPv4 addresses). The middle mix is not able to see client sources or destinations. The exit mix only sees the message destinations. However, as long as one of these mixes is honest and accurate, the input/output message unlinkability property needed to guarantee privacy is upheld. This type of network is referred to as the Any Trust security model, a term coined by Bryon Ford in his papers such as “Scalable Anonymous Group Communication in the Anytrust Model”.

There are dozens of mixing strategies and several ways to categorize them, including:

  1. Continuous-time mixing strategies vs. Batch mixing strategies
  2. Verified mixing vs. Decryption mix networks

The Stop and Go mixing strategy is an example of a continuous-time mixing strategy, where the network client specifies the latency for each hop by means of egress and ingress times for each mix hop through the network. On the other hand, a threshold mixing strategy is an example of a batch mixing strategy as described above. Globally recognized computer scientist Paul Syverson made an interesting observation that continuous-time mixing strategies imply no statistical interference between input and output messages, whereas all other mixing strategies imply statistical interference. See his paper “Sleeping dogs lie on a bed of onions but wake when mixed” for more details.

Similarly, there are many possibilities for mix network topologies, including:

  1. Multi-cascade
  2. Cascade
  3. Free route
  4. Stratified Topology
  5. Stratified Topology with Restricted Routes

For an interesting discussion of the pros and cons of various network topologies, see “Impact of Network Topology on Anonymity and Overhead in Low-Latency Anonymity Networks”.

cMix refers to a verified mixing strategy where the mix nodes can provide proof that mixing was done correctly. cMix is designed to be used with the multi-cascade topology where the mix network’s PKI publishes numerous mix cascades that clients can choose for routing their messages over the network.

Read more about cMix.