doc/handbook/chapters/keyconcepts.texi

   1
   2 @cindex Key Concepts
   3 @node Key Concepts
   4 @chapter Key Concepts
   5
   6 In this section, the fundamental concepts of GNUnet are explained.
   7 @c FIXME: Use @uref{https://docs.gnunet.org/bib/, research papers}
   8 @c once we have the new bibliography + subdomain setup.
   9 Most of them are also described in our research papers.
  10 First, some of the concepts used in the GNUnet framework are detailed.
  11 The second part describes concepts specific to anonymous file-sharing.
  12
  13 @menu
  14 * Authentication::
  15 * Accounting to Encourage Resource Sharing::
  16 * Confidentiality::
  17 * Anonymity::
  18 * Deniability::
  19 * Peer Identities::
  20 * Zones in the GNU Name System (GNS Zones)::
  21 * Egos::
  22 @end menu
  23
  24 @cindex Authentication
  25 @node Authentication
  26 @section Authentication
  27
  28 Almost all peer-to-peer communications in GNUnet are between mutually
  29 authenticated peers. The authentication works by using ECDHE, that is a
  30 DH (Diffie---Hellman) key exchange using ephemeral elliptic curve
  31 cryptography. The ephemeral ECC (Elliptic Curve Cryptography) keys are
  32 signed using ECDSA (@uref{http://en.wikipedia.org/wiki/ECDSA, ECDSA}).
  33 The shared secret from ECDHE is used to create a pair of session keys
  34 @c FIXME: Long word for HKDF. More FIXMEs: Explain MITM etc.
  35 (using HKDF) which are then used to encrypt the communication between the
  36 two peers using both 256-bit AES (Advanced Encryption Standard)
  37 and 256-bit Twofish (with independently derived secret keys).
  38 As only the two participating hosts know the shared secret, this
  39 authenticates each packet
  40 without requiring signatures each time. GNUnet uses SHA-512
  41 (Secure Hash Algorithm) hash codes to verify the integrity of messages.
  42
  43 @c FIXME: A while back I got the feedback that I should try and integrate
  44 @c explanation boxes in the long-run. So we could explain
  45 @c "man-in-the-middle" and "man-in-the-middle attacks" and other words
  46 @c which are not common knowledge. MITM is not common knowledge. To be
  47 @c selfcontained, we should be able to explain words and concepts used in
  48 @c a chapter or paragraph without hinting at Wikipedia and other online
  49 @c sources which might not be available or accessible to everyone.
  50 @c On the other hand we could write an introductionary chapter or book
  51 @c that we could then reference in each chapter, which sound like it
  52 @c could be more reusable.
  53 In GNUnet, the identity of a host is its public key. For that reason,
  54 man-in-the-middle attacks will not break the authentication or accounting
  55 goals. Essentially, for GNUnet, the IP of the host has nothing to do with
  56 the identity of the host. As the public key is the only thing that truly
  57 matters, faking an IP, a port or any other property of the underlying
  58 transport protocol is irrelevant. In fact, GNUnet peers can use
  59 multiple IPs (IPv4 and IPv6) on multiple ports --- or even not use the
  60 IP protocol at all (by running directly on layer 2).
  61 @c FIXME: "IP protocol" feels wrong, but could be what people expect, as
  62 @c IP is "the number" and "IP protocol" the protocol itself in general
  63 @c knowledge?
  64
  65 @c NOTE: For consistency we will use @code{HELLO}s throughout this Manual.
  66 GNUnet uses a special type of message to communicate a binding between
  67 public (ECC) keys to their current network address. These messages are
  68 commonly called @code{HELLO}s or @code{peer advertisements}.
  69 They contain the public key of the peer and its current network
  70 addresses for various transport services.
  71 A transport service is a special kind of shared library that
  72 provides (possibly unreliable, out-of-order) message delivery between
  73 peers.
  74 For the UDP and TCP transport services, a network address is an IP and a
  75 port.
  76 GNUnet can also use other transports (HTTP, HTTPS, WLAN, etc.) which use
  77 various other forms of addresses. Note that any node can have many
  78 different active transport services at the same time,
  79 and each of these can have a different addresses.
  80 Binding messages expire after at most a week (the timeout can be
  81 shorter if the user configures the node appropriately).
  82 This expiration ensures that the network will eventually get rid of
  83 outdated advertisements.
  84
  85 For more information, refer to the following paper:
  86
  87 Ronaldo A. Ferreira, Christian Grothoff, and Paul Ruth.
  88 A Transport Layer Abstraction for Peer-to-Peer Networks
  89 Proceedings of the 3rd International Symposium on Cluster Computing
  90 and the Grid (GRID 2003), 2003.
  91 (@uref{https://git.gnunet.org/bibliography.git/plain/docs/transport.pdf, https://git.gnunet.org/bibliography.git/plain/docs/transport.pdf})
  92
  93 @cindex Accounting to Encourage Resource Sharing
  94 @node Accounting to Encourage Resource Sharing
  95 @section Accounting to Encourage Resource Sharing
  96
  97 Most distributed P2P networks suffer from a lack of defenses or
  98 precautions against attacks in the form of freeloading.
  99 While the intentions of an attacker and a freeloader are different, their
 100 effect on the network is the same; they both render it useless.
 101 Most simple attacks on networks such as @command{Gnutella}
 102 involve flooding the network with traffic, particularly
 103 with queries that are, in the worst case, multiplied by the network.
 104
 105 In order to ensure that freeloaders or attackers have a minimal impact
 106 on the network, GNUnet's file-sharing implementation (@code{FS} tries
 107 to distinguish good (contributing) nodes from malicious (freeloading)
 108 nodes. In GNUnet, every file-sharing node keeps track of the behavior
 109 of every other node it has been in contact with. Many requests
 110 (depending on the application) are transmitted with a priority (or
 111 importance) level.  That priority is used to establish how important
 112 the sender believes this request is. If a peer responds to an
 113 important request, the recipient will increase its trust in the
 114 responder: the responder contributed resources.  If a peer is too busy
 115 to answer all requests, it needs to prioritize.  For that, peers do
 116 not take the priorities of the requests received at face value.
 117 First, they check how much they trust the sender, and depending on
 118 that amount of trust they assign the request a (possibly lower)
 119 effective priority. Then, they drop the requests with the lowest
 120 effective priority to satisfy their resource constraints. This way,
 121 GNUnet's economic model ensures that nodes that are not currently
 122 considered to have a surplus in contributions will not be served if
 123 the network load is high.
 124
 125 For more information, refer to the following paper:
 126 Christian Grothoff. An Excess-Based Economic Model for Resource
 127 Allocation in Peer-to-Peer Networks. Wirtschaftsinformatik, June 2003.
 128 (@uref{https://git.gnunet.org/bibliography.git/plain/docs/ebe.pdf, https://git.gnunet.org/bibliography.git/plain/docs/ebe.pdf})
 129
 130 @cindex Confidentiality
 131 @node Confidentiality
 132 @section Confidentiality
 133
 134 Adversaries (malicious, bad actors) outside of GNUnet are not supposed
 135 to know what kind of actions a peer is involved in. Only the specific
 136 neighbor of a peer that is the corresponding sender or recipient of a
 137 message may know its contents, and even then application protocols may
 138 place further restrictions on that knowledge.  In order to ensure
 139 confidentiality, GNUnet uses link encryption, that is each message
 140 exchanged between two peers is encrypted using a pair of keys only
 141 known to these two peers.  Encrypting traffic like this makes any kind
 142 of traffic analysis much harder. Naturally, for some applications, it
 143 may still be desirable if even neighbors cannot determine the concrete
 144 contents of a message.  In GNUnet, this problem is addressed by the
 145 specific application-level protocols. See for example the following
 146 sections @pxref{Anonymity}, @pxref{How file-sharing achieves Anonymity},
 147 and @pxref{Deniability}.
 148
 149 @cindex Anonymity
 150 @node Anonymity
 151 @section Anonymity
 152
 153 @menu
 154 * How file-sharing achieves Anonymity::
 155 @end menu
 156
 157 Providing anonymity for users is the central goal for the anonymous
 158 file-sharing application. Many other design decisions follow in the
 159 footsteps of this requirement.
 160 Anonymity is never absolute. While there are various
 161 scientific metrics
 162 (Claudia Díaz, Stefaan Seys, Joris Claessens,
 163 and Bart Preneel. Towards measuring anonymity.
 164 2002.
 165 (@uref{https://git.gnunet.org/bibliography.git/plain/docs/article-89.pdf, https://git.gnunet.org/bibliography.git/plain/docs/article-89.pdf}))
 166 that can help quantify the level of anonymity that a given mechanism
 167 provides, there is no such thing as "complete anonymity".
 168
 169 GNUnet's file-sharing implementation allows users to select for each
 170 operation (publish, search, download) the desired level of anonymity.
 171 The metric used is based on the amount of cover traffic needed to hide
 172 the request.
 173
 174 While there is no clear way to relate the amount of available cover
 175 traffic to traditional scientific metrics such as the anonymity set or
 176 information leakage, it is probably the best metric available to a
 177 peer with a purely local view of the world, in that it does not rely
 178 on unreliable external information or a particular adversary model.
 179
 180 The default anonymity level is @code{1}, which uses anonymous routing
 181 but imposes no minimal requirements on cover traffic. It is possible
 182 to forego anonymity when this is not required. The anonymity level of
 183 @code{0} allows GNUnet to use more efficient, non-anonymous routing.
 184
 185 @cindex How file-sharing achieves Anonymity
 186 @node How file-sharing achieves Anonymity
 187 @subsection How file-sharing achieves Anonymity
 188
 189 Contrary to other designs, we do not believe that users achieve strong
 190 anonymity just because their requests are obfuscated by a couple of
 191 indirections. This is not sufficient if the adversary uses traffic
 192 analysis.
 193 The threat model used for anonymous file sharing in GNUnet assumes that
 194 the adversary is quite powerful.
 195 In particular, we assume that the adversary can see all the traffic on
 196 the Internet. And while we assume that the adversary
 197 can not break our encryption, we assume that the adversary has many
 198 participating nodes in the network and that it can thus see many of the
 199 node-to-node interactions since it controls some of the nodes.
 200
 201 The system tries to achieve anonymity based on the idea that users can be
 202 anonymous if they can hide their actions in the traffic created by other
 203 users.
 204 Hiding actions in the traffic of other users requires participating in the
 205 traffic, bringing back the traditional technique of using indirection and
 206 source rewriting. Source rewriting is required to gain anonymity since
 207 otherwise an adversary could tell if a message originated from a host by
 208 looking at the source address. If all packets look like they originate
 209 from one node, the adversary can not tell which ones originate from that
 210 node and which ones were routed.
 211 Note that in this mindset, any node can decide to break the
 212 source-rewriting paradigm without violating the protocol, as this
 213 only reduces the amount of traffic that a node can hide its own traffic
 214 in.
 215
 216 If we want to hide our actions in the traffic of other nodes, we must make
 217 our traffic indistinguishable from the traffic that we route for others.
 218 As our queries must have us as the receiver of the reply
 219 (otherwise they would be useless), we must put ourselves as the receiver
 220 of replies that actually go to other hosts; in other words, we must
 221 indirect replies.
 222 Unlike other systems, in anonymous file-sharing as implemented on top of
 223 GNUnet we do not have to indirect the replies if we don't think we need
 224 more traffic to hide our own actions.
 225
 226 This increases the efficiency of the network as we can indirect less under
 227 higher load.
 228 Refer to the following paper for more:
 229 Krista Bennett and Christian Grothoff.
 230 GAP --- practical anonymous networking. In Proceedings of
 231 Designing Privacy Enhancing Technologies, 2003.
 232 (@uref{https://git.gnunet.org/bibliography.git/plain/docs/aff.pdf, https://git.gnunet.org/bibliography.git/plain/docs/aff.pdf})
 233
 234 @cindex Deniability
 235 @node Deniability
 236 @section Deniability
 237
 238 Even if the user that downloads data and the server that provides data are
 239 anonymous, the intermediaries may still be targets. In particular, if the
 240 intermediaries can find out which queries or which content they are
 241 processing, a strong adversary could try to force them to censor
 242 certain materials.
 243
 244 With the file-encoding used by GNUnet's anonymous file-sharing, this
 245 problem does not arise.
 246 The reason is that queries and replies are transmitted in
 247 an encrypted format such that intermediaries cannot tell what the query
 248 is for or what the content is about.  Mind that this is not the same
 249 encryption as the link-encryption between the nodes.  GNUnet has
 250 encryption on the network layer (link encryption, confidentiality,
 251 authentication) and again on the application layer (provided
 252 by @command{gnunet-publish}, @command{gnunet-download},
 253 @command{gnunet-search} and @command{gnunet-fs-gtk}).
 254
 255 Refer to the following paper for more:
 256 Christian Grothoff, Krista Grothoff, Tzvetan Horozov,
 257 and Jussi T. Lindgren.
 258 An Encoding for Censorship-Resistant Sharing.
 259 2009.
 260 (@uref{https://git.gnunet.org/bibliography.git/plain/docs/ecrs.pdf, https://git.gnunet.org/bibliography.git/plain/docs/ecrs.pdf})
 261
 262 @cindex Peer Identities
 263 @node Peer Identities
 264 @section Peer Identities
 265
 266 Peer identities are used to identify peers in the network and are unique
 267 for each peer. The identity for a peer is simply its public key, which is
 268 generated along with a private key the peer is started for the first time.
 269 While the identity is binary data, it is often expressed as ASCII string.
 270 For example, the following is a peer identity as you might see it in
 271 various places:
 272
 273 @example
 274 UAT1S6PMPITLBKSJ2DGV341JI6KF7B66AC4JVCN9811NNEGQLUN0
 275 @end example
 276
 277 @noindent
 278 You can find your peer identity by running @command{gnunet-peerinfo -s}.
 279
 280 @cindex Zones in the GNU Name System (GNS Zones)
 281 @node Zones in the GNU Name System (GNS Zones)
 282 @section Zones in the GNU Name System (GNS Zones)
 283
 284 @c FIXME: Explain or link to an explanation of the concept of public keys
 285 @c and private keys.
 286 @c FIXME: Rewrite for the latest GNS changes.
 287 GNS (Matthias Wachs, Martin Schanzenbach, and Christian Grothoff.
 288 A Censorship-Resistant, Privacy-Enhancing and Fully Decentralized Name
 289 System. In proceedings of 13th International Conference on Cryptology and
 290 Network Security (CANS 2014). 2014.
 291 @uref{https://git.gnunet.org/bibliography.git/plain/docs/gns2014wachs.pdf, https://git.gnunet.org/bibliography.git/plain/docs/gns2014wachs.pdf})
 292 zones are similar to those of DNS zones, but instead of a hierarchy of
 293 authorities to governing their use, GNS zones are controlled by a private
 294 key.
 295 When you create a record in a DNS zone, that information is stored in your
 296 nameserver. Anyone trying to resolve your domain then gets pointed
 297 (hopefully) by the centralised authority to your nameserver.
 298 Whereas GNS, being fully decentralized by design, stores that information
 299 in DHT. The validity of the records is assured cryptographically, by
 300 signing them with the private key of the respective zone.
 301
 302 Anyone trying to resolve records in a zone of your domain can then verify
 303 the signature of the records they get from the DHT and be assured that
 304 they are indeed from the respective zone.
 305 To make this work, there is a 1:1 correspondence between zones and
 306 their public-private key pairs.
 307 So when we talk about the owner of a GNS zone, that's really the owner of
 308 the private key.
 309 And a user accessing a zone needs to somehow specify the corresponding
 310 public key first.
 311
 312 @cindex Egos
 313 @node Egos
 314 @section Egos
 315
 316 @c what is the difference between peer identity and egos? It seems
 317 @c like both are linked to public-private key pair.
 318 Egos are your "identities" in GNUnet. Any user can assume multiple
 319 identities, for example to separate their activities online. Egos can
 320 correspond to "pseudonyms" or "real-world identities". Technically an
 321 ego is first of all a key pair of a public- and private-key.
 322