doc/handbook/chapters/developer.texi

   1 @c ***********************************************************************
   2 @node GNUnet Developer Handbook
   3 @chapter GNUnet Developer Handbook
   4
   5 This book is intended to be an introduction for programmers that want to
   6 extend the GNUnet framework. GNUnet is more than a simple peer-to-peer
   7 application.
   8
   9 For developers, GNUnet is:
  10
  11 @itemize @bullet
  12 @item developed by a community that believes in the GNU philosophy
  13 @item Free Software (Free as in Freedom), licensed under the
  14 GNU Affero General Public License
  15 (@uref{https://www.gnu.org/licenses/licenses.html#AGPL})
  16 @item A set of standards, including coding conventions and
  17 architectural rules
  18 @item A set of layered protocols, both specifying the communication
  19 between peers as well as the communication between components
  20 of a single peer
  21 @item A set of libraries with well-defined APIs suitable for
  22 writing extensions
  23 @end itemize
  24
  25 In particular, the architecture specifies that a peer consists of many
  26 processes communicating via protocols. Processes can be written in almost
  27 any language.
  28 @code{C}, @code{Java} and @code{Guile} APIs exist for accessing existing
  29 services and for writing extensions.
  30 It is possible to write extensions in other languages by
  31 implementing the necessary IPC protocols.
  32
  33 GNUnet can be extended and improved along many possible dimensions, and
  34 anyone interested in Free Software and Freedom-enhancing Networking is
  35 welcome to join the effort. This Developer Handbook attempts to provide
  36 an initial introduction to some of the key design choices and central
  37 components of the system.
  38 This part of the GNUnet documentation is far from complete,
  39 and we welcome informed contributions, be it in the form of
  40 new chapters, sections or insightful comments.
  41
  42 @menu
  43 * Developer Introduction::
  44 * Internal dependencies::
  45 * Code overview::
  46 * System Architecture::
  47 * Subsystem stability::
  48 * Naming conventions and coding style guide::
  49 * Build-system::
  50 * Developing extensions for GNUnet using the gnunet-ext template::
  51 * Writing testcases::
  52 * Building GNUnet and its dependencies::
  53 * TESTING library::
  54 * Performance regression analysis with Gauger::
  55 * TESTBED Subsystem::
  56 * libgnunetutil::
  57 * Automatic Restart Manager (ARM)::
  58 * TRANSPORT Subsystem::
  59 * NAT library::
  60 * Distance-Vector plugin::
  61 * SMTP plugin::
  62 * Bluetooth plugin::
  63 * WLAN plugin::
  64 * ATS Subsystem::
  65 * CORE Subsystem::
  66 * CADET Subsystem::
  67 * NSE Subsystem::
  68 * HOSTLIST Subsystem::
  69 * IDENTITY Subsystem::
  70 * NAMESTORE Subsystem::
  71 * PEERINFO Subsystem::
  72 * PEERSTORE Subsystem::
  73 * SET Subsystem::
  74 * STATISTICS Subsystem::
  75 * Distributed Hash Table (DHT)::
  76 * GNU Name System (GNS)::
  77 * GNS Namecache::
  78 * REVOCATION Subsystem::
  79 * File-sharing (FS) Subsystem::
  80 * REGEX Subsystem::
  81 * REST Subsystem::
  82 * RPS Subsystem::
  83 @end menu
  84
  85 @node Developer Introduction
  86 @section Developer Introduction
  87
  88 This Developer Handbook is intended as first introduction to GNUnet for
  89 new developers that want to extend the GNUnet framework. After the
  90 introduction, each of the GNUnet subsystems (directories in the
  91 @file{src/} tree) is (supposed to be) covered in its own chapter. In
  92 addition to this documentation, GNUnet developers should be aware of the
  93 services available on the GNUnet server to them.
  94
  95 New developers can have a look a the GNUnet tutorials for C and java
  96 available in the @file{src/} directory of the repository or under the
  97 following links:
  98
  99 @c ** FIXME: Link to files in source, not online.
 100 @c ** FIXME: Where is the Java tutorial?
 101 @itemize @bullet
 102 @item @xref{Top, Introduction,, gnunet-c-tutorial, The GNUnet C Tutorial}.
 103 @item @uref{https://docs.gnunet.org/tutorial/gnunet-tutorial.html, GNUnet C tutorial}
 104 @item GNUnet Java tutorial
 105 @end itemize
 106
 107 In addition to the GNUnet Reference Documentation you are reading,
 108 the GNUnet server at @uref{https://gnunet.org} contains
 109 various resources for GNUnet developers and those
 110 who aspire to become regular contributors.
 111 They are all conveniently reachable via the "Developer"
 112 entry in the navigation menu. Some additional tools (such as static
 113 analysis reports) require a special developer access to perform certain
 114 operations. If you want (or require) access, you should contact
 115 @uref{http://grothoff.org/christian/, Christian Grothoff},
 116 GNUnet's maintainer.
 117
 118 @c FIXME: A good part of this belongs on the website or should be
 119 @c extended in subsections explaining usage of this. A simple list
 120 @c is just taking space people have to read.
 121 The public subsystems on the GNUnet server that help developers are:
 122
 123 @itemize @bullet
 124
 125 @item The version control system (git) keeps our code and enables
 126 distributed development.
 127 It is publicly accessible at @uref{https://git.gnunet.org/}.
 128 Only developers with write access can commit code, everyone else is
 129 encouraged to submit patches to the GNUnet-developers mailinglist:
 130 @uref{https://lists.gnu.org/mailman/listinfo/gnunet-developers, https://lists.gnu.org/mailman/listinfo/gnunet-developers}
 131
 132 @item The bugtracking system (Mantis).
 133 We use it to track feature requests, open bug reports and their
 134 resolutions.
 135 It can be accessed at
 136 @uref{https://bugs.gnunet.org/, https://bugs.gnunet.org/}.
 137 Anyone can report bugs.
 138
 139 @item The current quality of our automated test suite is assessed using
 140 code coverage analysis. Testcases that
 141 improve our code coverage are always welcome.
 142
 143 @item We try to automatically find bugs using a static analysis using
 144 various tools. Note that not everything that is flagged by the
 145 analysis is a bug, sometimes even good code can be marked as possibly
 146 problematic. Nevertheless, developers are encouraged to at least be
 147 aware of all issues in their code that are listed.
 148
 149 @item We use Gauger for automatic performance regression visualization.
 150 @c FIXME: LINK!
 151 Details on how to use Gauger are here.
 152
 153 @end itemize
 154
 155
 156
 157 @c ***********************************************************************
 158 @menu
 159 * Project overview::
 160 @end menu
 161
 162 @node Project overview
 163 @subsection Project overview
 164
 165 The GNUnet project consists at this point of several sub-projects. This
 166 section is supposed to give an initial overview about the various
 167 sub-projects. Note that this description also lists projects that are far
 168 from complete, including even those that have literally not a single line
 169 of code in them yet.
 170
 171 GNUnet sub-projects in order of likely relevance are currently:
 172
 173 @table @asis
 174
 175 @item @command{gnunet}
 176 Core of the P2P framework, including file-sharing, VPN and
 177 chat applications; this is what the Developer Handbook covers mostly
 178 @item @command{gnunet-gtk}
 179 Gtk+-based user interfaces, including:
 180
 181 @itemize @bullet
 182 @item @command{gnunet-fs-gtk} (file-sharing),
 183 @item @command{gnunet-statistics-gtk} (statistics over time),
 184 @item @command{gnunet-peerinfo-gtk}
 185 (information about current connections and known peers),
 186 @item @command{gnunet-namestore-gtk} (GNS record editor),
 187 @item @command{gnunet-conversation-gtk} (voice chat GUI) and
 188 @item @command{gnunet-setup} (setup tool for "everything")
 189 @end itemize
 190
 191 @item @command{gnunet-fuse}
 192 Mounting directories shared via GNUnet's file-sharing
 193 on GNU/Linux distributions
 194 @item @command{gnunet-update}
 195 Installation and update tool
 196 @item @command{gnunet-ext}
 197 Template for starting 'external' GNUnet projects
 198 @item @command{gnunet-java}
 199 Java APIs for writing GNUnet services and applications
 200 @item @command{gnunet-java-ext}
 201 @item @command{eclectic}
 202 Code to run GNUnet nodes on testbeds for research, development,
 203 testing and evaluation
 204 @c ** FIXME: Solve the status and location of gnunet-qt
 205 @item @command{gnunet-qt}
 206 Qt-based GNUnet GUI (is it deprecated?)
 207 @item @command{gnunet-cocoa}
 208 cocoa-based GNUnet GUI (is it deprecated?)
 209 @item @command{gnunet-guile}
 210 Guile bindings for GNUnet
 211 @item @command{gnunet-python}
 212 Python bindings for GNUnet
 213
 214 @end table
 215
 216 We are also working on various supporting libraries and tools:
 217 @c ** FIXME: What about gauger, and what about libmwmodem?
 218
 219 @table @asis
 220 @item @command{libextractor}
 221 GNU libextractor (meta data extraction)
 222 @item @command{libmicrohttpd}
 223 GNU libmicrohttpd (embedded HTTP(S) server library)
 224 @item @command{gauger}
 225 Tool for performance regression analysis
 226 @item @command{monkey}
 227 Tool for automated debugging of distributed systems
 228 @item @command{libmwmodem}
 229 Library for accessing satellite connection quality reports
 230 @item @command{libgnurl}
 231 gnURL (feature-restricted variant of cURL/libcurl)
 232 @item @command{www}
 233 the gnunet.org website (Jinja2 based)
 234 @item @command{bibliography}
 235 Our collected bibliography, papers, references, and so forth
 236 @item @command{gnunet-videos-}
 237 Videos about and around GNUnet activities
 238 @end table
 239
 240 Finally, there are various external projects (see links for a list of
 241 those that have a public website) which build on top of the GNUnet
 242 framework.
 243
 244 @c ***********************************************************************
 245 @node Internal dependencies
 246 @section Internal dependencies
 247
 248 This section tries to give an overview of what processes a typical GNUnet
 249 peer running a particular application would consist of. All of the
 250 processes listed here should be automatically started by
 251 @command{gnunet-arm -s}.
 252 The list is given as a rough first guide to users for failure diagnostics.
 253 Ideally, end-users should never have to worry about these internal
 254 dependencies.
 255
 256 In terms of internal dependencies, a minimum file-sharing system consists
 257 of the following GNUnet processes (in order of dependency):
 258
 259 @itemize @bullet
 260 @item gnunet-service-arm
 261 @item gnunet-service-resolver (required by all)
 262 @item gnunet-service-statistics (required by all)
 263 @item gnunet-service-peerinfo
 264 @item gnunet-service-transport (requires peerinfo)
 265 @item gnunet-service-core (requires transport)
 266 @item gnunet-daemon-hostlist (requires core)
 267 @item gnunet-daemon-topology (requires hostlist, peerinfo)
 268 @item gnunet-service-datastore
 269 @item gnunet-service-dht (requires core)
 270 @item gnunet-service-identity
 271 @item gnunet-service-fs (requires identity, mesh, dht, datastore, core)
 272 @end itemize
 273
 274 @noindent
 275 A minimum VPN system consists of the following GNUnet processes (in
 276 order of dependency):
 277
 278 @itemize @bullet
 279 @item gnunet-service-arm
 280 @item gnunet-service-resolver (required by all)
 281 @item gnunet-service-statistics (required by all)
 282 @item gnunet-service-peerinfo
 283 @item gnunet-service-transport (requires peerinfo)
 284 @item gnunet-service-core (requires transport)
 285 @item gnunet-daemon-hostlist (requires core)
 286 @item gnunet-service-dht (requires core)
 287 @item gnunet-service-mesh (requires dht, core)
 288 @item gnunet-service-dns (requires dht)
 289 @item gnunet-service-regex (requires dht)
 290 @item gnunet-service-vpn (requires regex, dns, mesh, dht)
 291 @end itemize
 292
 293 @noindent
 294 A minimum GNS system consists of the following GNUnet processes (in
 295 order of dependency):
 296
 297 @itemize @bullet
 298 @item gnunet-service-arm
 299 @item gnunet-service-resolver (required by all)
 300 @item gnunet-service-statistics (required by all)
 301 @item gnunet-service-peerinfo
 302 @item gnunet-service-transport (requires peerinfo)
 303 @item gnunet-service-core (requires transport)
 304 @item gnunet-daemon-hostlist (requires core)
 305 @item gnunet-service-dht (requires core)
 306 @item gnunet-service-mesh (requires dht, core)
 307 @item gnunet-service-dns (requires dht)
 308 @item gnunet-service-regex (requires dht)
 309 @item gnunet-service-vpn (requires regex, dns, mesh, dht)
 310 @item gnunet-service-identity
 311 @item gnunet-service-namestore (requires identity)
 312 @item gnunet-service-gns (requires vpn, dns, dht, namestore, identity)
 313 @end itemize
 314
 315 @c ***********************************************************************
 316 @node Code overview
 317 @section Code overview
 318
 319 This section gives a brief overview of the GNUnet source code.
 320 Specifically, we sketch the function of each of the subdirectories in
 321 the @file{gnunet/src/} directory. The order given is roughly bottom-up
 322 (in terms of the layers of the system).
 323
 324 @table @asis
 325 @item @file{util/} --- libgnunetutil
 326 Library with general utility functions, all
 327 GNUnet binaries link against this library. Anything from memory
 328 allocation and data structures to cryptography and inter-process
 329 communication. The goal is to provide an OS-independent interface and
 330 more 'secure' or convenient implementations of commonly used primitives.
 331 The API is spread over more than a dozen headers, developers should study
 332 those closely to avoid duplicating existing functions.
 333 @pxref{libgnunetutil}.
 334 @item @file{hello/} --- libgnunethello
 335 HELLO messages are used to
 336 describe under which addresses a peer can be reached (for example,
 337 protocol, IP, port). This library manages parsing and generating of HELLO
 338 messages.
 339 @item @file{block/} --- libgnunetblock
 340 The DHT and other components of GNUnet
 341 store information in units called 'blocks'. Each block has a type and the
 342 type defines a particular format and how that binary format is to be
 343 linked to a hash code (the key for the DHT and for databases). The block
 344 library is a wrapper around block plugins which provide the necessary
 345 functions for each block type.
 346 @item @file{statistics/} --- statistics service
 347 The statistics service enables associating
 348 values (of type uint64_t) with a component name and a string. The main
 349 uses is debugging (counting events), performance tracking and user
 350 entertainment (what did my peer do today?).
 351 @item @file{arm/} --- Automatic Restart Manager (ARM)
 352 The automatic-restart-manager (ARM) service
 353 is the GNUnet master service. Its role is to start gnunet-services, to
 354 re-start them when they crashed and finally to shut down the system when
 355 requested.
 356 @item @file{peerinfo/} --- peerinfo service
 357 The peerinfo service keeps track of which peers are known
 358 to the local peer and also tracks the validated addresses for each peer
 359 (in the form of a HELLO message) for each of those peers. The peer is not
 360 necessarily connected to all peers known to the peerinfo service.
 361 Peerinfo provides persistent storage for peer identities --- peers are
 362 not forgotten just because of a system restart.
 363 @item @file{datacache/} --- libgnunetdatacache
 364 The datacache library provides (temporary) block storage for the DHT.
 365 Existing plugins can store blocks in Sqlite, Postgres or MySQL databases.
 366 All data stored in the cache is lost when the peer is stopped or
 367 restarted (datacache uses temporary tables).
 368 @item @file{datastore/} --- datastore service
 369 The datastore service stores file-sharing blocks in
 370 databases for extended periods of time. In contrast to the datacache, data
 371 is not lost when peers restart. However, quota restrictions may still
 372 cause old, expired or low-priority data to be eventually discarded.
 373 Existing plugins can store blocks in Sqlite, Postgres or MySQL databases.
 374 @item @file{template/} --- service template
 375 Template for writing a new service. Does nothing.
 376 @item @file{ats/} --- Automatic Transport Selection
 377 The automatic transport selection (ATS) service
 378 is responsible for deciding which address (i.e.
 379 which transport plugin) should be used for communication with other peers,
 380 and at what bandwidth.
 381 @item @file{nat/} --- libgnunetnat
 382 Library that provides basic functions for NAT traversal.
 383 The library supports NAT traversal with
 384 manual hole-punching by the user, UPnP and ICMP-based autonomous NAT
 385 traversal. The library also includes an API for testing if the current
 386 configuration works and the @code{gnunet-nat-server} which provides an
 387 external service to test the local configuration.
 388 @item @file{fragmentation/} --- libgnunetfragmentation
 389 Some transports (UDP and WLAN, mostly) have restrictions on the maximum
 390 transfer unit (MTU) for packets. The fragmentation library can be used to
 391 break larger packets into chunks of at most 1k and transmit the resulting
 392 fragments reliably (with acknowledgment, retransmission, timeouts,
 393 etc.).
 394 @item @file{transport/} --- transport service
 395 The transport service is responsible for managing the
 396 basic P2P communication. It uses plugins to support P2P communication
 397 over TCP, UDP, HTTP, HTTPS and other protocols.The transport service
 398 validates peer addresses, enforces bandwidth restrictions, limits the
 399 total number of connections and enforces connectivity restrictions (i.e.
 400 friends-only).
 401 @item @file{peerinfo-tool/} --- gnunet-peerinfo
 402 This directory contains the gnunet-peerinfo binary which can be used to
 403 inspect the peers and HELLOs known to the peerinfo service.
 404 @item @file{core/}
 405 The core service is responsible for establishing encrypted, authenticated
 406 connections with other peers, encrypting and decrypting messages and
 407 forwarding messages to higher-level services that are interested in them.
 408 @item @file{testing/} --- libgnunettesting
 409 The testing library allows starting (and stopping) peers
 410 for writing testcases.
 411 It also supports automatic generation of configurations for peers
 412 ensuring that the ports and paths are disjoint. libgnunettesting is also
 413 the foundation for the testbed service
 414 @item @file{testbed/} --- testbed service
 415 The testbed service is used for creating small or large scale deployments
 416 of GNUnet peers for evaluation of protocols.
 417 It facilitates peer deployments on multiple
 418 hosts (for example, in a cluster) and establishing various network
 419 topologies (both underlay and overlay).
 420 @item @file{nse/} --- Network Size Estimation
 421 The network size estimation (NSE) service
 422 implements a protocol for (securely) estimating the current size of the
 423 P2P network.
 424 @item @file{dht/} --- distributed hash table
 425 The distributed hash table (DHT) service provides a
 426 distributed implementation of a hash table to store blocks under hash
 427 keys in the P2P network.
 428 @item @file{hostlist/} --- hostlist service
 429 The hostlist service allows learning about
 430 other peers in the network by downloading HELLO messages from an HTTP
 431 server, can be configured to run such an HTTP server and also implements
 432 a P2P protocol to advertise and automatically learn about other peers
 433 that offer a public hostlist server.
 434 @item @file{topology/} --- topology service
 435 The topology service is responsible for
 436 maintaining the mesh topology. It tries to maintain connections to friends
 437 (depending on the configuration) and also tries to ensure that the peer
 438 has a decent number of active connections at all times. If necessary, new
 439 connections are added. All peers should run the topology service,
 440 otherwise they may end up not being connected to any other peer (unless
 441 some other service ensures that core establishes the required
 442 connections). The topology service also tells the transport service which
 443 connections are permitted (for friend-to-friend networking)
 444 @item @file{fs/} --- file-sharing
 445 The file-sharing (FS) service implements GNUnet's
 446 file-sharing application. Both anonymous file-sharing (using gap) and
 447 non-anonymous file-sharing (using dht) are supported.
 448 @item @file{cadet/} --- cadet service
 449 The CADET service provides a general-purpose routing abstraction to create
 450 end-to-end encrypted tunnels in mesh networks. We wrote a paper
 451 documenting key aspects of the design.
 452 @item @file{tun/} --- libgnunettun
 453 Library for building IPv4, IPv6 packets and creating
 454 checksums for UDP, TCP and ICMP packets. The header
 455 defines C structs for common Internet packet formats and in particular
 456 structs for interacting with TUN (virtual network) interfaces.
 457 @item @file{mysql/} --- libgnunetmysql
 458 Library for creating and executing prepared MySQL
 459 statements and to manage the connection to the MySQL database.
 460 Essentially a lightweight wrapper for the interaction between GNUnet
 461 components and libmysqlclient.
 462 @item @file{dns/}
 463 Service that allows intercepting and modifying DNS requests of
 464 the local machine. Currently used for IPv4-IPv6 protocol translation
 465 (DNS-ALG) as implemented by "pt/" and for the GNUnet naming system. The
 466 service can also be configured to offer an exit service for DNS traffic.
 467 @item @file{vpn/} --- VPN service
 468 The virtual public network (VPN) service provides a virtual
 469 tunnel interface (VTUN) for IP routing over GNUnet.
 470 Needs some other peers to run an "exit" service to work.
 471 Can be activated using the "gnunet-vpn" tool or integrated with DNS using
 472 the "pt" daemon.
 473 @item @file{exit/}
 474 Daemon to allow traffic from the VPN to exit this
 475 peer to the Internet or to specific IP-based services of the local peer.
 476 Currently, an exit service can only be restricted to IPv4 or IPv6, not to
 477 specific ports and or IP address ranges. If this is not acceptable,
 478 additional firewall rules must be added manually. exit currently only
 479 works for normal UDP, TCP and ICMP traffic; DNS queries need to leave the
 480 system via a DNS service.
 481 @item @file{pt/}
 482 protocol translation daemon. This daemon enables 4-to-6,
 483 6-to-4, 4-over-6 or 6-over-4 transitions for the local system. It
 484 essentially uses "DNS" to intercept DNS replies and then maps results to
 485 those offered by the VPN, which then sends them using mesh to some daemon
 486 offering an appropriate exit service.
 487 @item @file{identity/}
 488 Management of egos (alter egos) of a user; identities are
 489 essentially named ECC private keys and used for zones in the GNU name
 490 system and for namespaces in file-sharing, but might find other uses later
 491 @item @file{revocation/}
 492 Key revocation service, can be used to revoke the
 493 private key of an identity if it has been compromised
 494 @item @file{namecache/}
 495 Cache for resolution results for the GNU name system;
 496 data is encrypted and can be shared among users,
 497 loss of the data should ideally only result in a
 498 performance degradation (persistence not required)
 499 @item @file{namestore/}
 500 Database for the GNU name system with per-user private information,
 501 persistence required
 502 @item @file{gns/}
 503 GNU name system, a GNU approach to DNS and PKI.
 504 @item @file{dv/}
 505 A plugin for distance-vector (DV)-based routing.
 506 DV consists of a service and a transport plugin to provide peers
 507 with the illusion of a direct P2P connection for connections
 508 that use multiple (typically up to 3) hops in the actual underlay network.
 509 @item @file{regex/}
 510 Service for the (distributed) evaluation of regular expressions.
 511 @item @file{scalarproduct/}
 512 The scalar product service offers an API to perform a secure multiparty
 513 computation which calculates a scalar product between two peers
 514 without exposing the private input vectors of the peers to each other.
 515 @item @file{consensus/}
 516 The consensus service will allow a set of peers to agree
 517 on a set of values via a distributed set union computation.
 518 @item @file{rest/}
 519 The rest API allows access to GNUnet services using RESTful interaction.
 520 The services provide plugins that can exposed by the rest server.
 521 @c FIXME: Where did this disappear to?
 522 @c @item @file{experimentation/}
 523 @c The experimentation daemon coordinates distributed
 524 @c experimentation to evaluate transport and ATS properties.
 525 @end table
 526
 527 @c ***********************************************************************
 528 @node System Architecture
 529 @section System Architecture
 530
 531 @c FIXME: For those irritated by the textflow, we are missing images here,
 532 @c in the short term we should add them back, in the long term this should
 533 @c work without images or have images with alt-text.
 534
 535 GNUnet developers like LEGOs. The blocks are indestructible, can be
 536 stacked together to construct complex buildings and it is generally easy
 537 to swap one block for a different one that has the same shape. GNUnet's
 538 architecture is based on LEGOs:
 539
 540 @image{images/service_lego_block,5in,,picture of a LEGO block stack - 3 APIs upon IPC/network protocol provided by a service}
 541
 542 This chapter documents the GNUnet LEGO system, also known as GNUnet's
 543 system architecture.
 544
 545 The most common GNUnet component is a service. Services offer an API (or
 546 several, depending on what you count as "an API") which is implemented as
 547 a library. The library communicates with the main process of the service
 548 using a service-specific network protocol. The main process of the service
 549 typically doesn't fully provide everything that is needed --- it has holes
 550 to be filled by APIs to other services.
 551
 552 A special kind of component in GNUnet are user interfaces and daemons.
 553 Like services, they have holes to be filled by APIs of other services.
 554 Unlike services, daemons do not implement their own network protocol and
 555 they have no API:
 556
 557 @image{images/daemon_lego_block,5in,,A daemon in GNUnet is a component that does not offer an API for others to build upon}
 558
 559 The GNUnet system provides a range of services, daemons and user
 560 interfaces, which are then combined into a layered GNUnet instance (also
 561 known as a peer).
 562
 563 @image{images/service_stack,5in,,A GNUnet peer consists of many layers of services}
 564
 565 Note that while it is generally possible to swap one service for another
 566 compatible service, there is often only one implementation. However,
 567 during development we often have a "new" version of a service in parallel
 568 with an "old" version. While the "new" version is not working, developers
 569 working on other parts of the service can continue their development by
 570 simply using the "old" service. Alternative design ideas can also be
 571 easily investigated by swapping out individual components. This is
 572 typically achieved by simply changing the name of the "BINARY" in the
 573 respective configuration section.
 574
 575 Key properties of GNUnet services are that they must be separate
 576 processes and that they must protect themselves by applying tight error
 577 checking against the network protocol they implement (thereby achieving a
 578 certain degree of robustness).
 579
 580 On the other hand, the APIs are implemented to tolerate failures of the
 581 service, isolating their host process from errors by the service. If the
 582 service process crashes, other services and daemons around it should not
 583 also fail, but instead wait for the service process to be restarted by
 584 ARM.
 585
 586
 587 @c ***********************************************************************
 588 @node Subsystem stability
 589 @section Subsystem stability
 590
 591 This section documents the current stability of the various GNUnet
 592 subsystems. Stability here describes the expected degree of compatibility
 593 with future versions of GNUnet. For each subsystem we distinguish between
 594 compatibility on the P2P network level (communication protocol between
 595 peers), the IPC level (communication between the service and the service
 596 library) and the API level (stability of the API). P2P compatibility is
 597 relevant in terms of which applications are likely going to be able to
 598 communicate with future versions of the network. IPC communication is
 599 relevant for the implementation of language bindings that re-implement the
 600 IPC messages. Finally, API compatibility is relevant to developers that
 601 hope to be able to avoid changes to applications build on top of the APIs
 602 of the framework.
 603
 604 The following table summarizes our current view of the stability of the
 605 respective protocols or APIs:
 606
 607 @multitable @columnfractions .20 .20 .20 .20
 608 @headitem Subsystem @tab P2P @tab IPC @tab C API
 609 @item util @tab n/a @tab n/a @tab stable
 610 @item arm @tab n/a @tab stable @tab stable
 611 @item ats @tab n/a @tab unstable @tab testing
 612 @item block @tab n/a @tab n/a @tab stable
 613 @item cadet @tab testing @tab testing @tab testing
 614 @item consensus @tab experimental @tab experimental @tab experimental
 615 @item core @tab stable @tab stable @tab stable
 616 @item datacache @tab n/a @tab n/a @tab stable
 617 @item datastore @tab n/a @tab stable @tab stable
 618 @item dht @tab stable @tab stable @tab stable
 619 @item dns @tab stable @tab stable @tab stable
 620 @item dv @tab testing @tab testing @tab n/a
 621 @item exit @tab testing @tab n/a @tab n/a
 622 @item fragmentation @tab stable @tab n/a @tab stable
 623 @item fs @tab stable @tab stable @tab stable
 624 @item gns @tab stable @tab stable @tab stable
 625 @item hello @tab n/a @tab n/a @tab testing
 626 @item hostlist @tab stable @tab stable @tab n/a
 627 @item identity @tab stable @tab stable @tab n/a
 628 @item multicast @tab experimental @tab experimental @tab experimental
 629 @item mysql @tab stable @tab n/a @tab stable
 630 @item namestore @tab n/a @tab stable @tab stable
 631 @item nat @tab n/a @tab n/a @tab stable
 632 @item nse @tab stable @tab stable @tab stable
 633 @item peerinfo @tab n/a @tab stable @tab stable
 634 @item psyc @tab experimental @tab experimental @tab experimental
 635 @item pt @tab n/a @tab n/a @tab n/a
 636 @item regex @tab stable @tab stable @tab stable
 637 @item revocation @tab stable @tab stable @tab stable
 638 @item social @tab experimental @tab experimental @tab experimental
 639 @item statistics @tab n/a @tab stable @tab stable
 640 @item testbed @tab n/a @tab testing @tab testing
 641 @item testing @tab n/a @tab n/a @tab testing
 642 @item topology @tab n/a @tab n/a @tab n/a
 643 @item transport @tab stable @tab stable @tab stable
 644 @item tun @tab n/a @tab n/a @tab stable
 645 @item vpn @tab testing @tab n/a @tab n/a
 646 @end multitable
 647
 648 Here is a rough explanation of the values:
 649
 650 @table @samp
 651 @item stable
 652 No incompatible changes are planned at this time; for IPC/APIs, if
 653 there are incompatible changes, they will be minor and might only require
 654 minimal changes to existing code; for P2P, changes will be avoided if at
 655 all possible for the 0.10.x-series
 656
 657 @item testing
 658 No incompatible changes are
 659 planned at this time, but the code is still known to be in flux; so while
 660 we have no concrete plans, our expectation is that there will still be
 661 minor modifications; for P2P, changes will likely be extensions that
 662 should not break existing code
 663
 664 @item unstable
 665 Changes are planned and will happen; however, they
 666 will not be totally radical and the result should still resemble what is
 667 there now; nevertheless, anticipated changes will break protocol/API
 668 compatibility
 669
 670 @item experimental
 671 Changes are planned and the result may look nothing like
 672 what the API/protocol looks like today
 673
 674 @item unknown
 675 Someone should think about where this subsystem headed
 676
 677 @item n/a
 678 This subsystem does not have an API/IPC-protocol/P2P-protocol
 679 @end table
 680
 681 @c ***********************************************************************
 682 @node Naming conventions and coding style guide
 683 @section Naming conventions and coding style guide
 684
 685 Here you can find some rules to help you write code for GNUnet.
 686
 687 @c ***********************************************************************
 688 @menu
 689 * Naming conventions::
 690 * Coding style::
 691 @end menu
 692
 693 @node Naming conventions
 694 @subsection Naming conventions
 695
 696
 697 @c ***********************************************************************
 698 @menu
 699 * include files::
 700 * binaries::
 701 * logging::
 702 * configuration::
 703 * exported symbols::
 704 * private (library-internal) symbols (including structs and macros)::
 705 * testcases::
 706 * performance tests::
 707 * src/ directories::
 708 @end menu
 709
 710 @node include files
 711 @subsubsection include files
 712
 713 @itemize @bullet
 714 @item _lib: library without need for a process
 715 @item _service: library that needs a service process
 716 @item _plugin: plugin definition
 717 @item _protocol: structs used in network protocol
 718 @item exceptions:
 719 @itemize @bullet
 720 @item gnunet_config.h --- generated
 721 @item platform.h --- first included
 722 @item plibc.h --- external library
 723 @item gnunet_common.h --- fundamental routines
 724 @item gnunet_directories.h --- generated
 725 @item gettext.h --- external library
 726 @end itemize
 727 @end itemize
 728
 729 @c ***********************************************************************
 730 @node binaries
 731 @subsubsection binaries
 732
 733 @itemize @bullet
 734 @item gnunet-service-xxx: service process (has listen socket)
 735 @item gnunet-daemon-xxx: daemon process (no listen socket)
 736 @item gnunet-helper-xxx[-yyy]: SUID helper for module xxx
 737 @item gnunet-yyy: command-line tool for end-users
 738 @item libgnunet_plugin_xxx_yyy.so: plugin for API xxx
 739 @item libgnunetxxx.so: library for API xxx
 740 @end itemize
 741
 742 @c ***********************************************************************
 743 @node logging
 744 @subsubsection logging
 745
 746 @itemize @bullet
 747 @item services and daemons use their directory name in
 748 @code{GNUNET_log_setup} (i.e. 'core') and log using
 749 plain 'GNUNET_log'.
 750 @item command-line tools use their full name in
 751 @code{GNUNET_log_setup} (i.e. 'gnunet-publish') and log using
 752 plain 'GNUNET_log'.
 753 @item service access libraries log using
 754 '@code{GNUNET_log_from}' and use '@code{DIRNAME-api}' for the
 755 component (i.e. 'core-api')
 756 @item pure libraries (without associated service) use
 757 '@code{GNUNET_log_from}' with the component set to their
 758 library name (without lib or '@file{.so}'),
 759 which should also be their directory name (i.e. '@file{nat}')
 760 @item plugins should use '@code{GNUNET_log_from}'
 761 with the directory name and the plugin name combined to produce
 762 the component name (i.e. 'transport-tcp').
 763 @item logging should be unified per-file by defining a
 764 @code{LOG} macro with the appropriate arguments,
 765 along these lines:
 766
 767 @example
 768 #define LOG(kind,...)
 769 GNUNET_log_from (kind, "example-api",__VA_ARGS__)
 770 @end example
 771
 772 @end itemize
 773
 774 @c ***********************************************************************
 775 @node configuration
 776 @subsubsection configuration
 777
 778 @itemize @bullet
 779 @item paths (that are substituted in all filenames) are in PATHS
 780 (have as few as possible)
 781 @item all options for a particular module (@file{src/MODULE})
 782 are under @code{[MODULE]}
 783 @item options for a plugin of a module
 784 are under @code{[MODULE-PLUGINNAME]}
 785 @end itemize
 786
 787 @c ***********************************************************************
 788 @node exported symbols
 789 @subsubsection exported symbols
 790
 791 @itemize @bullet
 792 @item must start with @code{GNUNET_modulename_} and be defined in
 793 @file{modulename.c}
 794 @item exceptions: those defined in @file{gnunet_common.h}
 795 @end itemize
 796
 797 @c ***********************************************************************
 798 @node private (library-internal) symbols (including structs and macros)
 799 @subsubsection private (library-internal) symbols (including structs and macros)
 800
 801 @itemize @bullet
 802 @item must NOT start with any prefix
 803 @item must not be exported in a way that linkers could use them or@ other
 804 libraries might see them via headers; they must be either
 805 declared/defined in C source files or in headers that are in the
 806 respective directory under @file{src/modulename/} and NEVER be declared
 807 in @file{src/include/}.
 808 @end itemize
 809
 810 @node testcases
 811 @subsubsection testcases
 812
 813 @itemize @bullet
 814 @item must be called @file{test_module-under-test_case-description.c}
 815 @item "case-description" maybe omitted if there is only one test
 816 @end itemize
 817
 818 @c ***********************************************************************
 819 @node performance tests
 820 @subsubsection performance tests
 821
 822 @itemize @bullet
 823 @item must be called @file{perf_module-under-test_case-description.c}
 824 @item "case-description" maybe omitted if there is only one performance
 825 test
 826 @item Must only be run if @code{HAVE_BENCHMARKS} is satisfied
 827 @end itemize
 828
 829 @c ***********************************************************************
 830 @node src/ directories
 831 @subsubsection src/ directories
 832
 833 @itemize @bullet
 834 @item gnunet-NAME: end-user applications (i.e., gnunet-search, gnunet-arm)
 835 @item gnunet-service-NAME: service processes with accessor library (i.e.,
 836 gnunet-service-arm)
 837 @item libgnunetNAME: accessor library (_service.h-header) or standalone
 838 library (_lib.h-header)
 839 @item gnunet-daemon-NAME: daemon process without accessor library (i.e.,
 840 gnunet-daemon-hostlist) and no GNUnet management port
 841 @item libgnunet_plugin_DIR_NAME: loadable plugins (i.e.,
 842 libgnunet_plugin_transport_tcp)
 843 @end itemize
 844
 845 @cindex Coding style
 846 @node Coding style
 847 @subsection Coding style
 848
 849 @c XXX: Adjust examples to GNU Standards!
 850 @itemize @bullet
 851 @item We follow the GNU Coding Standards (@pxref{Top, The GNU Coding Standards,, standards, The GNU Coding Standards});
 852 @item Indentation is done with spaces, two per level, no tabs;
 853 @item C99 struct initialization is fine;
 854 @item declare only one variable per line, for example:
 855
 856 @noindent
 857 instead of
 858
 859 @example
 860 int i,j;
 861 @end example
 862
 863 @noindent
 864 write:
 865
 866 @example
 867 int i;
 868 int j;
 869 @end example
 870
 871 @c TODO: include actual example from a file in source
 872
 873 @noindent
 874 This helps keep diffs small and forces developers to think precisely about
 875 the type of every variable.
 876 Note that @code{char *} is different from @code{const char*} and
 877 @code{int} is different from @code{unsigned int} or @code{uint32_t}.
 878 Each variable type should be chosen with care.
 879
 880 @item While @code{goto} should generally be avoided, having a
 881 @code{goto} to the end of a function to a block of clean up
 882 statements (free, close, etc.) can be acceptable.
 883
 884 @item Conditions should be written with constants on the left (to avoid
 885 accidental assignment) and with the @code{true} target being either the
 886 @code{error} case or the significantly simpler continuation. For example:
 887
 888 @example
 889 if (0 != stat ("filename,"
 890                &sbuf))
 891 @{
 892   error();
 893 @}
 894 else
 895 @{
 896   /* handle normal case here */
 897 @}
 898 @end example
 899
 900 @noindent
 901 instead of
 902
 903 @example
 904 if (stat ("filename," &sbuf) == 0) @{
 905   /* handle normal case here */
 906  @} else @{
 907   error();
 908  @}
 909 @end example
 910
 911 @noindent
 912 If possible, the error clause should be terminated with a @code{return} (or
 913 @code{goto} to some cleanup routine) and in this case, the @code{else} clause
 914 should be omitted:
 915
 916 @example
 917 if (0 != stat ("filename",
 918                &sbuf))
 919 @{
 920   error();
 921   return;
 922 @}
 923 /* handle normal case here */
 924 @end example
 925
 926 This serves to avoid deep nesting. The 'constants on the left' rule
 927 applies to all constants (including. @code{GNUNET_SCHEDULER_NO_TASK}),
 928 NULL, and enums). With the two above rules (constants on left, errors in
 929 'true' branch), there is only one way to write most branches correctly.
 930
 931 @item Combined assignments and tests are allowed if they do not hinder
 932 code clarity. For example, one can write:
 933
 934 @example
 935 if (NULL == (value = lookup_function()))
 936 @{
 937   error();
 938   return;
 939 @}
 940 @end example
 941
 942 @item Use @code{break} and @code{continue} wherever possible to avoid
 943 deep(er) nesting. Thus, we would write:
 944
 945 @example
 946 next = head;
 947 while (NULL != (pos = next))
 948 @{
 949   next = pos->next;
 950   if (! should_free (pos))
 951     continue;
 952   GNUNET_CONTAINER_DLL_remove (head,
 953                                tail,
 954                                pos);
 955   GNUNET_free (pos);
 956 @}
 957 @end example
 958
 959 instead of
 960
 961 @example
 962 next = head; while (NULL != (pos = next)) @{
 963   next = pos->next;
 964   if (should_free (pos)) @{
 965     /* unnecessary nesting! */
 966     GNUNET_CONTAINER_DLL_remove (head, tail, pos);
 967     GNUNET_free (pos);
 968    @}
 969   @}
 970 @end example
 971
 972 @item We primarily use @code{for} and @code{while} loops.
 973 A @code{while} loop is used if the method for advancing in the loop is
 974 not a straightforward increment operation. In particular, we use:
 975
 976 @example
 977 next = head;
 978 while (NULL != (pos = next))
 979 @{
 980   next = pos->next;
 981   if (! should_free (pos))
 982     continue;
 983   GNUNET_CONTAINER_DLL_remove (head,
 984                                tail,
 985                                pos);
 986   GNUNET_free (pos);
 987 @}
 988 @end example
 989
 990 to free entries in a list (as the iteration changes the structure of the
 991 list due to the free; the equivalent @code{for} loop does no longer
 992 follow the simple @code{for} paradigm of @code{for(INIT;TEST;INC)}).
 993 However, for loops that do follow the simple @code{for} paradigm we do
 994 use @code{for}, even if it involves linked lists:
 995
 996 @example
 997 /* simple iteration over a linked list */
 998 for (pos = head;
 999      NULL != pos;
1000      pos = pos->next)
1001 @{
1002    use (pos);
1003 @}
1004 @end example
1005
1006
1007 @item The first argument to all higher-order functions in GNUnet must be
1008 declared to be of type @code{void *} and is reserved for a closure. We do
1009 not use inner functions, as trampolines would conflict with setups that
1010 use non-executable stacks.
1011 The first statement in a higher-order function, which unusually should
1012 be part of the variable declarations, should assign the
1013 @code{cls} argument to the precise expected type. For example:
1014
1015 @example
1016 int
1017 callback (void *cls,
1018           char *args)
1019 @{
1020   struct Foo *foo = cls;
1021   int other_variables;
1022
1023    /* rest of function */
1024 @}
1025 @end example
1026
1027 @item As shown in the example above, after the return type of a
1028 function there should be a break.  Each parameter should
1029 be on a new line.
1030
1031 @item It is good practice to write complex @code{if} expressions instead
1032 of using deeply nested @code{if} statements. However, except for addition
1033 and multiplication, all operators should use parens. This is fine:
1034
1035 @example
1036 if ( (1 == foo) ||
1037      ( (0 == bar) &&
1038        (x != y) ) )
1039   return x;
1040 @end example
1041
1042
1043 However, this is not:
1044
1045 @example
1046 if (1 == foo)
1047   return x;
1048 if (0 == bar && x != y)
1049   return x;
1050 @end example
1051
1052 @noindent
1053 Note that splitting the @code{if} statement above is debatable as the
1054 @code{return x} is a very trivial statement. However, once the logic after
1055 the branch becomes more complicated (and is still identical), the "or"
1056 formulation should be used for sure.
1057
1058 @item There should be two empty lines between the end of the function and
1059 the comments describing the following function. There should be a single
1060 empty line after the initial variable declarations of a function. If a
1061 function has no local variables, there should be no initial empty line. If
1062 a long function consists of several complex steps, those steps might be
1063 separated by an empty line (possibly followed by a comment describing the
1064 following step). The code should not contain empty lines in arbitrary
1065 places; if in doubt, it is likely better to NOT have an empty line (this
1066 way, more code will fit on the screen).
1067 @end itemize
1068
1069 @c ***********************************************************************
1070 @node Build-system
1071 @section Build-system
1072
1073 If you have code that is likely not to compile or build rules you might
1074 want to not trigger for most developers, use @code{if HAVE_EXPERIMENTAL}
1075 in your @file{Makefile.am}.
1076 Then it is OK to (temporarily) add non-compiling (or known-to-not-port)
1077 code.
1078
1079 If you want to compile all testcases but NOT run them, run configure with
1080 the @code{--enable-test-suppression} option.
1081
1082 If you want to run all testcases, including those that take a while, run
1083 configure with the @code{--enable-expensive-testcases} option.
1084
1085 If you want to compile and run benchmarks, run configure with the
1086 @code{--enable-benchmarks} option.
1087
1088 If you want to obtain code coverage results, run configure with the
1089 @code{--enable-coverage} option and run the @file{coverage.sh} script in
1090 the @file{contrib/} directory.
1091
1092 @cindex gnunet-ext
1093 @node Developing extensions for GNUnet using the gnunet-ext template
1094 @section Developing extensions for GNUnet using the gnunet-ext template
1095
1096 For developers who want to write extensions for GNUnet we provide the
1097 gnunet-ext template to provide an easy to use skeleton.
1098
1099 gnunet-ext contains the build environment and template files for the
1100 development of GNUnet services, command line tools, APIs and tests.
1101
1102 First of all you have to obtain gnunet-ext from git:
1103
1104 @example
1105 git clone https://git.gnunet.org/gnunet-ext.git
1106 @end example
1107
1108 The next step is to bootstrap and configure it. For configure you have to
1109 provide the path containing GNUnet with
1110 @code{--with-gnunet=/path/to/gnunet} and the prefix where you want the
1111 install the extension using @code{--prefix=/path/to/install}:
1112
1113 @example
1114 ./bootstrap
1115 ./configure --prefix=/path/to/install --with-gnunet=/path/to/gnunet
1116 @end example
1117
1118 When your GNUnet installation is not included in the default linker search
1119 path, you have to add @code{/path/to/gnunet} to the file
1120 @file{/etc/ld.so.conf} and run @code{ldconfig} or your add it to the
1121 environmental variable @code{LD_LIBRARY_PATH} by using
1122
1123 @example
1124 export LD_LIBRARY_PATH=/path/to/gnunet/lib
1125 @end example
1126
1127 @cindex writing testcases
1128 @node Writing testcases
1129 @section Writing testcases
1130
1131 Ideally, any non-trivial GNUnet code should be covered by automated
1132 testcases. Testcases should reside in the same place as the code that is
1133 being tested. The name of source files implementing tests should begin
1134 with @code{test_} followed by the name of the file that contains
1135 the code that is being tested.
1136
1137 Testcases in GNUnet should be integrated with the autotools build system.
1138 This way, developers and anyone building binary packages will be able to
1139 run all testcases simply by running @code{make check}. The final
1140 testcases shipped with the distribution should output at most some brief
1141 progress information and not display debug messages by default. The
1142 success or failure of a testcase must be indicated by returning zero
1143 (success) or non-zero (failure) from the main method of the testcase.
1144 The integration with the autotools is relatively straightforward and only
1145 requires modifications to the @file{Makefile.am} in the directory
1146 containing the testcase. For a testcase testing the code in @file{foo.c}
1147 the @file{Makefile.am} would contain the following lines:
1148
1149 @example
1150 check_PROGRAMS = test_foo
1151 TESTS = $(check_PROGRAMS)
1152 test_foo_SOURCES = test_foo.c
1153 test_foo_LDADD = $(top_builddir)/src/util/libgnunetutil.la
1154 @end example
1155
1156 Naturally, other libraries used by the testcase may be specified in the
1157 @code{LDADD} directive as necessary.
1158
1159 Often testcases depend on additional input files, such as a configuration
1160 file. These support files have to be listed using the @code{EXTRA_DIST}
1161 directive in order to ensure that they are included in the distribution.
1162
1163 Example:
1164
1165 @example
1166 EXTRA_DIST = test_foo_data.conf
1167 @end example
1168
1169 Executing @code{make check} will run all testcases in the current
1170 directory and all subdirectories. Testcases can be compiled individually
1171 by running @code{make test_foo} and then invoked directly using
1172 @code{./test_foo}. Note that due to the use of plugins in GNUnet, it is
1173 typically necessary to run @code{make install} before running any
1174 testcases. Thus the canonical command @code{make check install} has to be
1175 changed to @code{make install check} for GNUnet.
1176
1177 @c ***********************************************************************
1178 @cindex Building GNUnet
1179 @node Building GNUnet and its dependencies
1180 @section Building GNUnet and its dependencies
1181
1182 In the following section we will outline how to build GNUnet and
1183 some of its dependencies. We will assume a fair amount of knowledge
1184 for building applications under UNIX-like systems. Furthermore we
1185 assume that the build environment is sane and that you are aware of
1186 any implications actions in this process could have.
1187 Instructions here can be seen as notes for developers (an extension to
1188 the 'HACKING' section in README) as well as package maintainers.
1189 @b{Users should rely on the available binary packages.}
1190 We will use Debian as an example Operating System environment. Substitute
1191 accordingly with your own Operating System environment.
1192
1193 For the full list of dependencies, consult the appropriate, up-to-date
1194 section in the @file{README} file.
1195
1196 First, we need to build or install (depending on your OS) the following
1197 packages. If you build them from source, build them in this exact order:
1198
1199 @example
1200 libgpgerror, libgcrypt, libnettle, libunbound, GnuTLS (with libunbound
1201 support)
1202 @end example
1203
1204 After we have build and installed those packages, we continue with
1205 packages closer to GNUnet in this step: libgnurl (our libcurl fork),
1206 GNU libmicrohttpd, and GNU libextractor. Again, if your package manager
1207 provides one of these packages, use the packages provided from it
1208 unless you have good reasons (package version too old, conflicts, etc).
1209 We advise against compiling widely used packages such as GnuTLS
1210 yourself if your OS provides a variant already unless you take care
1211 of maintenance of the packages then.
1212
1213 In the optimistic case, this command will give you all the dependencies:
1214
1215 @example
1216 sudo apt-get install libgnurl libmicrohttpd libextractor
1217 @end example
1218
1219 From experience we know that at the very least libgnurl is not
1220 available in some environments. You could substitute libgnurl
1221 with libcurl, but we recommend to install libgnurl, as it gives
1222 you a predefined libcurl with the small set GNUnet requires. In
1223 the past namespaces of libcurl and libgnurl were shared, which
1224 caused problems when you wanted to integrate both of them in one
1225 Operating System. This has been resolved, and they can be installed
1226 side by side now.
1227
1228 @cindex libgnurl
1229 @cindex compiling libgnurl
1230 GNUnet and some of its function depend on a limited subset of cURL/libcurl.
1231 Rather than trying to enforce a certain configuration on the world, we
1232 opted to maintain a microfork of it that ensures we can link against the
1233 right set of features. We called this specialized set of libcurl
1234 ``libgnurl''. It is fully ABI compatible with libcurl and currently used
1235 by GNUnet and some of its dependencies.
1236
1237 We download libgnurl and its digital signature from the GNU fileserver,
1238 assuming @env{TMPDIR} exists.
1239
1240 Note: TMPDIR might be @file{/tmp}, @env{TMPDIR}, @env{TMP} or any other
1241 location. For consistency we assume @env{TMPDIR} points to @file{/tmp}
1242 for the remainder of this section.
1243
1244 @example
1245 cd \$TMPDIR
1246 wget https://ftp.gnu.org/gnu/gnunet/gnurl-7.60.0.tar.Z
1247 wget https://ftp.gnu.org/gnu/gnunet/gnurl-7.60.0.tar.Z.sig
1248 @end example
1249
1250 Next, verify the digital signature of the file:
1251
1252 @example
1253 gpg --verify gnurl-7.60.0.tar.Z.sig
1254 @end example
1255
1256 If gpg fails, you might try with @command{gpg2} on your OS. If the error
1257 states that ``the key can not be found'' or it is unknown, you have to
1258 retrieve the key (A88C8ADD129828D7EAC02E52E22F9BBFEE348588) from a
1259 keyserver first:
1260
1261 @example
1262 gpg --keyserver pgp.mit.edu --recv-keys A88C8ADD129828D7EAC02E52E22F9BBFEE348588
1263 @end example
1264
1265 and rerun the verification command.
1266
1267 libgnurl will require the following packages to be present at runtime:
1268 gnutls (with DANE support / libunbound), libidn, zlib and at compile time:
1269 libtool, groff, perl, pkg-config, and python 2.7.
1270
1271 Once you have verified that all the required packages are present on your
1272 system, we can proceed to compile libgnurl:
1273
1274 @example
1275 tar -xvf gnurl-7.60.0.tar.Z
1276 cd gnurl-7.60.0
1277 sh configure --disable-ntlm-wb
1278 make
1279 make -C tests test
1280 sudo make install
1281 @end example
1282
1283 After you've compiled and installed libgnurl, we can proceed to building
1284 GNUnet.
1285
1286
1287
1288
1289 First, in addition to the GNUnet sources you might require downloading the
1290 latest version of various dependencies, depending on how recent the
1291 software versions in your distribution of GNU/Linux are.
1292 Most distributions do not include sufficiently recent versions of these
1293 dependencies.
1294 Thus, a typically installation on a "modern" GNU/Linux distribution
1295 requires you to install the following dependencies (ideally in this
1296 order):
1297
1298 @itemize @bullet
1299 @item libgpgerror and libgcrypt
1300 @item libnettle and libunbound (possibly from distribution), GnuTLS
1301 @item libgnurl (read the README)
1302 @item GNU libmicrohttpd
1303 @item GNU libextractor
1304 @end itemize
1305
1306 Make sure to first install the various mandatory and optional
1307 dependencies including development headers from your distribution.
1308
1309 Other dependencies that you should strongly consider to install is a
1310 database (MySQL, sqlite or Postgres).
1311 The following instructions will assume that you installed at least sqlite.
1312 For most distributions you should be able to find pre-build packages for
1313 the database. Again, make sure to install the client libraries @b{and} the
1314 respective development headers (if they are packaged separately) as well.
1315
1316 You can find specific, detailed instructions for installing of the
1317 dependencies (and possibly the rest of the GNUnet installation) in the
1318 platform-specific descriptions, which can be found in the Index.
1319 Please consult them now.
1320 If your distribution is not listed, please study the build
1321 instructions for Debian stable, carefully as you try to install the
1322 dependencies for your own distribution.
1323 Contributing additional instructions for further platforms is always
1324 appreciated.
1325 Please take in mind that operating system development tends to move at
1326 a rather fast speed. Due to this you should be aware that some of
1327 the instructions could be outdated by the time you are reading this.
1328 If you find a mistake, please tell us about it (or even better: send
1329 a patch to the documentation to fix it!).
1330
1331 Before proceeding further, please double-check the dependency list.
1332 Note that in addition to satisfying the dependencies, you might have to
1333 make sure that development headers for the various libraries are also
1334 installed.
1335 There maybe files for other distributions, or you might be able to find
1336 equivalent packages for your distribution.
1337
1338 While it is possible to build and install GNUnet without having root
1339 access, we will assume that you have full control over your system in
1340 these instructions.
1341 First, you should create a system user @emph{gnunet} and an additional
1342 group @emph{gnunetdns}. On the GNU/Linux distributions Debian and Ubuntu,
1343 type:
1344
1345 @example
1346 sudo adduser --system --home /var/lib/gnunet --group \
1347 --disabled-password gnunet
1348 sudo addgroup --system gnunetdns
1349 @end example
1350
1351 @noindent
1352 On other Unixes and GNU systems, this should have the same effect:
1353
1354 @example
1355 sudo useradd --system --groups gnunet --home-dir /var/lib/gnunet
1356 sudo addgroup --system gnunetdns
1357 @end example
1358
1359 Now compile and install GNUnet using:
1360
1361 @example
1362 tar xvf gnunet-@value{VERSION}.tar.gz
1363 cd gnunet-@value{VERSION}
1364 ./configure --with-sudo=sudo --with-nssdir=/lib
1365 make
1366 sudo make install
1367 @end example
1368
1369 If you want to be able to enable DEBUG-level log messages, add
1370 @code{--enable-logging=verbose} to the end of the
1371 @command{./configure} command.
1372 @code{DEBUG}-level log messages are in English only and
1373 should only be useful for developers (or for filing
1374 really detailed bug reports).
1375
1376 @noindent
1377 Next, edit the file @file{/etc/gnunet.conf} to contain the following:
1378
1379 @example
1380 [arm]
1381 START_SYSTEM_SERVICES = YES
1382 START_USER_SERVICES = NO
1383 @end example
1384
1385 @noindent
1386 You may need to update your @code{ld.so} cache to include
1387 files installed in @file{/usr/local/lib}:
1388
1389 @example
1390 # ldconfig
1391 @end example
1392
1393 @noindent
1394 Then, switch from user @code{root} to user @code{gnunet} to start
1395 the peer:
1396
1397 @example
1398 # su -s /bin/sh - gnunet
1399 $ gnunet-arm -c /etc/gnunet.conf -s
1400 @end example
1401
1402 You may also want to add the last line in the gnunet user's @file{crontab}
1403 prefixed with @code{@@reboot} so that it is executed whenever the system
1404 is booted:
1405
1406 @example
1407 @@reboot /usr/local/bin/gnunet-arm -c /etc/gnunet.conf -s
1408 @end example
1409
1410 @noindent
1411 This will only start the system-wide GNUnet services.
1412 Type @command{exit} to get back your root shell.
1413 Now, you need to configure the per-user part. For each
1414 user that should get access to GNUnet on the system, run
1415 (replace alice with your username):
1416
1417 @example
1418 sudo adduser alice gnunet
1419 @end example
1420
1421 @noindent
1422 to allow them to access the system-wide GNUnet services. Then, each
1423 user should create a configuration file @file{~/.config/gnunet.conf}
1424 with the lines:
1425
1426 @example
1427 [arm]
1428 START_SYSTEM_SERVICES = NO
1429 START_USER_SERVICES = YES
1430 DEFAULTSERVICES = gns
1431 @end example
1432
1433 @noindent
1434 and start the per-user services using
1435
1436 @example
1437 $ gnunet-arm -c ~/.config/gnunet.conf -s
1438 @end example
1439
1440 @noindent
1441 Again, adding a @code{crontab} entry to autostart the peer is advised:
1442
1443 @example
1444 @@reboot /usr/local/bin/gnunet-arm -c $HOME/.config/gnunet.conf -s
1445 @end example
1446
1447 @noindent
1448 Note that some GNUnet services (such as SOCKS5 proxies) may need a
1449 system-wide TCP port for each user.
1450 For those services, systems with more than one user may require each user
1451 to specify a different port number in their personal configuration file.
1452
1453 Finally, the user should perform the basic initial setup for the GNU Name
1454 System (GNS) certificate authority. This is done by running:
1455
1456 @example
1457 $ gnunet-gns-proxy-setup-ca
1458 @end example
1459
1460 @noindent
1461 The first generates the default zones, whereas the second setups the GNS
1462 Certificate Authority with the user's browser. Now, to activate GNS in the
1463 normal DNS resolution process, you need to edit your
1464 @file{/etc/nsswitch.conf} where you should find a line like this:
1465
1466 @example
1467 hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4
1468 @end example
1469
1470 @noindent
1471 The exact details may differ a bit, which is fine. Add the text
1472 @emph{"gns [NOTFOUND=return]"} after @emph{"files"}.
1473 Keep in mind that we included a backslash ("\") here just for
1474 markup reasons. You should write the text below on @b{one line}
1475 and @b{without} the "\":
1476
1477 @example
1478 hosts: files gns [NOTFOUND=return] mdns4_minimal \
1479 [NOTFOUND=return] dns mdns4
1480 @end example
1481
1482 @c FIXME: Document new behavior.
1483 You might want to make sure that @file{/lib/libnss_gns.so.2} exists on
1484 your system, it should have been created during the installation.
1485
1486
1487 @c **********************************************************************
1488 @cindex TESTING library
1489 @node TESTING library
1490 @section TESTING library
1491
1492 The TESTING library is used for writing testcases which involve starting a
1493 single or multiple peers. While peers can also be started by testcases
1494 using the ARM subsystem, using TESTING library provides an elegant way to
1495 do this. The configurations of the peers are auto-generated from a given
1496 template to have non-conflicting port numbers ensuring that peers'
1497 services do not run into bind errors. This is achieved by testing ports'
1498 availability by binding a listening socket to them before allocating them
1499 to services in the generated configurations.
1500
1501 An another advantage while using TESTING is that it shortens the testcase
1502 startup time as the hostkeys for peers are copied from a pre-computed set
1503 of hostkeys instead of generating them at peer startup which may take a
1504 considerable amount of time when starting multiple peers or on an embedded
1505 processor.
1506
1507 TESTING also allows for certain services to be shared among peers. This
1508 feature is invaluable when testing with multiple peers as it helps to
1509 reduce the number of services run per each peer and hence the total
1510 number of processes run per testcase.
1511
1512 TESTING library only handles creating, starting and stopping peers.
1513 Features useful for testcases such as connecting peers in a topology are
1514 not available in TESTING but are available in the TESTBED subsystem.
1515 Furthermore, TESTING only creates peers on the localhost, however by
1516 using TESTBED testcases can benefit from creating peers across multiple
1517 hosts.
1518
1519 @menu
1520 * API::
1521 * Finer control over peer stop::
1522 * Helper functions::
1523 * Testing with multiple processes::
1524 @end menu
1525
1526 @cindex TESTING API
1527 @node API
1528 @subsection API
1529
1530 TESTING abstracts a group of peers as a TESTING system. All peers in a
1531 system have common hostname and no two services of these peers have a
1532 same port or a UNIX domain socket path.
1533
1534 TESTING system can be created with the function
1535 @code{GNUNET_TESTING_system_create()} which returns a handle to the
1536 system. This function takes a directory path which is used for generating
1537 the configurations of peers, an IP address from which connections to the
1538 peers' services should be allowed, the hostname to be used in peers'
1539 configuration, and an array of shared service specifications of type
1540 @code{struct GNUNET_TESTING_SharedService}.
1541
1542 The shared service specification must specify the name of the service to
1543 share, the configuration pertaining to that shared service and the
1544 maximum number of peers that are allowed to share a single instance of
1545 the shared service.
1546
1547 TESTING system created with @code{GNUNET_TESTING_system_create()} chooses
1548 ports from the default range @code{12000} - @code{56000} while
1549 auto-generating configurations for peers.
1550 This range can be customised with the function
1551 @code{GNUNET_TESTING_system_create_with_portrange()}. This function is
1552 similar to @code{GNUNET_TESTING_system_create()} except that it take 2
1553 additional parameters --- the start and end of the port range to use.
1554
1555 A TESTING system is destroyed with the function
1556 @code{GNUNET_TESTING_system_destory()}. This function takes the handle of
1557 the system and a flag to remove the files created in the directory used
1558 to generate configurations.
1559
1560 A peer is created with the function
1561 @code{GNUNET_TESTING_peer_configure()}. This functions takes the system
1562 handle, a configuration template from which the configuration for the peer
1563 is auto-generated and the index from where the hostkey for the peer has to
1564 be copied from. When successful, this function returns a handle to the
1565 peer which can be used to start and stop it and to obtain the identity of
1566 the peer. If unsuccessful, a NULL pointer is returned with an error
1567 message. This function handles the generated configuration to have
1568 non-conflicting ports and paths.
1569
1570 Peers can be started and stopped by calling the functions
1571 @code{GNUNET_TESTING_peer_start()} and @code{GNUNET_TESTING_peer_stop()}
1572 respectively. A peer can be destroyed by calling the function
1573 @code{GNUNET_TESTING_peer_destroy}. When a peer is destroyed, the ports
1574 and paths in allocated in its configuration are reclaimed for usage in new
1575 peers.
1576
1577 @c ***********************************************************************
1578 @node Finer control over peer stop
1579 @subsection Finer control over peer stop
1580
1581 Using @code{GNUNET_TESTING_peer_stop()} is normally fine for testcases.
1582 However, calling this function for each peer is inefficient when trying to
1583 shutdown multiple peers as this function sends the termination signal to
1584 the given peer process and waits for it to terminate. It would be faster
1585 in this case to send the termination signals to the peers first and then
1586 wait on them. This is accomplished by the functions
1587 @code{GNUNET_TESTING_peer_kill()} which sends a termination signal to the
1588 peer, and the function @code{GNUNET_TESTING_peer_wait()} which waits on
1589 the peer.
1590
1591 Further finer control can be achieved by choosing to stop a peer
1592 asynchronously with the function @code{GNUNET_TESTING_peer_stop_async()}.
1593 This function takes a callback parameter and a closure for it in addition
1594 to the handle to the peer to stop. The callback function is called with
1595 the given closure when the peer is stopped. Using this function
1596 eliminates blocking while waiting for the peer to terminate.
1597
1598 An asynchronous peer stop can be canceled by calling the function
1599 @code{GNUNET_TESTING_peer_stop_async_cancel()}. Note that calling this
1600 function does not prevent the peer from terminating if the termination
1601 signal has already been sent to it. It does, however, cancels the
1602 callback to be called when the peer is stopped.
1603
1604 @c ***********************************************************************
1605 @node Helper functions
1606 @subsection Helper functions
1607
1608 Most of the testcases can benefit from an abstraction which configures a
1609 peer and starts it. This is provided by the function
1610 @code{GNUNET_TESTING_peer_run()}. This function takes the testing
1611 directory pathname, a configuration template, a callback and its closure.
1612 This function creates a peer in the given testing directory by using the
1613 configuration template, starts the peer and calls the given callback with
1614 the given closure.
1615
1616 The function @code{GNUNET_TESTING_peer_run()} starts the ARM service of
1617 the peer which starts the rest of the configured services. A similar
1618 function @code{GNUNET_TESTING_service_run} can be used to just start a
1619 single service of a peer. In this case, the peer's ARM service is not
1620 started; instead, only the given service is run.
1621
1622 @c ***********************************************************************
1623 @node Testing with multiple processes
1624 @subsection Testing with multiple processes
1625
1626 When testing GNUnet, the splitting of the code into a services and clients
1627 often complicates testing. The solution to this is to have the testcase
1628 fork @code{gnunet-service-arm}, ask it to start the required server and
1629 daemon processes and then execute appropriate client actions (to test the
1630 client APIs or the core module or both). If necessary, multiple ARM
1631 services can be forked using different ports (!) to simulate a network.
1632 However, most of the time only one ARM process is needed. Note that on
1633 exit, the testcase should shutdown ARM with a @code{TERM} signal (to give
1634 it the chance to cleanly stop its child processes).
1635
1636 The following code illustrates spawning and killing an ARM process from a
1637 testcase:
1638
1639 @example
1640 static void run (void *cls,
1641                  char *const *args,
1642                  const char *cfgfile,
1643                  const struct GNUNET_CONFIGURATION_Handle *cfg) @{
1644   struct GNUNET_OS_Process *arm_pid;
1645   arm_pid = GNUNET_OS_start_process (NULL,
1646                                      NULL,
1647                                      "gnunet-service-arm",
1648                                      "gnunet-service-arm",
1649                                      "-c",
1650                                      cfgname,
1651                                      NULL);
1652   /* do real test work here */
1653   if (0 != GNUNET_OS_process_kill (arm_pid, SIGTERM))
1654     GNUNET_log_strerror
1655       (GNUNET_ERROR_TYPE_WARNING, "kill");
1656   GNUNET_assert (GNUNET_OK == GNUNET_OS_process_wait (arm_pid));
1657   GNUNET_OS_process_close (arm_pid); @}
1658
1659 GNUNET_PROGRAM_run (argc, argv,
1660                     "NAME-OF-TEST",
1661                     "nohelp",
1662                     options,
1663                     &run,
1664                     cls);
1665 @end example
1666
1667
1668 An alternative way that works well to test plugins is to implement a
1669 mock-version of the environment that the plugin expects and then to
1670 simply load the plugin directly.
1671
1672 @c ***********************************************************************
1673 @node Performance regression analysis with Gauger
1674 @section Performance regression analysis with Gauger
1675
1676 To help avoid performance regressions, GNUnet uses Gauger. Gauger is a
1677 simple logging tool that allows remote hosts to send performance data to
1678 a central server, where this data can be analyzed and visualized. Gauger
1679 shows graphs of the repository revisions and the performance data recorded
1680 for each revision, so sudden performance peaks or drops can be identified
1681 and linked to a specific revision number.
1682
1683 In the case of GNUnet, the buildbots log the performance data obtained
1684 during the tests after each build. The data can be accessed on GNUnet's
1685 Gauger page.
1686
1687 The menu on the left allows to select either the results of just one
1688 build bot (under "Hosts") or review the data from all hosts for a given
1689 test result (under "Metrics"). In case of very different absolute value
1690 of the results, for instance arm vs. amd64 machines, the option
1691 "Normalize" on a metric view can help to get an idea about the
1692 performance evolution across all hosts.
1693
1694 Using Gauger in GNUnet and having the performance of a module tracked over
1695 time is very easy. First of course, the testcase must generate some
1696 consistent metric, which makes sense to have logged. Highly volatile or
1697 random dependent metrics probably are not ideal candidates for meaningful
1698 regression detection.
1699
1700 To start logging any value, just include @code{gauger.h} in your testcase
1701 code. Then, use the macro @code{GAUGER()} to make the Buildbots log
1702 whatever value is of interest for you to @code{gnunet.org}'s Gauger
1703 server. No setup is necessary as most Buildbots have already everything
1704 in place and new metrics are created on demand. To delete a metric, you
1705 need to contact a member of the GNUnet development team (a file will need
1706 to be removed manually from the respective directory).
1707
1708 The code in the test should look like this:
1709
1710 @example
1711 [other includes]
1712 #include <gauger.h>
1713
1714 int main (int argc, char *argv[]) @{
1715
1716   [run test, generate data]
1717     GAUGER("YOUR_MODULE",
1718            "METRIC_NAME",
1719            (float)value,
1720            "UNIT"); @}
1721 @end example
1722
1723 Where:
1724
1725 @table @asis
1726
1727 @item @strong{YOUR_MODULE} is a category in the gauger page and should be
1728 the name of the module or subsystem like "Core" or "DHT"
1729 @item @strong{METRIC} is
1730 the name of the metric being collected and should be concise and
1731 descriptive, like "PUT operations in sqlite-datastore".
1732 @item @strong{value} is the value
1733 of the metric that is logged for this run.
1734 @item @strong{UNIT} is the unit in
1735 which the value is measured, for instance "kb/s" or "kb of RAM/node".
1736 @end table
1737
1738 If you wish to use Gauger for your own project, you can grab a copy of the
1739 latest stable release or check out Gauger's Subversion repository.
1740
1741 @cindex TESTBED Subsystem
1742 @node TESTBED Subsystem
1743 @section TESTBED Subsystem
1744
1745 The TESTBED subsystem facilitates testing and measuring of multi-peer
1746 deployments on a single host or over multiple hosts.
1747
1748 The architecture of the testbed module is divided into the following:
1749 @itemize @bullet
1750
1751 @item Testbed API: An API which is used by the testing driver programs. It
1752 provides with functions for creating, destroying, starting, stopping
1753 peers, etc.
1754
1755 @item Testbed service (controller): A service which is started through the
1756 Testbed API. This service handles operations to create, destroy, start,
1757 stop peers, connect them, modify their configurations.
1758
1759 @item Testbed helper: When a controller has to be started on a host, the
1760 testbed API starts the testbed helper on that host which in turn starts
1761 the controller. The testbed helper receives a configuration for the
1762 controller through its stdin and changes it to ensure the controller
1763 doesn't run into any port conflict on that host.
1764 @end itemize
1765
1766
1767 The testbed service (controller) is different from the other GNUnet
1768 services in that it is not started by ARM and is not supposed to be run
1769 as a daemon. It is started by the testbed API through a testbed helper.
1770 In a typical scenario involving multiple hosts, a controller is started
1771 on each host. Controllers take up the actual task of creating peers,
1772 starting and stopping them on the hosts they run.
1773
1774 While running deployments on a single localhost the testbed API starts the
1775 testbed helper directly as a child process. When running deployments on
1776 remote hosts the testbed API starts Testbed Helpers on each remote host
1777 through remote shell. By default testbed API uses SSH as a remote shell.
1778 This can be changed by setting the environmental variable
1779 GNUNET_TESTBED_RSH_CMD to the required remote shell program. This
1780 variable can also contain parameters which are to be passed to the remote
1781 shell program. For e.g:
1782
1783 @example
1784 export GNUNET_TESTBED_RSH_CMD="ssh -o BatchMode=yes \
1785 -o NoHostAuthenticationForLocalhost=yes %h"
1786 @end example
1787
1788 Substitutions are allowed in the command string above,
1789 this allows for substitutions through placemarks which begin with a `%'.
1790 At present the following substitutions are supported
1791
1792 @itemize @bullet
1793 @item %h: hostname
1794 @item %u: username
1795 @item %p: port
1796 @end itemize
1797
1798 Note that the substitution placemark is replaced only when the
1799 corresponding field is available and only once. Specifying
1800
1801 @example
1802 %u@@%h
1803 @end example
1804
1805 doesn't work either. If you want to user username substitutions for
1806 @command{SSH}, use the argument @code{-l} before the
1807 username substitution.
1808
1809 For example:
1810 @example
1811 ssh -l %u -p %p %h
1812 @end example
1813
1814 The testbed API and the helper communicate through the helpers stdin and
1815 stdout. As the helper is started through a remote shell on remote hosts
1816 any output messages from the remote shell interfere with the communication
1817 and results in a failure while starting the helper. For this reason, it is
1818 suggested to use flags to make the remote shells produce no output
1819 messages and to have password-less logins. The default remote shell, SSH,
1820 the default options are:
1821
1822 @example
1823 -o BatchMode=yes -o NoHostBasedAuthenticationForLocalhost=yes"
1824 @end example
1825
1826 Password-less logins should be ensured by using SSH keys.
1827
1828 Since the testbed API executes the remote shell as a non-interactive
1829 shell, certain scripts like .bashrc, .profiler may not be executed. If
1830 this is the case testbed API can be forced to execute an interactive
1831 shell by setting up the environmental variable
1832 @code{GNUNET_TESTBED_RSH_CMD_SUFFIX} to a shell program.
1833
1834 An example could be:
1835
1836 @example
1837 export GNUNET_TESTBED_RSH_CMD_SUFFIX="sh -lc"
1838 @end example
1839
1840 The testbed API will then execute the remote shell program as:
1841
1842 @example
1843 $GNUNET_TESTBED_RSH_CMD -p $port $dest $GNUNET_TESTBED_RSH_CMD_SUFFIX \
1844 gnunet-helper-testbed
1845 @end example
1846
1847 On some systems, problems may arise while starting testbed helpers if
1848 GNUnet is installed into a custom location since the helper may not be
1849 found in the standard path. This can be addressed by setting the variable
1850 `@code{HELPER_BINARY_PATH}' to the path of the testbed helper.
1851 Testbed API will then use this path to start helper binaries both
1852 locally and remotely.
1853
1854 Testbed API can accessed by including the
1855 @file{gnunet_testbed_service.h} file and linking with
1856 @code{-lgnunettestbed}.
1857
1858 @c ***********************************************************************
1859 @menu
1860 * Supported Topologies::
1861 * Hosts file format::
1862 * Topology file format::
1863 * Testbed Barriers::
1864 * TESTBED Caveats::
1865 @end menu
1866
1867 @node Supported Topologies
1868 @subsection Supported Topologies
1869
1870 While testing multi-peer deployments, it is often needed that the peers
1871 are connected in some topology. This requirement is addressed by the
1872 function @code{GNUNET_TESTBED_overlay_connect()} which connects any given
1873 two peers in the testbed.
1874
1875 The API also provides a helper function
1876 @code{GNUNET_TESTBED_overlay_configure_topology()} to connect a given set
1877 of peers in any of the following supported topologies:
1878
1879 @itemize @bullet
1880
1881 @item @code{GNUNET_TESTBED_TOPOLOGY_CLIQUE}: All peers are connected with
1882 each other
1883
1884 @item @code{GNUNET_TESTBED_TOPOLOGY_LINE}: Peers are connected to form a
1885 line
1886
1887 @item @code{GNUNET_TESTBED_TOPOLOGY_RING}: Peers are connected to form a
1888 ring topology
1889
1890 @item @code{GNUNET_TESTBED_TOPOLOGY_2D_TORUS}: Peers are connected to
1891 form a 2 dimensional torus topology. The number of peers may not be a
1892 perfect square, in that case the resulting torus may not have the uniform
1893 poloidal and toroidal lengths
1894
1895 @item @code{GNUNET_TESTBED_TOPOLOGY_ERDOS_RENYI}: Topology is generated
1896 to form a random graph. The number of links to be present should be given
1897
1898 @item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD}: Peers are connected to
1899 form a 2D Torus with some random links among them. The number of random
1900 links are to be given
1901
1902 @item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD_RING}: Peers are
1903 connected to form a ring with some random links among them. The number of
1904 random links are to be given
1905
1906 @item @code{GNUNET_TESTBED_TOPOLOGY_SCALE_FREE}: Connects peers in a
1907 topology where peer connectivity follows power law - new peers are
1908 connected with high probability to well connected peers.
1909 (See Emergence of Scaling in Random Networks. Science 286,
1910 509-512, 1999
1911 (@uref{https://git.gnunet.org/bibliography.git/plain/docs/emergence_of_scaling_in_random_networks__barabasi_albert_science_286__1999.pdf, pdf}))
1912
1913 @item @code{GNUNET_TESTBED_TOPOLOGY_FROM_FILE}: The topology information
1914 is loaded from a file. The path to the file has to be given.
1915 @xref{Topology file format}, for the format of this file.
1916
1917 @item @code{GNUNET_TESTBED_TOPOLOGY_NONE}: No topology
1918 @end itemize
1919
1920
1921 The above supported topologies can be specified respectively by setting
1922 the variable @code{OVERLAY_TOPOLOGY} to the following values in the
1923 configuration passed to Testbed API functions
1924 @code{GNUNET_TESTBED_test_run()} and
1925 @code{GNUNET_TESTBED_run()}:
1926
1927 @itemize @bullet
1928 @item @code{CLIQUE}
1929 @item @code{RING}
1930 @item @code{LINE}
1931 @item @code{2D_TORUS}
1932 @item @code{RANDOM}
1933 @item @code{SMALL_WORLD}
1934 @item @code{SMALL_WORLD_RING}
1935 @item @code{SCALE_FREE}
1936 @item @code{FROM_FILE}
1937 @item @code{NONE}
1938 @end itemize
1939
1940
1941 Topologies @code{RANDOM}, @code{SMALL_WORLD} and @code{SMALL_WORLD_RING}
1942 require the option @code{OVERLAY_RANDOM_LINKS} to be set to the number of
1943 random links to be generated in the configuration. The option will be
1944 ignored for the rest of the topologies.
1945
1946 Topology @code{SCALE_FREE} requires the options
1947 @code{SCALE_FREE_TOPOLOGY_CAP} to be set to the maximum number of peers
1948 which can connect to a peer and @code{SCALE_FREE_TOPOLOGY_M} to be set to
1949 how many peers a peer should be at least connected to.
1950
1951 Similarly, the topology @code{FROM_FILE} requires the option
1952 @code{OVERLAY_TOPOLOGY_FILE} to contain the path of the file containing
1953 the topology information. This option is ignored for the rest of the
1954 topologies. @xref{Topology file format}, for the format of this file.
1955
1956 @c ***********************************************************************
1957 @node Hosts file format
1958 @subsection Hosts file format
1959
1960 The testbed API offers the function
1961 @code{GNUNET_TESTBED_hosts_load_from_file()} to load from a given file
1962 details about the hosts which testbed can use for deploying peers.
1963 This function is useful to keep the data about hosts
1964 separate instead of hard coding them in code.
1965
1966 Another helper function from testbed API, @code{GNUNET_TESTBED_run()}
1967 also takes a hosts file name as its parameter. It uses the above
1968 function to populate the hosts data structures and start controllers to
1969 deploy peers.
1970
1971 These functions require the hosts file to be of the following format:
1972 @itemize @bullet
1973 @item Each line is interpreted to have details about a host
1974 @item Host details should include the username to use for logging into the
1975 host, the hostname of the host and the port number to use for the remote
1976 shell program. All thee values should be given.
1977 @item These details should be given in the following format:
1978 @example
1979 <username>@@<hostname>:<port>
1980 @end example
1981 @end itemize
1982
1983 Note that having canonical hostnames may cause problems while resolving
1984 the IP addresses (See this bug). Hence it is advised to provide the hosts'
1985 IP numerical addresses as hostnames whenever possible.
1986
1987 @c ***********************************************************************
1988 @node Topology file format
1989 @subsection Topology file format
1990
1991 A topology file describes how peers are to be connected. It should adhere
1992 to the following format for testbed to parse it correctly.
1993
1994 Each line should begin with the target peer id. This should be followed by
1995 a colon(`:') and origin peer ids separated by `|'. All spaces except for
1996 newline characters are ignored. The API will then try to connect each
1997 origin peer to the target peer.
1998
1999 For example, the following file will result in 5 overlay connections:
2000 [2->1], [3->1],[4->3], [0->3], [2->0]@
2001 @code{@ 1:2|3@ 3:4| 0@ 0: 2@ }
2002
2003 @c ***********************************************************************
2004 @node Testbed Barriers
2005 @subsection Testbed Barriers
2006
2007 The testbed subsystem's barriers API facilitates coordination among the
2008 peers run by the testbed and the experiment driver. The concept is
2009 similar to the barrier synchronisation mechanism found in parallel
2010 programming or multi-threading paradigms - a peer waits at a barrier upon
2011 reaching it until the barrier is reached by a predefined number of peers.
2012 This predefined number of peers required to cross a barrier is also called
2013 quorum. We say a peer has reached a barrier if the peer is waiting for the
2014 barrier to be crossed. Similarly a barrier is said to be reached if the
2015 required quorum of peers reach the barrier. A barrier which is reached is
2016 deemed as crossed after all the peers waiting on it are notified.
2017
2018 The barriers API provides the following functions:
2019 @itemize @bullet
2020 @item @strong{@code{GNUNET_TESTBED_barrier_init()}:} function to
2021 initialize a barrier in the experiment
2022 @item @strong{@code{GNUNET_TESTBED_barrier_cancel()}:} function to cancel
2023 a barrier which has been initialized before
2024 @item @strong{@code{GNUNET_TESTBED_barrier_wait()}:} function to signal
2025 barrier service that the caller has reached a barrier and is waiting for
2026 it to be crossed
2027 @item @strong{@code{GNUNET_TESTBED_barrier_wait_cancel()}:} function to
2028 stop waiting for a barrier to be crossed
2029 @end itemize
2030
2031
2032 Among the above functions, the first two, namely
2033 @code{GNUNET_TESTBED_barrier_init()} and
2034 @code{GNUNET_TESTBED_barrier_cancel()} are used by experiment drivers. All
2035 barriers should be initialised by the experiment driver by calling
2036 @code{GNUNET_TESTBED_barrier_init()}. This function takes a name to
2037 identify the barrier, the quorum required for the barrier to be crossed
2038 and a notification callback for notifying the experiment driver when the
2039 barrier is crossed. @code{GNUNET_TESTBED_barrier_cancel()} cancels an
2040 initialised barrier and frees the resources allocated for it. This
2041 function can be called upon a initialised barrier before it is crossed.
2042
2043 The remaining two functions @code{GNUNET_TESTBED_barrier_wait()} and
2044 @code{GNUNET_TESTBED_barrier_wait_cancel()} are used in the peer's
2045 processes. @code{GNUNET_TESTBED_barrier_wait()} connects to the local
2046 barrier service running on the same host the peer is running on and
2047 registers that the caller has reached the barrier and is waiting for the
2048 barrier to be crossed. Note that this function can only be used by peers
2049 which are started by testbed as this function tries to access the local
2050 barrier service which is part of the testbed controller service. Calling
2051 @code{GNUNET_TESTBED_barrier_wait()} on an uninitialised barrier results
2052 in failure. @code{GNUNET_TESTBED_barrier_wait_cancel()} cancels the
2053 notification registered by @code{GNUNET_TESTBED_barrier_wait()}.
2054
2055
2056 @c ***********************************************************************
2057 @menu
2058 * Implementation::
2059 @end menu
2060
2061 @node Implementation
2062 @subsubsection Implementation
2063
2064 Since barriers involve coordination between experiment driver and peers,
2065 the barrier service in the testbed controller is split into two
2066 components. The first component responds to the message generated by the
2067 barrier API used by the experiment driver (functions
2068 @code{GNUNET_TESTBED_barrier_init()} and
2069 @code{GNUNET_TESTBED_barrier_cancel()}) and the second component to the
2070 messages generated by barrier API used by peers (functions
2071 @code{GNUNET_TESTBED_barrier_wait()} and
2072 @code{GNUNET_TESTBED_barrier_wait_cancel()}).
2073
2074 Calling @code{GNUNET_TESTBED_barrier_init()} sends a
2075 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_INIT} message to the master
2076 controller. The master controller then registers a barrier and calls
2077 @code{GNUNET_TESTBED_barrier_init()} for each its subcontrollers. In this
2078 way barrier initialisation is propagated to the controller hierarchy.
2079 While propagating initialisation, any errors at a subcontroller such as
2080 timeout during further propagation are reported up the hierarchy back to
2081 the experiment driver.
2082
2083 Similar to @code{GNUNET_TESTBED_barrier_init()},
2084 @code{GNUNET_TESTBED_barrier_cancel()} propagates
2085 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_CANCEL} message which causes
2086 controllers to remove an initialised barrier.
2087
2088 The second component is implemented as a separate service in the binary
2089 `gnunet-service-testbed' which already has the testbed controller service.
2090 Although this deviates from the gnunet process architecture of having one
2091 service per binary, it is needed in this case as this component needs
2092 access to barrier data created by the first component. This component
2093 responds to @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages from
2094 local peers when they call @code{GNUNET_TESTBED_barrier_wait()}. Upon
2095 receiving @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} message, the
2096 service checks if the requested barrier has been initialised before and
2097 if it was not initialised, an error status is sent through
2098 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to the local
2099 peer and the connection from the peer is terminated. If the barrier is
2100 initialised before, the barrier's counter for reached peers is incremented
2101 and a notification is registered to notify the peer when the barrier is
2102 reached. The connection from the peer is left open.
2103
2104 When enough peers required to attain the quorum send
2105 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages, the controller
2106 sends a @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to its
2107 parent informing that the barrier is crossed. If the controller has
2108 started further subcontrollers, it delays this message until it receives
2109 a similar notification from each of those subcontrollers. Finally, the
2110 barriers API at the experiment driver receives the
2111 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} when the barrier is
2112 reached at all the controllers.
2113
2114 The barriers API at the experiment driver responds to the
2115 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message by echoing it
2116 back to the master controller and notifying the experiment controller
2117 through the notification callback that a barrier has been crossed. The
2118 echoed @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message is
2119 propagated by the master controller to the controller hierarchy. This
2120 propagation triggers the notifications registered by peers at each of the
2121 controllers in the hierarchy. Note the difference between this downward
2122 propagation of the @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS}
2123 message from its upward propagation --- the upward propagation is needed
2124 for ensuring that the barrier is reached by all the controllers and the
2125 downward propagation is for triggering that the barrier is crossed.
2126
2127 @cindex TESTBED Caveats
2128 @node TESTBED Caveats
2129 @subsection TESTBED Caveats
2130
2131 This section documents a few caveats when using the GNUnet testbed
2132 subsystem.
2133
2134 @c ***********************************************************************
2135 @menu
2136 * CORE must be started::
2137 * ATS must want the connections::
2138 @end menu
2139
2140 @node CORE must be started
2141 @subsubsection CORE must be started
2142
2143 A uncomplicated issue is bug #3993
2144 (@uref{https://bugs.gnunet.org/view.php?id=3993, https://bugs.gnunet.org/view.php?id=3993}):
2145 Your configuration MUST somehow ensure that for each peer the
2146 @code{CORE} service is started when the peer is setup, otherwise
2147 @code{TESTBED} may fail to connect peers when the topology is initialized,
2148 as @code{TESTBED} will start some @code{CORE} services but not
2149 necessarily all (but it relies on all of them running). The easiest way
2150 is to set
2151
2152 @example
2153 [core]
2154 IMMEDIATE_START = YES
2155 @end example
2156
2157 @noindent
2158 in the configuration file.
2159 Alternatively, having any service that directly or indirectly depends on
2160 @code{CORE} being started with @code{IMMEDIATE_START} will also do.
2161 This issue largely arises if users try to over-optimize by not
2162 starting any services with @code{IMMEDIATE_START}.
2163
2164 @c ***********************************************************************
2165 @node ATS must want the connections
2166 @subsubsection ATS must want the connections
2167
2168 When TESTBED sets up connections, it only offers the respective HELLO
2169 information to the TRANSPORT service. It is then up to the ATS service to
2170 @strong{decide} to use the connection. The ATS service will typically
2171 eagerly establish any connection if the number of total connections is
2172 low (relative to bandwidth). Details may further depend on the
2173 specific ATS backend that was configured. If ATS decides to NOT establish
2174 a connection (even though TESTBED provided the required information), then
2175 that connection will count as failed for TESTBED. Note that you can
2176 configure TESTBED to tolerate a certain number of connection failures
2177 (see '-e' option of gnunet-testbed-profiler). This issue largely arises
2178 for dense overlay topologies, especially if you try to create cliques
2179 with more than 20 peers.
2180
2181 @cindex libgnunetutil
2182 @node libgnunetutil
2183 @section libgnunetutil
2184
2185 libgnunetutil is the fundamental library that all GNUnet code builds upon.
2186 Ideally, this library should contain most of the platform dependent code
2187 (except for user interfaces and really special needs that only few
2188 applications have). It is also supposed to offer basic services that most
2189 if not all GNUnet binaries require. The code of libgnunetutil is in the
2190 @file{src/util/} directory. The public interface to the library is in the
2191 gnunet_util.h header. The functions provided by libgnunetutil fall
2192 roughly into the following categories (in roughly the order of importance
2193 for new developers):
2194
2195 @itemize @bullet
2196 @item logging (common_logging.c)
2197 @item memory allocation (common_allocation.c)
2198 @item endianess conversion (common_endian.c)
2199 @item internationalization (common_gettext.c)
2200 @item String manipulation (string.c)
2201 @item file access (disk.c)
2202 @item buffered disk IO (bio.c)
2203 @item time manipulation (time.c)
2204 @item configuration parsing (configuration.c)
2205 @item command-line handling (getopt*.c)
2206 @item cryptography (crypto_*.c)
2207 @item data structures (container_*.c)
2208 @item CPS-style scheduling (scheduler.c)
2209 @item Program initialization (program.c)
2210 @item Networking (network.c, client.c, server*.c, service.c)
2211 @item message queuing (mq.c)
2212 @item bandwidth calculations (bandwidth.c)
2213 @item Other OS-related (os*.c, plugin.c, signal.c)
2214 @item Pseudonym management (pseudonym.c)
2215 @end itemize
2216
2217 It should be noted that only developers that fully understand this entire
2218 API will be able to write good GNUnet code.
2219
2220 Ideally, porting GNUnet should only require porting the gnunetutil
2221 library. More testcases for the gnunetutil APIs are therefore a great
2222 way to make porting of GNUnet easier.
2223
2224 @menu
2225 * Logging::
2226 * Interprocess communication API (IPC)::
2227 * Cryptography API::
2228 * Message Queue API::
2229 * Service API::
2230 * Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps::
2231 * CONTAINER_MDLL API::
2232 @end menu
2233
2234 @cindex Logging
2235 @cindex log levels
2236 @node Logging
2237 @subsection Logging
2238
2239 GNUnet is able to log its activity, mostly for the purposes of debugging
2240 the program at various levels.
2241
2242 @file{gnunet_common.h} defines several @strong{log levels}:
2243 @table @asis
2244
2245 @item ERROR for errors
2246 (really problematic situations, often leading to crashes)
2247 @item WARNING for warnings
2248 (troubling situations that might have negative consequences, although
2249 not fatal)
2250 @item INFO for various information.
2251 Used somewhat rarely, as GNUnet statistics is used to hold and display
2252 most of the information that users might find interesting.
2253 @item DEBUG for debugging.
2254 Does not produce much output on normal builds, but when extra logging is
2255 enabled at compile time, a staggering amount of data is outputted under
2256 this log level.
2257 @end table
2258
2259
2260 Normal builds of GNUnet (configured with @code{--enable-logging[=yes]})
2261 are supposed to log nothing under DEBUG level. The
2262 @code{--enable-logging=verbose} configure option can be used to create a
2263 build with all logging enabled. However, such build will produce large
2264 amounts of log data, which is inconvenient when one tries to hunt down a
2265 specific problem.
2266
2267 To mitigate this problem, GNUnet provides facilities to apply a filter to
2268 reduce the logs:
2269 @table @asis
2270
2271 @item Logging by default When no log levels are configured in any other
2272 way (see below), GNUnet will default to the WARNING log level. This
2273 mostly applies to GNUnet command line utilities, services and daemons;
2274 tests will always set log level to WARNING or, if
2275 @code{--enable-logging=verbose} was passed to configure, to DEBUG. The
2276 default level is suggested for normal operation.
2277 @item The -L option Most GNUnet executables accept an "-L loglevel" or
2278 "--log=loglevel" option. If used, it makes the process set a global log
2279 level to "loglevel". Thus it is possible to run some processes
2280 with -L DEBUG, for example, and others with -L ERROR to enable specific
2281 settings to diagnose problems with a particular process.
2282 @item Configuration files.  Because GNUnet
2283 service and daemon processes are usually launched by gnunet-arm, it is not
2284 possible to pass different custom command line options directly to every
2285 one of them. The options passed to @code{gnunet-arm} only affect
2286 gnunet-arm and not the rest of GNUnet. However, one can specify a
2287 configuration key "OPTIONS" in the section that corresponds to a service
2288 or a daemon, and put a value of "-L loglevel" there. This will make the
2289 respective service or daemon set its log level to "loglevel" (as the
2290 value of OPTIONS will be passed as a command-line argument).
2291
2292 To specify the same log level for all services without creating separate
2293 "OPTIONS" entries in the configuration for each one, the user can specify
2294 a config key "GLOBAL_POSTFIX" in the [arm] section of the configuration
2295 file. The value of GLOBAL_POSTFIX will be appended to all command lines
2296 used by the ARM service to run other services. It can contain any option
2297 valid for all GNUnet commands, thus in particular the "-L loglevel"
2298 option. The ARM service itself is, however, unaffected by GLOBAL_POSTFIX;
2299 to set log level for it, one has to specify "OPTIONS" key in the [arm]
2300 section.
2301 @item Environment variables.
2302 Setting global per-process log levels with "-L loglevel" does not offer
2303 sufficient log filtering granularity, as one service will call interface
2304 libraries and supporting libraries of other GNUnet services, potentially
2305 producing lots of debug log messages from these libraries. Also, changing
2306 the config file is not always convenient (especially when running the
2307 GNUnet test suite).@ To fix that, and to allow GNUnet to use different
2308 log filtering at runtime without re-compiling the whole source tree, the
2309 log calls were changed to be configurable at run time. To configure them
2310 one has to define environment variables "GNUNET_FORCE_LOGFILE",
2311 "GNUNET_LOG" and/or "GNUNET_FORCE_LOG":
2312 @itemize @bullet
2313
2314 @item "GNUNET_LOG" only affects the logging when no global log level is
2315 configured by any other means (that is, the process does not explicitly
2316 set its own log level, there are no "-L loglevel" options on command line
2317 or in configuration files), and can be used to override the default
2318 WARNING log level.
2319
2320 @item "GNUNET_FORCE_LOG" will completely override any other log
2321 configuration options given.
2322
2323 @item "GNUNET_FORCE_LOGFILE" will completely override the location of the
2324 file to log messages to. It should contain a relative or absolute file
2325 name. Setting GNUNET_FORCE_LOGFILE is equivalent to passing
2326 "--log-file=logfile" or "-l logfile" option (see below). It supports "[]"
2327 format in file names, but not "@{@}" (see below).
2328 @end itemize
2329
2330
2331 Because environment variables are inherited by child processes when they
2332 are launched, starting or re-starting the ARM service with these
2333 variables will propagate them to all other services.
2334
2335 "GNUNET_LOG" and "GNUNET_FORCE_LOG" variables must contain a specially
2336 formatted @strong{logging definition} string, which looks like this:@
2337
2338 @c FIXME: Can we close this with [/component] instead?
2339 @example
2340 [component];[file];[function];[from_line[-to_line]];loglevel[/component...]
2341 @end example
2342
2343 That is, a logging definition consists of definition entries, separated by
2344 slashes ('/'). If only one entry is present, there is no need to add a
2345 slash to its end (although it is not forbidden either).@ All definition
2346 fields (component, file, function, lines and loglevel) are mandatory, but
2347 (except for the loglevel) they can be empty. An empty field means
2348 "match anything". Note that even if fields are empty, the semicolon (';')
2349 separators must be present.@ The loglevel field is mandatory, and must
2350 contain one of the log level names (ERROR, WARNING, INFO or DEBUG).@
2351 The lines field might contain one non-negative number, in which case it
2352 matches only one line, or a range "from_line-to_line", in which case it
2353 matches any line in the interval [from_line;to_line] (that is, including
2354 both start and end line).@ GNUnet mostly defaults component name to the
2355 name of the service that is implemented in a process ('transport',
2356 'core', 'peerinfo', etc), but logging calls can specify custom component
2357 names using @code{GNUNET_log_from}.@ File name and function name are
2358 provided by the compiler (__FILE__ and __FUNCTION__ built-ins).
2359
2360 Component, file and function fields are interpreted as non-extended
2361 regular expressions (GNU libc regex functions are used). Matching is
2362 case-sensitive, "^" and "$" will match the beginning and the end of the
2363 text. If a field is empty, its contents are automatically replaced with
2364 a ".*" regular expression, which matches anything. Matching is done in
2365 the default way, which means that the expression matches as long as it's
2366 contained anywhere in the string. Thus "GNUNET_" will match both
2367 "GNUNET_foo" and "BAR_GNUNET_BAZ". Use '^' and/or '$' to make sure that
2368 the expression matches at the start and/or at the end of the string.
2369 The semicolon (';') can't be escaped, and GNUnet will not use it in
2370 component names (it can't be used in function names and file names
2371 anyway).
2372
2373 @end table
2374
2375
2376 Every logging call in GNUnet code will be (at run time) matched against
2377 the log definitions passed to the process. If a log definition fields are
2378 matching the call arguments, then the call log level is compared the the
2379 log level of that definition. If the call log level is less or equal to
2380 the definition log level, the call is allowed to proceed. Otherwise the
2381 logging call is forbidden, and nothing is logged. If no definitions
2382 matched at all, GNUnet will use the global log level or (if a global log
2383 level is not specified) will default to WARNING (that is, it will allow
2384 the call to proceed, if its level is less or equal to the global log
2385 level or to WARNING).
2386
2387 That is, definitions are evaluated from left to right, and the first
2388 matching definition is used to allow or deny the logging call. Thus it is
2389 advised to place narrow definitions at the beginning of the logdef
2390 string, and generic definitions - at the end.
2391
2392 Whether a call is allowed or not is only decided the first time this
2393 particular call is made. The evaluation result is then cached, so that
2394 any attempts to make the same call later will be allowed or disallowed
2395 right away. Because of that runtime log level evaluation should not
2396 significantly affect the process performance.
2397 Log definition parsing is only done once, at the first call to
2398 @code{GNUNET_log_setup ()} made by the process (which is usually done soon after
2399 it starts).
2400
2401 At the moment of writing there is no way to specify logging definitions
2402 from configuration files, only via environment variables.
2403
2404 At the moment GNUnet will stop processing a log definition when it
2405 encounters an error in definition formatting or an error in regular
2406 expression syntax, and will not report the failure in any way.
2407
2408
2409 @c ***********************************************************************
2410 @menu
2411 * Examples::
2412 * Log files::
2413 * Updated behavior of GNUNET_log::
2414 @end menu
2415
2416 @node Examples
2417 @subsubsection Examples
2418
2419 @table @asis
2420
2421 @item @code{GNUNET_FORCE_LOG=";;;;DEBUG" gnunet-arm -s} Start GNUnet
2422 process tree, running all processes with DEBUG level (one should be
2423 careful with it, as log files will grow at alarming rate!)
2424 @item @code{GNUNET_FORCE_LOG="core;;;;DEBUG" gnunet-arm -s} Start GNUnet
2425 process tree, running the core service under DEBUG level (everything else
2426 will use configured or default level).
2427
2428 @item Start GNUnet process tree, allowing any logging calls from
2429 gnunet-service-transport_validation.c (everything else will use
2430 configured or default level).
2431
2432 @example
2433 GNUNET_FORCE_LOG=";gnunet-service-transport_validation.c;;; DEBUG" \
2434 gnunet-arm -s
2435 @end example
2436
2437 @item Start GNUnet process tree, allowing any logging calls from
2438 gnunet-gnunet-service-fs_push.c (everything else will use configured or
2439 default level).
2440
2441 @example
2442 GNUNET_FORCE_LOG="fs;gnunet-service-fs_push.c;;;DEBUG" gnunet-arm -s
2443 @end example
2444
2445 @item Start GNUnet process tree, allowing any logging calls from the
2446 GNUNET_NETWORK_socket_select function (everything else will use
2447 configured or default level).
2448
2449 @example
2450 GNUNET_FORCE_LOG=";;GNUNET_NETWORK_socket_select;;DEBUG" gnunet-arm -s
2451 @end example
2452
2453 @item Start GNUnet process tree, allowing any logging calls from the
2454 components that have "transport" in their names, and are made from
2455 function that have "send" in their names. Everything else will be allowed
2456 to be logged only if it has WARNING level.
2457
2458 @example
2459 GNUNET_FORCE_LOG="transport.*;;.*send.*;;DEBUG/;;;;WARNING" gnunet-arm -s
2460 @end example
2461
2462 @end table
2463
2464
2465 On Windows, one can use batch files to run GNUnet processes with special
2466 environment variables, without affecting the whole system. Such batch
2467 file will look like this:
2468
2469 @example
2470 set GNUNET_FORCE_LOG=;;do_transmit;;DEBUG@ gnunet-arm -s
2471 @end example
2472
2473 (note the absence of double quotes in the environment variable definition,
2474 as opposed to earlier examples, which use the shell).
2475 Another limitation, on Windows, GNUNET_FORCE_LOGFILE @strong{MUST} be set
2476 in order to GNUNET_FORCE_LOG to work.
2477
2478
2479 @cindex Log files
2480 @node Log files
2481 @subsubsection Log files
2482
2483 GNUnet can be told to log everything into a file instead of stderr (which
2484 is the default) using the "--log-file=logfile" or "-l logfile" option.
2485 This option can also be passed via command line, or from the "OPTION" and
2486 "GLOBAL_POSTFIX" configuration keys (see above). The file name passed
2487 with this option is subject to GNUnet filename expansion. If specified in
2488 "GLOBAL_POSTFIX", it is also subject to ARM service filename expansion,
2489 in particular, it may contain "@{@}" (left and right curly brace)
2490 sequence, which will be replaced by ARM with the name of the service.
2491 This is used to keep logs from more than one service separate, while only
2492 specifying one template containing "@{@}" in GLOBAL_POSTFIX.
2493
2494 As part of a secondary file name expansion, the first occurrence of "[]"
2495 sequence ("left square brace" followed by "right square brace") in the
2496 file name will be replaced with a process identifier or the process when
2497 it initializes its logging subsystem. As a result, all processes will log
2498 into different files. This is convenient for isolating messages of a
2499 particular process, and prevents I/O races when multiple processes try to
2500 write into the file at the same time. This expansion is done
2501 independently of "@{@}" expansion that ARM service does (see above).
2502
2503 The log file name that is specified via "-l" can contain format characters
2504 from the 'strftime' function family. For example, "%Y" will be replaced
2505 with the current year. Using "basename-%Y-%m-%d.log" would include the
2506 current year, month and day in the log file. If a GNUnet process runs for
2507 long enough to need more than one log file, it will eventually clean up
2508 old log files. Currently, only the last three log files (plus the current
2509 log file) are preserved. So once the fifth log file goes into use (so
2510 after 4 days if you use "%Y-%m-%d" as above), the first log file will be
2511 automatically deleted. Note that if your log file name only contains "%Y",
2512 then log files would be kept for 4 years and the logs from the first year
2513 would be deleted once year 5 begins. If you do not use any date-related
2514 string format codes, logs would never be automatically deleted by GNUnet.
2515
2516
2517 @c ***********************************************************************
2518
2519 @node Updated behavior of GNUNET_log
2520 @subsubsection Updated behavior of GNUNET_log
2521
2522 It's currently quite common to see constructions like this all over the
2523 code:
2524
2525 @example
2526 #if MESH_DEBUG
2527 GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, "MESH: client disconnected\n");
2528 #endif
2529 @end example
2530
2531 The reason for the #if is not to avoid displaying the message when
2532 disabled (GNUNET_ERROR_TYPE takes care of that), but to avoid the
2533 compiler including it in the binary at all, when compiling GNUnet for
2534 platforms with restricted storage space / memory (MIPS routers,
2535 ARM plug computers / dev boards, etc).
2536
2537 This presents several problems: the code gets ugly, hard to write and it
2538 is very easy to forget to include the #if guards, creating non-consistent
2539 code. A new change in GNUNET_log aims to solve these problems.
2540
2541 @strong{This change requires to @file{./configure} with at least
2542 @code{--enable-logging=verbose} to see debug messages.}
2543
2544 Here is an example of code with dense debug statements:
2545
2546 @example
2547 switch (restrict_topology) @{
2548 case GNUNET_TESTING_TOPOLOGY_CLIQUE:#if VERBOSE_TESTING
2549 GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but clique
2550 topology\n")); #endif unblacklisted_connections = create_clique (pg,
2551 &remove_connections, BLACKLIST, GNUNET_NO); break; case
2552 GNUNET_TESTING_TOPOLOGY_SMALL_WORLD_RING: #if VERBOSE_TESTING GNUNET_log
2553 (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but small world (ring)
2554 topology\n")); #endif unblacklisted_connections = create_small_world_ring
2555 (pg,&remove_connections, BLACKLIST); break;
2556 @end example
2557
2558
2559 Pretty hard to follow, huh?
2560
2561 From now on, it is not necessary to include the #if / #endif statements to
2562 achieve the same behavior. The @code{GNUNET_log} and @code{GNUNET_log_from}
2563 macros take
2564 care of it for you, depending on the configure option:
2565
2566 @itemize @bullet
2567 @item If @code{--enable-logging} is set to @code{no}, the binary will
2568 contain no log messages at all.
2569 @item If @code{--enable-logging} is set to @code{yes}, the binary will
2570 contain no DEBUG messages, and therefore running with @command{-L DEBUG}
2571 will have
2572 no effect. Other messages (ERROR, WARNING, INFO, etc) will be included.
2573 @item If @code{--enable-logging} is set to @code{verbose}, or
2574 @code{veryverbose} the binary will contain DEBUG messages (still, it will
2575 be necessary to run with @command{-L DEBUG} or set the DEBUG config option
2576 to show
2577 them).
2578 @end itemize
2579
2580
2581 If you are a developer:
2582 @itemize @bullet
2583 @item please make sure that you @code{./configure
2584 --enable-logging=@{verbose,veryverbose@}}, so you can see DEBUG messages.
2585 @item please remove the @code{#if} statements around @code{GNUNET_log
2586 (GNUNET_ERROR_TYPE_DEBUG, ...)} lines, to improve the readability of your
2587 code.
2588 @end itemize
2589
2590 Since now activating DEBUG automatically makes it VERBOSE and activates
2591 @strong{all} debug messages by default, you probably want to use the
2592 @uref{https://docs.gnunet.org/#Logging, https://docs.gnunet.org/#Logging}
2593 functionality to filter only relevant messages.
2594 A suitable configuration could be:
2595
2596 @example
2597 $ export GNUNET_FORCE_LOG="^YOUR_SUBSYSTEM$;;;;DEBUG/;;;;WARNING"
2598 @end example
2599
2600 Which will behave almost like enabling DEBUG in that subsystem before the
2601 change. Of course you can adapt it to your particular needs, this is only
2602 a quick example.
2603
2604 @cindex Interprocess communication API
2605 @cindex ICP
2606 @node Interprocess communication API (IPC)
2607 @subsection Interprocess communication API (IPC)
2608
2609 In GNUnet a variety of new message types might be defined and used in
2610 interprocess communication, in this tutorial we use the
2611 @code{struct AddressLookupMessage} as a example to introduce how to
2612 construct our own message type in GNUnet and how to implement the message
2613 communication between service and client.
2614 (Here, a client uses the @code{struct AddressLookupMessage} as a request
2615 to ask the server to return the address of any other peer connecting to
2616 the service.)
2617
2618
2619 @c ***********************************************************************
2620 @menu
2621 * Define new message types::
2622 * Define message struct::
2623 * Client - Establish connection::
2624 * Client - Initialize request message::
2625 * Client - Send request and receive response::
2626 * Server - Startup service::
2627 * Server - Add new handles for specified messages::
2628 * Server - Process request message::
2629 * Server - Response to client::
2630 * Server - Notification of clients::
2631 * Conversion between Network Byte Order (Big Endian) and Host Byte Order::
2632 @end menu
2633
2634 @node Define new message types
2635 @subsubsection Define new message types
2636
2637 First of all, you should define the new message type in
2638 @file{gnunet_protocols.h}:
2639
2640 @example
2641  // Request to look addresses of peers in server.
2642 #define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP 29
2643   // Response to the address lookup request.
2644 #define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY 30
2645 @end example
2646
2647 @c ***********************************************************************
2648 @node Define message struct
2649 @subsubsection Define message struct
2650
2651 After the type definition, the specified message structure should also be
2652 described in the header file, e.g. transport.h in our case.
2653
2654 @example
2655 struct AddressLookupMessage @{
2656   struct GNUNET_MessageHeader header;
2657   int32_t numeric_only GNUNET_PACKED;
2658   struct GNUNET_TIME_AbsoluteNBO timeout;
2659   uint32_t addrlen GNUNET_PACKED;
2660   /* followed by 'addrlen' bytes of the actual address, then
2661      followed by the 0-terminated name of the transport */ @};
2662 GNUNET_NETWORK_STRUCT_END
2663 @end example
2664
2665
2666 Please note @code{GNUNET_NETWORK_STRUCT_BEGIN} and @code{GNUNET_PACKED}
2667 which both ensure correct alignment when sending structs over the network.
2668
2669 @menu
2670 @end menu
2671
2672 @c ***********************************************************************
2673 @node Client - Establish connection
2674 @subsubsection Client - Establish connection
2675
2676
2677
2678 At first, on the client side, the underlying API is employed to create a
2679 new connection to a service, in our example the transport service would be
2680 connected.
2681
2682 @example
2683 struct GNUNET_CLIENT_Connection *client;
2684 client = GNUNET_CLIENT_connect ("transport", cfg);
2685 @end example
2686
2687 @c ***********************************************************************
2688 @node Client - Initialize request message
2689 @subsubsection Client - Initialize request message
2690
2691
2692 When the connection is ready, we initialize the message. In this step,
2693 all the fields of the message should be properly initialized, namely the
2694 size, type, and some extra user-defined data, such as timeout, name of
2695 transport, address and name of transport.
2696
2697 @example
2698 struct AddressLookupMessage *msg;
2699 size_t len = sizeof (struct AddressLookupMessage)
2700   + addressLen
2701   + strlen (nameTrans)
2702   + 1;
2703 msg->header->size = htons (len);
2704 msg->header->type = htons
2705 (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP);
2706 msg->timeout = GNUNET_TIME_absolute_hton (abs_timeout);
2707 msg->addrlen = htonl (addressLen);
2708 char *addrbuf = (char *) &msg[1];
2709 memcpy (addrbuf, address, addressLen);
2710 char *tbuf = &addrbuf[addressLen];
2711 memcpy (tbuf, nameTrans, strlen (nameTrans) + 1);
2712 @end example
2713
2714 Note that, here the functions @code{htonl}, @code{htons} and
2715 @code{GNUNET_TIME_absolute_hton} are applied to convert little endian
2716 into big endian, about the usage of the big/small endian order and the
2717 corresponding conversion function please refer to Introduction of
2718 Big Endian and Little Endian.
2719
2720 @c ***********************************************************************
2721 @node Client - Send request and receive response
2722 @subsubsection Client - Send request and receive response
2723
2724
2725 @b{FIXME: This is very outdated, see the tutorial for the current API!}
2726
2727 Next, the client would send the constructed message as a request to the
2728 service and wait for the response from the service. To accomplish this
2729 goal, there are a number of API calls that can be used. In this example,
2730 @code{GNUNET_CLIENT_transmit_and_get_response} is chosen as the most
2731 appropriate function to use.
2732
2733 @example
2734 GNUNET_CLIENT_transmit_and_get_response
2735 (client, msg->header, timeout, GNUNET_YES, &address_response_processor,
2736 arp_ctx);
2737 @end example
2738
2739 the argument @code{address_response_processor} is a function with
2740 @code{GNUNET_CLIENT_MessageHandler} type, which is used to process the
2741 reply message from the service.
2742
2743 @node Server - Startup service
2744 @subsubsection Server - Startup service
2745
2746 After receiving the request message, we run a standard GNUnet service
2747 startup sequence using @code{GNUNET_SERVICE_run}, as follows,
2748
2749 @example
2750 int main(int argc, char**argv) @{
2751   GNUNET_SERVICE_run(argc, argv, "transport"
2752   GNUNET_SERVICE_OPTION_NONE, &run, NULL)); @}
2753 @end example
2754
2755 @c ***********************************************************************
2756 @node Server - Add new handles for specified messages
2757 @subsubsection Server - Add new handles for specified messages
2758
2759
2760 in the function above the argument @code{run} is used to initiate
2761 transport service,and defined like this:
2762
2763 @example
2764 static void run (void *cls,
2765 struct GNUNET_SERVER_Handle *serv,
2766 const struct GNUNET_CONFIGURATION_Handle *cfg) @{
2767   GNUNET_SERVER_add_handlers (serv, handlers); @}
2768 @end example
2769
2770
2771 Here, @code{GNUNET_SERVER_add_handlers} must be called in the run
2772 function to add new handlers in the service. The parameter
2773 @code{handlers} is a list of @code{struct GNUNET_SERVER_MessageHandler}
2774 to tell the service which function should be called when a particular
2775 type of message is received, and should be defined in this way:
2776
2777 @example
2778 static struct GNUNET_SERVER_MessageHandler handlers[] = @{
2779   @{&handle_start,
2780    NULL,
2781    GNUNET_MESSAGE_TYPE_TRANSPORT_START,
2782    0@},
2783   @{&handle_send,
2784    NULL,
2785    GNUNET_MESSAGE_TYPE_TRANSPORT_SEND,
2786    0@},
2787   @{&handle_try_connect,
2788    NULL,
2789    GNUNET_MESSAGE_TYPE_TRANSPORT_TRY_CONNECT,
2790    sizeof (struct TryConnectMessage)
2791   @},
2792   @{&handle_address_lookup,
2793    NULL,
2794    GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP,
2795    0@},
2796   @{NULL,
2797    NULL,
2798    0,
2799    0@}
2800 @};
2801 @end example
2802
2803
2804 As shown, the first member of the struct in the first area is a callback
2805 function, which is called to process the specified message types, given
2806 as the third member. The second parameter is the closure for the callback
2807 function, which is set to @code{NULL} in most cases, and the last
2808 parameter is the expected size of the message of this type, usually we
2809 set it to 0 to accept variable size, for special cases the exact size of
2810 the specified message also can be set. In addition, the terminator sign
2811 depicted as @code{@{NULL, NULL, 0, 0@}} is set in the last area.
2812
2813 @c ***********************************************************************
2814 @node Server - Process request message
2815 @subsubsection Server - Process request message
2816
2817
2818 After the initialization of transport service, the request message would
2819 be processed. Before handling the main message data, the validity of this
2820 message should be checked out, e.g., to check whether the size of message
2821 is correct.
2822
2823 @example
2824 size = ntohs (message->size);
2825 if (size < sizeof (struct AddressLookupMessage)) @{
2826   GNUNET_break_op (0);
2827   GNUNET_SERVER_receive_done (client, GNUNET_SYSERR);
2828   return; @}
2829 @end example
2830
2831
2832 Note that, opposite to the construction method of the request message in
2833 the client, in the server the function @code{nothl} and @code{ntohs}
2834 should be employed during the extraction of the data from the message, so
2835 that the data in big endian order can be converted back into little
2836 endian order. See more in detail please refer to Introduction of
2837 Big Endian and Little Endian.
2838
2839 Moreover in this example, the name of the transport stored in the message
2840 is a 0-terminated string, so we should also check whether the name of the
2841 transport in the received message is 0-terminated:
2842
2843 @example
2844 nameTransport = (const char *) &address[addressLen];
2845 if (nameTransport[size - sizeof
2846                   (struct AddressLookupMessage)
2847                   - addressLen - 1] != '\0') @{
2848   GNUNET_break_op (0);
2849   GNUNET_SERVER_receive_done (client,
2850                               GNUNET_SYSERR);
2851   return; @}
2852 @end example
2853
2854 Here, @code{GNUNET_SERVER_receive_done} should be called to tell the
2855 service that the request is done and can receive the next message. The
2856 argument @code{GNUNET_SYSERR} here indicates that the service didn't
2857 understand the request message, and the processing of this request would
2858 be terminated.
2859
2860 In comparison to the aforementioned situation, when the argument is equal
2861 to @code{GNUNET_OK}, the service would continue to process the request
2862 message.
2863
2864 @c ***********************************************************************
2865 @node Server - Response to client
2866 @subsubsection Server - Response to client
2867
2868
2869 Once the processing of current request is done, the server should give the
2870 response to the client. A new @code{struct AddressLookupMessage} would be
2871 produced by the server in a similar way as the client did and sent to the
2872 client, but here the type should be
2873 @code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY} rather than
2874 @code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP} in client.
2875 @example
2876 struct AddressLookupMessage *msg;
2877 size_t len = sizeof (struct AddressLookupMessage)
2878   + addressLen
2879   + strlen (nameTrans) + 1;
2880 msg->header->size = htons (len);
2881 msg->header->type = htons
2882   (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2883
2884 // ...
2885
2886 struct GNUNET_SERVER_TransmitContext *tc;
2887 tc = GNUNET_SERVER_transmit_context_create (client);
2888 GNUNET_SERVER_transmit_context_append_data
2889 (tc,
2890  NULL,
2891  0,
2892  GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2893 GNUNET_SERVER_transmit_context_run (tc, rtimeout);
2894 @end example
2895
2896
2897 Note that, there are also a number of other APIs provided to the service
2898 to send the message.
2899
2900 @c ***********************************************************************
2901 @node Server - Notification of clients
2902 @subsubsection Server - Notification of clients
2903
2904
2905 Often a service needs to (repeatedly) transmit notifications to a client
2906 or a group of clients. In these cases, the client typically has once
2907 registered for a set of events and then needs to receive a message
2908 whenever such an event happens (until the client disconnects). The use of
2909 a notification context can help manage message queues to clients and
2910 handle disconnects. Notification contexts can be used to send
2911 individualized messages to a particular client or to broadcast messages
2912 to a group of clients. An individualized notification might look like
2913 this:
2914
2915 @example
2916 GNUNET_SERVER_notification_context_unicast(nc,
2917                                            client,
2918                                            msg,
2919                                            GNUNET_YES);
2920 @end example
2921
2922
2923 Note that after processing the original registration message for
2924 notifications, the server code still typically needs to call
2925 @code{GNUNET_SERVER_receive_done} so that the client can transmit further
2926 messages to the server.
2927
2928 @c ***********************************************************************
2929 @node Conversion between Network Byte Order (Big Endian) and Host Byte Order
2930 @subsubsection Conversion between Network Byte Order (Big Endian) and Host Byte Order
2931 @c %** subsub? it's a referenced page on the ipc document.
2932
2933
2934 Here we can simply comprehend big endian and little endian as Network Byte
2935 Order and Host Byte Order respectively. What is the difference between
2936 both two?
2937
2938 Usually in our host computer we store the data byte as Host Byte Order,
2939 for example, we store a integer in the RAM which might occupies 4 Byte,
2940 as Host Byte Order the higher Byte would be stored at the lower address
2941 of RAM, and the lower Byte would be stored at the higher address of RAM.
2942 However, contrast to this, Network Byte Order just take the totally
2943 opposite way to store the data, says, it will store the lower Byte at the
2944 lower address, and the higher Byte will stay at higher address.
2945
2946 For the current communication of network, we normally exchange the
2947 information by surveying the data package, every two host wants to
2948 communicate with each other must send and receive data package through
2949 network. In order to maintain the identity of data through the
2950 transmission in the network, the order of the Byte storage must changed
2951 before sending and after receiving the data.
2952
2953 There ten convenient functions to realize the conversion of Byte Order in
2954 GNUnet, as following:
2955
2956 @table @asis
2957
2958 @item uint16_t htons(uint16_t hostshort) Convert host byte order to net
2959 byte order with short int
2960 @item uint32_t htonl(uint32_t hostlong) Convert host byte
2961 order to net byte order with long int
2962 @item uint16_t ntohs(uint16_t netshort)
2963 Convert net byte order to host byte order with short int
2964 @item uint32_t
2965 ntohl(uint32_t netlong) Convert net byte order to host byte order with
2966 long int
2967 @item unsigned long long GNUNET_ntohll (unsigned long long netlonglong)
2968 Convert net byte order to host byte order with long long int
2969 @item unsigned long long GNUNET_htonll (unsigned long long hostlonglong)
2970 Convert host byte order to net byte order with long long int
2971 @item struct GNUNET_TIME_RelativeNBO GNUNET_TIME_relative_hton
2972 (struct GNUNET_TIME_Relative a) Convert relative time to network byte
2973 order.
2974 @item struct GNUNET_TIME_Relative GNUNET_TIME_relative_ntoh
2975 (struct GNUNET_TIME_RelativeNBO a) Convert relative time from network
2976 byte order.
2977 @item struct GNUNET_TIME_AbsoluteNBO GNUNET_TIME_absolute_hton
2978 (struct GNUNET_TIME_Absolute a) Convert relative time to network byte
2979 order.
2980 @item struct GNUNET_TIME_Absolute GNUNET_TIME_absolute_ntoh
2981 (struct GNUNET_TIME_AbsoluteNBO a) Convert relative time from network
2982 byte order.
2983 @end table
2984
2985 @cindex Cryptography API
2986 @node Cryptography API
2987 @subsection Cryptography API
2988
2989
2990 The gnunetutil APIs provides the cryptographic primitives used in GNUnet.
2991 GNUnet uses 2048 bit RSA keys for the session key exchange and for signing
2992 messages by peers and most other public-key operations. Most researchers
2993 in cryptography consider 2048 bit RSA keys as secure and practically
2994 unbreakable for a long time. The API provides functions to create a fresh
2995 key pair, read a private key from a file (or create a new file if the
2996 file does not exist), encrypt, decrypt, sign, verify and extraction of
2997 the public key into a format suitable for network transmission.
2998
2999 For the encryption of files and the actual data exchanged between peers
3000 GNUnet uses 256-bit AES encryption. Fresh, session keys are negotiated
3001 for every new connection.@ Again, there is no published technique to
3002 break this cipher in any realistic amount of time. The API provides
3003 functions for generation of keys, validation of keys (important for
3004 checking that decryptions using RSA succeeded), encryption and decryption.
3005
3006 GNUnet uses SHA-512 for computing one-way hash codes. The API provides
3007 functions to compute a hash over a block in memory or over a file on disk.
3008
3009 The crypto API also provides functions for randomizing a block of memory,
3010 obtaining a single random number and for generating a permutation of the
3011 numbers 0 to n-1. Random number generation distinguishes between WEAK and
3012 STRONG random number quality; WEAK random numbers are pseudo-random
3013 whereas STRONG random numbers use entropy gathered from the operating
3014 system.
3015
3016 Finally, the crypto API provides a means to deterministically generate a
3017 1024-bit RSA key from a hash code. These functions should most likely not
3018 be used by most applications; most importantly,
3019 GNUNET_CRYPTO_rsa_key_create_from_hash does not create an RSA-key that
3020 should be considered secure for traditional applications of RSA.
3021
3022 @cindex Message Queue API
3023 @node Message Queue API
3024 @subsection Message Queue API
3025
3026
3027 @strong{ Introduction }@
3028 Often, applications need to queue messages that
3029 are to be sent to other GNUnet peers, clients or services. As all of
3030 GNUnet's message-based communication APIs, by design, do not allow
3031 messages to be queued, it is common to implement custom message queues
3032 manually when they are needed. However, writing very similar code in
3033 multiple places is tedious and leads to code duplication.
3034
3035 MQ (for Message Queue) is an API that provides the functionality to
3036 implement and use message queues. We intend to eventually replace all of
3037 the custom message queue implementations in GNUnet with MQ.
3038
3039 @strong{ Basic Concepts }@
3040 The two most important entities in MQ are queues and envelopes.
3041
3042 Every queue is backed by a specific implementation (e.g. for mesh, stream,
3043 connection, server client, etc.) that will actually deliver the queued
3044 messages. For convenience,@ some queues also allow to specify a list of
3045 message handlers. The message queue will then also wait for incoming
3046 messages and dispatch them appropriately.
3047
3048 An envelope holds the the memory for a message, as well as metadata
3049 (Where is the envelope queued? What should happen after it has been
3050 sent?). Any envelope can only be queued in one message queue.
3051
3052 @strong{ Creating Queues }@
3053 The following is a list of currently available message queues. Note that
3054 to avoid layering issues, message queues for higher level APIs are not
3055 part of @code{libgnunetutil}, but@ the respective API itself provides the
3056 queue implementation.
3057
3058 @table @asis
3059
3060 @item @code{GNUNET_MQ_queue_for_connection_client}
3061 Transmits queued messages over a @code{GNUNET_CLIENT_Connection} handle.
3062 Also supports receiving with message handlers.
3063
3064 @item @code{GNUNET_MQ_queue_for_server_client}
3065 Transmits queued messages over a @code{GNUNET_SERVER_Client} handle. Does
3066 not support incoming message handlers.
3067
3068 @item @code{GNUNET_MESH_mq_create} Transmits queued messages over a
3069 @code{GNUNET_MESH_Tunnel} handle. Does not support incoming message
3070 handlers.
3071
3072 @item @code{GNUNET_MQ_queue_for_callbacks} This is the most general
3073 implementation. Instead of delivering and receiving messages with one of
3074 GNUnet's communication APIs, implementation callbacks are called. Refer to
3075 "Implementing Queues" for a more detailed explanation.
3076 @end table
3077
3078
3079 @strong{ Allocating Envelopes }@
3080 A GNUnet message (as defined by the GNUNET_MessageHeader) has three
3081 parts: The size, the type, and the body.
3082
3083 MQ provides macros to allocate an envelope containing a message
3084 conveniently, automatically setting the size and type fields of the
3085 message.
3086
3087 Consider the following simple message, with the body consisting of a
3088 single number value.
3089 @c why the empty code function?
3090 @code{}
3091
3092 @example
3093 struct NumberMessage @{
3094   /** Type: GNUNET_MESSAGE_TYPE_EXAMPLE_1 */
3095   struct GNUNET_MessageHeader header;
3096   uint32_t number GNUNET_PACKED;
3097 @};
3098 @end example
3099
3100 An envelope containing an instance of the NumberMessage can be
3101 constructed like this:
3102
3103 @example
3104 struct GNUNET_MQ_Envelope *ev;
3105 struct NumberMessage *msg;
3106 ev = GNUNET_MQ_msg (msg, GNUNET_MESSAGE_TYPE_EXAMPLE_1);
3107 msg->number = htonl (42);
3108 @end example
3109
3110 In the above code, @code{GNUNET_MQ_msg} is a macro. The return value is
3111 the newly allocated envelope. The first argument must be a pointer to some
3112 @code{struct} containing a @code{struct GNUNET_MessageHeader header}
3113 field, while the second argument is the desired message type, in host
3114 byte order.
3115
3116 The @code{msg} pointer now points to an allocated message, where the
3117 message type and the message size are already set. The message's size is
3118 inferred from the type of the @code{msg} pointer: It will be set to
3119 'sizeof(*msg)', properly converted to network byte order.
3120
3121 If the message body's size is dynamic, the the macro
3122 @code{GNUNET_MQ_msg_extra} can be used to allocate an envelope whose
3123 message has additional space allocated after the @code{msg} structure.
3124
3125 If no structure has been defined for the message,
3126 @code{GNUNET_MQ_msg_header_extra} can be used to allocate additional space
3127 after the message header. The first argument then must be a pointer to a
3128 @code{GNUNET_MessageHeader}.
3129
3130 @strong{Envelope Properties}@
3131 A few functions in MQ allow to set additional properties on envelopes:
3132
3133 @table @asis
3134
3135 @item @code{GNUNET_MQ_notify_sent} Allows to specify a function that will
3136 be called once the envelope's message has been sent irrevocably.
3137 An envelope can be canceled precisely up to the@ point where the notify
3138 sent callback has been called.
3139
3140 @item @code{GNUNET_MQ_disable_corking} No corking will be used when
3141 sending the message. Not every@ queue supports this flag, per default,
3142 envelopes are sent with corking.@
3143
3144 @end table
3145
3146
3147 @strong{Sending Envelopes}@
3148 Once an envelope has been constructed, it can be queued for sending with
3149 @code{GNUNET_MQ_send}.
3150
3151 Note that in order to avoid memory leaks, an envelope must either be sent
3152 (the queue will free it) or destroyed explicitly with
3153 @code{GNUNET_MQ_discard}.
3154
3155 @strong{Canceling Envelopes}@
3156 An envelope queued with @code{GNUNET_MQ_send} can be canceled with
3157 @code{GNUNET_MQ_cancel}. Note that after the notify sent callback has
3158 been called, canceling a message results in undefined behavior.
3159 Thus it is unsafe to cancel an envelope that does not have a notify sent
3160 callback. When canceling an envelope, it is not necessary@ to call
3161 @code{GNUNET_MQ_discard}, and the envelope can't be sent again.
3162
3163 @strong{ Implementing Queues }@
3164 @code{TODO}
3165
3166 @cindex Service API
3167 @node Service API
3168 @subsection Service API
3169
3170
3171 Most GNUnet code lives in the form of services. Services are processes
3172 that offer an API for other components of the system to build on. Those
3173 other components can be command-line tools for users, graphical user
3174 interfaces or other services. Services provide their API using an IPC
3175 protocol. For this, each service must listen on either a TCP port or a
3176 UNIX domain socket; for this, the service implementation uses the server
3177 API. This use of server is exposed directly to the users of the service
3178 API. Thus, when using the service API, one is usually also often using
3179 large parts of the server API. The service API provides various
3180 convenience functions, such as parsing command-line arguments and the
3181 configuration file, which are not found in the server API.
3182 The dual to the service/server API is the client API, which can be used to
3183 access services.
3184
3185 The most common way to start a service is to use the
3186 @code{GNUNET_SERVICE_run} function from the program's main function.
3187 @code{GNUNET_SERVICE_run} will then parse the command line and
3188 configuration files and, based on the options found there,
3189 start the server. It will then give back control to the main
3190 program, passing the server and the configuration to the
3191 @code{GNUNET_SERVICE_Main} callback. @code{GNUNET_SERVICE_run}
3192 will also take care of starting the scheduler loop.
3193 If this is inappropriate (for example, because the scheduler loop
3194 is already running), @code{GNUNET_SERVICE_start} and
3195 related functions provide an alternative to @code{GNUNET_SERVICE_run}.
3196
3197 When starting a service, the service_name option is used to determine
3198 which sections in the configuration file should be used to configure the
3199 service. A typical value here is the name of the @file{src/}
3200 sub-directory, for example @file{statistics}.
3201 The same string would also be given to
3202 @code{GNUNET_CLIENT_connect} to access the service.
3203
3204 Once a service has been initialized, the program should use the
3205 @code{GNUNET_SERVICE_Main} callback to register message handlers
3206 using @code{GNUNET_SERVER_add_handlers}.
3207 The service will already have registered a handler for the
3208 "TEST" message.
3209
3210 @findex GNUNET_SERVICE_Options
3211 The option bitfield (@code{enum GNUNET_SERVICE_Options})
3212 determines how a service should behave during shutdown.
3213 There are three key strategies:
3214
3215 @table @asis
3216
3217 @item instant (@code{GNUNET_SERVICE_OPTION_NONE})
3218 Upon receiving the shutdown
3219 signal from the scheduler, the service immediately terminates the server,
3220 closing all existing connections with clients.
3221 @item manual (@code{GNUNET_SERVICE_OPTION_MANUAL_SHUTDOWN})
3222 The service does nothing by itself
3223 during shutdown. The main program will need to take the appropriate
3224 action by calling GNUNET_SERVER_destroy or GNUNET_SERVICE_stop (depending
3225 on how the service was initialized) to terminate the service. This method
3226 is used by gnunet-service-arm and rather uncommon.
3227 @item soft (@code{GNUNET_SERVICE_OPTION_SOFT_SHUTDOWN})
3228 Upon receiving the shutdown signal from the scheduler,
3229 the service immediately tells the server to stop
3230 listening for incoming clients. Requests from normal existing clients are
3231 still processed and the server/service terminates once all normal clients
3232 have disconnected. Clients that are not expected to ever disconnect (such
3233 as clients that monitor performance values) can be marked as 'monitor'
3234 clients using GNUNET_SERVER_client_mark_monitor. Those clients will
3235 continue to be processed until all 'normal' clients have disconnected.
3236 Then, the server will terminate, closing the monitor connections.
3237 This mode is for example used by 'statistics', allowing existing 'normal'
3238 clients to set (possibly persistent) statistic values before terminating.
3239
3240 @end table
3241
3242 @c ***********************************************************************
3243 @node Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
3244 @subsection Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
3245
3246
3247 A commonly used data structure in GNUnet is a (multi-)hash map. It is most
3248 often used to map a peer identity to some data structure, but also to map
3249 arbitrary keys to values (for example to track requests in the distributed
3250 hash table or in file-sharing). As it is commonly used, the DHT is
3251 actually sometimes responsible for a large share of GNUnet's overall
3252 memory consumption (for some processes, 30% is not uncommon). The
3253 following text documents some API quirks (and their implications for
3254 applications) that were recently introduced to minimize the footprint of
3255 the hash map.
3256
3257
3258 @c ***********************************************************************
3259 @menu
3260 * Analysis::
3261 * Solution::
3262 * Migration::
3263 * Conclusion::
3264 * Availability::
3265 @end menu
3266
3267 @node Analysis
3268 @subsubsection Analysis
3269
3270
3271 The main reason for the "excessive" memory consumption by the hash map is
3272 that GNUnet uses 512-bit cryptographic hash codes --- and the
3273 (multi-)hash map also uses the same 512-bit 'struct GNUNET_HashCode'. As
3274 a result, storing just the keys requires 64 bytes of memory for each key.
3275 As some applications like to keep a large number of entries in the hash
3276 map (after all, that's what maps are good for), 64 bytes per hash is
3277 significant: keeping a pointer to the value and having a linked list for
3278 collisions consume between 8 and 16 bytes, and 'malloc' may add about the
3279 same overhead per allocation, putting us in the 16 to 32 byte per entry
3280 ballpark. Adding a 64-byte key then triples the overall memory
3281 requirement for the hash map.
3282
3283 To make things "worse", most of the time storing the key in the hash map
3284 is not required: it is typically already in memory elsewhere! In most
3285 cases, the values stored in the hash map are some application-specific
3286 struct that _also_ contains the hash. Here is a simplified example:
3287
3288 @example
3289 struct MyValue @{
3290 struct GNUNET_HashCode key;
3291 unsigned int my_data; @};
3292
3293 // ...
3294 val = GNUNET_malloc (sizeof (struct MyValue));
3295 val->key = key;
3296 val->my_data = 42;
3297 GNUNET_CONTAINER_multihashmap_put (map, &key, val, ...);
3298 @end example
3299
3300 This is a common pattern as later the entries might need to be removed,
3301 and at that time it is convenient to have the key immediately at hand:
3302
3303 @example
3304 GNUNET_CONTAINER_multihashmap_remove (map, &val->key, val);
3305 @end example
3306
3307
3308 Note that here we end up with two times 64 bytes for the key, plus maybe
3309 64 bytes total for the rest of the 'struct MyValue' and the map entry in
3310 the hash map. The resulting redundant storage of the key increases
3311 overall memory consumption per entry from the "optimal" 128 bytes to 192
3312 bytes. This is not just an extreme example: overheads in practice are
3313 actually sometimes close to those highlighted in this example. This is
3314 especially true for maps with a significant number of entries, as there
3315 we tend to really try to keep the entries small.
3316
3317 @c ***********************************************************************
3318 @node Solution
3319 @subsubsection Solution
3320
3321
3322 The solution that has now been implemented is to @strong{optionally}
3323 allow the hash map to not make a (deep) copy of the hash but instead have
3324 a pointer to the hash/key in the entry. This reduces the memory
3325 consumption for the key from 64 bytes to 4 to 8 bytes. However, it can
3326 also only work if the key is actually stored in the entry (which is the
3327 case most of the time) and if the entry does not modify the key (which in
3328 all of the code I'm aware of has been always the case if there key is
3329 stored in the entry). Finally, when the client stores an entry in the
3330 hash map, it @strong{must} provide a pointer to the key within the entry,
3331 not just a pointer to a transient location of the key. If
3332 the client code does not meet these requirements, the result is a dangling
3333 pointer and undefined behavior of the (multi-)hash map API.
3334
3335 @c ***********************************************************************
3336 @node Migration
3337 @subsubsection Migration
3338
3339
3340 To use the new feature, first check that the values contain the respective
3341 key (and never modify it). Then, all calls to
3342 @code{GNUNET_CONTAINER_multihashmap_put} on the respective map must be
3343 audited and most likely changed to pass a pointer into the value's struct.
3344 For the initial example, the new code would look like this:
3345
3346 @example
3347 struct MyValue @{
3348 struct GNUNET_HashCode key;
3349 unsigned int my_data; @};
3350
3351 // ...
3352 val = GNUNET_malloc (sizeof (struct MyValue));
3353 val->key = key; val->my_data = 42;
3354 GNUNET_CONTAINER_multihashmap_put (map, &val->key, val, ...);
3355 @end example
3356
3357
3358 Note that @code{&val} was changed to @code{&val->key} in the argument to
3359 the @code{put} call. This is critical as often @code{key} is on the stack
3360 or in some other transient data structure and thus having the hash map
3361 keep a pointer to @code{key} would not work. Only the key inside of
3362 @code{val} has the same lifetime as the entry in the map (this must of
3363 course be checked as well). Naturally, @code{val->key} must be
3364 initialized before the @code{put} call. Once all @code{put} calls have
3365 been converted and double-checked, you can change the call to create the
3366 hash map from
3367
3368 @example
3369 map =
3370 GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_NO);
3371 @end example
3372
3373 to
3374
3375 @example
3376 map = GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_YES);
3377 @end example
3378
3379 If everything was done correctly, you now use about 60 bytes less memory
3380 per entry in @code{map}. However, if now (or in the future) any call to
3381 @code{put} does not ensure that the given key is valid until the entry is
3382 removed from the map, undefined behavior is likely to be observed.
3383
3384 @c ***********************************************************************
3385 @node Conclusion
3386 @subsubsection Conclusion
3387
3388
3389 The new optimization can is often applicable and can result in a
3390 reduction in memory consumption of up to 30% in practice. However, it
3391 makes the code less robust as additional invariants are imposed on the
3392 multi hash map client. Thus applications should refrain from enabling the
3393 new mode unless the resulting performance increase is deemed significant
3394 enough. In particular, it should generally not be used in new code (wait
3395 at least until benchmarks exist).
3396
3397 @c ***********************************************************************
3398 @node Availability
3399 @subsubsection Availability
3400
3401
3402 The new multi hash map code was committed in SVN 24319 (which made its
3403 way into GNUnet version 0.9.4).
3404 Various subsystems (transport, core, dht, file-sharing) were
3405 previously audited and modified to take advantage of the new capability.
3406 In particular, memory consumption of the file-sharing service is expected
3407 to drop by 20-30% due to this change.
3408
3409
3410 @cindex CONTAINER_MDLL API
3411 @node CONTAINER_MDLL API
3412 @subsection CONTAINER_MDLL API
3413
3414
3415 This text documents the GNUNET_CONTAINER_MDLL API. The
3416 GNUNET_CONTAINER_MDLL API is similar to the GNUNET_CONTAINER_DLL API in
3417 that it provides operations for the construction and manipulation of
3418 doubly-linked lists. The key difference to the (simpler) DLL-API is that
3419 the MDLL-version allows a single element (instance of a "struct") to be
3420 in multiple linked lists at the same time.
3421
3422 Like the DLL API, the MDLL API stores (most of) the data structures for
3423 the doubly-linked list with the respective elements; only the 'head' and
3424 'tail' pointers are stored "elsewhere" --- and the application needs to
3425 provide the locations of head and tail to each of the calls in the
3426 MDLL API. The key difference for the MDLL API is that the "next" and
3427 "previous" pointers in the struct can no longer be simply called "next"
3428 and "prev" --- after all, the element may be in multiple doubly-linked
3429 lists, so we cannot just have one "next" and one "prev" pointer!
3430
3431 The solution is to have multiple fields that must have a name of the
3432 format "next_XX" and "prev_XX" where "XX" is the name of one of the
3433 doubly-linked lists. Here is a simple example:
3434
3435 @example
3436 struct MyMultiListElement @{
3437   struct MyMultiListElement *next_ALIST;
3438   struct MyMultiListElement *prev_ALIST;
3439   struct MyMultiListElement *next_BLIST;
3440   struct MyMultiListElement *prev_BLIST;
3441   void
3442   *data;
3443 @};
3444 @end example
3445
3446
3447 Note that by convention, we use all-uppercase letters for the list names.
3448 In addition, the program needs to have a location for the head and tail
3449 pointers for both lists, for example:
3450
3451 @example
3452 static struct MyMultiListElement *head_ALIST;
3453 static struct MyMultiListElement *tail_ALIST;
3454 static struct MyMultiListElement *head_BLIST;
3455 static struct MyMultiListElement *tail_BLIST;
3456 @end example
3457
3458
3459 Using the MDLL-macros, we can now insert an element into the ALIST:
3460
3461 @example
3462 GNUNET_CONTAINER_MDLL_insert (ALIST, head_ALIST, tail_ALIST, element);
3463 @end example
3464
3465
3466 Passing "ALIST" as the first argument to MDLL specifies which of the
3467 next/prev fields in the 'struct MyMultiListElement' should be used. The
3468 extra "ALIST" argument and the "_ALIST" in the names of the
3469 next/prev-members are the only differences between the MDDL and DLL-API.
3470 Like the DLL-API, the MDLL-API offers functions for inserting (at head,
3471 at tail, after a given element) and removing elements from the list.
3472 Iterating over the list should be done by directly accessing the
3473 "next_XX" and/or "prev_XX" members.
3474
3475 @cindex Automatic Restart Manager
3476 @cindex ARM
3477 @node Automatic Restart Manager (ARM)
3478 @section Automatic Restart Manager (ARM)
3479
3480
3481 GNUnet's Automated Restart Manager (ARM) is the GNUnet service responsible
3482 for system initialization and service babysitting. ARM starts and halts
3483 services, detects configuration changes and restarts services impacted by
3484 the changes as needed. It's also responsible for restarting services in
3485 case of crashes and is planned to incorporate automatic debugging for
3486 diagnosing service crashes providing developers insights about crash
3487 reasons. The purpose of this document is to give GNUnet developer an idea
3488 about how ARM works and how to interact with it.
3489
3490 @menu
3491 * Basic functionality::
3492 * Key configuration options::
3493 * ARM - Availability::
3494 * Reliability::
3495 @end menu
3496
3497 @c ***********************************************************************
3498 @node Basic functionality
3499 @subsection Basic functionality
3500
3501
3502 @itemize @bullet
3503 @item ARM source code can be found under "src/arm".@ Service processes are
3504 managed by the functions in "gnunet-service-arm.c" which is controlled
3505 with "gnunet-arm.c" (main function in that file is ARM's entry point).
3506
3507 @item The functions responsible for communicating with ARM , starting and
3508 stopping services -including ARM service itself- are provided by the
3509 ARM API "arm_api.c".@ Function: GNUNET_ARM_connect() returns to the caller
3510 an ARM handle after setting it to the caller's context (configuration and
3511 scheduler in use). This handle can be used afterwards by the caller to
3512 communicate with ARM. Functions GNUNET_ARM_start_service() and
3513 GNUNET_ARM_stop_service() are used for starting and stopping services
3514 respectively.
3515
3516 @item A typical example of using these basic ARM services can be found in
3517 file test_arm_api.c. The test case connects to ARM, starts it, then uses
3518 it to start a service "resolver", stops the "resolver" then stops "ARM".
3519 @end itemize
3520
3521 @c ***********************************************************************
3522 @node Key configuration options
3523 @subsection Key configuration options
3524
3525
3526 Configurations for ARM and services should be available in a .conf file
3527 (As an example, see test_arm_api_data.conf). When running ARM, the
3528 configuration file to use should be passed to the command:
3529
3530 @example
3531 $ gnunet-arm -s -c configuration_to_use.conf
3532 @end example
3533
3534 If no configuration is passed, the default configuration file will be used
3535 (see GNUNET_PREFIX/share/gnunet/defaults.conf which is created from
3536 contrib/defaults.conf).@ Each of the services is having a section starting
3537 by the service name between square brackets, for example: "[arm]".
3538 The following options configure how ARM configures or interacts with the
3539 various services:
3540
3541 @table @asis
3542
3543 @item PORT Port number on which the service is listening for incoming TCP
3544 connections. ARM will start the services should it notice a request at
3545 this port.
3546
3547 @item HOSTNAME Specifies on which host the service is deployed. Note
3548 that ARM can only start services that are running on the local system
3549 (but will not check that the hostname matches the local machine name).
3550 This option is used by the @code{gnunet_client_lib.h} implementation to
3551 determine which system to connect to. The default is "localhost".
3552
3553 @item BINARY The name of the service binary file.
3554
3555 @item OPTIONS To be passed to the service.
3556
3557 @item PREFIX A command to pre-pend to the actual command, for example,
3558 running a service with "valgrind" or "gdb"
3559
3560 @item DEBUG Run in debug mode (much verbosity).
3561
3562 @item START_ON_DEMAND ARM will listen to UNIX domain socket and/or TCP port of
3563 the service and start the service on-demand.
3564
3565 @item IMMEDIATE_START ARM will always start this service when the peer
3566 is started.
3567
3568 @item ACCEPT_FROM IPv4 addresses the service accepts connections from.
3569
3570 @item ACCEPT_FROM6 IPv6 addresses the service accepts connections from.
3571
3572 @end table
3573
3574
3575 Options that impact the operation of ARM overall are in the "[arm]"
3576 section. ARM is a normal service and has (except for START_ON_DEMAND) all of the
3577 options that other services do. In addition, ARM has the
3578 following options:
3579
3580 @table @asis
3581
3582 @item GLOBAL_PREFIX Command to be pre-pended to all services that are
3583 going to run.
3584
3585 @item GLOBAL_POSTFIX Global option that will be supplied to all the
3586 services that are going to run.
3587
3588 @end table
3589
3590 @c ***********************************************************************
3591 @node ARM - Availability
3592 @subsection ARM - Availability
3593
3594
3595 As mentioned before, one of the features provided by ARM is starting
3596 services on demand. Consider the example of one service "client" that
3597 wants to connect to another service a "server". The "client" will ask ARM
3598 to run the "server". ARM starts the "server". The "server" starts
3599 listening to incoming connections. The "client" will establish a
3600 connection with the "server". And then, they will start to communicate
3601 together.@ One problem with that scheme is that it's slow!@
3602 The "client" service wants to communicate with the "server" service at
3603 once and is not willing wait for it to be started and listening to
3604 incoming connections before serving its request.@ One solution for that
3605 problem will be that ARM starts all services as default services. That
3606 solution will solve the problem, yet, it's not quite practical, for some
3607 services that are going to be started can never be used or are going to
3608 be used after a relatively long time.@
3609 The approach followed by ARM to solve this problem is as follows:
3610
3611 @itemize @bullet
3612
3613 @item For each service having a PORT field in the configuration file and
3614 that is not one of the default services ( a service that accepts incoming
3615 connections from clients), ARM creates listening sockets for all addresses
3616 associated with that service.
3617
3618 @item The "client" will immediately establish a connection with
3619 the "server".
3620
3621 @item ARM --- pretending to be the "server" --- will listen on the
3622 respective port and notice the incoming connection from the "client"
3623 (but not accept it), instead
3624
3625 @item Once there is an incoming connection, ARM will start the "server",
3626 passing on the listen sockets (now, the service is started and can do its
3627 work).
3628
3629 @item Other client services now can directly connect directly to the
3630 "server".
3631
3632 @end itemize
3633
3634 @c ***********************************************************************
3635 @node Reliability
3636 @subsection Reliability
3637
3638 One of the features provided by ARM, is the automatic restart of crashed
3639 services.@ ARM needs to know which of the running services died. Function
3640 "gnunet-service-arm.c/maint_child_death()" is responsible for that. The
3641 function is scheduled to run upon receiving a SIGCHLD signal. The
3642 function, then, iterates ARM's list of services running and monitors
3643 which service has died (crashed). For all crashing services, ARM restarts
3644 them.@
3645 Now, considering the case of a service having a serious problem causing it
3646 to crash each time it's started by ARM. If ARM keeps blindly restarting
3647 such a service, we are going to have the pattern:
3648 start-crash-restart-crash-restart-crash and so forth!! Which is of course
3649 not practical.@
3650 For that reason, ARM schedules the service to be restarted after waiting
3651 for some delay that grows exponentially with each crash/restart of that
3652 service.@ To clarify the idea, considering the following example:
3653
3654 @itemize @bullet
3655
3656 @item Service S crashed.
3657
3658 @item ARM receives the SIGCHLD and inspects its list of services to find
3659 the dead one(s).
3660
3661 @item ARM finds S dead and schedules it for restarting after "backoff"
3662 time which is initially set to 1ms. ARM will double the backoff time
3663 correspondent to S (now backoff(S) = 2ms)
3664
3665 @item Because there is a severe problem with S, it crashed again.
3666
3667 @item Again ARM receives the SIGCHLD and detects that it's S again that's
3668 crashed. ARM schedules it for restarting but after its new backoff time
3669 (which became 2ms), and doubles its backoff time (now backoff(S) = 4).
3670
3671 @item and so on, until backoff(S) reaches a certain threshold
3672 (@code{EXPONENTIAL_BACKOFF_THRESHOLD} is set to half an hour),
3673 after reaching it, backoff(S) will remain half an hour,
3674 hence ARM won't be busy for a lot of time trying to restart a
3675 problematic service.
3676 @end itemize
3677
3678 @cindex TRANSPORT Subsystem
3679 @node TRANSPORT Subsystem
3680 @section TRANSPORT Subsystem
3681
3682
3683 This chapter documents how the GNUnet transport subsystem works. The
3684 GNUnet transport subsystem consists of three main components: the
3685 transport API (the interface used by the rest of the system to access the
3686 transport service), the transport service itself (most of the interesting
3687 functions, such as choosing transports, happens here) and the transport
3688 plugins. A transport plugin is a concrete implementation for how two
3689 GNUnet peers communicate; many plugins exist, for example for
3690 communication via TCP, UDP, HTTP, HTTPS and others. Finally, the
3691 transport subsystem uses supporting code, especially the NAT/UPnP
3692 library to help with tasks such as NAT traversal.
3693
3694 Key tasks of the transport service include:
3695
3696 @itemize @bullet
3697
3698 @item Create our HELLO message, notify clients and neighbours if our HELLO
3699 changes (using NAT library as necessary)
3700
3701 @item Validate HELLOs from other peers (send PING), allow other peers to
3702 validate our HELLO's addresses (send PONG)
3703
3704 @item Upon request, establish connections to other peers (using address
3705 selection from ATS subsystem) and maintain them (again using PINGs and
3706 PONGs) as long as desired
3707
3708 @item Accept incoming connections, give ATS service the opportunity to
3709 switch communication channels
3710
3711 @item Notify clients about peers that have connected to us or that have
3712 been disconnected from us
3713
3714 @item If a (stateful) connection goes down unexpectedly (without explicit
3715 DISCONNECT), quickly attempt to recover (without notifying clients) but do
3716 notify clients quickly if reconnecting fails
3717
3718 @item Send (payload) messages arriving from clients to other peers via
3719 transport plugins and receive messages from other peers, forwarding
3720 those to clients
3721
3722 @item Enforce inbound traffic limits (using flow-control if it is
3723 applicable); outbound traffic limits are enforced by CORE, not by us (!)
3724
3725 @item Enforce restrictions on P2P connection as specified by the blacklist
3726 configuration and blacklisting clients
3727 @end itemize
3728
3729 Note that the term "clients" in the list above really refers to the
3730 GNUnet-CORE service, as CORE is typically the only client of the
3731 transport service.
3732
3733 @menu
3734 * Address validation protocol::
3735 @end menu
3736
3737 @node Address validation protocol
3738 @subsection Address validation protocol
3739
3740
3741 This section documents how the GNUnet transport service validates
3742 connections with other peers. It is a high-level description of the
3743 protocol necessary to understand the details of the implementation. It
3744 should be noted that when we talk about PING and PONG messages in this
3745 section, we refer to transport-level PING and PONG messages, which are
3746 different from core-level PING and PONG messages (both in implementation
3747 and function).
3748
3749 The goal of transport-level address validation is to minimize the chances
3750 of a successful man-in-the-middle attack against GNUnet peers on the
3751 transport level. Such an attack would not allow the adversary to decrypt
3752 the P2P transmissions, but a successful attacker could at least measure
3753 traffic volumes and latencies (raising the adversaries capabilities by
3754 those of a global passive adversary in the worst case). The scenarios we
3755 are concerned about is an attacker, Mallory, giving a @code{HELLO} to
3756 Alice that claims to be for Bob, but contains Mallory's IP address
3757 instead of Bobs (for some transport).
3758 Mallory would then forward the traffic to Bob (by initiating a
3759 connection to Bob and claiming to be Alice). As a further
3760 complication, the scheme has to work even if say Alice is behind a NAT
3761 without traversal support and hence has no address of her own (and thus
3762 Alice must always initiate the connection to Bob).
3763
3764 An additional constraint is that @code{HELLO} messages do not contain a
3765 cryptographic signature since other peers must be able to edit
3766 (i.e. remove) addresses from the @code{HELLO} at any time (this was
3767 not true in GNUnet 0.8.x). A basic @strong{assumption} is that each peer
3768 knows the set of possible network addresses that it @strong{might}
3769 be reachable under (so for example, the external IP address of the
3770 NAT plus the LAN address(es) with the respective ports).
3771
3772 The solution is the following. If Alice wants to validate that a given
3773 address for Bob is valid (i.e. is actually established @strong{directly}
3774 with the intended target), she sends a PING message over that connection
3775 to Bob. Note that in this case, Alice initiated the connection so only
3776 Alice knows which address was used for sure (Alice may be behind NAT, so
3777 whatever address Bob sees may not be an address Alice knows she has).
3778 Bob checks that the address given in the @code{PING} is actually one
3779 of Bob's addresses (ie: does not belong to Mallory), and if it is,
3780 sends back a @code{PONG} (with a signature that says that Bob
3781 owns/uses the address from the @code{PING}).
3782 Alice checks the signature and is happy if it is valid and the address
3783 in the @code{PONG} is the address Alice used.
3784 This is similar to the 0.8.x protocol where the @code{HELLO} contained a
3785 signature from Bob for each address used by Bob.
3786 Here, the purpose code for the signature is
3787 @code{GNUNET_SIGNATURE_PURPOSE_TRANSPORT_PONG_OWN}. After this, Alice will
3788 remember Bob's address and consider the address valid for a while (12h in
3789 the current implementation). Note that after this exchange, Alice only
3790 considers Bob's address to be valid, the connection itself is not
3791 considered 'established'. In particular, Alice may have many addresses
3792 for Bob that Alice considers valid.
3793
3794 The @code{PONG} message is protected with a nonce/challenge against replay
3795 attacks (@uref{http://en.wikipedia.org/wiki/Replay_attack, replay})
3796 and uses an expiration time for the signature (but those are almost
3797 implementation details).
3798
3799 @cindex NAT library
3800 @node NAT library
3801 @section NAT library
3802
3803
3804 The goal of the GNUnet NAT library is to provide a general-purpose API for
3805 NAT traversal @strong{without} third-party support. So protocols that
3806 involve contacting a third peer to help establish a connection between
3807 two peers are outside of the scope of this API. That does not mean that
3808 GNUnet doesn't support involving a third peer (we can do this with the
3809 distance-vector transport or using application-level protocols), it just
3810 means that the NAT API is not concerned with this possibility. The API is
3811 written so that it will work for IPv6-NAT in the future as well as
3812 current IPv4-NAT. Furthermore, the NAT API is always used, even for peers
3813 that are not behind NAT --- in that case, the mapping provided is simply
3814 the identity.
3815
3816 NAT traversal is initiated by calling @code{GNUNET_NAT_register}. Given a
3817 set of addresses that the peer has locally bound to (TCP or UDP), the NAT
3818 library will return (via callback) a (possibly longer) list of addresses
3819 the peer @strong{might} be reachable under. Internally, depending on the
3820 configuration, the NAT library will try to punch a hole (using UPnP) or
3821 just "know" that the NAT was manually punched and generate the respective
3822 external IP address (the one that should be globally visible) based on
3823 the given information.
3824
3825 The NAT library also supports ICMP-based NAT traversal. Here, the other
3826 peer can request connection-reversal by this peer (in this special case,
3827 the peer is even allowed to configure a port number of zero). If the NAT
3828 library detects a connection-reversal request, it returns the respective
3829 target address to the client as well. It should be noted that
3830 connection-reversal is currently only intended for TCP, so other plugins
3831 @strong{must} pass @code{NULL} for the reversal callback. Naturally, the
3832 NAT library also supports requesting connection reversal from a remote
3833 peer (@code{GNUNET_NAT_run_client}).
3834
3835 Once initialized, the NAT handle can be used to test if a given address is
3836 possibly a valid address for this peer (@code{GNUNET_NAT_test_address}).
3837 This is used for validating our addresses when generating PONGs.
3838
3839 Finally, the NAT library contains an API to test if our NAT configuration
3840 is correct. Using @code{GNUNET_NAT_test_start} @strong{before} binding to
3841 the respective port, the NAT library can be used to test if the
3842 configuration works. The test function act as a local client, initialize
3843 the NAT traversal and then contact a @code{gnunet-nat-server} (running by
3844 default on @code{gnunet.org}) and ask for a connection to be established.
3845 This way, it is easy to test if the current NAT configuration is valid.
3846
3847 @node Distance-Vector plugin
3848 @section Distance-Vector plugin
3849
3850
3851 The Distance Vector (DV) transport is a transport mechanism that allows
3852 peers to act as relays for each other, thereby connecting peers that would
3853 otherwise be unable to connect. This gives a larger connection set to
3854 applications that may work better with more peers to choose from (for
3855 example, File Sharing and/or DHT).
3856
3857 The Distance Vector transport essentially has two functions. The first is
3858 "gossiping" connection information about more distant peers to directly
3859 connected peers. The second is taking messages intended for non-directly
3860 connected peers and encapsulating them in a DV wrapper that contains the
3861 required information for routing the message through forwarding peers. Via
3862 gossiping, optimal routes through the known DV neighborhood are discovered
3863 and utilized and the message encapsulation provides some benefits in
3864 addition to simply getting the message from the correct source to the
3865 proper destination.
3866
3867 The gossiping function of DV provides an up to date routing table of
3868 peers that are available up to some number of hops. We call this a
3869 fisheye view of the network (like a fish, nearby objects are known while
3870 more distant ones unknown). Gossip messages are sent only to directly
3871 connected peers, but they are sent about other knowns peers within the
3872 "fisheye distance". Whenever two peers connect, they immediately gossip
3873 to each other about their appropriate other neighbors. They also gossip
3874 about the newly connected peer to previously
3875 connected neighbors. In order to keep the routing tables up to date,
3876 disconnect notifications are propagated as gossip as well (because
3877 disconnects may not be sent/received, timeouts are also used remove
3878 stagnant routing table entries).
3879
3880 Routing of messages via DV is straightforward. When the DV transport is
3881 notified of a message destined for a non-direct neighbor, the appropriate
3882 forwarding peer is selected, and the base message is encapsulated in a DV
3883 message which contains information about the initial peer and the intended
3884 recipient. At each forwarding hop, the initial peer is validated (the
3885 forwarding peer ensures that it has the initial peer in its neighborhood,
3886 otherwise the message is dropped). Next the base message is
3887 re-encapsulated in a new DV message for the next hop in the forwarding
3888 chain (or delivered to the current peer, if it has arrived at the
3889 destination).
3890
3891 Assume a three peer network with peers Alice, Bob and Carol. Assume that
3892
3893 @example
3894 Alice <-> Bob and Bob <-> Carol
3895 @end example
3896
3897 @noindent
3898 are direct (e.g. over TCP or UDP transports) connections, but that
3899 Alice cannot directly connect to Carol.
3900 This may be the case due to NAT or firewall restrictions, or perhaps
3901 based on one of the peers respective configurations. If the Distance
3902 Vector transport is enabled on all three peers, it will automatically
3903 discover (from the gossip protocol) that Alice and Carol can connect via
3904 Bob and provide a "virtual" Alice <-> Carol connection. Routing between
3905 Alice and Carol happens as follows; Alice creates a message destined for
3906 Carol and notifies the DV transport about it. The DV transport at Alice
3907 looks up Carol in the routing table and finds that the message must be
3908 sent through Bob for Carol. The message is encapsulated setting Alice as
3909 the initiator and Carol as the destination and sent to Bob. Bob receives
3910 the messages, verifies that both Alice and Carol are known to Bob, and
3911 re-wraps the message in a new DV message for Carol.
3912 The DV transport at Carol receives this message, unwraps the original
3913 message, and delivers it to Carol as though it came directly from Alice.
3914
3915 @cindex SMTP plugin
3916 @node SMTP plugin
3917 @section SMTP plugin
3918
3919 @c TODO: Update!
3920
3921 This section describes the new SMTP transport plugin for GNUnet as it
3922 exists in the 0.7.x and 0.8.x branch. SMTP support is currently not
3923 available in GNUnet 0.9.x. This page also describes the transport layer
3924 abstraction (as it existed in 0.7.x and 0.8.x) in more detail and gives
3925 some benchmarking results. The performance results presented are quite
3926 old and maybe outdated at this point.
3927 For the readers in the year 2019, you will notice by the mention of
3928 version 0.7, 0.8, and 0.9 that this section has to be taken with your
3929 usual grain of salt and be updated eventually.
3930
3931 @itemize @bullet
3932 @item Why use SMTP for a peer-to-peer transport?
3933 @item SMTPHow does it work?
3934 @item How do I configure my peer?
3935 @item How do I test if it works?
3936 @item How fast is it?
3937 @item Is there any additional documentation?
3938 @end itemize
3939
3940
3941 @menu
3942 * Why use SMTP for a peer-to-peer transport?::
3943 * How does it work?::
3944 * How do I configure my peer?::
3945 * How do I test if it works?::
3946 * How fast is it?::
3947 @end menu
3948
3949 @node Why use SMTP for a peer-to-peer transport?
3950 @subsection Why use SMTP for a peer-to-peer transport?
3951
3952
3953 There are many reasons why one would not want to use SMTP:
3954
3955 @itemize @bullet
3956 @item SMTP is using more bandwidth than TCP, UDP or HTTP
3957 @item SMTP has a much higher latency.
3958 @item SMTP requires significantly more computation (encoding and decoding
3959 time) for the peers.
3960 @item SMTP is significantly more complicated to configure.
3961 @item SMTP may be abused by tricking GNUnet into sending mail to@
3962 non-participating third parties.
3963 @end itemize
3964
3965 So why would anybody want to use SMTP?
3966 @itemize @bullet
3967 @item SMTP can be used to contact peers behind NAT boxes (in virtual
3968 private networks).
3969 @item SMTP can be used to circumvent policies that limit or prohibit
3970 peer-to-peer traffic by masking as "legitimate" traffic.
3971 @item SMTP uses E-mail addresses which are independent of a specific IP,
3972 which can be useful to address peers that use dynamic IP addresses.
3973 @item SMTP can be used to initiate a connection (e.g. initial address
3974 exchange) and peers can then negotiate the use of a more efficient
3975 protocol (e.g. TCP) for the actual communication.
3976 @end itemize
3977
3978 In summary, SMTP can for example be used to send a message to a peer
3979 behind a NAT box that has a dynamic IP to tell the peer to establish a
3980 TCP connection to a peer outside of the private network. Even an
3981 extraordinary overhead for this first message would be irrelevant in this
3982 type of situation.
3983
3984 @node How does it work?
3985 @subsection How does it work?
3986
3987
3988 When a GNUnet peer needs to send a message to another GNUnet peer that has
3989 advertised (only) an SMTP transport address, GNUnet base64-encodes the
3990 message and sends it in an E-mail to the advertised address. The
3991 advertisement contains a filter which is placed in the E-mail header,
3992 such that the receiving host can filter the tagged E-mails and forward it
3993 to the GNUnet peer process. The filter can be specified individually by
3994 each peer and be changed over time. This makes it impossible to censor
3995 GNUnet E-mail messages by searching for a generic filter.
3996
3997 @node How do I configure my peer?
3998 @subsection How do I configure my peer?
3999
4000
4001 First, you need to configure @code{procmail} to filter your inbound E-mail
4002 for GNUnet traffic. The GNUnet messages must be delivered into a pipe, for
4003 example @code{/tmp/gnunet.smtp}. You also need to define a filter that is
4004 used by @command{procmail} to detect GNUnet messages. You are free to
4005 choose whichever filter you like, but you should make sure that it does
4006 not occur in your other E-mail. In our example, we will use
4007 @code{X-mailer: GNUnet}. The @code{~/.procmailrc} configuration file then
4008 looks like this:
4009
4010 @example
4011 :0:
4012 * ^X-mailer: GNUnet
4013 /tmp/gnunet.smtp
4014 # where do you want your other e-mail delivered to
4015 # (default: /var/spool/mail/)
4016 :0: /var/spool/mail/
4017 @end example
4018
4019 After adding this file, first make sure that your regular E-mail still
4020 works (e.g. by sending an E-mail to yourself). Then edit the GNUnet
4021 configuration. In the section @code{SMTP} you need to specify your E-mail
4022 address under @code{EMAIL}, your mail server (for outgoing mail) under
4023 @code{SERVER}, the filter (X-mailer: GNUnet in the example) under
4024 @code{FILTER} and the name of the pipe under @code{PIPE}.@ The completed
4025 section could then look like this:
4026
4027 @example
4028 EMAIL = me@@mail.gnu.org MTU = 65000 SERVER = mail.gnu.org:25 FILTER =
4029 "X-mailer: GNUnet" PIPE = /tmp/gnunet.smtp
4030 @end example
4031
4032 Finally, you need to add @code{smtp} to the list of @code{TRANSPORTS} in
4033 the @code{GNUNETD} section. GNUnet peers will use the E-mail address that
4034 you specified to contact your peer until the advertisement times out.
4035 Thus, if you are not sure if everything works properly or if you are not
4036 planning to be online for a long time, you may want to configure this
4037 timeout to be short, e.g. just one hour. For this, set
4038 @code{HELLOEXPIRES} to @code{1} in the @code{GNUNETD} section.
4039
4040 This should be it, but you may probably want to test it first.
4041
4042 @node How do I test if it works?
4043 @subsection How do I test if it works?
4044
4045
4046 Any transport can be subjected to some rudimentary tests using the
4047 @code{gnunet-transport-check} tool. The tool sends a message to the local
4048 node via the transport and checks that a valid message is received. While
4049 this test does not involve other peers and can not check if firewalls or
4050 other network obstacles prohibit proper operation, this is a great
4051 testcase for the SMTP transport since it tests pretty much nearly all of
4052 the functionality.
4053
4054 @code{gnunet-transport-check} should only be used without running
4055 @code{gnunetd} at the same time. By default, @code{gnunet-transport-check}
4056 tests all transports that are specified in the configuration file. But
4057 you can specifically test SMTP by giving the option
4058 @code{--transport=smtp}.
4059
4060 Note that this test always checks if a transport can receive and send.
4061 While you can configure most transports to only receive or only send
4062 messages, this test will only work if you have configured the transport
4063 to send and receive messages.
4064
4065 @node How fast is it?
4066 @subsection How fast is it?
4067
4068
4069 We have measured the performance of the UDP, TCP and SMTP transport layer
4070 directly and when used from an application using the GNUnet core.
4071 Measuring just the transport layer gives the better view of the actual
4072 overhead of the protocol, whereas evaluating the transport from the
4073 application puts the overhead into perspective from a practical point of
4074 view.
4075
4076 The loopback measurements of the SMTP transport were performed on three
4077 different machines spanning a range of modern SMTP configurations. We
4078 used a PIII-800 running RedHat 7.3 with the Purdue Computer Science
4079 configuration which includes filters for spam. We also used a Xenon 2 GHZ
4080 with a vanilla RedHat 8.0 sendmail configuration. Furthermore, we used
4081 qmail on a PIII-1000 running Sorcerer GNU Linux (SGL). The numbers for
4082 UDP and TCP are provided using the SGL configuration. The qmail benchmark
4083 uses qmail's internal filtering whereas the sendmail benchmarks relies on
4084 procmail to filter and deliver the mail. We used the transport layer to
4085 send a message of b bytes (excluding transport protocol headers) directly
4086 to the local machine. This way, network latency and packet loss on the
4087 wire have no impact on the timings. n messages were sent sequentially over
4088 the transport layer, sending message i+1 after the i-th message was
4089 received. All messages were sent over the same connection and the time to
4090 establish the connection was not taken into account since this overhead is
4091 minuscule in practice --- as long as a connection is used for a
4092 significant number of messages.
4093
4094 @multitable @columnfractions .20 .15 .15 .15 .15 .15
4095 @headitem Transport @tab UDP @tab TCP @tab SMTP (Purdue sendmail)
4096 @tab SMTP (RH 8.0) @tab SMTP (SGL qmail)
4097 @item  11 bytes @tab 31 ms @tab 55 ms @tab  781 s @tab 77 s @tab 24 s
4098 @item  407 bytes @tab 37 ms @tab 62 ms @tab  789 s @tab 78 s @tab 25 s
4099 @item 1,221 bytes @tab 46 ms @tab 73 ms @tab  804 s @tab 78 s @tab 25 s
4100 @end multitable
4101
4102 The benchmarks show that UDP and TCP are, as expected, both significantly
4103 faster compared with any of the SMTP services. Among the SMTP
4104 implementations, there can be significant differences depending on the
4105 SMTP configuration. Filtering with an external tool like procmail that
4106 needs to re-parse its configuration for each mail can be very expensive.
4107 Applying spam filters can also significantly impact the performance of
4108 the underlying SMTP implementation. The microbenchmark shows that SMTP
4109 can be a viable solution for initiating peer-to-peer sessions: a couple of
4110 seconds to connect to a peer are probably not even going to be noticed by
4111 users. The next benchmark measures the possible throughput for a
4112 transport. Throughput can be measured by sending multiple messages in
4113 parallel and measuring packet loss. Note that not only UDP but also the
4114 TCP transport can actually loose messages since the TCP implementation
4115 drops messages if the @code{write} to the socket would block. While the
4116 SMTP protocol never drops messages itself, it is often so
4117 slow that only a fraction of the messages can be sent and received in the
4118 given time-bounds. For this benchmark we report the message loss after
4119 allowing t time for sending m messages. If messages were not sent (or
4120 received) after an overall timeout of t, they were considered lost. The
4121 benchmark was performed using two Xeon 2 GHZ machines running RedHat 8.0
4122 with sendmail. The machines were connected with a direct 100 MBit Ethernet
4123 connection.@ Figures udp1200, tcp1200 and smtp-MTUs show that the
4124 throughput for messages of size 1,200 octets is 2,343 kbps, 3,310 kbps
4125 and 6 kbps for UDP, TCP and SMTP respectively. The high per-message
4126 overhead of SMTP can be improved by increasing the MTU, for example, an
4127 MTU of 12,000 octets improves the throughput to 13 kbps as figure
4128 smtp-MTUs shows. Our research paper) has some more details on the
4129 benchmarking results.
4130
4131 @cindex Bluetooth plugin
4132 @node Bluetooth plugin
4133 @section Bluetooth plugin
4134
4135
4136 This page describes the new Bluetooth transport plugin for GNUnet. The
4137 plugin is still in the testing stage so don't expect it to work
4138 perfectly. If you have any questions or problems just post them here or
4139 ask on the IRC channel.
4140
4141 @itemize @bullet
4142 @item What do I need to use the Bluetooth plugin transport?
4143 @item BluetoothHow does it work?
4144 @item What possible errors should I be aware of?
4145 @item How do I configure my peer?
4146 @item How can I test it?
4147 @end itemize
4148
4149 @menu
4150 * What do I need to use the Bluetooth plugin transport?::
4151 * How does it work2?::
4152 * What possible errors should I be aware of?::
4153 * How do I configure my peer2?::
4154 * How can I test it?::
4155 * The implementation of the Bluetooth transport plugin::
4156 @end menu
4157
4158 @node What do I need to use the Bluetooth plugin transport?
4159 @subsection What do I need to use the Bluetooth plugin transport?
4160
4161
4162 If you are a GNU/Linux user and you want to use the Bluetooth
4163 transport plugin you should install the
4164 @command{BlueZ development libraries} (if they aren't already
4165 installed).
4166 For instructions about how to install the libraries you should
4167 check out the BlueZ site
4168 (@uref{http://www.bluez.org/, http://www.bluez.org}). If you don't know if
4169 you have the necessary libraries, don't worry, just run the GNUnet
4170 configure script and you will be able to see a notification at the end
4171 which will warn you if you don't have the necessary libraries.
4172
4173 If you are a Windows user you should have installed the
4174 @emph{MinGW}/@emph{MSys2} with the latest updates (especially the
4175 @emph{ws2bth} header). If this is your first build of GNUnet on Windows
4176 you should check out the SBuild repository. It will semi-automatically
4177 assembles a @emph{MinGW}/@emph{MSys2} installation with a lot of extra
4178 packages which are needed for the GNUnet build. So this will ease your
4179 work!@ Finally you just have to be sure that you have the correct drivers
4180 for your Bluetooth device installed and that your device is on and in a
4181 discoverable mode. The Windows Bluetooth Stack supports only the RFCOMM
4182 protocol so we cannot turn on your device programatically!
4183
4184 @c FIXME: Change to unique title
4185 @node How does it work2?
4186 @subsection How does it work2?
4187
4188
4189 The Bluetooth transport plugin uses virtually the same code as the WLAN
4190 plugin and only the helper binary is different. The helper takes a single
4191 argument, which represents the interface name and is specified in the
4192 configuration file. Here are the basic steps that are followed by the
4193 helper binary used on GNU/Linux:
4194
4195 @itemize @bullet
4196 @item it verifies if the name corresponds to a Bluetooth interface name
4197 @item it verifies if the interface is up (if it is not, it tries to bring
4198 it up)
4199 @item it tries to enable the page and inquiry scan in order to make the
4200 device discoverable and to accept incoming connection requests
4201 @emph{The above operations require root access so you should start the
4202 transport plugin with root privileges.}
4203 @item it finds an available port number and registers a SDP service which
4204 will be used to find out on which port number is the server listening on
4205 and switch the socket in listening mode
4206 @item it sends a HELLO message with its address
4207 @item finally it forwards traffic from the reading sockets to the STDOUT
4208 and from the STDIN to the writing socket
4209 @end itemize
4210
4211 Once in a while the device will make an inquiry scan to discover the
4212 nearby devices and it will send them randomly HELLO messages for peer
4213 discovery.
4214
4215 @node What possible errors should I be aware of?
4216 @subsection What possible errors should I be aware of?
4217
4218
4219 @emph{This section is dedicated for GNU/Linux users}
4220
4221 Well there are many ways in which things could go wrong but I will try to
4222 present some tools that you could use to debug and some scenarios.
4223
4224 @itemize @bullet
4225
4226 @item @code{bluetoothd -n -d} : use this command to enable logging in the
4227 foreground and to print the logging messages
4228
4229 @item @code{hciconfig}: can be used to configure the Bluetooth devices.
4230 If you run it without any arguments it will print information about the
4231 state of the interfaces. So if you receive an error that the device
4232 couldn't be brought up you should try to bring it manually and to see if
4233 it works (use @code{hciconfig -a hciX up}). If you can't and the
4234 Bluetooth address has the form 00:00:00:00:00:00 it means that there is
4235 something wrong with the D-Bus daemon or with the Bluetooth daemon. Use
4236 @code{bluetoothd} tool to see the logs
4237
4238 @item @code{sdptool} can be used to control and interrogate SDP servers.
4239 If you encounter problems regarding the SDP server (like the SDP server is
4240 down) you should check out if the D-Bus daemon is running correctly and to
4241 see if the Bluetooth daemon started correctly(use @code{bluetoothd} tool).
4242 Also, sometimes the SDP service could work but somehow the device couldn't
4243 register its service. Use @code{sdptool browse [dev-address]} to see if
4244 the service is registered. There should be a service with the name of the
4245 interface and GNUnet as provider.
4246
4247 @item @code{hcitool} : another useful tool which can be used to configure
4248 the device and to send some particular commands to it.
4249
4250 @item @code{hcidump} : could be used for low level debugging
4251 @end itemize
4252
4253 @c FIXME: A more unique name
4254 @node How do I configure my peer2?
4255 @subsection How do I configure my peer2?
4256
4257
4258 On GNU/Linux, you just have to be sure that the interface name
4259 corresponds to the one that you want to use.
4260 Use the @code{hciconfig} tool to check that.
4261 By default it is set to hci0 but you can change it.
4262
4263 A basic configuration looks like this:
4264
4265 @example
4266 [transport-bluetooth]
4267 # Name of the interface (typically hciX)
4268 INTERFACE = hci0
4269 # Real hardware, no testing
4270 TESTMODE = 0 TESTING_IGNORE_KEYS = ACCEPT_FROM;
4271 @end example
4272
4273 In order to use the Bluetooth transport plugin when the transport service
4274 is started, you must add the plugin name to the default transport service
4275 plugins list. For example:
4276
4277 @example
4278 [transport] ...  PLUGINS = dns bluetooth ...
4279 @end example
4280
4281 If you want to use only the Bluetooth plugin set
4282 @emph{PLUGINS = bluetooth}
4283
4284 On Windows, you cannot specify which device to use. The only thing that
4285 you should do is to add @emph{bluetooth} on the plugins list of the
4286 transport service.
4287
4288 @node How can I test it?
4289 @subsection How can I test it?
4290
4291
4292 If you have two Bluetooth devices on the same machine and you are using
4293 GNU/Linux you must:
4294
4295 @itemize @bullet
4296
4297 @item create two different file configuration (one which will use the
4298 first interface (@emph{hci0}) and the other which will use the second
4299 interface (@emph{hci1})). Let's name them @emph{peer1.conf} and
4300 @emph{peer2.conf}.
4301
4302 @item run @emph{gnunet-peerinfo -c peerX.conf -s} in order to generate the
4303 peers private keys. The @strong{X} must be replace with 1 or 2.
4304
4305 @item run @emph{gnunet-arm -c peerX.conf -s -i=transport} in order to
4306 start the transport service. (Make sure that you have "bluetooth" on the
4307 transport plugins list if the Bluetooth transport service doesn't start.)
4308
4309 @item run @emph{gnunet-peerinfo -c peer1.conf -s} to get the first peer's
4310 ID. If you already know your peer ID (you saved it from the first
4311 command), this can be skipped.
4312
4313 @item run @emph{gnunet-transport -c peer2.conf -p=PEER1_ID -s} to start
4314 sending data for benchmarking to the other peer.
4315
4316 @end itemize
4317
4318
4319 This scenario will try to connect the second peer to the first one and
4320 then start sending data for benchmarking.
4321
4322 On Windows you cannot test the plugin functionality using two Bluetooth
4323 devices from the same machine because after you install the drivers there
4324 will occur some conflicts between the Bluetooth stacks. (At least that is
4325 what happened on my machine : I wasn't able to use the Bluesoleil stack and
4326 the WINDCOMM one in the same time).
4327
4328 If you have two different machines and your configuration files are good
4329 you can use the same scenario presented on the beginning of this section.
4330
4331 Another way to test the plugin functionality is to create your own
4332 application which will use the GNUnet framework with the Bluetooth
4333 transport service.
4334
4335 @node The implementation of the Bluetooth transport plugin
4336 @subsection The implementation of the Bluetooth transport plugin
4337
4338
4339 This page describes the implementation of the Bluetooth transport plugin.
4340
4341 First I want to remind you that the Bluetooth transport plugin uses
4342 virtually the same code as the WLAN plugin and only the helper binary is
4343 different. Also the scope of the helper binary from the Bluetooth
4344 transport plugin is the same as the one used for the WLAN transport
4345 plugin: it accesses the interface and then it forwards traffic in both
4346 directions between the Bluetooth interface and stdin/stdout of the
4347 process involved.
4348
4349 The Bluetooth plugin transport could be used both on GNU/Linux and Windows
4350 platforms.
4351
4352 @itemize @bullet
4353 @item Linux functionality
4354 @item Windows functionality
4355 @item Pending Features
4356 @end itemize
4357
4358
4359
4360 @menu
4361 * Linux functionality::
4362 * THE INITIALIZATION::
4363 * THE LOOP::
4364 * Details about the broadcast implementation::
4365 * Windows functionality::
4366 * Pending features::
4367 @end menu
4368
4369 @node Linux functionality
4370 @subsubsection Linux functionality
4371
4372
4373 In order to implement the plugin functionality on GNU/Linux I
4374 used the BlueZ stack.
4375 For the communication with the other devices I used the RFCOMM
4376 protocol. Also I used the HCI protocol to gain some control over the
4377 device. The helper binary takes a single argument (the name of the
4378 Bluetooth interface) and is separated in two stages:
4379
4380 @c %** 'THE INITIALIZATION' should be in bigger letters or stand out, not
4381 @c %** starting a new section?
4382 @node THE INITIALIZATION
4383 @subsubsection THE INITIALIZATION
4384
4385 @itemize @bullet
4386 @item first, it checks if we have root privileges
4387 (@emph{Remember that we need to have root privileges in order to be able
4388 to bring the interface up if it is down or to change its state.}).
4389
4390 @item second, it verifies if the interface with the given name exists.
4391
4392 @strong{If the interface with that name exists and it is a Bluetooth
4393 interface:}
4394
4395 @item it creates a RFCOMM socket which will be used for listening and call
4396 the @emph{open_device} method
4397
4398 On the @emph{open_device} method:
4399 @itemize @bullet
4400 @item creates a HCI socket used to send control events to the the device
4401 @item searches for the device ID using the interface name
4402 @item saves the device MAC address
4403 @item checks if the interface is down and tries to bring it UP
4404 @item checks if the interface is in discoverable mode and tries to make it
4405 discoverable
4406 @item closes the HCI socket and binds the RFCOMM one
4407 @item switches the RFCOMM socket in listening mode
4408 @item registers the SDP service (the service will be used by the other
4409 devices to get the port on which this device is listening on)
4410 @end itemize
4411
4412 @item drops the root privileges
4413
4414 @strong{If the interface is not a Bluetooth interface the helper exits
4415 with a suitable error}
4416 @end itemize
4417
4418 @c %** Same as for @node entry above
4419 @node THE LOOP
4420 @subsubsection THE LOOP
4421
4422 The helper binary uses a list where it saves all the connected neighbour
4423 devices (@emph{neighbours.devices}) and two buffers (@emph{write_pout} and
4424 @emph{write_std}). The first message which is send is a control message
4425 with the device's MAC address in order to announce the peer presence to
4426 the neighbours. Here are a short description of what happens in the main
4427 loop:
4428
4429 @itemize @bullet
4430 @item Every time when it receives something from the STDIN it processes
4431 the data and saves the message in the first buffer (@emph{write_pout}).
4432 When it has something in the buffer, it gets the destination address from
4433 the buffer, searches the destination address in the list (if there is no
4434 connection with that device, it creates a new one and saves it to the
4435 list) and sends the message.
4436 @item Every time when it receives something on the listening socket it
4437 accepts the connection and saves the socket on a list with the reading
4438 sockets. @item Every time when it receives something from a reading
4439 socket it parses the message, verifies the CRC and saves it in the
4440 @emph{write_std} buffer in order to be sent later to the STDOUT.
4441 @end itemize
4442
4443 So in the main loop we use the select function to wait until one of the
4444 file descriptor saved in one of the two file descriptors sets used is
4445 ready to use. The first set (@emph{rfds}) represents the reading set and
4446 it could contain the list with the reading sockets, the STDIN file
4447 descriptor or the listening socket. The second set (@emph{wfds}) is the
4448 writing set and it could contain the sending socket or the STDOUT file
4449 descriptor. After the select function returns, we check which file
4450 descriptor is ready to use and we do what is supposed to do on that kind
4451 of event. @emph{For example:} if it is the listening socket then we
4452 accept a new connection and save the socket in the reading list; if it is
4453 the STDOUT file descriptor, then we write to STDOUT the message from the
4454 @emph{write_std} buffer.
4455
4456 To find out on which port a device is listening on we connect to the local
4457 SDP server and search the registered service for that device.
4458
4459 @emph{You should be aware of the fact that if the device fails to connect
4460 to another one when trying to send a message it will attempt one more
4461 time. If it fails again, then it skips the message.}
4462 @emph{Also you should know that the transport Bluetooth plugin has
4463 support for @strong{broadcast messages}.}
4464
4465 @node Details about the broadcast implementation
4466 @subsubsection Details about the broadcast implementation
4467
4468
4469 First I want to point out that the broadcast functionality for the CONTROL
4470 messages is not implemented in a conventional way. Since the inquiry scan
4471 time is too big and it will take some time to send a message to all the
4472 discoverable devices I decided to tackle the problem in a different way.
4473 Here is how I did it:
4474
4475 @itemize @bullet
4476 @item If it is the first time when I have to broadcast a message I make an
4477 inquiry scan and save all the devices' addresses to a vector.
4478 @item After the inquiry scan ends I take the first address from the list
4479 and I try to connect to it. If it fails, I try to connect to the next one.
4480 If it succeeds, I save the socket to a list and send the message to the
4481 device.
4482 @item When I have to broadcast another message, first I search on the list
4483 for a new device which I'm not connected to. If there is no new device on
4484 the list I go to the beginning of the list and send the message to the
4485 old devices. After 5 cycles I make a new inquiry scan to check out if
4486 there are new discoverable devices and save them to the list. If there
4487 are no new discoverable devices I reset the cycling counter and go again
4488 through the old list and send messages to the devices saved in it.
4489 @end itemize
4490
4491 @strong{Therefore}:
4492
4493 @itemize @bullet
4494 @item every time when I have a broadcast message I look up on the list
4495 for a new device and send the message to it
4496 @item if I reached the end of the list for 5 times and I'm connected to
4497 all the devices from the list I make a new inquiry scan.
4498 @emph{The number of the list's cycles after an inquiry scan could be
4499 increased by redefining the MAX_LOOPS variable}
4500 @item when there are no new devices I send messages to the old ones.
4501 @end itemize
4502
4503 Doing so, the broadcast control messages will reach the devices but with
4504 delay.
4505
4506 @emph{NOTICE:} When I have to send a message to a certain device first I
4507 check on the broadcast list to see if we are connected to that device. If
4508 not we try to connect to it and in case of success we save the address and
4509 the socket on the list. If we are already connected to that device we
4510 simply use the socket.
4511
4512 @node Windows functionality
4513 @subsubsection Windows functionality
4514
4515
4516 For Windows I decided to use the Microsoft Bluetooth stack which has the
4517 advantage of coming standard from Windows XP SP2. The main disadvantage is
4518 that it only supports the RFCOMM protocol so we will not be able to have
4519 a low level control over the Bluetooth device. Therefore it is the user
4520 responsibility to check if the device is up and in the discoverable mode.
4521 Also there are no tools which could be used for debugging in order to read
4522 the data coming from and going to a Bluetooth device, which obviously
4523 hindered my work. Another thing that slowed down the implementation of the
4524 plugin (besides that I wasn't too accommodated with the win32 API) was that
4525 there were some bugs on MinGW regarding the Bluetooth. Now they are solved
4526 but you should keep in mind that you should have the latest updates
4527 (especially the @emph{ws2bth} header).
4528
4529 Besides the fact that it uses the Windows Sockets, the Windows
4530 implementation follows the same principles as the GNU/Linux one:
4531
4532 @itemize @bullet
4533 @item It has a initalization part where it initializes the
4534 Windows Sockets, creates a RFCOMM socket which will be binded and switched
4535 to the listening mode and registers a SDP service. In the Microsoft
4536 Bluetooth API there are two ways to work with the SDP:
4537 @itemize @bullet
4538 @item an easy way which works with very simple service records
4539 @item a hard way which is useful when you need to update or to delete the
4540 record
4541 @end itemize
4542 @end itemize
4543
4544 Since I only needed the SDP service to find out on which port the device
4545 is listening on and that did not change, I decided to use the easy way.
4546 In order to register the service I used the @emph{WSASetService} function
4547 and I generated the @emph{Universally Unique Identifier} with the
4548 @emph{guidgen.exe} Windows's tool.
4549
4550 In the loop section the only difference from the GNU/Linux implementation
4551 is that I used the @code{GNUNET_NETWORK} library for
4552 functions like @emph{accept}, @emph{bind}, @emph{connect} or
4553 @emph{select}. I decided to use the
4554 @code{GNUNET_NETWORK} library because I also needed to interact
4555 with the STDIN and STDOUT handles and on Windows
4556 the select function is only defined for sockets,
4557 and it will not work for arbitrary file handles.
4558
4559 Another difference between GNU/Linux and Windows implementation is that in
4560 GNU/Linux, the Bluetooth address is represented in 48 bits
4561 while in Windows is represented in 64 bits.
4562 Therefore I had to do some changes on @emph{plugin_transport_wlan} header.
4563
4564 Also, currently on Windows the Bluetooth plugin doesn't have support for
4565 broadcast messages. When it receives a broadcast message it will skip it.
4566
4567 @node Pending features
4568 @subsubsection Pending features
4569
4570
4571 @itemize @bullet
4572 @item Implement the broadcast functionality on Windows @emph{(currently
4573 working on)}
4574 @item Implement a testcase for the helper :@ @emph{The testcase
4575 consists of a program which emulates the plugin and uses the helper. It
4576 will simulate connections, disconnections and data transfers.}
4577 @end itemize
4578
4579 If you have a new idea about a feature of the plugin or suggestions about
4580 how I could improve the implementation you are welcome to comment or to
4581 contact me.
4582
4583 @node WLAN plugin
4584 @section WLAN plugin
4585
4586
4587 This section documents how the wlan transport plugin works. Parts which
4588 are not implemented yet or could be better implemented are described at
4589 the end.
4590
4591 @cindex ATS Subsystem
4592 @node ATS Subsystem
4593 @section ATS Subsystem
4594
4595
4596 ATS stands for "automatic transport selection", and the function of ATS in
4597 GNUnet is to decide on which address (and thus transport plugin) should
4598 be used for two peers to communicate, and what bandwidth limits should be
4599 imposed on such an individual connection. To help ATS make an informed
4600 decision, higher-level services inform the ATS service about their
4601 requirements and the quality of the service rendered. The ATS service
4602 also interacts with the transport service to be appraised of working
4603 addresses and to communicate its resource allocation decisions. Finally,
4604 the ATS service's operation can be observed using a monitoring API.
4605
4606 The main logic of the ATS service only collects the available addresses,
4607 their performance characteristics and the applications requirements, but
4608 does not make the actual allocation decision. This last critical step is
4609 left to an ATS plugin, as we have implemented (currently three) different
4610 allocation strategies which differ significantly in their performance and
4611 maturity, and it is still unclear if any particular plugin is generally
4612 superior.
4613
4614 @cindex CORE Subsystem
4615 @node CORE Subsystem
4616 @section CORE Subsystem
4617
4618
4619 The CORE subsystem in GNUnet is responsible for securing link-layer
4620 communications between nodes in the GNUnet overlay network. CORE builds
4621 on the TRANSPORT subsystem which provides for the actual, insecure,
4622 unreliable link-layer communication (for example, via UDP or WLAN), and
4623 then adds fundamental security to the connections:
4624
4625 @itemize @bullet
4626 @item confidentiality with so-called perfect forward secrecy; we use
4627 ECDHE
4628 (@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman})
4629 powered by Curve25519
4630 (@uref{http://cr.yp.to/ecdh.html, Curve25519}) for the key
4631 exchange and then use symmetric encryption, encrypting with both AES-256
4632 (@uref{http://en.wikipedia.org/wiki/Rijndael, AES-256}) and
4633 Twofish (@uref{http://en.wikipedia.org/wiki/Twofish, Twofish})
4634 @item @uref{http://en.wikipedia.org/wiki/Authentication, authentication}
4635 is achieved by signing the ephemeral keys using Ed25519
4636 (@uref{http://ed25519.cr.yp.to/, Ed25519}), a deterministic
4637 variant of ECDSA
4638 (@uref{http://en.wikipedia.org/wiki/ECDSA, ECDSA})
4639 @item integrity protection (using SHA-512
4640 (@uref{http://en.wikipedia.org/wiki/SHA-2, SHA-512}) to do
4641 encrypt-then-MAC
4642 (@uref{http://en.wikipedia.org/wiki/Authenticated_encryption, encrypt-then-MAC}))
4643 @item Replay
4644 (@uref{http://en.wikipedia.org/wiki/Replay_attack, replay})
4645 protection (using nonces, timestamps, challenge-response,
4646 message counters and ephemeral keys)
4647 @item liveness (keep-alive messages, timeout)
4648 @end itemize
4649
4650 @menu
4651 * Limitations::
4652 * When is a peer "connected"?::
4653 * libgnunetcore::
4654 * The CORE Client-Service Protocol::
4655 * The CORE Peer-to-Peer Protocol::
4656 @end menu
4657
4658 @cindex core subsystem limitations
4659 @node Limitations
4660 @subsection Limitations
4661
4662
4663 CORE does not perform
4664 @uref{http://en.wikipedia.org/wiki/Routing, routing}; using CORE it is
4665 only possible to communicate with peers that happen to already be
4666 "directly" connected with each other. CORE also does not have an
4667 API to allow applications to establish such "direct" connections --- for
4668 this, applications can ask TRANSPORT, but TRANSPORT might not be able to
4669 establish a "direct" connection. The TOPOLOGY subsystem is responsible for
4670 trying to keep a few "direct" connections open at all times. Applications
4671 that need to talk to particular peers should use the CADET subsystem, as
4672 it can establish arbitrary "indirect" connections.
4673
4674 Because CORE does not perform routing, CORE must only be used directly by
4675 applications that either perform their own routing logic (such as
4676 anonymous file-sharing) or that do not require routing, for example
4677 because they are based on flooding the network. CORE communication is
4678 unreliable and delivery is possibly out-of-order. Applications that
4679 require reliable communication should use the CADET service. Each
4680 application can only queue one message per target peer with the CORE
4681 service at any time; messages cannot be larger than approximately
4682 63 kilobytes. If messages are small, CORE may group multiple messages
4683 (possibly from different applications) prior to encryption. If permitted
4684 by the application (using the @uref{http://baus.net/on-tcp_cork/, cork}
4685 option), CORE may delay transmissions to facilitate grouping of multiple
4686 small messages. If cork is not enabled, CORE will transmit the message as
4687 soon as TRANSPORT allows it (TRANSPORT is responsible for limiting
4688 bandwidth and congestion control). CORE does not allow flow control;
4689 applications are expected to process messages at line-speed. If flow
4690 control is needed, applications should use the CADET service.
4691
4692 @cindex when is a peer connected
4693 @node When is a peer "connected"?
4694 @subsection When is a peer "connected"?
4695
4696
4697 In addition to the security features mentioned above, CORE also provides
4698 one additional key feature to applications using it, and that is a
4699 limited form of protocol-compatibility checking. CORE distinguishes
4700 between TRANSPORT-level connections (which enable communication with other
4701 peers) and application-level connections. Applications using the CORE API
4702 will (typically) learn about application-level connections from CORE, and
4703 not about TRANSPORT-level connections. When a typical application uses
4704 CORE, it will specify a set of message types
4705 (from @code{gnunet_protocols.h}) that it understands. CORE will then
4706 notify the application about connections it has with other peers if and
4707 only if those applications registered an intersecting set of message
4708 types with their CORE service. Thus, it is quite possible that CORE only
4709 exposes a subset of the established direct connections to a particular
4710 application --- and different applications running above CORE might see
4711 different sets of connections at the same time.
4712
4713 A special case are applications that do not register a handler for any
4714 message type.
4715 CORE assumes that these applications merely want to monitor connections
4716 (or "all" messages via other callbacks) and will notify those applications
4717 about all connections. This is used, for example, by the
4718 @code{gnunet-core} command-line tool to display the active connections.
4719 Note that it is also possible that the TRANSPORT service has more active
4720 connections than the CORE service, as the CORE service first has to
4721 perform a key exchange with connecting peers before exchanging information
4722 about supported message types and notifying applications about the new
4723 connection.
4724
4725 @cindex libgnunetcore
4726 @node libgnunetcore
4727 @subsection libgnunetcore
4728
4729
4730 The CORE API (defined in @file{gnunet_core_service.h}) is the basic
4731 messaging API used by P2P applications built using GNUnet. It provides
4732 applications the ability to send and receive encrypted messages to the
4733 peer's "directly" connected neighbours.
4734
4735 As CORE connections are generally "direct" connections,@ applications must
4736 not assume that they can connect to arbitrary peers this way, as "direct"
4737 connections may not always be possible. Applications using CORE are
4738 notified about which peers are connected. Creating new "direct"
4739 connections must be done using the TRANSPORT API.
4740
4741 The CORE API provides unreliable, out-of-order delivery. While the
4742 implementation tries to ensure timely, in-order delivery, both message
4743 losses and reordering are not detected and must be tolerated by the
4744 application. Most important, the core will NOT perform retransmission if
4745 messages could not be delivered.
4746
4747 Note that CORE allows applications to queue one message per connected
4748 peer. The rate at which each connection operates is influenced by the
4749 preferences expressed by local application as well as restrictions
4750 imposed by the other peer. Local applications can express their
4751 preferences for particular connections using the "performance" API of the
4752 ATS service.
4753
4754 Applications that require more sophisticated transmission capabilities
4755 such as TCP-like behavior, or if you intend to send messages to arbitrary
4756 remote peers, should use the CADET API.
4757
4758 The typical use of the CORE API is to connect to the CORE service using
4759 @code{GNUNET_CORE_connect}, process events from the CORE service (such as
4760 peers connecting, peers disconnecting and incoming messages) and send
4761 messages to connected peers using
4762 @code{GNUNET_CORE_notify_transmit_ready}. Note that applications must
4763 cancel pending transmission requests if they receive a disconnect event
4764 for a peer that had a transmission pending; furthermore, queuing more
4765 than one transmission request per peer per application using the
4766 service is not permitted.
4767
4768 The CORE API also allows applications to monitor all communications of the
4769 peer prior to encryption (for outgoing messages) or after decryption (for
4770 incoming messages). This can be useful for debugging, diagnostics or to
4771 establish the presence of cover traffic (for anonymity). As monitoring
4772 applications are often not interested in the payload, the monitoring
4773 callbacks can be configured to only provide the message headers (including
4774 the message type and size) instead of copying the full data stream to the
4775 monitoring client.
4776
4777 The init callback of the @code{GNUNET_CORE_connect} function is called
4778 with the hash of the public key of the peer. This public key is used to
4779 identify the peer globally in the GNUnet network. Applications are
4780 encouraged to check that the provided hash matches the hash that they are
4781 using (as theoretically the application may be using a different
4782 configuration file with a different private key, which would result in
4783 hard to find bugs).
4784
4785 As with most service APIs, the CORE API isolates applications from crashes
4786 of the CORE service. If the CORE service crashes, the application will see
4787 disconnect events for all existing connections. Once the connections are
4788 re-established, the applications will be receive matching connect events.
4789
4790 @cindex core clinet-service protocol
4791 @node The CORE Client-Service Protocol
4792 @subsection The CORE Client-Service Protocol
4793
4794
4795 This section describes the protocol between an application using the CORE
4796 service (the client) and the CORE service process itself.
4797
4798
4799 @menu
4800 * Setup2::
4801 * Notifications::
4802 * Sending::
4803 @end menu
4804
4805 @node Setup2
4806 @subsubsection Setup2
4807
4808
4809 When a client connects to the CORE service, it first sends a
4810 @code{InitMessage} which specifies options for the connection and a set of
4811 message type values which are supported by the application. The options
4812 bitmask specifies which events the client would like to be notified about.
4813 The options include:
4814
4815 @table @asis
4816 @item GNUNET_CORE_OPTION_NOTHING No notifications
4817 @item GNUNET_CORE_OPTION_STATUS_CHANGE Peers connecting and disconnecting
4818 @item GNUNET_CORE_OPTION_FULL_INBOUND All inbound messages (after
4819 decryption) with full payload
4820 @item GNUNET_CORE_OPTION_HDR_INBOUND Just the @code{MessageHeader}
4821 of all inbound messages
4822 @item GNUNET_CORE_OPTION_FULL_OUTBOUND All outbound
4823 messages (prior to encryption) with full payload
4824 @item GNUNET_CORE_OPTION_HDR_OUTBOUND Just the @code{MessageHeader} of all
4825 outbound messages
4826 @end table
4827
4828 Typical applications will only monitor for connection status changes.
4829
4830 The CORE service responds to the @code{InitMessage} with an
4831 @code{InitReplyMessage} which contains the peer's identity. Afterwards,
4832 both CORE and the client can send messages.
4833
4834 @node Notifications
4835 @subsubsection Notifications
4836
4837
4838 The CORE will send @code{ConnectNotifyMessage}s and
4839 @code{DisconnectNotifyMessage}s whenever peers connect or disconnect from
4840 the CORE (assuming their type maps overlap with the message types
4841 registered by the client). When the CORE receives a message that matches
4842 the set of message types specified during the @code{InitMessage} (or if
4843 monitoring is enabled in for inbound messages in the options), it sends a
4844 @code{NotifyTrafficMessage} with the peer identity of the sender and the
4845 decrypted payload. The same message format (except with
4846 @code{GNUNET_MESSAGE_TYPE_CORE_NOTIFY_OUTBOUND} for the message type) is
4847 used to notify clients monitoring outbound messages; here, the peer
4848 identity given is that of the receiver.
4849
4850 @node Sending
4851 @subsubsection Sending
4852
4853
4854 When a client wants to transmit a message, it first requests a
4855 transmission slot by sending a @code{SendMessageRequest} which specifies
4856 the priority, deadline and size of the message. Note that these values
4857 may be ignored by CORE. When CORE is ready for the message, it answers
4858 with a @code{SendMessageReady} response. The client can then transmit the
4859 payload with a @code{SendMessage} message. Note that the actual message
4860 size in the @code{SendMessage} is allowed to be smaller than the size in
4861 the original request. A client may at any time send a fresh
4862 @code{SendMessageRequest}, which then superceeds the previous
4863 @code{SendMessageRequest}, which is then no longer valid. The client can
4864 tell which @code{SendMessageRequest} the CORE service's
4865 @code{SendMessageReady} message is for as all of these messages contain a
4866 "unique" request ID (based on a counter incremented by the client
4867 for each request).
4868
4869 @cindex CORE Peer-to-Peer Protocol
4870 @node The CORE Peer-to-Peer Protocol
4871 @subsection The CORE Peer-to-Peer Protocol
4872
4873
4874
4875 @menu
4876 * Creating the EphemeralKeyMessage::
4877 * Establishing a connection::
4878 * Encryption and Decryption::
4879 * Type maps::
4880 @end menu
4881
4882 @cindex EphemeralKeyMessage creation
4883 @node Creating the EphemeralKeyMessage
4884 @subsubsection Creating the EphemeralKeyMessage
4885
4886
4887 When the CORE service starts, each peer creates a fresh ephemeral (ECC)
4888 public-private key pair and signs the corresponding
4889 @code{EphemeralKeyMessage} with its long-term key (which we usually call
4890 the peer's identity; the hash of the public long term key is what results
4891 in a @code{struct GNUNET_PeerIdentity} in all GNUnet APIs. The ephemeral
4892 key is ONLY used for an ECDHE
4893 (@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman})
4894 exchange by the CORE service to establish symmetric session keys. A peer
4895 will use the same @code{EphemeralKeyMessage} for all peers for
4896 @code{REKEY_FREQUENCY}, which is usually 12 hours. After that time, it
4897 will create a fresh ephemeral key (forgetting the old one) and broadcast
4898 the new @code{EphemeralKeyMessage} to all connected peers, resulting in
4899 fresh symmetric session keys. Note that peers independently decide on
4900 when to discard ephemeral keys; it is not a protocol violation to discard
4901 keys more often. Ephemeral keys are also never stored to disk; restarting
4902 a peer will thus always create a fresh ephemeral key. The use of ephemeral
4903 keys is what provides @uref{http://en.wikipedia.org/wiki/Forward_secrecy, forward secrecy}.
4904
4905 Just before transmission, the @code{EphemeralKeyMessage} is patched to
4906 reflect the current sender_status, which specifies the current state of
4907 the connection from the point of view of the sender. The possible values
4908 are:
4909
4910 @itemize @bullet
4911 @item @code{KX_STATE_DOWN} Initial value, never used on the network
4912 @item @code{KX_STATE_KEY_SENT} We sent our ephemeral key, do not know the
4913 key of the other peer
4914 @item @code{KX_STATE_KEY_RECEIVED} This peer has received a valid
4915 ephemeral key of the other peer, but we are waiting for the other peer to
4916 confirm it's authenticity (ability to decode) via challenge-response.
4917 @item @code{KX_STATE_UP} The connection is fully up from the point of
4918 view of the sender (now performing keep-alives)
4919 @item @code{KX_STATE_REKEY_SENT} The sender has initiated a rekeying
4920 operation; the other peer has so far failed to confirm a working
4921 connection using the new ephemeral key
4922 @end itemize
4923
4924 @node Establishing a connection
4925 @subsubsection Establishing a connection
4926
4927
4928 Peers begin their interaction by sending a @code{EphemeralKeyMessage} to
4929 the other peer once the TRANSPORT service notifies the CORE service about
4930 the connection.
4931 A peer receiving an @code{EphemeralKeyMessage} with a status
4932 indicating that the sender does not have the receiver's ephemeral key, the
4933 receiver's @code{EphemeralKeyMessage} is sent in response.
4934 Additionally, if the receiver has not yet confirmed the authenticity of
4935 the sender, it also sends an (encrypted)@code{PingMessage} with a
4936 challenge (and the identity of the target) to the other peer. Peers
4937 receiving a @code{PingMessage} respond with an (encrypted)
4938 @code{PongMessage} which includes the challenge. Peers receiving a
4939 @code{PongMessage} check the challenge, and if it matches set the
4940 connection to @code{KX_STATE_UP}.
4941
4942 @node Encryption and Decryption
4943 @subsubsection Encryption and Decryption
4944
4945
4946 All functions related to the key exchange and encryption/decryption of
4947 messages can be found in @file{gnunet-service-core_kx.c} (except for the
4948 cryptographic primitives, which are in @file{util/crypto*.c}).
4949 Given the key material from ECDHE, a Key derivation function
4950 (@uref{https://en.wikipedia.org/wiki/Key_derivation_function, Key derivation function})
4951 is used to derive two pairs of encryption and decryption keys for AES-256
4952 and TwoFish, as well as initialization vectors and authentication keys
4953 (for HMAC
4954 (@uref{https://en.wikipedia.org/wiki/HMAC, HMAC})).
4955 The HMAC is computed over the encrypted payload.
4956 Encrypted messages include an iv_seed and the HMAC in the header.
4957
4958 Each encrypted message in the CORE service includes a sequence number and
4959 a timestamp in the encrypted payload. The CORE service remembers the
4960 largest observed sequence number and a bit-mask which represents which of
4961 the previous 32 sequence numbers were already used.
4962 Messages with sequence numbers lower than the largest observed sequence
4963 number minus 32 are discarded. Messages with a timestamp that is less
4964 than @code{REKEY_TOLERANCE} off (5 minutes) are also discarded. This of
4965 course means that system clocks need to be reasonably synchronized for
4966 peers to be able to communicate. Additionally, as the ephemeral key
4967 changes every 12 hours, a peer would not even be able to decrypt messages
4968 older than 12 hours.
4969
4970 @node Type maps
4971 @subsubsection Type maps
4972
4973
4974 Once an encrypted connection has been established, peers begin to exchange
4975 type maps. Type maps are used to allow the CORE service to determine which
4976 (encrypted) connections should be shown to which applications. A type map
4977 is an array of 65536 bits representing the different types of messages
4978 understood by applications using the CORE service. Each CORE service
4979 maintains this map, simply by setting the respective bit for each message
4980 type supported by any of the applications using the CORE service. Note
4981 that bits for message types embedded in higher-level protocols (such as
4982 MESH) will not be included in these type maps.
4983
4984 Typically, the type map of a peer will be sparse. Thus, the CORE service
4985 attempts to compress its type map using @code{gzip}-style compression
4986 ("deflate") prior to transmission. However, if the compression fails to
4987 compact the map, the map may also be transmitted without compression
4988 (resulting in @code{GNUNET_MESSAGE_TYPE_CORE_COMPRESSED_TYPE_MAP} or
4989 @code{GNUNET_MESSAGE_TYPE_CORE_BINARY_TYPE_MAP} messages respectively).
4990 Upon receiving a type map, the respective CORE service notifies
4991 applications about the connection to the other peer if they support any
4992 message type indicated in the type map (or no message type at all).
4993 If the CORE service experience a connect or disconnect event from an
4994 application, it updates its type map (setting or unsetting the respective
4995 bits) and notifies its neighbours about the change.
4996 The CORE services of the neighbours then in turn generate connect and
4997 disconnect events for the peer that sent the type map for their respective
4998 applications. As CORE messages may be lost, the CORE service confirms
4999 receiving a type map by sending back a
5000 @code{GNUNET_MESSAGE_TYPE_CORE_CONFIRM_TYPE_MAP}. If such a confirmation
5001 (with the correct hash of the type map) is not received, the sender will
5002 retransmit the type map (with exponential back-off).
5003
5004 @cindex CADET Subsystem
5005 @cindex CADET
5006 @cindex cadet
5007 @node CADET Subsystem
5008 @section CADET Subsystem
5009
5010 The CADET subsystem in GNUnet is responsible for secure end-to-end
5011 communications between nodes in the GNUnet overlay network. CADET builds
5012 on the CORE subsystem which provides for the link-layer communication and
5013 then adds routing, forwarding and additional security to the connections.
5014 CADET offers the same cryptographic services as CORE, but on an
5015 end-to-end level. This is done so peers retransmitting traffic on behalf
5016 of other peers cannot access the payload data.
5017
5018 @itemize @bullet
5019 @item CADET provides confidentiality with so-called perfect forward
5020 secrecy; we use ECDHE powered by Curve25519 for the key exchange and then
5021 use symmetric encryption, encrypting with both AES-256 and Twofish
5022 @item authentication is achieved by signing the ephemeral keys using
5023 Ed25519, a deterministic variant of ECDSA
5024 @item integrity protection (using SHA-512 to do encrypt-then-MAC, although
5025 only 256 bits are sent to reduce overhead)
5026 @item replay protection (using nonces, timestamps, challenge-response,
5027 message counters and ephemeral keys)
5028 @item liveness (keep-alive messages, timeout)
5029 @end itemize
5030
5031 Additional to the CORE-like security benefits, CADET offers other
5032 properties that make it a more universal service than CORE.
5033
5034 @itemize @bullet
5035 @item CADET can establish channels to arbitrary peers in GNUnet. If a
5036 peer is not immediately reachable, CADET will find a path through the
5037 network and ask other peers to retransmit the traffic on its behalf.
5038 @item CADET offers (optional) reliability mechanisms. In a reliable
5039 channel traffic is guaranteed to arrive complete, unchanged and in-order.
5040 @item CADET takes care of flow and congestion control mechanisms, not
5041 allowing the sender to send more traffic than the receiver or the network
5042 are able to process.
5043 @end itemize
5044
5045 @menu
5046 * libgnunetcadet::
5047 @end menu
5048
5049 @cindex libgnunetcadet
5050 @node libgnunetcadet
5051 @subsection libgnunetcadet
5052
5053
5054 The CADET API (defined in @file{gnunet_cadet_service.h}) is the
5055 messaging API used by P2P applications built using GNUnet.
5056 It provides applications the ability to send and receive encrypted
5057 messages to any peer participating in GNUnet.
5058 The API is heavily base on the CORE API.
5059
5060 CADET delivers messages to other peers in "channels".
5061 A channel is a permanent connection defined by a destination peer
5062 (identified by its public key) and a port number.
5063 Internally, CADET tunnels all channels towards a destination peer
5064 using one session key and relays the data on multiple "connections",
5065 independent from the channels.
5066
5067 Each channel has optional parameters, the most important being the
5068 reliability flag.
5069 Should a message get lost on TRANSPORT/CORE level, if a channel is
5070 created with as reliable, CADET will retransmit the lost message and
5071 deliver it in order to the destination application.
5072
5073 @pindex GNUNET_CADET_connect
5074 To communicate with other peers using CADET, it is necessary to first
5075 connect to the service using @code{GNUNET_CADET_connect}.
5076 This function takes several parameters in form of callbacks, to allow the
5077 client to react to various events, like incoming channels or channels that
5078 terminate, as well as specify a list of ports the client wishes to listen
5079 to (at the moment it is not possible to start listening on further ports
5080 once connected, but nothing prevents a client to connect several times to
5081 CADET, even do one connection per listening port).
5082 The function returns a handle which has to be used for any further
5083 interaction with the service.
5084
5085 @pindex GNUNET_CADET_channel_create
5086 To connect to a remote peer, a client has to call the
5087 @code{GNUNET_CADET_channel_create} function. The most important parameters
5088 given are the remote peer's identity (it public key) and a port, which
5089 specifies which application on the remote peer to connect to, similar to
5090 TCP/UDP ports. CADET will then find the peer in the GNUnet network and
5091 establish the proper low-level connections and do the necessary key
5092 exchanges to assure and authenticated, secure and verified communication.
5093 Similar to @code{GNUNET_CADET_connect},@code{GNUNET_CADET_create_channel}
5094 returns a handle to interact with the created channel.
5095
5096 @pindex GNUNET_CADET_notify_transmit_ready
5097 For every message the client wants to send to the remote application,
5098 @code{GNUNET_CADET_notify_transmit_ready} must be called, indicating the
5099 channel on which the message should be sent and the size of the message
5100 (but not the message itself!). Once CADET is ready to send the message,
5101 the provided callback will fire, and the message contents are provided to
5102 this callback.
5103
5104 Please note the CADET does not provide an explicit notification of when a
5105 channel is connected. In loosely connected networks, like big wireless
5106 mesh networks, this can take several seconds, even minutes in the worst
5107 case. To be alerted when a channel is online, a client can call
5108 @code{GNUNET_CADET_notify_transmit_ready} immediately after
5109 @code{GNUNET_CADET_create_channel}. When the callback is activated, it
5110 means that the channel is online. The callback can give 0 bytes to CADET
5111 if no message is to be sent, this is OK.
5112
5113 @pindex GNUNET_CADET_notify_transmit_cancel
5114 If a transmission was requested but before the callback fires it is no
5115 longer needed, it can be canceled with
5116 @code{GNUNET_CADET_notify_transmit_ready_cancel}, which uses the handle
5117 given back by @code{GNUNET_CADET_notify_transmit_ready}.
5118 As in the case of CORE, only one message can be requested at a time: a
5119 client must not call @code{GNUNET_CADET_notify_transmit_ready} again until
5120 the callback is called or the request is canceled.
5121
5122 @pindex GNUNET_CADET_channel_destroy
5123 When a channel is no longer needed, a client can call
5124 @code{GNUNET_CADET_channel_destroy} to get rid of it.
5125 Note that CADET will try to transmit all pending traffic before notifying
5126 the remote peer of the destruction of the channel, including
5127 retransmitting lost messages if the channel was reliable.
5128
5129 Incoming channels, channels being closed by the remote peer, and traffic
5130 on any incoming or outgoing channels are given to the client when CADET
5131 executes the callbacks given to it at the time of
5132 @code{GNUNET_CADET_connect}.
5133
5134 @pindex GNUNET_CADET_disconnect
5135 Finally, when an application no longer wants to use CADET, it should call
5136 @code{GNUNET_CADET_disconnect}, but first all channels and pending
5137 transmissions must be closed (otherwise CADET will complain).
5138
5139 @cindex NSE Subsystem
5140 @node NSE Subsystem
5141 @section NSE Subsystem
5142
5143
5144 NSE stands for @dfn{Network Size Estimation}. The NSE subsystem provides
5145 other subsystems and users with a rough estimate of the number of peers
5146 currently participating in the GNUnet overlay.
5147 The computed value is not a precise number as producing a precise number
5148 in a decentralized, efficient and secure way is impossible.
5149 While NSE's estimate is inherently imprecise, NSE also gives the expected
5150 range. For a peer that has been running in a stable network for a
5151 while, the real network size will typically (99.7% of the time) be in the
5152 range of [2/3 estimate, 3/2 estimate]. We will now give an overview of the
5153 algorithm used to calculate the estimate;
5154 all of the details can be found in this technical report.
5155
5156 @c FIXME: link to the report.
5157
5158 @menu
5159 * Motivation::
5160 * Principle::
5161 * libgnunetnse::
5162 * The NSE Client-Service Protocol::
5163 * The NSE Peer-to-Peer Protocol::
5164 @end menu
5165
5166 @node Motivation
5167 @subsection Motivation
5168
5169
5170 Some subsystems, like DHT, need to know the size of the GNUnet network to
5171 optimize some parameters of their own protocol. The decentralized nature
5172 of GNUnet makes efficient and securely counting the exact number of peers
5173 infeasible. Although there are several decentralized algorithms to count
5174 the number of peers in a system, so far there is none to do so securely.
5175 Other protocols may allow any malicious peer to manipulate the final
5176 result or to take advantage of the system to perform
5177 @dfn{Denial of Service} (DoS) attacks against the network.
5178 GNUnet's NSE protocol avoids these drawbacks.
5179
5180
5181
5182 @menu
5183 * Security::
5184 @end menu
5185
5186 @cindex NSE security
5187 @cindex nse security
5188 @node Security
5189 @subsubsection Security
5190
5191
5192 The NSE subsystem is designed to be resilient against these attacks.
5193 It uses @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proofs of work}
5194 to prevent one peer from impersonating a large number of participants,
5195 which would otherwise allow an adversary to artificially inflate the
5196 estimate.
5197 The DoS protection comes from the time-based nature of the protocol:
5198 the estimates are calculated periodically and out-of-time traffic is
5199 either ignored or stored for later retransmission by benign peers.
5200 In particular, peers cannot trigger global network communication at will.
5201
5202 @cindex NSE principle
5203 @cindex nse principle
5204 @node Principle
5205 @subsection Principle
5206
5207
5208 The algorithm calculates the estimate by finding the globally closest
5209 peer ID to a random, time-based value.
5210
5211 The idea is that the closer the ID is to the random value, the more
5212 "densely packed" the ID space is, and therefore, more peers are in the
5213 network.
5214
5215
5216
5217 @menu
5218 * Example::
5219 * Algorithm::
5220 * Target value::
5221 * Timing::
5222 * Controlled Flooding::
5223 * Calculating the estimate::
5224 @end menu
5225
5226 @node Example
5227 @subsubsection Example
5228
5229
5230 Suppose all peers have IDs between 0 and 100 (our ID space), and the
5231 random value is 42.
5232 If the closest peer has the ID 70 we can imagine that the average
5233 "distance" between peers is around 30 and therefore the are around 3
5234 peers in the whole ID space. On the other hand, if the closest peer has
5235 the ID 44, we can imagine that the space is rather packed with peers,
5236 maybe as much as 50 of them.
5237 Naturally, we could have been rather unlucky, and there is only one peer
5238 and happens to have the ID 44. Thus, the current estimate is calculated
5239 as the average over multiple rounds, and not just a single sample.
5240
5241 @node Algorithm
5242 @subsubsection Algorithm
5243
5244
5245 Given that example, one can imagine that the job of the subsystem is to
5246 efficiently communicate the ID of the closest peer to the target value
5247 to all the other peers, who will calculate the estimate from it.
5248
5249 @node Target value
5250 @subsubsection Target value
5251
5252
5253
5254 The target value itself is generated by hashing the current time, rounded
5255 down to an agreed value. If the rounding amount is 1h (default) and the
5256 time is 12:34:56, the time to hash would be 12:00:00. The process is
5257 repeated each rounding amount (in this example would be every hour).
5258 Every repetition is called a round.
5259
5260 @node Timing
5261 @subsubsection Timing
5262
5263
5264 The NSE subsystem has some timing control to avoid everybody broadcasting
5265 its ID all at one. Once each peer has the target random value, it
5266 compares its own ID to the target and calculates the hypothetical size of
5267 the network if that peer were to be the closest.
5268 Then it compares the hypothetical size with the estimate from the previous
5269 rounds. For each value there is an associated point in the period,
5270 let's call it "broadcast time". If its own hypothetical estimate
5271 is the same as the previous global estimate, its "broadcast time" will be
5272 in the middle of the round. If its bigger it will be earlier and if its
5273 smaller (the most likely case) it will be later. This ensures that the
5274 peers closest to the target value start broadcasting their ID the first.
5275
5276 @node Controlled Flooding
5277 @subsubsection Controlled Flooding
5278
5279
5280
5281 When a peer receives a value, first it verifies that it is closer than the
5282 closest value it had so far, otherwise it answers the incoming message
5283 with a message containing the better value. Then it checks a proof of
5284 work that must be included in the incoming message, to ensure that the
5285 other peer's ID is not made up (otherwise a malicious peer could claim to
5286 have an ID of exactly the target value every round). Once validated, it
5287 compares the broadcast time of the received value with the current time
5288 and if it's not too early, sends the received value to its neighbors.
5289 Otherwise it stores the value until the correct broadcast time comes.
5290 This prevents unnecessary traffic of sub-optimal values, since a better
5291 value can come before the broadcast time, rendering the previous one
5292 obsolete and saving the traffic that would have been used to broadcast it
5293 to the neighbors.
5294
5295 @node Calculating the estimate
5296 @subsubsection Calculating the estimate
5297
5298
5299
5300 Once the closest ID has been spread across the network each peer gets the
5301 exact distance between this ID and the target value of the round and
5302 calculates the estimate with a mathematical formula described in the tech
5303 report. The estimate generated with this method for a single round is not
5304 very precise. Remember the case of the example, where the only peer is the
5305 ID 44 and we happen to generate the target value 42, thinking there are
5306 50 peers in the network. Therefore, the NSE subsystem remembers the last
5307 64 estimates and calculates an average over them, giving a result of which
5308 usually has one bit of uncertainty (the real size could be half of the
5309 estimate or twice as much). Note that the actual network size is
5310 calculated in powers of two of the raw input, thus one bit of uncertainty
5311 means a factor of two in the size estimate.
5312
5313 @cindex libgnunetnse
5314 @node libgnunetnse
5315 @subsection libgnunetnse
5316
5317
5318
5319 The NSE subsystem has the simplest API of all services, with only two
5320 calls: @code{GNUNET_NSE_connect} and @code{GNUNET_NSE_disconnect}.
5321
5322 The connect call gets a callback function as a parameter and this function
5323 is called each time the network agrees on an estimate. This usually is
5324 once per round, with some exceptions: if the closest peer has a late
5325 local clock and starts spreading its ID after everyone else agreed on a
5326 value, the callback might be activated twice in a round, the second value
5327 being always bigger than the first. The default round time is set to
5328 1 hour.
5329
5330 The disconnect call disconnects from the NSE subsystem and the callback
5331 is no longer called with new estimates.
5332
5333
5334
5335 @menu
5336 * Results::
5337 * libgnunetnse - Examples::
5338 @end menu
5339
5340 @node Results
5341 @subsubsection Results
5342
5343
5344
5345 The callback provides two values: the average and the
5346 @uref{http://en.wikipedia.org/wiki/Standard_deviation, standard deviation}
5347 of the last 64 rounds. The values provided by the callback function are
5348 logarithmic, this means that the real estimate numbers can be obtained by
5349 calculating 2 to the power of the given value (2average). From a
5350 statistics point of view this means that:
5351
5352 @itemize @bullet
5353 @item 68% of the time the real size is included in the interval
5354 [(2average-stddev), 2]
5355 @item 95% of the time the real size is included in the interval
5356 [(2average-2*stddev, 2^average+2*stddev]
5357 @item 99.7% of the time the real size is included in the interval
5358 [(2average-3*stddev, 2average+3*stddev]
5359 @end itemize
5360
5361 The expected standard variation for 64 rounds in a network of stable size
5362 is 0.2. Thus, we can say that normally:
5363
5364 @itemize @bullet
5365 @item 68% of the time the real size is in the range [-13%, +15%]
5366 @item 95% of the time the real size is in the range [-24%, +32%]
5367 @item 99.7% of the time the real size is in the range [-34%, +52%]
5368 @end itemize
5369
5370 As said in the introduction, we can be quite sure that usually the real
5371 size is between one third and three times the estimate. This can of
5372 course vary with network conditions.
5373 Thus, applications may want to also consider the provided standard
5374 deviation value, not only the average (in particular, if the standard
5375 variation is very high, the average maybe meaningless: the network size is
5376 changing rapidly).
5377
5378 @node libgnunetnse - Examples
5379 @subsubsection libgnunetnse -Examples
5380
5381
5382
5383 Let's close with a couple examples.
5384
5385 @table @asis
5386
5387 @item Average: 10, std dev: 1 Here the estimate would be
5388 2^10 = 1024 peers. (The range in which we can be 95% sure is:
5389 [2^8, 2^12] = [256, 4096]. We can be very (>99.7%) sure that the network
5390 is not a hundred peers and absolutely sure that it is not a million peers,
5391 but somewhere around a thousand.)
5392
5393 @item Average 22, std dev: 0.2 Here the estimate would be
5394 2^22 = 4 Million peers. (The range in which we can be 99.7% sure
5395 is: [2^21.4, 2^22.6] = [2.8M, 6.3M]. We can be sure that the network size
5396 is around four million, with absolutely way of it being 1 million.)
5397
5398 @end table
5399
5400 To put this in perspective, if someone remembers the LHC Higgs boson
5401 results, were announced with "5 sigma" and "6 sigma" certainties. In this
5402 case a 5 sigma minimum would be 2 million and a 6 sigma minimum,
5403 1.8 million.
5404
5405 @node The NSE Client-Service Protocol
5406 @subsection The NSE Client-Service Protocol
5407
5408
5409
5410 As with the API, the client-service protocol is very simple, only has 2
5411 different messages, defined in @code{src/nse/nse.h}:
5412
5413 @itemize @bullet
5414 @item @code{GNUNET_MESSAGE_TYPE_NSE_START}@ This message has no parameters
5415 and is sent from the client to the service upon connection.
5416 @item @code{GNUNET_MESSAGE_TYPE_NSE_ESTIMATE}@ This message is sent from
5417 the service to the client for every new estimate and upon connection.
5418 Contains a timestamp for the estimate, the average and the standard
5419 deviation for the respective round.
5420 @end itemize
5421
5422 When the @code{GNUNET_NSE_disconnect} API call is executed, the client
5423 simply disconnects from the service, with no message involved.
5424
5425 @cindex NSE Peer-to-Peer Protocol
5426 @node The NSE Peer-to-Peer Protocol
5427 @subsection The NSE Peer-to-Peer Protocol
5428
5429
5430 @pindex GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD
5431 The NSE subsystem only has one message in the P2P protocol, the
5432 @code{GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD} message.
5433
5434 This message key contents are the timestamp to identify the round
5435 (differences in system clocks may cause some peers to send messages way
5436 too early or way too late, so the timestamp allows other peers to
5437 identify such messages easily), the
5438 @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proof of work}
5439 used to make it difficult to mount a
5440 @uref{http://en.wikipedia.org/wiki/Sybil_attack, Sybil attack}, and the
5441 public key, which is used to verify the signature on the message.
5442
5443 Every peer stores a message for the previous, current and next round. The
5444 messages for the previous and current round are given to peers that
5445 connect to us. The message for the next round is simply stored until our
5446 system clock advances to the next round. The message for the current round
5447 is what we are flooding the network with right now.
5448 At the beginning of each round the peer does the following:
5449
5450 @itemize @bullet
5451 @item calculates its own distance to the target value
5452 @item creates, signs and stores the message for the current round (unless
5453 it has a better message in the "next round" slot which came early in the
5454 previous round)
5455 @item calculates, based on the stored round message (own or received) when
5456 to start flooding it to its neighbors
5457 @end itemize
5458
5459 Upon receiving a message the peer checks the validity of the message
5460 (round, proof of work, signature). The next action depends on the
5461 contents of the incoming message:
5462
5463 @itemize @bullet
5464 @item if the message is worse than the current stored message, the peer
5465 sends the current message back immediately, to stop the other peer from
5466 spreading suboptimal results
5467 @item if the message is better than the current stored message, the peer
5468 stores the new message and calculates the new target time to start
5469 spreading it to its neighbors (excluding the one the message came from)
5470 @item if the message is for the previous round, it is compared to the
5471 message stored in the "previous round slot", which may then be updated
5472 @item if the message is for the next round, it is compared to the message
5473 stored in the "next round slot", which again may then be updated
5474 @end itemize
5475
5476 Finally, when it comes to send the stored message for the current round to
5477 the neighbors there is a random delay added for each neighbor, to avoid
5478 traffic spikes and minimize cross-messages.
5479
5480 @cindex HOSTLIST Subsystem
5481 @node HOSTLIST Subsystem
5482 @section HOSTLIST Subsystem
5483
5484
5485
5486 Peers in the GNUnet overlay network need address information so that they
5487 can connect with other peers. GNUnet uses so called HELLO messages to
5488 store and exchange peer addresses.
5489 GNUnet provides several methods for peers to obtain this information:
5490
5491 @itemize @bullet
5492 @item out-of-band exchange of HELLO messages (manually, using for example
5493 gnunet-peerinfo)
5494 @item HELLO messages shipped with GNUnet (automatic with distribution)
5495 @item UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast)
5496 @item topology gossiping (learning from other peers we already connected
5497 to), and
5498 @item the HOSTLIST daemon covered in this section, which is particularly
5499 relevant for bootstrapping new peers.
5500 @end itemize
5501
5502 New peers have no existing connections (and thus cannot learn from gossip
5503 among peers), may not have other peers in their LAN and might be started
5504 with an outdated set of HELLO messages from the distribution.
5505 In this case, getting new peers to connect to the network requires either
5506 manual effort or the use of a HOSTLIST to obtain HELLOs.
5507
5508 @menu
5509 * HELLOs::
5510 * Overview for the HOSTLIST subsystem::
5511 * Interacting with the HOSTLIST daemon::
5512 * Hostlist security address validation::
5513 * The HOSTLIST daemon::
5514 * The HOSTLIST server::
5515 * The HOSTLIST client::
5516 * Usage::
5517 @end menu
5518
5519 @node HELLOs
5520 @subsection HELLOs
5521
5522
5523
5524 The basic information peers require to connect to other peers are
5525 contained in so called HELLO messages you can think of as a business card.
5526 Besides the identity of the peer (based on the cryptographic public key) a
5527 HELLO message may contain address information that specifies ways to
5528 contact a peer. By obtaining HELLO messages, a peer can learn how to
5529 contact other peers.
5530
5531 @node Overview for the HOSTLIST subsystem
5532 @subsection Overview for the HOSTLIST subsystem
5533
5534
5535
5536 The HOSTLIST subsystem provides a way to distribute and obtain contact
5537 information to connect to other peers using a simple HTTP GET request.
5538 It's implementation is split in three parts, the main file for the daemon
5539 itself (@file{gnunet-daemon-hostlist.c}), the HTTP client used to download
5540 peer information (@file{hostlist-client.c}) and the server component used
5541 to provide this information to other peers (@file{hostlist-server.c}).
5542 The server is basically a small HTTP web server (based on GNU
5543 libmicrohttpd) which provides a list of HELLOs known to the local peer for
5544 download. The client component is basically a HTTP client
5545 (based on libcurl) which can download hostlists from one or more websites.
5546 The hostlist format is a binary blob containing a sequence of HELLO
5547 messages. Note that any HTTP server can theoretically serve a hostlist,
5548 the build-in hostlist server makes it simply convenient to offer this
5549 service.
5550
5551
5552 @menu
5553 * Features::
5554 * HOSTLIST - Limitations::
5555 @end menu
5556
5557 @node Features
5558 @subsubsection Features
5559
5560
5561
5562 The HOSTLIST daemon can:
5563
5564 @itemize @bullet
5565 @item provide HELLO messages with validated addresses obtained from
5566 PEERINFO to download for other peers
5567 @item download HELLO messages and forward these message to the TRANSPORT
5568 subsystem for validation
5569 @item advertises the URL of this peer's hostlist address to other peers
5570 via gossip
5571 @item automatically learn about hostlist servers from the gossip of other
5572 peers
5573 @end itemize
5574
5575 @node HOSTLIST - Limitations
5576 @subsubsection HOSTLIST - Limitations
5577
5578
5579
5580 The HOSTLIST daemon does not:
5581
5582 @itemize @bullet
5583 @item verify the cryptographic information in the HELLO messages
5584 @item verify the address information in the HELLO messages
5585 @end itemize
5586
5587 @node Interacting with the HOSTLIST daemon
5588 @subsection Interacting with the HOSTLIST daemon
5589
5590
5591
5592 The HOSTLIST subsystem is currently implemented as a daemon, so there is
5593 no need for the user to interact with it and therefore there is no
5594 command line tool and no API to communicate with the daemon. In the
5595 future, we can envision changing this to allow users to manually trigger
5596 the download of a hostlist.
5597
5598 Since there is no command line interface to interact with HOSTLIST, the
5599 only way to interact with the hostlist is to use STATISTICS to obtain or
5600 modify information about the status of HOSTLIST:
5601
5602 @example
5603 $ gnunet-statistics -s hostlist
5604 @end example
5605
5606 @noindent
5607 In particular, HOSTLIST includes a @strong{persistent} value in statistics
5608 that specifies when the hostlist server might be queried next. As this
5609 value is exponentially increasing during runtime, developers may want to
5610 reset or manually adjust it. Note that HOSTLIST (but not STATISTICS) needs
5611 to be shutdown if changes to this value are to have any effect on the
5612 daemon (as HOSTLIST does not monitor STATISTICS for changes to the
5613 download frequency).
5614
5615 @node Hostlist security address validation
5616 @subsection Hostlist security address validation
5617
5618
5619
5620 Since information obtained from other parties cannot be trusted without
5621 validation, we have to distinguish between @emph{validated} and
5622 @emph{not validated} addresses. Before using (and so trusting)
5623 information from other parties, this information has to be double-checked
5624 (validated). Address validation is not done by HOSTLIST but by the
5625 TRANSPORT service.
5626
5627 The HOSTLIST component is functionally located between the PEERINFO and
5628 the TRANSPORT subsystem. When acting as a server, the daemon obtains valid
5629 (@emph{validated}) peer information (HELLO messages) from the PEERINFO
5630 service and provides it to other peers. When acting as a client, it
5631 contacts the HOSTLIST servers specified in the configuration, downloads
5632 the (unvalidated) list of HELLO messages and forwards these information
5633 to the TRANSPORT server to validate the addresses.
5634
5635 @cindex HOSTLIST daemon
5636 @node The HOSTLIST daemon
5637 @subsection The HOSTLIST daemon
5638
5639
5640
5641 The hostlist daemon is the main component of the HOSTLIST subsystem. It is
5642 started by the ARM service and (if configured) starts the HOSTLIST client
5643 and server components.
5644
5645 @pindex GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT
5646 If the daemon provides a hostlist itself it can advertise it's own
5647 hostlist to other peers. To do so it sends a
5648 @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to other peers
5649 when they connect to this peer on the CORE level. This hostlist
5650 advertisement message contains the URL to access the HOSTLIST HTTP
5651 server of the sender. The daemon may also subscribe to this type of
5652 message from CORE service, and then forward these kind of message to the
5653 HOSTLIST client. The client then uses all available URLs to download peer
5654 information when necessary.
5655
5656 When starting, the HOSTLIST daemon first connects to the CORE subsystem
5657 and if hostlist learning is enabled, registers a CORE handler to receive
5658 this kind of messages. Next it starts (if configured) the client and
5659 server. It passes pointers to CORE connect and disconnect and receive
5660 handlers where the client and server store their functions, so the daemon
5661 can notify them about CORE events.
5662
5663 To clean up on shutdown, the daemon has a cleaning task, shutting down all
5664 subsystems and disconnecting from CORE.
5665
5666 @cindex HOSTLIST server
5667 @node The HOSTLIST server
5668 @subsection The HOSTLIST server
5669
5670
5671
5672 The server provides a way for other peers to obtain HELLOs. Basically it
5673 is a small web server other peers can connect to and download a list of
5674 HELLOs using standard HTTP; it may also advertise the URL of the hostlist
5675 to other peers connecting on CORE level.
5676
5677
5678 @menu
5679 * The HTTP Server::
5680 * Advertising the URL::
5681 @end menu
5682
5683 @node The HTTP Server
5684 @subsubsection The HTTP Server
5685
5686
5687
5688 During startup, the server starts a web server listening on the port
5689 specified with the HTTPPORT value (default 8080). In addition it connects
5690 to the PEERINFO service to obtain peer information. The HOSTLIST server
5691 uses the GNUNET_PEERINFO_iterate function to request HELLO information for
5692 all peers and adds their information to a new hostlist if they are
5693 suitable (expired addresses and HELLOs without addresses are both not
5694 suitable) and the maximum size for a hostlist is not exceeded
5695 (MAX_BYTES_PER_HOSTLISTS = 500000).
5696 When PEERINFO finishes (with a last NULL callback), the server destroys
5697 the previous hostlist response available for download on the web server
5698 and replaces it with the updated hostlist. The hostlist format is
5699 basically a sequence of HELLO messages (as obtained from PEERINFO) without
5700 any special tokenization. Since each HELLO message contains a size field,
5701 the response can easily be split into separate HELLO messages by the
5702 client.
5703
5704 A HOSTLIST client connecting to the HOSTLIST server will receive the
5705 hostlist as a HTTP response and the the server will terminate the
5706 connection with the result code @code{HTTP 200 OK}.
5707 The connection will be closed immediately if no hostlist is available.
5708
5709 @node Advertising the URL
5710 @subsubsection Advertising the URL
5711
5712
5713
5714 The server also advertises the URL to download the hostlist to other peers
5715 if hostlist advertisement is enabled.
5716 When a new peer connects and has hostlist learning enabled, the server
5717 sends a @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to this
5718 peer using the CORE service.
5719
5720 @cindex HOSTLIST client
5721 @node The HOSTLIST client
5722 @subsection The HOSTLIST client
5723
5724
5725
5726 The client provides the functionality to download the list of HELLOs from
5727 a set of URLs.
5728 It performs a standard HTTP request to the URLs configured and learned
5729 from advertisement messages received from other peers. When a HELLO is
5730 downloaded, the HOSTLIST client forwards the HELLO to the TRANSPORT
5731 service for validation.
5732
5733 The client supports two modes of operation:
5734
5735 @itemize @bullet
5736 @item download of HELLOs (bootstrapping)
5737 @item learning of URLs
5738 @end itemize
5739
5740 @menu
5741 * Bootstrapping::
5742 * Learning::
5743 @end menu
5744
5745 @node Bootstrapping
5746 @subsubsection Bootstrapping
5747
5748
5749
5750 For bootstrapping, it schedules a task to download the hostlist from the
5751 set of known URLs.
5752 The downloads are only performed if the number of current
5753 connections is smaller than a minimum number of connections
5754 (at the moment 4).
5755 The interval between downloads increases exponentially; however, the
5756 exponential growth is limited if it becomes longer than an hour.
5757 At that point, the frequency growth is capped at
5758 (#number of connections * 1h).
5759
5760 Once the decision has been taken to download HELLOs, the daemon chooses a
5761 random URL from the list of known URLs. URLs can be configured in the
5762 configuration or be learned from advertisement messages.
5763 The client uses a HTTP client library (libcurl) to initiate the download
5764 using the libcurl multi interface.
5765 Libcurl passes the data to the callback_download function which
5766 stores the data in a buffer if space is available and the maximum size for
5767 a hostlist download is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000).
5768 When a full HELLO was downloaded, the HOSTLIST client offers this
5769 HELLO message to the TRANSPORT service for validation.
5770 When the download is finished or failed, statistical information about the
5771 quality of this URL is updated.
5772
5773 @cindex HOSTLIST learning
5774 @node Learning
5775 @subsubsection Learning
5776
5777
5778
5779 The client also manages hostlist advertisements from other peers. The
5780 HOSTLIST daemon forwards @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT}
5781 messages to the client subsystem, which extracts the URL from the message.
5782 Next, a test of the newly obtained URL is performed by triggering a
5783 download from the new URL. If the URL works correctly, it is added to the
5784 list of working URLs.
5785
5786 The size of the list of URLs is restricted, so if an additional server is
5787 added and the list is full, the URL with the worst quality ranking
5788 (determined through successful downloads and number of HELLOs e.g.) is
5789 discarded. During shutdown the list of URLs is saved to a file for
5790 persistance and loaded on startup. URLs from the configuration file are
5791 never discarded.
5792
5793 @node Usage
5794 @subsection Usage
5795
5796
5797
5798 To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES
5799 section for the ARM services. This is done in the default configuration.
5800
5801 For more information on how to configure the HOSTLIST subsystem see the
5802 installation handbook:@
5803 Configuring the hostlist to bootstrap@
5804 Configuring your peer to provide a hostlist
5805
5806 @cindex IDENTITY Subsystem
5807 @node IDENTITY Subsystem
5808 @section IDENTITY Subsystem
5809
5810
5811
5812 Identities of "users" in GNUnet are called egos.
5813 Egos can be used as pseudonyms ("fake names") or be tied to an
5814 organization (for example, "GNU") or even the actual identity of a human.
5815 GNUnet users are expected to have many egos. They might have one tied to
5816 their real identity, some for organizations they manage, and more for
5817 different domains where they want to operate under a pseudonym.
5818
5819 The IDENTITY service allows users to manage their egos. The identity
5820 service manages the private keys egos of the local user; it does not
5821 manage identities of other users (public keys). Public keys for other
5822 users need names to become manageable. GNUnet uses the
5823 @dfn{GNU Name System} (GNS) to give names to other users and manage their
5824 public keys securely. This chapter is about the IDENTITY service,
5825 which is about the management of private keys.
5826
5827 On the network, an ego corresponds to an ECDSA key (over Curve25519,
5828 using RFC 6979, as required by GNS). Thus, users can perform actions
5829 under a particular ego by using (signing with) a particular private key.
5830 Other users can then confirm that the action was really performed by that
5831 ego by checking the signature against the respective public key.
5832
5833 The IDENTITY service allows users to associate a human-readable name with
5834 each ego. This way, users can use names that will remind them of the
5835 purpose of a particular ego.
5836 The IDENTITY service will store the respective private keys and
5837 allows applications to access key information by name.
5838 Users can change the name that is locally (!) associated with an ego.
5839 Egos can also be deleted, which means that the private key will be removed
5840 and it thus will not be possible to perform actions with that ego in the
5841 future.
5842
5843 Additionally, the IDENTITY subsystem can associate service functions with
5844 egos.
5845 For example, GNS requires the ego that should be used for the shorten
5846 zone. GNS will ask IDENTITY for an ego for the "gns-short" service.
5847 The IDENTITY service has a mapping of such service strings to the name of
5848 the ego that the user wants to use for this service, for example
5849 "my-short-zone-ego".
5850
5851 Finally, the IDENTITY API provides access to a special ego, the
5852 anonymous ego. The anonymous ego is special in that its private key is not
5853 really private, but fixed and known to everyone.
5854 Thus, anyone can perform actions as anonymous. This can be useful as with
5855 this trick, code does not have to contain a special case to distinguish
5856 between anonymous and pseudonymous egos.
5857
5858 @menu
5859 * libgnunetidentity::
5860 * The IDENTITY Client-Service Protocol::
5861 @end menu
5862
5863 @cindex libgnunetidentity
5864 @node libgnunetidentity
5865 @subsection libgnunetidentity
5866
5867
5868
5869 @menu
5870 * Connecting to the service::
5871 * Operations on Egos::
5872 * The anonymous Ego::
5873 * Convenience API to lookup a single ego::
5874 * Associating egos with service functions::
5875 @end menu
5876
5877 @node Connecting to the service
5878 @subsubsection Connecting to the service
5879
5880
5881
5882 First, typical clients connect to the identity service using
5883 @code{GNUNET_IDENTITY_connect}. This function takes a callback as a
5884 parameter.
5885 If the given callback parameter is non-null, it will be invoked to notify
5886 the application about the current state of the identities in the system.
5887
5888 @itemize @bullet
5889 @item First, it will be invoked on all known egos at the time of the
5890 connection. For each ego, a handle to the ego and the user's name for the
5891 ego will be passed to the callback. Furthermore, a @code{void **} context
5892 argument will be provided which gives the client the opportunity to
5893 associate some state with the ego.
5894 @item Second, the callback will be invoked with NULL for the ego, the name
5895 and the context. This signals that the (initial) iteration over all egos
5896 has completed.
5897 @item Then, the callback will be invoked whenever something changes about
5898 an ego.
5899 If an ego is renamed, the callback is invoked with the ego handle of the
5900 ego that was renamed, and the new name. If an ego is deleted, the callback
5901 is invoked with the ego handle and a name of NULL. In the deletion case,
5902 the application should also release resources stored in the context.
5903 @item When the application destroys the connection to the identity service
5904 using @code{GNUNET_IDENTITY_disconnect}, the callback is again invoked
5905 with the ego and a name of NULL (equivalent to deletion of the egos).
5906 This should again be used to clean up the per-ego context.
5907 @end itemize
5908
5909 The ego handle passed to the callback remains valid until the callback is
5910 invoked with a name of NULL, so it is safe to store a reference to the
5911 ego's handle.
5912
5913 @node Operations on Egos
5914 @subsubsection Operations on Egos
5915
5916
5917
5918 Given an ego handle, the main operations are to get its associated private
5919 key using @code{GNUNET_IDENTITY_ego_get_private_key} or its associated
5920 public key using @code{GNUNET_IDENTITY_ego_get_public_key}.
5921
5922 The other operations on egos are pretty straightforward.
5923 Using @code{GNUNET_IDENTITY_create}, an application can request the
5924 creation of an ego by specifying the desired name.
5925 The operation will fail if that name is
5926 already in use. Using @code{GNUNET_IDENTITY_rename} the name of an
5927 existing ego can be changed. Finally, egos can be deleted using
5928 @code{GNUNET_IDENTITY_delete}. All of these operations will trigger
5929 updates to the callback given to the @code{GNUNET_IDENTITY_connect}
5930 function of all applications that are connected with the identity service
5931 at the time. @code{GNUNET_IDENTITY_cancel} can be used to cancel the
5932 operations before the respective continuations would be called.
5933 It is not guaranteed that the operation will not be completed anyway,
5934 only the continuation will no longer be called.
5935
5936 @node The anonymous Ego
5937 @subsubsection The anonymous Ego
5938
5939
5940
5941 A special way to obtain an ego handle is to call
5942 @code{GNUNET_IDENTITY_ego_get_anonymous}, which returns an ego for the
5943 "anonymous" user --- anyone knows and can get the private key for this
5944 user, so it is suitable for operations that are supposed to be anonymous
5945 but require signatures (for example, to avoid a special path in the code).
5946 The anonymous ego is always valid and accessing it does not require a
5947 connection to the identity service.
5948
5949 @node Convenience API to lookup a single ego
5950 @subsubsection Convenience API to lookup a single ego
5951
5952
5953 As applications commonly simply have to lookup a single ego, there is a
5954 convenience API to do just that. Use @code{GNUNET_IDENTITY_ego_lookup} to
5955 lookup a single ego by name. Note that this is the user's name for the
5956 ego, not the service function. The resulting ego will be returned via a
5957 callback and will only be valid during that callback. The operation can
5958 be canceled via @code{GNUNET_IDENTITY_ego_lookup_cancel}
5959 (cancellation is only legal before the callback is invoked).
5960
5961 @node Associating egos with service functions
5962 @subsubsection Associating egos with service functions
5963
5964
5965 The @code{GNUNET_IDENTITY_set} function is used to associate a particular
5966 ego with a service function. The name used by the service and the ego are
5967 given as arguments.
5968 Afterwards, the service can use its name to lookup the associated ego
5969 using @code{GNUNET_IDENTITY_get}.
5970
5971 @node The IDENTITY Client-Service Protocol
5972 @subsection The IDENTITY Client-Service Protocol
5973
5974
5975
5976 A client connecting to the identity service first sends a message with
5977 type
5978 @code{GNUNET_MESSAGE_TYPE_IDENTITY_START} to the service. After that, the
5979 client will receive information about changes to the egos by receiving
5980 messages of type @code{GNUNET_MESSAGE_TYPE_IDENTITY_UPDATE}.
5981 Those messages contain the private key of the ego and the user's name of
5982 the ego (or zero bytes for the name to indicate that the ego was deleted).
5983 A special bit @code{end_of_list} is used to indicate the end of the
5984 initial iteration over the identity service's egos.
5985
5986 The client can trigger changes to the egos by sending @code{CREATE},
5987 @code{RENAME} or @code{DELETE} messages.
5988 The CREATE message contains the private key and the desired name.@
5989 The RENAME message contains the old name and the new name.@
5990 The DELETE message only needs to include the name of the ego to delete.@
5991 The service responds to each of these messages with a @code{RESULT_CODE}
5992 message which indicates success or error of the operation, and possibly
5993 a human-readable error message.
5994
5995 Finally, the client can bind the name of a service function to an ego by
5996 sending a @code{SET_DEFAULT} message with the name of the service function
5997 and the private key of the ego.
5998 Such bindings can then be resolved using a @code{GET_DEFAULT} message,
5999 which includes the name of the service function. The identity service
6000 will respond to a GET_DEFAULT request with a SET_DEFAULT message
6001 containing the respective information, or with a RESULT_CODE to
6002 indicate an error.
6003
6004 @cindex NAMESTORE Subsystem
6005 @node NAMESTORE Subsystem
6006 @section NAMESTORE Subsystem
6007
6008 The NAMESTORE subsystem provides persistent storage for local GNS zone
6009 information. All local GNS zone information are managed by NAMESTORE. It
6010 provides both the functionality to administer local GNS information (e.g.
6011 delete and add records) as well as to retrieve GNS information (e.g to
6012 list name information in a client).
6013 NAMESTORE does only manage the persistent storage of zone information
6014 belonging to the user running the service: GNS information from other
6015 users obtained from the DHT are stored by the NAMECACHE subsystem.
6016
6017 NAMESTORE uses a plugin-based database backend to store GNS information
6018 with good performance. Here sqlite, MySQL and PostgreSQL are supported
6019 database backends.
6020 NAMESTORE clients interact with the IDENTITY subsystem to obtain
6021 cryptographic information about zones based on egos as described with the
6022 IDENTITY subsystem, but internally NAMESTORE refers to zones using the
6023 ECDSA private key.
6024 In addition, it collaborates with the NAMECACHE subsystem and
6025 stores zone information when local information are modified in the
6026 GNS cache to increase look-up performance for local information.
6027
6028 NAMESTORE provides functionality to look-up and store records, to iterate
6029 over a specific or all zones and to monitor zones for changes. NAMESTORE
6030 functionality can be accessed using the NAMESTORE api or the NAMESTORE
6031 command line tool.
6032
6033 @menu
6034 * libgnunetnamestore::
6035 @end menu
6036
6037 @cindex libgnunetnamestore
6038 @node libgnunetnamestore
6039 @subsection libgnunetnamestore
6040
6041 To interact with NAMESTORE clients first connect to the NAMESTORE service
6042 using the @code{GNUNET_NAMESTORE_connect} passing a configuration handle.
6043 As a result they obtain a NAMESTORE handle, they can use for operations,
6044 or NULL is returned if the connection failed.
6045
6046 To disconnect from NAMESTORE, clients use
6047 @code{GNUNET_NAMESTORE_disconnect} and specify the handle to disconnect.
6048
6049 NAMESTORE internally uses the ECDSA private key to refer to zones. These
6050 private keys can be obtained from the IDENTITY subsytem.
6051 Here @emph{egos} @emph{can be used to refer to zones or the default ego
6052 assigned to the GNS subsystem can be used to obtained the master zone's
6053 private key.}
6054
6055
6056 @menu
6057 * Editing Zone Information::
6058 * Iterating Zone Information::
6059 * Monitoring Zone Information::
6060 @end menu
6061
6062 @node Editing Zone Information
6063 @subsubsection Editing Zone Information
6064
6065
6066
6067 NAMESTORE provides functions to lookup records stored under a label in a
6068 zone and to store records under a label in a zone.
6069
6070 To store (and delete) records, the client uses the
6071 @code{GNUNET_NAMESTORE_records_store} function and has to provide
6072 namestore handle to use, the private key of the zone, the label to store
6073 the records under, the records and number of records plus an callback
6074 function.
6075 After the operation is performed NAMESTORE will call the provided
6076 callback function with the result GNUNET_SYSERR on failure
6077 (including timeout/queue drop/failure to validate), GNUNET_NO if content
6078 was already there or not found GNUNET_YES (or other positive value) on
6079 success plus an additional error message.
6080
6081 Records are deleted by using the store command with 0 records to store.
6082 It is important to note, that records are not merged when records exist
6083 with the label.
6084 So a client has first to retrieve records, merge with existing records
6085 and then store the result.
6086
6087 To perform a lookup operation, the client uses the
6088 @code{GNUNET_NAMESTORE_records_store} function. Here it has to pass the
6089 namestore handle, the private key of the zone and the label. It also has
6090 to provide a callback function which will be called with the result of
6091 the lookup operation:
6092 the zone for the records, the label, and the records including the
6093 number of records included.
6094
6095 A special operation is used to set the preferred nickname for a zone.
6096 This nickname is stored with the zone and is automatically merged with
6097 all labels and records stored in a zone. Here the client uses the
6098 @code{GNUNET_NAMESTORE_set_nick} function and passes the private key of
6099 the zone, the nickname as string plus a the callback with the result of
6100 the operation.
6101
6102 @node Iterating Zone Information
6103 @subsubsection Iterating Zone Information
6104
6105
6106
6107 A client can iterate over all information in a zone or all zones managed
6108 by NAMESTORE.
6109 Here a client uses the @code{GNUNET_NAMESTORE_zone_iteration_start}
6110 function and passes the namestore handle, the zone to iterate over and a
6111 callback function to call with the result.
6112 If the client wants to iterate over all the WHAT!? FIXME, it passes NULL for the zone.
6113 A @code{GNUNET_NAMESTORE_ZoneIterator} handle is returned to be used to
6114 continue iteration.
6115
6116 NAMESTORE calls the callback for every result and expects the client to
6117 call @code{GNUNET_NAMESTORE_zone_iterator_next} to continue to iterate or
6118 @code{GNUNET_NAMESTORE_zone_iterator_stop} to interrupt the iteration.
6119 When NAMESTORE reached the last item it will call the callback with a
6120 NULL value to indicate.
6121
6122 @node Monitoring Zone Information
6123 @subsubsection Monitoring Zone Information
6124
6125
6126
6127 Clients can also monitor zones to be notified about changes. Here the
6128 clients uses the @code{GNUNET_NAMESTORE_zone_monitor_start} function and
6129 passes the private key of the zone and and a callback function to call
6130 with updates for a zone.
6131 The client can specify to obtain zone information first by iterating over
6132 the zone and specify a synchronization callback to be called when the
6133 client and the namestore are synced.
6134
6135 On an update, NAMESTORE will call the callback with the private key of the
6136 zone, the label and the records and their number.
6137
6138 To stop monitoring, the client calls
6139 @code{GNUNET_NAMESTORE_zone_monitor_stop} and passes the handle obtained
6140 from the function to start the monitoring.
6141
6142 @cindex PEERINFO Subsystem
6143 @node PEERINFO Subsystem
6144 @section PEERINFO Subsystem
6145
6146
6147
6148 The PEERINFO subsystem is used to store verified (validated) information
6149 about known peers in a persistent way. It obtains these addresses for
6150 example from TRANSPORT service which is in charge of address validation.
6151 Validation means that the information in the HELLO message are checked by
6152 connecting to the addresses and performing a cryptographic handshake to
6153 authenticate the peer instance stating to be reachable with these
6154 addresses.
6155 Peerinfo does not validate the HELLO messages itself but only stores them
6156 and gives them to interested clients.
6157
6158 As future work, we think about moving from storing just HELLO messages to
6159 providing a generic persistent per-peer information store.
6160 More and more subsystems tend to need to store per-peer information in
6161 persistent way.
6162 To not duplicate this functionality we plan to provide a PEERSTORE
6163 service providing this functionality.
6164
6165 @menu
6166 * PEERINFO - Features::
6167 * PEERINFO - Limitations::
6168 * DeveloperPeer Information::
6169 * Startup::
6170 * Managing Information::
6171 * Obtaining Information::
6172 * The PEERINFO Client-Service Protocol::
6173 * libgnunetpeerinfo::
6174 @end menu
6175
6176 @node PEERINFO - Features
6177 @subsection PEERINFO - Features
6178
6179
6180
6181 @itemize @bullet
6182 @item Persistent storage
6183 @item Client notification mechanism on update
6184 @item Periodic clean up for expired information
6185 @item Differentiation between public and friend-only HELLO
6186 @end itemize
6187
6188 @node PEERINFO - Limitations
6189 @subsection PEERINFO - Limitations
6190
6191
6192 @itemize @bullet
6193 @item Does not perform HELLO validation
6194 @end itemize
6195
6196 @node DeveloperPeer Information
6197 @subsection DeveloperPeer Information
6198
6199
6200
6201 The PEERINFO subsystem stores these information in the form of HELLO
6202 messages you can think of as business cards.
6203 These HELLO messages contain the public key of a peer and the addresses
6204 a peer can be reached under.
6205 The addresses include an expiration date describing how long they are
6206 valid. This information is updated regularly by the TRANSPORT service by
6207 revalidating the address.
6208 If an address is expired and not renewed, it can be removed from the
6209 HELLO message.
6210
6211 Some peer do not want to have their HELLO messages distributed to other
6212 peers, especially when GNUnet's friend-to-friend modus is enabled.
6213 To prevent this undesired distribution. PEERINFO distinguishes between
6214 @emph{public} and @emph{friend-only} HELLO messages.
6215 Public HELLO messages can be freely distributed to other (possibly
6216 unknown) peers (for example using the hostlist, gossiping, broadcasting),
6217 whereas friend-only HELLO messages may not be distributed to other peers.
6218 Friend-only HELLO messages have an additional flag @code{friend_only} set
6219 internally. For public HELLO message this flag is not set.
6220 PEERINFO does and cannot not check if a client is allowed to obtain a
6221 specific HELLO type.
6222
6223 The HELLO messages can be managed using the GNUnet HELLO library.
6224 Other GNUnet systems can obtain these information from PEERINFO and use
6225 it for their purposes.
6226 Clients are for example the HOSTLIST component providing these
6227 information to other peers in form of a hostlist or the TRANSPORT
6228 subsystem using these information to maintain connections to other peers.
6229
6230 @node Startup
6231 @subsection Startup
6232
6233
6234
6235 During startup the PEERINFO services loads persistent HELLOs from disk.
6236 First PEERINFO parses the directory configured in the HOSTS value of the
6237 @code{PEERINFO} configuration section to store PEERINFO information.
6238 For all files found in this directory valid HELLO messages are extracted.
6239 In addition it loads HELLO messages shipped with the GNUnet distribution.
6240 These HELLOs are used to simplify network bootstrapping by providing
6241 valid peer information with the distribution.
6242 The use of these HELLOs can be prevented by setting the
6243 @code{USE_INCLUDED_HELLOS} in the @code{PEERINFO} configuration section to
6244 @code{NO}. Files containing invalid information are removed.
6245
6246 @node Managing Information
6247 @subsection Managing Information
6248
6249
6250
6251 The PEERINFO services stores information about known PEERS and a single
6252 HELLO message for every peer.
6253 A peer does not need to have a HELLO if no information are available.
6254 HELLO information from different sources, for example a HELLO obtained
6255 from a remote HOSTLIST and a second HELLO stored on disk, are combined
6256 and merged into one single HELLO message per peer which will be given to
6257 clients. During this merge process the HELLO is immediately written to
6258 disk to ensure persistence.
6259
6260 PEERINFO in addition periodically scans the directory where information
6261 are stored for empty HELLO messages with expired TRANSPORT addresses.
6262 This periodic task scans all files in the directory and recreates the
6263 HELLO messages it finds.
6264 Expired TRANSPORT addresses are removed from the HELLO and if the
6265 HELLO does not contain any valid addresses, it is discarded and removed
6266 from the disk.
6267
6268 @node Obtaining Information
6269 @subsection Obtaining Information
6270
6271
6272
6273 When a client requests information from PEERINFO, PEERINFO performs a
6274 lookup for the respective peer or all peers if desired and transmits this
6275 information to the client.
6276 The client can specify if friend-only HELLOs have to be included or not
6277 and PEERINFO filters the respective HELLO messages before transmitting
6278 information.
6279
6280 To notify clients about changes to PEERINFO information, PEERINFO
6281 maintains a list of clients interested in this notifications.
6282 Such a notification occurs if a HELLO for a peer was updated (due to a
6283 merge for example) or a new peer was added.
6284
6285 @node The PEERINFO Client-Service Protocol
6286 @subsection The PEERINFO Client-Service Protocol
6287
6288
6289
6290 To connect and disconnect to and from the PEERINFO Service PEERINFO
6291 utilizes the util client/server infrastructure, so no special messages
6292 types are used here.
6293
6294 To add information for a peer, the plain HELLO message is transmitted to
6295 the service without any wrapping. All pieces of information required are
6296 stored within the HELLO message.
6297 The PEERINFO service provides a message handler accepting and processing
6298 these HELLO messages.
6299
6300 When obtaining PEERINFO information using the iterate functionality
6301 specific messages are used. To obtain information for all peers, a
6302 @code{struct ListAllPeersMessage} with message type
6303 @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL} and a flag
6304 include_friend_only to indicate if friend-only HELLO messages should be
6305 included are transmitted. If information for a specific peer is required
6306 a @code{struct ListAllPeersMessage} with
6307 @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET} containing the peer identity is
6308 used.
6309
6310 For both variants the PEERINFO service replies for each HELLO message it
6311 wants to transmit with a @code{struct ListAllPeersMessage} with type
6312 @code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO} containing the plain HELLO.
6313 The final message is @code{struct GNUNET_MessageHeader} with type
6314 @code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO}. If the client receives this
6315 message, it can proceed with the next request if any is pending.
6316
6317 @node libgnunetpeerinfo
6318 @subsection libgnunetpeerinfo
6319
6320
6321
6322 The PEERINFO API consists mainly of three different functionalities:
6323
6324 @itemize @bullet
6325 @item maintaining a connection to the service
6326 @item adding new information to the PEERINFO service
6327 @item retrieving information from the PEERINFO service
6328 @end itemize
6329
6330 @menu
6331 * Connecting to the PEERINFO Service::
6332 * Adding Information to the PEERINFO Service::
6333 * Obtaining Information from the PEERINFO Service::
6334 @end menu
6335
6336 @node Connecting to the PEERINFO Service
6337 @subsubsection Connecting to the PEERINFO Service
6338
6339
6340
6341 To connect to the PEERINFO service the function
6342 @code{GNUNET_PEERINFO_connect} is used, taking a configuration handle as
6343 an argument, and to disconnect from PEERINFO the function
6344 @code{GNUNET_PEERINFO_disconnect}, taking the PEERINFO
6345 handle returned from the connect function has to be called.
6346
6347 @node Adding Information to the PEERINFO Service
6348 @subsubsection Adding Information to the PEERINFO Service
6349
6350
6351
6352 @code{GNUNET_PEERINFO_add_peer} adds a new peer to the PEERINFO subsystem
6353 storage. This function takes the PEERINFO handle as an argument, the HELLO
6354 message to store and a continuation with a closure to be called with the
6355 result of the operation.
6356 The @code{GNUNET_PEERINFO_add_peer} returns a handle to this operation
6357 allowing to cancel the operation with the respective cancel function
6358 @code{GNUNET_PEERINFO_add_peer_cancel}. To retrieve information from
6359 PEERINFO you can iterate over all information stored with PEERINFO or you
6360 can tell PEERINFO to notify if new peer information are available.
6361
6362 @node Obtaining Information from the PEERINFO Service
6363 @subsubsection Obtaining Information from the PEERINFO Service
6364
6365
6366
6367 To iterate over information in PEERINFO you use
6368 @code{GNUNET_PEERINFO_iterate}.
6369 This function expects the PEERINFO handle, a flag if HELLO messages
6370 intended for friend only mode should be included, a timeout how long the
6371 operation should take and a callback with a callback closure to be called
6372 for the results.
6373 If you want to obtain information for a specific peer, you can specify
6374 the peer identity, if this identity is NULL, information for all peers are
6375 returned. The function returns a handle to allow to cancel the operation
6376 using @code{GNUNET_PEERINFO_iterate_cancel}.
6377
6378 To get notified when peer information changes, you can use
6379 @code{GNUNET_PEERINFO_notify}.
6380 This function expects a configuration handle and a flag if friend-only
6381 HELLO messages should be included. The PEERINFO service will notify you
6382 about every change and the callback function will be called to notify you
6383 about changes. The function returns a handle to cancel notifications
6384 with @code{GNUNET_PEERINFO_notify_cancel}.
6385
6386 @cindex PEERSTORE Subsystem
6387 @node PEERSTORE Subsystem
6388 @section PEERSTORE Subsystem
6389
6390
6391
6392 GNUnet's PEERSTORE subsystem offers persistent per-peer storage for other
6393 GNUnet subsystems. GNUnet subsystems can use PEERSTORE to persistently
6394 store and retrieve arbitrary data.
6395 Each data record stored with PEERSTORE contains the following fields:
6396
6397 @itemize @bullet
6398 @item subsystem: Name of the subsystem responsible for the record.
6399 @item peerid: Identity of the peer this record is related to.
6400 @item key: a key string identifying the record.
6401 @item value: binary record value.
6402 @item expiry: record expiry date.
6403 @end itemize
6404
6405 @menu
6406 * Functionality::
6407 * Architecture::
6408 * libgnunetpeerstore::
6409 @end menu
6410
6411 @node Functionality
6412 @subsection Functionality
6413
6414
6415
6416 Subsystems can store any type of value under a (subsystem, peerid, key)
6417 combination. A "replace" flag set during store operations forces the
6418 PEERSTORE to replace any old values stored under the same
6419 (subsystem, peerid, key) combination with the new value.
6420 Additionally, an expiry date is set after which the record is *possibly*
6421 deleted by PEERSTORE.
6422
6423 Subsystems can iterate over all values stored under any of the following
6424 combination of fields:
6425
6426 @itemize @bullet
6427 @item (subsystem)
6428 @item (subsystem, peerid)
6429 @item (subsystem, key)
6430 @item (subsystem, peerid, key)
6431 @end itemize
6432
6433 Subsystems can also request to be notified about any new values stored
6434 under a (subsystem, peerid, key) combination by sending a "watch"
6435 request to PEERSTORE.
6436
6437 @node Architecture
6438 @subsection Architecture
6439
6440
6441
6442 PEERSTORE implements the following components:
6443
6444 @itemize @bullet
6445 @item PEERSTORE service: Handles store, iterate and watch operations.
6446 @item PEERSTORE API: API to be used by other subsystems to communicate and
6447 issue commands to the PEERSTORE service.
6448 @item PEERSTORE plugins: Handles the persistent storage. At the moment,
6449 only an "sqlite" plugin is implemented.
6450 @end itemize
6451
6452 @cindex libgnunetpeerstore
6453 @node libgnunetpeerstore
6454 @subsection libgnunetpeerstore
6455
6456
6457
6458 libgnunetpeerstore is the library containing the PEERSTORE API. Subsystems
6459 wishing to communicate with the PEERSTORE service use this API to open a
6460 connection to PEERSTORE. This is done by calling
6461 @code{GNUNET_PEERSTORE_connect} which returns a handle to the newly
6462 created connection.
6463 This handle has to be used with any further calls to the API.
6464
6465 To store a new record, the function @code{GNUNET_PEERSTORE_store} is to
6466 be used which requires the record fields and a continuation function that
6467 will be called by the API after the STORE request is sent to the
6468 PEERSTORE service.
6469 Note that calling the continuation function does not mean that the record
6470 is successfully stored, only that the STORE request has been successfully
6471 sent to the PEERSTORE service.
6472 @code{GNUNET_PEERSTORE_store_cancel} can be called to cancel the STORE
6473 request only before the continuation function has been called.
6474
6475 To iterate over stored records, the function
6476 @code{GNUNET_PEERSTORE_iterate} is
6477 to be used. @emph{peerid} and @emph{key} can be set to NULL. An iterator
6478 callback function will be called with each matching record found and a
6479 NULL record at the end to signal the end of result set.
6480 @code{GNUNET_PEERSTORE_iterate_cancel} can be used to cancel the ITERATE
6481 request before the iterator callback is called with a NULL record.
6482
6483 To be notified with new values stored under a (subsystem, peerid, key)
6484 combination, the function @code{GNUNET_PEERSTORE_watch} is to be used.
6485 This will register the watcher with the PEERSTORE service, any new
6486 records matching the given combination will trigger the callback
6487 function passed to @code{GNUNET_PEERSTORE_watch}. This continues until
6488 @code{GNUNET_PEERSTORE_watch_cancel} is called or the connection to the
6489 service is destroyed.
6490
6491 After the connection is no longer needed, the function
6492 @code{GNUNET_PEERSTORE_disconnect} can be called to disconnect from the
6493 PEERSTORE service.
6494 Any pending ITERATE or WATCH requests will be destroyed.
6495 If the @code{sync_first} flag is set to @code{GNUNET_YES}, the API will
6496 delay the disconnection until all pending STORE requests are sent to
6497 the PEERSTORE service, otherwise, the pending STORE requests will be
6498 destroyed as well.
6499
6500 @cindex SET Subsystem
6501 @node SET Subsystem
6502 @section SET Subsystem
6503
6504
6505
6506 The SET service implements efficient set operations between two peers
6507 over a mesh tunnel.
6508 Currently, set union and set intersection are the only supported
6509 operations. Elements of a set consist of an @emph{element type} and
6510 arbitrary binary @emph{data}.
6511 The size of an element's data is limited to around 62 KB.
6512
6513 @menu
6514 * Local Sets::
6515 * Set Modifications::
6516 * Set Operations::
6517 * Result Elements::
6518 * libgnunetset::
6519 * The SET Client-Service Protocol::
6520 * The SET Intersection Peer-to-Peer Protocol::
6521 * The SET Union Peer-to-Peer Protocol::
6522 @end menu
6523
6524 @node Local Sets
6525 @subsection Local Sets
6526
6527
6528
6529 Sets created by a local client can be modified and reused for multiple
6530 operations. As each set operation requires potentially expensive special
6531 auxiliary data to be computed for each element of a set, a set can only
6532 participate in one type of set operation (i.e. union or intersection).
6533 The type of a set is determined upon its creation.
6534 If a the elements of a set are needed for an operation of a different
6535 type, all of the set's element must be copied to a new set of appropriate
6536 type.
6537
6538 @node Set Modifications
6539 @subsection Set Modifications
6540
6541
6542
6543 Even when set operations are active, one can add to and remove elements
6544 from a set.
6545 However, these changes will only be visible to operations that have been
6546 created after the changes have taken place. That is, every set operation
6547 only sees a snapshot of the set from the time the operation was started.
6548 This mechanism is @emph{not} implemented by copying the whole set, but by
6549 attaching @emph{generation information} to each element and operation.
6550
6551 @node Set Operations
6552 @subsection Set Operations
6553
6554
6555
6556 Set operations can be started in two ways: Either by accepting an
6557 operation request from a remote peer, or by requesting a set operation
6558 from a remote peer.
6559 Set operations are uniquely identified by the involved @emph{peers}, an
6560 @emph{application id} and the @emph{operation type}.
6561
6562 The client is notified of incoming set operations by @emph{set listeners}.
6563 A set listener listens for incoming operations of a specific operation
6564 type and application id.
6565 Once notified of an incoming set request, the client can accept the set
6566 request (providing a local set for the operation) or reject it.
6567
6568 @node Result Elements
6569 @subsection Result Elements
6570
6571
6572
6573 The SET service has three @emph{result modes} that determine how an
6574 operation's result set is delivered to the client:
6575
6576 @itemize @bullet
6577 @item @strong{Full Result Set.} All elements of set resulting from the set
6578 operation are returned to the client.
6579 @item @strong{Added Elements.} Only elements that result from the
6580 operation and are not already in the local peer's set are returned.
6581 Note that for some operations (like set intersection) this result mode
6582 will never return any elements.
6583 This can be useful if only the remove peer is actually interested in
6584 the result of the set operation.
6585 @item @strong{Removed Elements.} Only elements that are in the local
6586 peer's initial set but not in the operation's result set are returned.
6587 Note that for some operations (like set union) this result mode will
6588 never return any elements. This can be useful if only the remove peer is
6589 actually interested in the result of the set operation.
6590 @end itemize
6591
6592 @cindex libgnunetset
6593 @node libgnunetset
6594 @subsection libgnunetset
6595
6596
6597
6598 @menu
6599 * Sets::
6600 * Listeners::
6601 * Operations::
6602 * Supplying a Set::
6603 * The Result Callback::
6604 @end menu
6605
6606 @node Sets
6607 @subsubsection Sets
6608
6609
6610
6611 New sets are created with @code{GNUNET_SET_create}. Both the local peer's
6612 configuration (as each set has its own client connection) and the
6613 operation type must be specified.
6614 The set exists until either the client calls @code{GNUNET_SET_destroy} or
6615 the client's connection to the service is disrupted.
6616 In the latter case, the client is notified by the return value of
6617 functions dealing with sets. This return value must always be checked.
6618
6619 Elements are added and removed with @code{GNUNET_SET_add_element} and
6620 @code{GNUNET_SET_remove_element}.
6621
6622 @node Listeners
6623 @subsubsection Listeners
6624
6625
6626
6627 Listeners are created with @code{GNUNET_SET_listen}. Each time time a
6628 remote peer suggests a set operation with an application id and operation
6629 type matching a listener, the listener's callback is invoked.
6630 The client then must synchronously call either @code{GNUNET_SET_accept}
6631 or @code{GNUNET_SET_reject}. Note that the operation will not be started
6632 until the client calls @code{GNUNET_SET_commit}
6633 (see Section "Supplying a Set").
6634
6635 @node Operations
6636 @subsubsection Operations
6637
6638
6639
6640 Operations to be initiated by the local peer are created with
6641 @code{GNUNET_SET_prepare}. Note that the operation will not be started
6642 until the client calls @code{GNUNET_SET_commit}
6643 (see Section "Supplying a Set").
6644
6645 @node Supplying a Set
6646 @subsubsection Supplying a Set
6647
6648
6649
6650 To create symmetry between the two ways of starting a set operation
6651 (accepting and initiating it), the operation handles returned by
6652 @code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare} do not yet have a
6653 set to operate on, thus they can not do any work yet.
6654
6655 The client must call @code{GNUNET_SET_commit} to specify a set to use for
6656 an operation. @code{GNUNET_SET_commit} may only be called once per set
6657 operation.
6658
6659 @node The Result Callback
6660 @subsubsection The Result Callback
6661
6662
6663
6664 Clients must specify both a result mode and a result callback with
6665 @code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare}. The result
6666 callback with a status indicating either that an element was received, or
6667 the operation failed or succeeded.
6668 The interpretation of the received element depends on the result mode.
6669 The callback needs to know which result mode it is used in, as the
6670 arguments do not indicate if an element is part of the full result set,
6671 or if it is in the difference between the original set and the final set.
6672
6673 @node The SET Client-Service Protocol
6674 @subsection The SET Client-Service Protocol
6675
6676
6677
6678 @menu
6679 * Creating Sets::
6680 * Listeners2::
6681 * Initiating Operations::
6682 * Modifying Sets::
6683 * Results and Operation Status::
6684 * Iterating Sets::
6685 @end menu
6686
6687 @node Creating Sets
6688 @subsubsection Creating Sets
6689
6690
6691
6692 For each set of a client, there exists a client connection to the service.
6693 Sets are created by sending the @code{GNUNET_SERVICE_SET_CREATE} message
6694 over a new client connection. Multiple operations for one set are
6695 multiplexed over one client connection, using a request id supplied by
6696 the client.
6697
6698 @node Listeners2
6699 @subsubsection Listeners2
6700
6701
6702
6703 Each listener also requires a seperate client connection. By sending the
6704 @code{GNUNET_SERVICE_SET_LISTEN} message, the client notifies the service
6705 of the application id and operation type it is interested in. A client
6706 rejects an incoming request by sending @code{GNUNET_SERVICE_SET_REJECT}
6707 on the listener's client connection.
6708 In contrast, when accepting an incoming request, a
6709 @code{GNUNET_SERVICE_SET_ACCEPT} message must be sent over the@ set that
6710 is supplied for the set operation.
6711
6712 @node Initiating Operations
6713 @subsubsection Initiating Operations
6714
6715
6716
6717 Operations with remote peers are initiated by sending a
6718 @code{GNUNET_SERVICE_SET_EVALUATE} message to the service. The@ client
6719 connection that this message is sent by determines the set to use.
6720
6721 @node Modifying Sets
6722 @subsubsection Modifying Sets
6723
6724
6725
6726 Sets are modified with the @code{GNUNET_SERVICE_SET_ADD} and
6727 @code{GNUNET_SERVICE_SET_REMOVE} messages.
6728
6729
6730 @c %@menu
6731 @c %* Results and Operation Status::
6732 @c %* Iterating Sets::
6733 @c %@end menu
6734
6735 @node Results and Operation Status
6736 @subsubsection Results and Operation Status
6737
6738
6739 The service notifies the client of result elements and success/failure of
6740 a set operation with the @code{GNUNET_SERVICE_SET_RESULT} message.
6741
6742 @node Iterating Sets
6743 @subsubsection Iterating Sets
6744
6745
6746
6747 All elements of a set can be requested by sending
6748 @code{GNUNET_SERVICE_SET_ITER_REQUEST}. The server responds with
6749 @code{GNUNET_SERVICE_SET_ITER_ELEMENT} and eventually terminates the
6750 iteration with @code{GNUNET_SERVICE_SET_ITER_DONE}.
6751 After each received element, the client
6752 must send @code{GNUNET_SERVICE_SET_ITER_ACK}. Note that only one set
6753 iteration may be active for a set at any given time.
6754
6755 @node The SET Intersection Peer-to-Peer Protocol
6756 @subsection The SET Intersection Peer-to-Peer Protocol
6757
6758
6759
6760 The intersection protocol operates over CADET and starts with a
6761 GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer
6762 initiating the operation to the peer listening for inbound requests.
6763 It includes the number of elements of the initiating peer, which is used
6764 to decide which side will send a Bloom filter first.
6765
6766 The listening peer checks if the operation type and application
6767 identifier are acceptable for its current state.
6768 If not, it responds with a GNUNET_MESSAGE_TYPE_SET_RESULT and a status of
6769 GNUNET_SET_STATUS_FAILURE (and terminates the CADET channel).
6770
6771 If the application accepts the request, the listener sends back a
6772 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} if it has
6773 more elements in the set than the client.
6774 Otherwise, it immediately starts with the Bloom filter exchange.
6775 If the initiator receives a
6776 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} response,
6777 it beings the Bloom filter exchange, unless the set size is indicated to
6778 be zero, in which case the intersection is considered finished after
6779 just the initial handshake.
6780
6781
6782 @menu
6783 * The Bloom filter exchange::
6784 * Salt::
6785 @end menu
6786
6787 @node The Bloom filter exchange
6788 @subsubsection The Bloom filter exchange
6789
6790
6791
6792 In this phase, each peer transmits a Bloom filter over the remaining
6793 keys of the local set to the other peer using a
6794 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_BF} message. This
6795 message additionally includes the number of elements left in the sender's
6796 set, as well as the XOR over all of the keys in that set.
6797
6798 The number of bits 'k' set per element in the Bloom filter is calculated
6799 based on the relative size of the two sets.
6800 Furthermore, the size of the Bloom filter is calculated based on 'k' and
6801 the number of elements in the set to maximize the amount of data filtered
6802 per byte transmitted on the wire (while avoiding an excessively high
6803 number of iterations).
6804
6805 The receiver of the message removes all elements from its local set that
6806 do not pass the Bloom filter test.
6807 It then checks if the set size of the sender and the XOR over the keys
6808 match what is left of its own set. If they do, it sends a
6809 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_DONE} back to indicate
6810 that the latest set is the final result.
6811 Otherwise, the receiver starts another Bloom filter exchange, except
6812 this time as the sender.
6813
6814 @node Salt
6815 @subsubsection Salt
6816
6817
6818
6819 Bloomfilter operations are probabilistic: With some non-zero probability
6820 the test may incorrectly say an element is in the set, even though it is
6821 not.
6822
6823 To mitigate this problem, the intersection protocol iterates exchanging
6824 Bloom filters using a different random 32-bit salt in each iteration (the
6825 salt is also included in the message).
6826 With different salts, set operations may fail for different elements.
6827 Merging the results from the executions, the probability of failure drops
6828 to zero.
6829
6830 The iterations terminate once both peers have established that they have
6831 sets of the same size, and where the XOR over all keys computes the same
6832 512-bit value (leaving a failure probability of 2-511).
6833
6834 @node The SET Union Peer-to-Peer Protocol
6835 @subsection The SET Union Peer-to-Peer Protocol
6836
6837
6838
6839 The SET union protocol is based on Eppstein's efficient set reconciliation
6840 without prior context. You should read this paper first if you want to
6841 understand the protocol.
6842
6843 The union protocol operates over CADET and starts with a
6844 GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer
6845 initiating the operation to the peer listening for inbound requests.
6846 It includes the number of elements of the initiating peer, which is
6847 currently not used.
6848
6849 The listening peer checks if the operation type and application
6850 identifier are acceptable for its current state. If not, it responds with
6851 a @code{GNUNET_MESSAGE_TYPE_SET_RESULT} and a status of
6852 @code{GNUNET_SET_STATUS_FAILURE} (and terminates the CADET channel).
6853
6854 If the application accepts the request, it sends back a strata estimator
6855 using a message of type GNUNET_MESSAGE_TYPE_SET_UNION_P2P_SE. The
6856 initiator evaluates the strata estimator and initiates the exchange of
6857 invertible Bloom filters, sending a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
6858
6859 During the IBF exchange, if the receiver cannot invert the Bloom filter or
6860 detects a cycle, it sends a larger IBF in response (up to a defined
6861 maximum limit; if that limit is reached, the operation fails).
6862 Elements decoded while processing the IBF are transmitted to the other
6863 peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS, or requested from the
6864 other peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS messages,
6865 depending on the sign observed during decoding of the IBF.
6866 Peers respond to a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS message
6867 with the respective element in a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS
6868 message. If the IBF fully decodes, the peer responds with a
6869 GNUNET_MESSAGE_TYPE_SET_UNION_P2P_DONE message instead of another
6870 GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
6871
6872 All Bloom filter operations use a salt to mingle keys before hashing them
6873 into buckets, such that future iterations have a fresh chance of
6874 succeeding if they failed due to collisions before.
6875
6876 @cindex STATISTICS Subsystem
6877 @node STATISTICS Subsystem
6878 @section STATISTICS Subsystem
6879
6880
6881
6882 In GNUnet, the STATISTICS subsystem offers a central place for all
6883 subsystems to publish unsigned 64-bit integer run-time statistics.
6884 Keeping this information centrally means that there is a unified way for
6885 the user to obtain data on all subsystems, and individual subsystems do
6886 not have to always include a custom data export method for performance
6887 metrics and other statistics. For example, the TRANSPORT system uses
6888 STATISTICS to update information about the number of directly connected
6889 peers and the bandwidth that has been consumed by the various plugins.
6890 This information is valuable for diagnosing connectivity and performance
6891 issues.
6892
6893 Following the GNUnet service architecture, the STATISTICS subsystem is
6894 divided into an API which is exposed through the header
6895 @strong{gnunet_statistics_service.h} and the STATISTICS service
6896 @strong{gnunet-service-statistics}. The @strong{gnunet-statistics}
6897 command-line tool can be used to obtain (and change) information about
6898 the values stored by the STATISTICS service. The STATISTICS service does
6899 not communicate with other peers.
6900
6901 Data is stored in the STATISTICS service in the form of tuples
6902 @strong{(subsystem, name, value, persistence)}. The subsystem determines
6903 to which other GNUnet's subsystem the data belongs. name is the name
6904 through which value is associated. It uniquely identifies the record
6905 from among other records belonging to the same subsystem.
6906 In some parts of the code, the pair @strong{(subsystem, name)} is called
6907 a @strong{statistic} as it identifies the values stored in the STATISTCS
6908 service.The persistence flag determines if the record has to be preserved
6909 across service restarts. A record is said to be persistent if this flag
6910 is set for it; if not, the record is treated as a non-persistent record
6911 and it is lost after service restart. Persistent records are written to
6912 and read from the file @strong{statistics.data} before shutdown
6913 and upon startup. The file is located in the HOME directory of the peer.
6914
6915 An anomaly of the STATISTICS service is that it does not terminate
6916 immediately upon receiving a shutdown signal if it has any clients
6917 connected to it. It waits for all the clients that are not monitors to
6918 close their connections before terminating itself.
6919 This is to prevent the loss of data during peer shutdown --- delaying the
6920 STATISTICS service shutdown helps other services to store important data
6921 to STATISTICS during shutdown.
6922
6923 @menu
6924 * libgnunetstatistics::
6925 * The STATISTICS Client-Service Protocol::
6926 @end menu
6927
6928 @cindex libgnunetstatistics
6929 @node libgnunetstatistics
6930 @subsection libgnunetstatistics
6931
6932
6933
6934 @strong{libgnunetstatistics} is the library containing the API for the
6935 STATISTICS subsystem. Any process requiring to use STATISTICS should use
6936 this API by to open a connection to the STATISTICS service.
6937 This is done by calling the function @code{GNUNET_STATISTICS_create()}.
6938 This function takes the subsystem's name which is trying to use STATISTICS
6939 and a configuration.
6940 All values written to STATISTICS with this connection will be placed in
6941 the section corresponding to the given subsystem's name.
6942 The connection to STATISTICS can be destroyed with the function
6943 @code{GNUNET_STATISTICS_destroy()}. This function allows for the
6944 connection to be destroyed immediately or upon transferring all
6945 pending write requests to the service.
6946
6947 Note: STATISTICS subsystem can be disabled by setting @code{DISABLE = YES}
6948 under the @code{[STATISTICS]} section in the configuration. With such a
6949 configuration all calls to @code{GNUNET_STATISTICS_create()} return
6950 @code{NULL} as the STATISTICS subsystem is unavailable and no other
6951 functions from the API can be used.
6952
6953
6954 @menu
6955 * Statistics retrieval::
6956 * Setting statistics and updating them::
6957 * Watches::
6958 @end menu
6959
6960 @node Statistics retrieval
6961 @subsubsection Statistics retrieval
6962
6963
6964
6965 Once a connection to the statistics service is obtained, information
6966 about any other system which uses statistics can be retrieved with the
6967 function GNUNET_STATISTICS_get().
6968 This function takes the connection handle, the name of the subsystem
6969 whose information we are interested in (a @code{NULL} value will
6970 retrieve information of all available subsystems using STATISTICS), the
6971 name of the statistic we are interested in (a @code{NULL} value will
6972 retrieve all available statistics), a continuation callback which is
6973 called when all of requested information is retrieved, an iterator
6974 callback which is called for each parameter in the retrieved information
6975 and a closure for the aforementioned callbacks. The library then invokes
6976 the iterator callback for each value matching the request.
6977
6978 Call to @code{GNUNET_STATISTICS_get()} is asynchronous and can be
6979 canceled with the function @code{GNUNET_STATISTICS_get_cancel()}.
6980 This is helpful when retrieving statistics takes too long and especially
6981 when we want to shutdown and cleanup everything.
6982
6983 @node Setting statistics and updating them
6984 @subsubsection Setting statistics and updating them
6985
6986
6987
6988 So far we have seen how to retrieve statistics, here we will learn how we
6989 can set statistics and update them so that other subsystems can retrieve
6990 them.
6991
6992 A new statistic can be set using the function
6993 @code{GNUNET_STATISTICS_set()}.
6994 This function takes the name of the statistic and its value and a flag to
6995 make the statistic persistent.
6996 The value of the statistic should be of the type @code{uint64_t}.
6997 The function does not take the name of the subsystem; it is determined
6998 from the previous @code{GNUNET_STATISTICS_create()} invocation. If
6999 the given statistic is already present, its value is overwritten.
7000
7001 An existing statistics can be updated, i.e its value can be increased or
7002 decreased by an amount with the function
7003 @code{GNUNET_STATISTICS_update()}.
7004 The parameters to this function are similar to
7005 @code{GNUNET_STATISTICS_set()}, except that it takes the amount to be
7006 changed as a type @code{int64_t} instead of the value.
7007
7008 The library will combine multiple set or update operations into one
7009 message if the client performs requests at a rate that is faster than the
7010 available IPC with the STATISTICS service. Thus, the client does not have
7011 to worry about sending requests too quickly.
7012
7013 @node Watches
7014 @subsubsection Watches
7015
7016
7017
7018 As interesting feature of STATISTICS lies in serving notifications
7019 whenever a statistic of our interest is modified.
7020 This is achieved by registering a watch through the function
7021 @code{GNUNET_STATISTICS_watch()}.
7022 The parameters of this function are similar to those of
7023 @code{GNUNET_STATISTICS_get()}.
7024 Changes to the respective statistic's value will then cause the given
7025 iterator callback to be called.
7026 Note: A watch can only be registered for a specific statistic. Hence
7027 the subsystem name and the parameter name cannot be @code{NULL} in a
7028 call to @code{GNUNET_STATISTICS_watch()}.
7029
7030 A registered watch will keep notifying any value changes until
7031 @code{GNUNET_STATISTICS_watch_cancel()} is called with the same
7032 parameters that are used for registering the watch.
7033
7034 @node The STATISTICS Client-Service Protocol
7035 @subsection The STATISTICS Client-Service Protocol
7036
7037
7038
7039 @menu
7040 * Statistics retrieval2::
7041 * Setting and updating statistics::
7042 * Watching for updates::
7043 @end menu
7044
7045 @node Statistics retrieval2
7046 @subsubsection Statistics retrieval2
7047
7048
7049
7050 To retrieve statistics, the client transmits a message of type
7051 @code{GNUNET_MESSAGE_TYPE_STATISTICS_GET} containing the given subsystem
7052 name and statistic parameter to the STATISTICS service.
7053 The service responds with a message of type
7054 @code{GNUNET_MESSAGE_TYPE_STATISTICS_VALUE} for each of the statistics
7055 parameters that match the client request for the client. The end of
7056 information retrieved is signaled by the service by sending a message of
7057 type @code{GNUNET_MESSAGE_TYPE_STATISTICS_END}.
7058
7059 @node Setting and updating statistics
7060 @subsubsection Setting and updating statistics
7061
7062
7063
7064 The subsystem name, parameter name, its value and the persistence flag are
7065 communicated to the service through the message
7066 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}.
7067
7068 When the service receives a message of type
7069 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}, it retrieves the subsystem
7070 name and checks for a statistic parameter with matching the name given in
7071 the message.
7072 If a statistic parameter is found, the value is overwritten by the new
7073 value from the message; if not found then a new statistic parameter is
7074 created with the given name and value.
7075
7076 In addition to just setting an absolute value, it is possible to perform a
7077 relative update by sending a message of type
7078 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET} with an update flag
7079 (@code{GNUNET_STATISTICS_SETFLAG_RELATIVE}) signifying that the value in
7080 the message should be treated as an update value.
7081
7082 @node Watching for updates
7083 @subsubsection Watching for updates
7084
7085
7086
7087 The function registers the watch at the service by sending a message of
7088 type @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH}. The service then sends
7089 notifications through messages of type
7090 @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH_VALUE} whenever the statistic
7091 parameter's value is changed.
7092
7093 @cindex DHT
7094 @cindex Distributed Hash Table
7095 @node Distributed Hash Table (DHT)
7096 @section Distributed Hash Table (DHT)
7097
7098
7099
7100 GNUnet includes a generic distributed hash table that can be used by
7101 developers building P2P applications in the framework.
7102 This section documents high-level features and how developers are
7103 expected to use the DHT.
7104 We have a research paper detailing how the DHT works.
7105 Also, Nate's thesis includes a detailed description and performance
7106 analysis (in chapter 6).
7107
7108 Key features of GNUnet's DHT include:
7109
7110 @itemize @bullet
7111 @item stores key-value pairs with values up to (approximately) 63k in size
7112 @item works with many underlay network topologies (small-world, random
7113 graph), underlay does not need to be a full mesh / clique
7114 @item support for extended queries (more than just a simple 'key'),
7115 filtering duplicate replies within the network (bloomfilter) and content
7116 validation (for details, please read the subsection on the block library)
7117 @item can (optionally) return paths taken by the PUT and GET operations
7118 to the application
7119 @item provides content replication to handle churn
7120 @end itemize
7121
7122 GNUnet's DHT is randomized and unreliable. Unreliable means that there is
7123 no strict guarantee that a value stored in the DHT is always
7124 found --- values are only found with high probability.
7125 While this is somewhat true in all P2P DHTs, GNUnet developers should be
7126 particularly wary of this fact (this will help you write secure,
7127 fault-tolerant code). Thus, when writing any application using the DHT,
7128 you should always consider the possibility that a value stored in the
7129 DHT by you or some other peer might simply not be returned, or returned
7130 with a significant delay.
7131 Your application logic must be written to tolerate this (naturally, some
7132 loss of performance or quality of service is expected in this case).
7133
7134 @menu
7135 * Block library and plugins::
7136 * libgnunetdht::
7137 * The DHT Client-Service Protocol::
7138 * The DHT Peer-to-Peer Protocol::
7139 @end menu
7140
7141 @node Block library and plugins
7142 @subsection Block library and plugins
7143
7144
7145
7146 @menu
7147 * What is a Block?::
7148 * The API of libgnunetblock::
7149 * Queries::
7150 * Sample Code::
7151 * Conclusion2::
7152 @end menu
7153
7154 @node What is a Block?
7155 @subsubsection What is a Block?
7156
7157
7158
7159 Blocks are small (< 63k) pieces of data stored under a key (struct
7160 GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which defines
7161 their data format. Blocks are used in GNUnet as units of static data
7162 exchanged between peers and stored (or cached) locally.
7163 Uses of blocks include file-sharing (the files are broken up into blocks),
7164 the VPN (DNS information is stored in blocks) and the DHT (all
7165 information in the DHT and meta-information for the maintenance of the
7166 DHT are both stored using blocks).
7167 The block subsystem provides a few common functions that must be
7168 available for any type of block.
7169
7170 @cindex libgnunetblock API
7171 @node The API of libgnunetblock
7172 @subsubsection The API of libgnunetblock
7173
7174
7175
7176 The block library requires for each (family of) block type(s) a block
7177 plugin (implementing @file{gnunet_block_plugin.h}) that provides basic
7178 functions that are needed by the DHT (and possibly other subsystems) to
7179 manage the block.
7180 These block plugins are typically implemented within their respective
7181 subsystems.
7182 The main block library is then used to locate, load and query the
7183 appropriate block plugin.
7184 Which plugin is appropriate is determined by the block type (which is
7185 just a 32-bit integer). Block plugins contain code that specifies which
7186 block types are supported by a given plugin. The block library loads all
7187 block plugins that are installed at the local peer and forwards the
7188 application request to the respective plugin.
7189
7190 The central functions of the block APIs (plugin and main library) are to
7191 allow the mapping of blocks to their respective key (if possible) and the
7192 ability to check that a block is well-formed and matches a given
7193 request (again, if possible).
7194 This way, GNUnet can avoid storing invalid blocks, storing blocks under
7195 the wrong key and forwarding blocks in response to a query that they do
7196 not answer.
7197
7198 One key function of block plugins is that it allows GNUnet to detect
7199 duplicate replies (via the Bloom filter). All plugins MUST support
7200 detecting duplicate replies (by adding the current response to the
7201 Bloom filter and rejecting it if it is encountered again).
7202 If a plugin fails to do this, responses may loop in the network.
7203
7204 @node Queries
7205 @subsubsection Queries
7206
7207
7208 The query format for any block in GNUnet consists of four main components.
7209 First, the type of the desired block must be specified. Second, the query
7210 must contain a hash code. The hash code is used for lookups in hash
7211 tables and databases and must not be unique for the block (however, if
7212 possible a unique hash should be used as this would be best for
7213 performance).
7214 Third, an optional Bloom filter can be specified to exclude known results;
7215 replies that hash to the bits set in the Bloom filter are considered
7216 invalid. False-positives can be eliminated by sending the same query
7217 again with a different Bloom filter mutator value, which parameterizes
7218 the hash function that is used.
7219 Finally, an optional application-specific "eXtended query" (xquery) can
7220 be specified to further constrain the results. It is entirely up to
7221 the type-specific plugin to determine whether or not a given block
7222 matches a query (type, hash, Bloom filter, and xquery).
7223 Naturally, not all xquery's are valid and some types of blocks may not
7224 support Bloom filters either, so the plugin also needs to check if the
7225 query is valid in the first place.
7226
7227 Depending on the results from the plugin, the DHT will then discard the
7228 (invalid) query, forward the query, discard the (invalid) reply, cache the
7229 (valid) reply, and/or forward the (valid and non-duplicate) reply.
7230
7231 @node Sample Code
7232 @subsubsection Sample Code
7233
7234
7235
7236 The source code in @strong{plugin_block_test.c} is a good starting point
7237 for new block plugins --- it does the minimal work by implementing a
7238 plugin that performs no validation at all.
7239 The respective @strong{Makefile.am} shows how to build and install a
7240 block plugin.
7241
7242 @node Conclusion2
7243 @subsubsection Conclusion2
7244
7245
7246
7247 In conclusion, GNUnet subsystems that want to use the DHT need to define a
7248 block format and write a plugin to match queries and replies. For testing,
7249 the @code{GNUNET_BLOCK_TYPE_TEST} block type can be used; it accepts
7250 any query as valid and any reply as matching any query.
7251 This type is also used for the DHT command line tools.
7252 However, it should NOT be used for normal applications due to the lack
7253 of error checking that results from this primitive implementation.
7254
7255 @cindex libgnunetdht
7256 @node libgnunetdht
7257 @subsection libgnunetdht
7258
7259
7260
7261 The DHT API itself is pretty simple and offers the usual GET and PUT
7262 functions that work as expected. The specified block type refers to the
7263 block library which allows the DHT to run application-specific logic for
7264 data stored in the network.
7265
7266
7267 @menu
7268 * GET::
7269 * PUT::
7270 * MONITOR::
7271 * DHT Routing Options::
7272 @end menu
7273
7274 @node GET
7275 @subsubsection GET
7276
7277
7278
7279 When using GET, the main consideration for developers (other than the
7280 block library) should be that after issuing a GET, the DHT will
7281 continuously cause (small amounts of) network traffic until the operation
7282 is explicitly canceled.
7283 So GET does not simply send out a single network request once; instead,
7284 the DHT will continue to search for data. This is needed to achieve good
7285 success rates and also handles the case where the respective PUT
7286 operation happens after the GET operation was started.
7287 Developers should not cancel an existing GET operation and then
7288 explicitly re-start it to trigger a new round of network requests;
7289 this is simply inefficient, especially as the internal automated version
7290 can be more efficient, for example by filtering results in the network
7291 that have already been returned.
7292
7293 If an application that performs a GET request has a set of replies that it
7294 already knows and would like to filter, it can call@
7295 @code{GNUNET_DHT_get_filter_known_results} with an array of hashes over
7296 the respective blocks to tell the DHT that these results are not
7297 desired (any more).
7298 This way, the DHT will filter the respective blocks using the block
7299 library in the network, which may result in a significant reduction in
7300 bandwidth consumption.
7301
7302 @node PUT
7303 @subsubsection PUT
7304
7305
7306
7307 @c inconsistent use of ``must'' above it's written ``MUST''
7308 In contrast to GET operations, developers @strong{must} manually re-run
7309 PUT operations periodically (if they intend the content to continue to be
7310 available). Content stored in the DHT expires or might be lost due to
7311 churn.
7312 Furthermore, GNUnet's DHT typically requires multiple rounds of PUT
7313 operations before a key-value pair is consistently available to all
7314 peers (the DHT randomizes paths and thus storage locations, and only
7315 after multiple rounds of PUTs there will be a sufficient number of
7316 replicas in large DHTs). An explicit PUT operation using the DHT API will
7317 only cause network traffic once, so in order to ensure basic availability
7318 and resistance to churn (and adversaries), PUTs must be repeated.
7319 While the exact frequency depends on the application, a rule of thumb is
7320 that there should be at least a dozen PUT operations within the content
7321 lifetime. Content in the DHT typically expires after one day, so
7322 DHT PUT operations should be repeated at least every 1-2 hours.
7323
7324 @node MONITOR
7325 @subsubsection MONITOR
7326
7327
7328
7329 The DHT API also allows applications to monitor messages crossing the
7330 local DHT service.
7331 The types of messages used by the DHT are GET, PUT and RESULT messages.
7332 Using the monitoring API, applications can choose to monitor these
7333 requests, possibly limiting themselves to requests for a particular block
7334 type.
7335
7336 The monitoring API is not only useful for diagnostics, it can also be
7337 used to trigger application operations based on PUT operations.
7338 For example, an application may use PUTs to distribute work requests to
7339 other peers.
7340 The workers would then monitor for PUTs that give them work, instead of
7341 looking for work using GET operations.
7342 This can be beneficial, especially if the workers have no good way to
7343 guess the keys under which work would be stored.
7344 Naturally, additional protocols might be needed to ensure that the desired
7345 number of workers will process the distributed workload.
7346
7347 @node DHT Routing Options
7348 @subsubsection DHT Routing Options
7349
7350
7351
7352 There are two important options for GET and PUT requests:
7353
7354 @table @asis
7355 @item GNUNET_DHT_RO_DEMULITPLEX_EVERYWHERE This option means that all
7356 peers should process the request, even if their peer ID is not closest to
7357 the key. For a PUT request, this means that all peers that a request
7358 traverses may make a copy of the data.
7359 Similarly for a GET request, all peers will check their local database
7360 for a result. Setting this option can thus significantly improve caching
7361 and reduce bandwidth consumption --- at the expense of a larger DHT
7362 database. If in doubt, we recommend that this option should be used.
7363 @item GNUNET_DHT_RO_RECORD_ROUTE This option instructs the DHT to record
7364 the path that a GET or a PUT request is taking through the overlay
7365 network. The resulting paths are then returned to the application with
7366 the respective result. This allows the receiver of a result to construct
7367 a path to the originator of the data, which might then be used for
7368 routing. Naturally, setting this option requires additional bandwidth
7369 and disk space, so applications should only set this if the paths are
7370 needed by the application logic.
7371 @item GNUNET_DHT_RO_FIND_PEER This option is an internal option used by
7372 the DHT's peer discovery mechanism and should not be used by applications.
7373 @item GNUNET_DHT_RO_BART This option is currently not implemented. It may
7374 in the future offer performance improvements for clique topologies.
7375 @end table
7376
7377 @node The DHT Client-Service Protocol
7378 @subsection The DHT Client-Service Protocol
7379
7380
7381
7382 @menu
7383 * PUTting data into the DHT::
7384 * GETting data from the DHT::
7385 * Monitoring the DHT::
7386 @end menu
7387
7388 @node PUTting data into the DHT
7389 @subsubsection PUTting data into the DHT
7390
7391
7392
7393 To store (PUT) data into the DHT, the client sends a
7394 @code{struct GNUNET_DHT_ClientPutMessage} to the service.
7395 This message specifies the block type, routing options, the desired
7396 replication level, the expiration time, key,
7397 value and a 64-bit unique ID for the operation. The service responds with
7398 a @code{struct GNUNET_DHT_ClientPutConfirmationMessage} with the same
7399 64-bit unique ID. Note that the service sends the confirmation as soon as
7400 it has locally processed the PUT request. The PUT may still be
7401 propagating through the network at this time.
7402
7403 In the future, we may want to change this to provide (limited) feedback
7404 to the client, for example if we detect that the PUT operation had no
7405 effect because the same key-value pair was already stored in the DHT.
7406 However, changing this would also require additional state and messages
7407 in the P2P interaction.
7408
7409 @node GETting data from the DHT
7410 @subsubsection GETting data from the DHT
7411
7412
7413
7414 To retrieve (GET) data from the DHT, the client sends a
7415 @code{struct GNUNET_DHT_ClientGetMessage} to the service. The message
7416 specifies routing options, a replication level (for replicating the GET,
7417 not the content), the desired block type, the key, the (optional)
7418 extended query and unique 64-bit request ID.
7419
7420 Additionally, the client may send any number of
7421 @code{struct GNUNET_DHT_ClientGetResultSeenMessage}s to notify the
7422 service about results that the client is already aware of.
7423 These messages consist of the key, the unique 64-bit ID of the request,
7424 and an arbitrary number of hash codes over the blocks that the client is
7425 already aware of. As messages are restricted to 64k, a client that
7426 already knows more than about a thousand blocks may need to send
7427 several of these messages. Naturally, the client should transmit these
7428 messages as quickly as possible after the original GET request such that
7429 the DHT can filter those results in the network early on. Naturally, as
7430 these messages are sent after the original request, it is conceivable
7431 that the DHT service may return blocks that match those already known
7432 to the client anyway.
7433
7434 In response to a GET request, the service will send @code{struct
7435 GNUNET_DHT_ClientResultMessage}s to the client. These messages contain the
7436 block type, expiration, key, unique ID of the request and of course the
7437 value (a block). Depending on the options set for the respective
7438 operations, the replies may also contain the path the GET and/or the PUT
7439 took through the network.
7440
7441 A client can stop receiving replies either by disconnecting or by sending
7442 a @code{struct GNUNET_DHT_ClientGetStopMessage} which must contain the
7443 key and the 64-bit unique ID of the original request. Using an
7444 explicit "stop" message is more common as this allows a client to run
7445 many concurrent GET operations over the same connection with the DHT
7446 service --- and to stop them individually.
7447
7448 @node Monitoring the DHT
7449 @subsubsection Monitoring the DHT
7450
7451
7452
7453 To begin monitoring, the client sends a
7454 @code{struct GNUNET_DHT_MonitorStartStop} message to the DHT service.
7455 In this message, flags can be set to enable (or disable) monitoring of
7456 GET, PUT and RESULT messages that pass through a peer. The message can
7457 also restrict monitoring to a particular block type or a particular key.
7458 Once monitoring is enabled, the DHT service will notify the client about
7459 any matching event using @code{struct GNUNET_DHT_MonitorGetMessage}s for
7460 GET events, @code{struct GNUNET_DHT_MonitorPutMessage} for PUT events
7461 and @code{struct GNUNET_DHT_MonitorGetRespMessage} for RESULTs. Each of
7462 these messages contains all of the information about the event.
7463
7464 @node The DHT Peer-to-Peer Protocol
7465 @subsection The DHT Peer-to-Peer Protocol
7466
7467
7468
7469 @menu
7470 * Routing GETs or PUTs::
7471 * PUTting data into the DHT2::
7472 * GETting data from the DHT2::
7473 @end menu
7474
7475 @node Routing GETs or PUTs
7476 @subsubsection Routing GETs or PUTs
7477
7478
7479
7480 When routing GETs or PUTs, the DHT service selects a suitable subset of
7481 neighbours for forwarding. The exact number of neighbours can be zero or
7482 more and depends on the hop counter of the query (initially zero) in
7483 relation to the (log of) the network size estimate, the desired
7484 replication level and the peer's connectivity.
7485 Depending on the hop counter and our network size estimate, the selection
7486 of the peers maybe randomized or by proximity to the key.
7487 Furthermore, requests include a set of peers that a request has already
7488 traversed; those peers are also excluded from the selection.
7489
7490 @node PUTting data into the DHT2
7491 @subsubsection PUTting data into the DHT2
7492
7493
7494
7495 To PUT data into the DHT, the service sends a @code{struct PeerPutMessage}
7496 of type @code{GNUNET_MESSAGE_TYPE_DHT_P2P_PUT} to the respective
7497 neighbour.
7498 In addition to the usual information about the content (type, routing
7499 options, desired replication level for the content, expiration time, key
7500 and value), the message contains a fixed-size Bloom filter with
7501 information about which peers (may) have already seen this request.
7502 This Bloom filter is used to ensure that DHT messages never loop back to
7503 a peer that has already processed the request.
7504 Additionally, the message includes the current hop counter and, depending
7505 on the routing options, the message may include the full path that the
7506 message has taken so far.
7507 The Bloom filter should already contain the identity of the previous hop;
7508 however, the path should not include the identity of the previous hop and
7509 the receiver should append the identity of the sender to the path, not
7510 its own identity (this is done to reduce bandwidth).
7511
7512 @node GETting data from the DHT2
7513 @subsubsection GETting data from the DHT2
7514
7515
7516
7517 A peer can search the DHT by sending @code{struct PeerGetMessage}s of type
7518 @code{GNUNET_MESSAGE_TYPE_DHT_P2P_GET} to other peers. In addition to the
7519 usual information about the request (type, routing options, desired
7520 replication level for the request, the key and the extended query), a GET
7521 request also contains a hop counter, a Bloom filter over the peers
7522 that have processed the request already and depending on the routing
7523 options the full path traversed by the GET.
7524 Finally, a GET request includes a variable-size second Bloom filter and a
7525 so-called Bloom filter mutator value which together indicate which
7526 replies the sender has already seen. During the lookup, each block that
7527 matches they block type, key and extended query is additionally subjected
7528 to a test against this Bloom filter.
7529 The block plugin is expected to take the hash of the block and combine it
7530 with the mutator value and check if the result is not yet in the Bloom
7531 filter. The originator of the query will from time to time modify the
7532 mutator to (eventually) allow false-positives filtered by the Bloom filter
7533 to be returned.
7534
7535 Peers that receive a GET request perform a local lookup (depending on
7536 their proximity to the key and the query options) and forward the request
7537 to other peers.
7538 They then remember the request (including the Bloom filter for blocking
7539 duplicate results) and when they obtain a matching, non-filtered response
7540 a @code{struct PeerResultMessage} of type
7541 @code{GNUNET_MESSAGE_TYPE_DHT_P2P_RESULT} is forwarded to the previous
7542 hop.
7543 Whenever a result is forwarded, the block plugin is used to update the
7544 Bloom filter accordingly, to ensure that the same result is never
7545 forwarded more than once.
7546 The DHT service may also cache forwarded results locally if the
7547 "CACHE_RESULTS" option is set to "YES" in the configuration.
7548
7549 @cindex GNS
7550 @cindex GNU Name System
7551 @node GNU Name System (GNS)
7552 @section GNU Name System (GNS)
7553
7554
7555
7556 The GNU Name System (GNS) is a decentralized database that enables users
7557 to securely resolve names to values.
7558 Names can be used to identify other users (for example, in social
7559 networking), or network services (for example, VPN services running at a
7560 peer in GNUnet, or purely IP-based services on the Internet).
7561 Users interact with GNS by typing in a hostname that ends in a
7562 top-level domain that is configured in the ``GNS'' section, matches
7563 an identity of the user or ends in a Base32-encoded public key.
7564
7565 Videos giving an overview of most of the GNS and the motivations behind
7566 it is available here and here.
7567 The remainder of this chapter targets developers that are familiar with
7568 high level concepts of GNS as presented in these talks.
7569 @c TODO: Add links to here and here and to these.
7570
7571 GNS-aware applications should use the GNS resolver to obtain the
7572 respective records that are stored under that name in GNS.
7573 Each record consists of a type, value, expiration time and flags.
7574
7575 The type specifies the format of the value. Types below 65536 correspond
7576 to DNS record types, larger values are used for GNS-specific records.
7577 Applications can define new GNS record types by reserving a number and
7578 implementing a plugin (which mostly needs to convert the binary value
7579 representation to a human-readable text format and vice-versa).
7580 The expiration time specifies how long the record is to be valid.
7581 The GNS API ensures that applications are only given non-expired values.
7582 The flags are typically irrelevant for applications, as GNS uses them
7583 internally to control visibility and validity of records.
7584
7585 Records are stored along with a signature.
7586 The signature is generated using the private key of the authoritative
7587 zone. This allows any GNS resolver to verify the correctness of a
7588 name-value mapping.
7589
7590 Internally, GNS uses the NAMECACHE to cache information obtained from
7591 other users, the NAMESTORE to store information specific to the local
7592 users, and the DHT to exchange data between users.
7593 A plugin API is used to enable applications to define new GNS
7594 record types.
7595
7596 @menu
7597 * libgnunetgns::
7598 * libgnunetgnsrecord::
7599 * GNS plugins::
7600 * The GNS Client-Service Protocol::
7601 * Hijacking the DNS-Traffic using gnunet-service-dns::
7602 * Serving DNS lookups via GNS on W32::
7603 * Importing DNS Zones into GNS::
7604 @end menu
7605
7606 @node libgnunetgns
7607 @subsection libgnunetgns
7608
7609
7610
7611 The GNS API itself is extremely simple. Clients first connect to the
7612 GNS service using @code{GNUNET_GNS_connect}.
7613 They can then perform lookups using @code{GNUNET_GNS_lookup} or cancel
7614 pending lookups using @code{GNUNET_GNS_lookup_cancel}.
7615 Once finished, clients disconnect using @code{GNUNET_GNS_disconnect}.
7616
7617 @menu
7618 * Looking up records::
7619 * Accessing the records::
7620 * Creating records::
7621 * Future work::
7622 @end menu
7623
7624 @node Looking up records
7625 @subsubsection Looking up records
7626
7627
7628
7629 @code{GNUNET_GNS_lookup} takes a number of arguments:
7630
7631 @table @asis
7632 @item handle This is simply the GNS connection handle from
7633 @code{GNUNET_GNS_connect}.
7634 @item name The client needs to specify the name to
7635 be resolved. This can be any valid DNS or GNS hostname.
7636 @item zone The client
7637 needs to specify the public key of the GNS zone against which the
7638 resolution should be done.
7639 Note that a key must be provided, the client should
7640 look up plausible values using its configuration,
7641 the identity service and by attempting to interpret the
7642 TLD as a base32-encoded public key.
7643 @item type This is the desired GNS or DNS record type
7644 to look for. While all records for the given name will be returned, this
7645 can be important if the client wants to resolve record types that
7646 themselves delegate resolution, such as CNAME, PKEY or GNS2DNS.
7647 Resolving a record of any of these types will only work if the respective
7648 record type is specified in the request, as the GNS resolver will
7649 otherwise follow the delegation and return the records from the
7650 respective destination, instead of the delegating record.
7651 @item only_cached This argument should typically be set to
7652 @code{GNUNET_NO}. Setting it to @code{GNUNET_YES} disables resolution via
7653 the overlay network.
7654 @item shorten_zone_key If GNS encounters new names during resolution,
7655 their respective zones can automatically be learned and added to the
7656 "shorten zone". If this is desired, clients must pass the private key of
7657 the shorten zone. If NULL is passed, shortening is disabled.
7658 @item proc This argument identifies
7659 the function to call with the result. It is given proc_cls, the number of
7660 records found (possibly zero) and the array of the records as arguments.
7661 proc will only be called once. After proc,> has been called, the lookup
7662 must no longer be canceled.
7663 @item proc_cls The closure for proc.
7664 @end table
7665
7666 @node Accessing the records
7667 @subsubsection Accessing the records
7668
7669
7670
7671 The @code{libgnunetgnsrecord} library provides an API to manipulate the
7672 GNS record array that is given to proc. In particular, it offers
7673 functions such as converting record values to human-readable
7674 strings (and back). However, most @code{libgnunetgnsrecord} functions are
7675 not interesting to GNS client applications.
7676
7677 For DNS records, the @code{libgnunetdnsparser} library provides
7678 functions for parsing (and serializing) common types of DNS records.
7679
7680 @node Creating records
7681 @subsubsection Creating records
7682
7683
7684
7685 Creating GNS records is typically done by building the respective record
7686 information (possibly with the help of @code{libgnunetgnsrecord} and
7687 @code{libgnunetdnsparser}) and then using the @code{libgnunetnamestore} to
7688 publish the information. The GNS API is not involved in this
7689 operation.
7690
7691 @node Future work
7692 @subsubsection Future work
7693
7694
7695
7696 In the future, we want to expand @code{libgnunetgns} to allow
7697 applications to observe shortening operations performed during GNS
7698 resolution, for example so that users can receive visual feedback when
7699 this happens.
7700
7701 @node libgnunetgnsrecord
7702 @subsection libgnunetgnsrecord
7703
7704
7705
7706 The @code{libgnunetgnsrecord} library is used to manipulate GNS
7707 records (in plaintext or in their encrypted format).
7708 Applications mostly interact with @code{libgnunetgnsrecord} by using the
7709 functions to convert GNS record values to strings or vice-versa, or to
7710 lookup a GNS record type number by name (or vice-versa).
7711 The library also provides various other functions that are mostly
7712 used internally within GNS, such as converting keys to names, checking for
7713 expiration, encrypting GNS records to GNS blocks, verifying GNS block
7714 signatures and decrypting GNS records from GNS blocks.
7715
7716 We will now discuss the four commonly used functions of the API.@
7717 @code{libgnunetgnsrecord} does not perform these operations itself,
7718 but instead uses plugins to perform the operation.
7719 GNUnet includes plugins to support common DNS record types as well as
7720 standard GNS record types.
7721
7722 @menu
7723 * Value handling::
7724 * Type handling::
7725 @end menu
7726
7727 @node Value handling
7728 @subsubsection Value handling
7729
7730
7731
7732 @code{GNUNET_GNSRECORD_value_to_string} can be used to convert
7733 the (binary) representation of a GNS record value to a human readable,
7734 0-terminated UTF-8 string.
7735 NULL is returned if the specified record type is not supported by any
7736 available plugin.
7737
7738 @code{GNUNET_GNSRECORD_string_to_value} can be used to try to convert a
7739 human readable string to the respective (binary) representation of
7740 a GNS record value.
7741
7742 @node Type handling
7743 @subsubsection Type handling
7744
7745
7746
7747 @code{GNUNET_GNSRECORD_typename_to_number} can be used to obtain the
7748 numeric value associated with a given typename. For example, given the
7749 typename "A" (for DNS A reocrds), the function will return the number 1.
7750 A list of common DNS record types is
7751 @uref{http://en.wikipedia.org/wiki/List_of_DNS_record_types, here}.
7752 Note that not all DNS record types are supported by GNUnet GNSRECORD
7753 plugins at this time.
7754
7755 @code{GNUNET_GNSRECORD_number_to_typename} can be used to obtain the
7756 typename associated with a given numeric value.
7757 For example, given the type number 1, the function will return the
7758 typename "A".
7759
7760 @node GNS plugins
7761 @subsection GNS plugins
7762
7763
7764
7765 Adding a new GNS record type typically involves writing (or extending) a
7766 GNSRECORD plugin. The plugin needs to implement the
7767 @code{gnunet_gnsrecord_plugin.h} API which provides basic functions that
7768 are needed by GNSRECORD to convert typenames and values of the respective
7769 record type to strings (and back).
7770 These gnsrecord plugins are typically implemented within their respective
7771 subsystems.
7772 Examples for such plugins can be found in the GNSRECORD, GNS and
7773 CONVERSATION subsystems.
7774
7775 The @code{libgnunetgnsrecord} library is then used to locate, load and
7776 query the appropriate gnsrecord plugin.
7777 Which plugin is appropriate is determined by the record type (which is
7778 just a 32-bit integer). The @code{libgnunetgnsrecord} library loads all
7779 block plugins that are installed at the local peer and forwards the
7780 application request to the plugins. If the record type is not
7781 supported by the plugin, it should simply return an error code.
7782
7783 The central functions of the block APIs (plugin and main library) are the
7784 same four functions for converting between values and strings, and
7785 typenames and numbers documented in the previous subsection.
7786
7787 @node The GNS Client-Service Protocol
7788 @subsection The GNS Client-Service Protocol
7789
7790
7791 The GNS client-service protocol consists of two simple messages, the
7792 @code{LOOKUP} message and the @code{LOOKUP_RESULT}. Each @code{LOOKUP}
7793 message contains a unique 32-bit identifier, which will be included in the
7794 corresponding response. Thus, clients can send many lookup requests in
7795 parallel and receive responses out-of-order.
7796 A @code{LOOKUP} request also includes the public key of the GNS zone,
7797 the desired record type and fields specifying whether shortening is
7798 enabled or networking is disabled. Finally, the @code{LOOKUP} message
7799 includes the name to be resolved.
7800
7801 The response includes the number of records and the records themselves
7802 in the format created by @code{GNUNET_GNSRECORD_records_serialize}.
7803 They can thus be deserialized using
7804 @code{GNUNET_GNSRECORD_records_deserialize}.
7805
7806 @node Hijacking the DNS-Traffic using gnunet-service-dns
7807 @subsection Hijacking the DNS-Traffic using gnunet-service-dns
7808
7809
7810
7811 This section documents how the gnunet-service-dns (and the
7812 gnunet-helper-dns) intercepts DNS queries from the local system.
7813 This is merely one method for how we can obtain GNS queries.
7814 It is also possible to change @code{resolv.conf} to point to a machine
7815 running @code{gnunet-dns2gns} or to modify libc's name system switch
7816 (NSS) configuration to include a GNS resolution plugin.
7817 The method described in this chapter is more of a last-ditch catch-all
7818 approach.
7819
7820 @code{gnunet-service-dns} enables intercepting DNS traffic using policy
7821 based routing.
7822 We MARK every outgoing DNS-packet if it was not sent by our application.
7823 Using a second routing table in the Linux kernel these marked packets are
7824 then routed through our virtual network interface and can thus be
7825 captured unchanged.
7826
7827 Our application then reads the query and decides how to handle it.
7828 If the query can be addressed via GNS, it is passed to
7829 @code{gnunet-service-gns} and resolved internally using GNS.
7830 In the future, a reverse query for an address of the configured virtual
7831 network could be answered with records kept about previous forward
7832 queries.
7833 Queries that are not hijacked by some application using the DNS service
7834 will be sent to the original recipient.
7835 The answer to the query will always be sent back through the virtual
7836 interface with the original nameserver as source address.
7837
7838
7839 @menu
7840 * Network Setup Details::
7841 @end menu
7842
7843 @node Network Setup Details
7844 @subsubsection Network Setup Details
7845
7846
7847
7848 The DNS interceptor adds the following rules to the Linux kernel:
7849 @example
7850 iptables -t mangle -I OUTPUT 1 -p udp --sport $LOCALPORT --dport 53 \
7851 -j ACCEPT iptables -t mangle -I OUTPUT 2 -p udp --dport 53 -j MARK \
7852 --set-mark 3 ip rule add fwmark 3 table2 ip route add default via \
7853 $VIRTUALDNS table2
7854 @end example
7855
7856 @c FIXME: Rewrite to reflect display which is no longer content by line
7857 @c FIXME: due to the < 74 characters limit.
7858 Line 1 makes sure that all packets coming from a port our application
7859 opened beforehand (@code{$LOCALPORT}) will be routed normally.
7860 Line 2 marks every other packet to a DNS-Server with mark 3 (chosen
7861 arbitrarily). The third line adds a routing policy based on this mark
7862 3 via the routing table.
7863
7864 @node Serving DNS lookups via GNS on W32
7865 @subsection Serving DNS lookups via GNS on W32
7866
7867
7868
7869 This section documents how the libw32nsp (and
7870 gnunet-gns-helper-service-w32) do DNS resolutions of DNS queries on the
7871 local system. This only applies to GNUnet running on W32.
7872
7873 W32 has a concept of "Namespaces" and "Namespace providers".
7874 These are used to present various name systems to applications in a
7875 generic way.
7876 Namespaces include DNS, mDNS, NLA and others. For each namespace any
7877 number of providers could be registered, and they are queried in an order
7878 of priority (which is adjustable).
7879
7880 Applications can resolve names by using WSALookupService*() family of
7881 functions.
7882
7883 However, these are WSA-only facilities. Common BSD socket functions for
7884 namespace resolutions are gethostbyname and getaddrinfo (among others).
7885 These functions are implemented internally (by default - by mswsock,
7886 which also implements the default DNS provider) as wrappers around
7887 WSALookupService*() functions (see "Sample Code for a Service Provider"
7888 on MSDN).
7889
7890 On W32 GNUnet builds a libw32nsp - a namespace provider, which can then be
7891 installed into the system by using w32nsp-install (and uninstalled by
7892 w32nsp-uninstall), as described in "Installation Handbook".
7893
7894 libw32nsp is very simple and has almost no dependencies. As a response to
7895 NSPLookupServiceBegin(), it only checks that the provider GUID passed to
7896 it by the caller matches GNUnet DNS Provider GUID,
7897 then connects to
7898 gnunet-gns-helper-service-w32 at 127.0.0.1:5353 (hardcoded) and sends the
7899 name resolution request there, returning the connected socket to the
7900 caller.
7901
7902 When the caller invokes NSPLookupServiceNext(), libw32nsp reads a
7903 completely formed reply from that socket, unmarshalls it, then gives
7904 it back to the caller.
7905
7906 At the moment gnunet-gns-helper-service-w32 is implemented to ever give
7907 only one reply, and subsequent calls to NSPLookupServiceNext() will fail
7908 with WSA_NODATA (first call to NSPLookupServiceNext() might also fail if
7909 GNS failed to find the name, or there was an error connecting to it).
7910
7911 gnunet-gns-helper-service-w32 does most of the processing:
7912
7913 @itemize @bullet
7914 @item Maintains a connection to GNS.
7915 @item Reads GNS config and loads appropriate keys.
7916 @item Checks service GUID and decides on the type of record to look up,
7917 refusing to make a lookup outright when unsupported service GUID is
7918 passed.
7919 @item Launches the lookup
7920 @end itemize
7921
7922 When lookup result arrives, gnunet-gns-helper-service-w32 forms a complete
7923 reply (including filling a WSAQUERYSETW structure and, possibly, a binary
7924 blob with a hostent structure for gethostbyname() client), marshalls it,
7925 and sends it back to libw32nsp. If no records were found, it sends an
7926 empty header.
7927
7928 This works for most normal applications that use gethostbyname() or
7929 getaddrinfo() to resolve names, but fails to do anything with
7930 applications that use alternative means of resolving names (such as
7931 sending queries to a DNS server directly by themselves).
7932 This includes some of well known utilities, like "ping" and "nslookup".
7933
7934 @node Importing DNS Zones into GNS
7935 @subsection Importing DNS Zones into GNS
7936
7937 This section discusses the challenges and problems faced when writing the
7938 Ascension tool. It also takes a look at possible improvements in the
7939 future.
7940
7941 Consider the following diagram that shows the workflow of Ascension:
7942
7943 @image{images/ascension_ssd,6in,,Ascensions workflow}
7944
7945 Further the interaction between components of GNUnet are shown in the diagram
7946 below:
7947 @center @image{images/ascension_interaction,,6in,Ascensions workflow}
7948
7949 @menu
7950 * Conversions between DNS and GNS::
7951 * DNS Zone Size::
7952 * Performance::
7953 @end menu
7954
7955 @cindex DNS Conversion
7956 @node Conversions between DNS and GNS
7957 @subsubsection Conversions between DNS and GNS
7958
7959 The differences between the two name systems lies in the details and is not
7960 always transparent.  For instance an SRV record is converted to a BOX record
7961 which is unique to GNS.
7962
7963 This is done by converting to a BOX record from an existing SRV record:
7964
7965 @example
7966 # SRV
7967 # _service._proto.name. TTL class SRV priority weight port target
7968 _sip._tcp.example.com. 14000 IN SRV     0 0 5060 www.example.com.
7969 # BOX
7970 # TTL BOX flags port protocol recordtype priority weight port target
7971 14000 BOX n 5060 6 33 0 0 5060 www.example.com
7972 @end example
7973
7974 Other records that need to undergo such transformation is the MX record type,
7975 as well as the SOA record type.
7976
7977 Transformation of a SOA record into GNS works as described in the
7978 following example. Very important to note are the rname and mname keys.
7979
7980 @example
7981 # BIND syntax for a clean SOA record
7982 @   IN SOA master.example.com. hostmaster.example.com. (
7983     2017030300 ; serial
7984     3600       ; refresh
7985     1800       ; retry
7986     604800     ; expire
7987     600 )      ; ttl
7988 # Recordline for adding the record
7989 $ gnunet-namestore -z example.com -a -n @ -t SOA -V \
7990     rname=master.example.com mname=hostmaster.example.com  \
7991     2017030300,3600,1800,604800,600 -e 7200s
7992 @end example
7993
7994 The transformation of MX records is done in a simple way.
7995 @example
7996 # mail.example.com. 3600 IN MX 10 mail.example.com.
7997 $ gnunet-namestore -z example.com -n mail -R 3600 MX n 10,mail
7998 @end example
7999
8000 Finally, one of the biggest struggling points were the NS records that are
8001 found in top level domain zones. The intended behaviour for those is to add
8002 GNS2DNS records for those so that gnunet-gns can resolve records for those
8003 domains on its own. Those require the values from DNS GLUE records, provided
8004 they are within the same zone.
8005
8006 The following two examples show one record with a GLUE record and the other one
8007 does not have a GLUE record. This takes place in the 'com' TLD.
8008
8009 @example
8010 # ns1.example.com 86400 IN A 127.0.0.1
8011 # example.com 86400 IN NS ns1.example.com.
8012 $ gnunet-namestore -z com -n example -R 86400 GNS2DNS n \
8013     example.com@@127.0.0.1
8014
8015 # example.com 86400 IN NS ns1.example.org.
8016 $ gnunet-namestore -z com -n example -R 86400 GNS2DNS n \
8017     example.com@@ns1.example.org
8018 @end example
8019
8020 As you can see, one of the GNS2DNS records has an IP address listed and the
8021 other one a DNS name. For the first one there is a GLUE record to do the
8022 translation directly and the second one will issue another DNS query to figure
8023 out the IP of ns1.example.org.
8024
8025 A solution was found by creating a hierarchical zone structure in GNS and linking
8026 the zones using PKEY records to one another. This allows the resolution of the
8027 name servers to work within GNS while not taking control over unwanted zones.
8028
8029 Currently the following record types are supported:
8030 @itemize @bullet
8031 @item A
8032 @item AAAA
8033 @item CNAME
8034 @item MX
8035 @item NS
8036 @item SRV
8037 @item TXT
8038 @end itemize
8039
8040 This is not due to technical limitations but rather a practical ones. The
8041 problem occurs with DNSSEC enabled DNS zones. As records within those zones are
8042 signed periodically, and every new signature is an update to the zone, there are
8043 many revisions of zones. This results in a problem with bigger zones as there
8044 are lots of records that have been signed again but no major changes.  Also
8045 trying to add records that are unknown that require a different format take time
8046 as they cause a CLI call of the namestore.  Furthermore certain record types
8047 need transformation into a GNS compatible format which, depending on the record
8048 type, takes more time.
8049
8050 Further a blacklist was added to drop for instance DNSSEC related records. Also
8051 if a record type is neither in the white list nor the blacklist it is considered
8052 as a loss of data and a message is shown to the user. This helps with
8053 transparency and also with contributing, as the not supported record types can
8054 then be added accordingly.
8055
8056 @node DNS Zone Size
8057 @subsubsection DNS Zone Size
8058 Another very big problem exists with very large zones. When migrating a small
8059 zone the delay between adding of records and their expiry is negligible. However
8060 when working with big zones that easily have more than a few million records
8061 this delay becomes a problem.
8062
8063 Records will start to expire well before the zone has finished migrating. This
8064 is usually not a problem but can cause a high CPU load when a peer is restarted
8065 and the records have expired.
8066
8067 A good solution has not been found yet. One of the idea that floated around was
8068 that the records should be added with the s (shadow) flag to keep the records
8069 resolvable even if they expired. However this would introduce the problem of how
8070 to detect if a record has been removed from the zone and would require deletion
8071 of said record(s).
8072
8073 Another problem that still persists is how to refresh records. Expired records
8074 are still displayed when calling gnunet-namestore but do not resolve with
8075 gnunet-gns. Zonemaster will sign the expired records again and make sure that
8076 the records are still valid. With a recent change this was fixed as gnunet-gns
8077 to improve the suffix lookup which allows for a fast lookup even with thousands
8078 of local egos.
8079
8080 Currently the pace of adding records in general is around 10 records per second.
8081 Crypto is the upper limit for adding of records. The performance of your machine
8082 can be tested with the perf_crypto_* tools. There is still a big discrepancy
8083 between the pace of Ascension and the theoretical limit.
8084
8085 A performance metric for measuring improvements has not yet been implemented in
8086 Ascension.
8087
8088 @node Performance
8089 @subsubsection Performance
8090 The performance when migrating a zone using the Ascension tool is limited by a
8091 handful of factors. First of all ascension is written in Python3 and calls the
8092 CLI tools of GNUnet. This is comparable to a fork and exec call which costs a
8093 few CPU cycles. Furthermore all the records that are added to the same
8094 label are signed using the zones private key. This signing operation is very
8095 resource heavy and was optimized during development by adding the '-R'
8096 (Recordline) option to gnunet-namestore which allows to specify multiple records
8097 using the CLI tool. Assuming that in a TLD zone every domain has at least two
8098 name servers this halves the amount of signatures needed.
8099
8100 Another improvement that could be made is with the addition of multiple threads
8101 or using asynchronous subprocesses when opening the GNUnet CLI tools. This could
8102 be implemented by simply creating more workers in the program but performance
8103 improvements were not tested.
8104
8105 Ascension was tested using different hardware and database backends. Performance
8106 differences between SQLite and postgresql are marginal and almost non existent.
8107 What did make a huge impact on record adding performance was the storage medium.
8108 On a traditional mechanical hard drive adding of records were slow compared to a
8109 solid state disk.
8110
8111 In conclusion there are many bottlenecks still around in the program, namely the
8112 single threaded implementation and inefficient, sequential calls of
8113 gnunet-namestore. In the future a solution that uses the C API would be cleaner
8114 and better.
8115
8116 @cindex GNS Namecache
8117 @node GNS Namecache
8118 @section GNS Namecache
8119
8120 The NAMECACHE subsystem is responsible for caching (encrypted) resolution
8121 results of the GNU Name System (GNS). GNS makes zone information available
8122 to other users via the DHT. However, as accessing the DHT for every
8123 lookup is expensive (and as the DHT's local cache is lost whenever the
8124 peer is restarted), GNS uses the NAMECACHE as a more persistent cache for
8125 DHT lookups.
8126 Thus, instead of always looking up every name in the DHT, GNS first
8127 checks if the result is already available locally in the NAMECACHE.
8128 Only if there is no result in the NAMECACHE, GNS queries the DHT.
8129 The NAMECACHE stores data in the same (encrypted) format as the DHT.
8130 It thus makes no sense to iterate over all items in the
8131 NAMECACHE --- the NAMECACHE does not have a way to provide the keys
8132 required to decrypt the entries.
8133
8134 Blocks in the NAMECACHE share the same expiration mechanism as blocks in
8135 the DHT --- the block expires wheneever any of the records in
8136 the (encrypted) block expires.
8137 The expiration time of the block is the only information stored in
8138 plaintext. The NAMECACHE service internally performs all of the required
8139 work to expire blocks, clients do not have to worry about this.
8140 Also, given that NAMECACHE stores only GNS blocks that local users
8141 requested, there is no configuration option to limit the size of the
8142 NAMECACHE. It is assumed to be always small enough (a few MB) to fit on
8143 the drive.
8144
8145 The NAMECACHE supports the use of different database backends via a
8146 plugin API.
8147
8148 @menu
8149 * libgnunetnamecache::
8150 * The NAMECACHE Client-Service Protocol::
8151 * The NAMECACHE Plugin API::
8152 @end menu
8153
8154 @node libgnunetnamecache
8155 @subsection libgnunetnamecache
8156
8157
8158
8159 The NAMECACHE API consists of five simple functions. First, there is
8160 @code{GNUNET_NAMECACHE_connect} to connect to the NAMECACHE service.
8161 This returns the handle required for all other operations on the
8162 NAMECACHE. Using @code{GNUNET_NAMECACHE_block_cache} clients can insert a
8163 block into the cache.
8164 @code{GNUNET_NAMECACHE_lookup_block} can be used to lookup blocks that
8165 were stored in the NAMECACHE. Both operations can be canceled using
8166 @code{GNUNET_NAMECACHE_cancel}. Note that canceling a
8167 @code{GNUNET_NAMECACHE_block_cache} operation can result in the block
8168 being stored in the NAMECACHE --- or not. Cancellation primarily ensures
8169 that the continuation function with the result of the operation will no
8170 longer be invoked.
8171 Finally, @code{GNUNET_NAMECACHE_disconnect} closes the connection to the
8172 NAMECACHE.
8173
8174 The maximum size of a block that can be stored in the NAMECACHE is
8175 @code{GNUNET_NAMECACHE_MAX_VALUE_SIZE}, which is defined to be 63 kB.
8176
8177 @node The NAMECACHE Client-Service Protocol
8178 @subsection The NAMECACHE Client-Service Protocol
8179
8180
8181
8182 All messages in the NAMECACHE IPC protocol start with the
8183 @code{struct GNUNET_NAMECACHE_Header} which adds a request
8184 ID (32-bit integer) to the standard message header.
8185 The request ID is used to match requests with the
8186 respective responses from the NAMECACHE, as they are allowed to happen
8187 out-of-order.
8188
8189
8190 @menu
8191 * Lookup::
8192 * Store::
8193 @end menu
8194
8195 @node Lookup
8196 @subsubsection Lookup
8197
8198
8199
8200 The @code{struct LookupBlockMessage} is used to lookup a block stored in
8201 the cache.
8202 It contains the query hash. The NAMECACHE always responds with a
8203 @code{struct LookupBlockResponseMessage}. If the NAMECACHE has no
8204 response, it sets the expiration time in the response to zero.
8205 Otherwise, the response is expected to contain the expiration time, the
8206 ECDSA signature, the derived key and the (variable-size) encrypted data
8207 of the block.
8208
8209 @node Store
8210 @subsubsection Store
8211
8212
8213
8214 The @code{struct BlockCacheMessage} is used to cache a block in the
8215 NAMECACHE.
8216 It has the same structure as the @code{struct LookupBlockResponseMessage}.
8217 The service responds with a @code{struct BlockCacheResponseMessage} which
8218 contains the result of the operation (success or failure).
8219 In the future, we might want to make it possible to provide an error
8220 message as well.
8221
8222 @node The NAMECACHE Plugin API
8223 @subsection The NAMECACHE Plugin API
8224
8225
8226 The NAMECACHE plugin API consists of two functions, @code{cache_block} to
8227 store a block in the database, and @code{lookup_block} to lookup a block
8228 in the database.
8229
8230
8231 @menu
8232 * Lookup2::
8233 * Store2::
8234 @end menu
8235
8236 @node Lookup2
8237 @subsubsection Lookup2
8238
8239
8240
8241 The @code{lookup_block} function is expected to return at most one block
8242 to the iterator, and return @code{GNUNET_NO} if there were no non-expired
8243 results.
8244 If there are multiple non-expired results in the cache, the lookup is
8245 supposed to return the result with the largest expiration time.
8246
8247 @node Store2
8248 @subsubsection Store2
8249
8250
8251
8252 The @code{cache_block} function is expected to try to store the block in
8253 the database, and return @code{GNUNET_SYSERR} if this was not possible
8254 for any reason.
8255 Furthermore, @code{cache_block} is expected to implicitly perform cache
8256 maintenance and purge blocks from the cache that have expired. Note that
8257 @code{cache_block} might encounter the case where the database already has
8258 another block stored under the same key. In this case, the plugin must
8259 ensure that the block with the larger expiration time is preserved.
8260 Obviously, this can done either by simply adding new blocks and selecting
8261 for the most recent expiration time during lookup, or by checking which
8262 block is more recent during the store operation.
8263
8264 @cindex REVOCATION Subsystem
8265 @node REVOCATION Subsystem
8266 @section REVOCATION Subsystem
8267
8268
8269 The REVOCATION subsystem is responsible for key revocation of Egos.
8270 If a user learns that theis private key has been compromised or has lost
8271 it, they can use the REVOCATION system to inform all of the other users
8272 that their private key is no longer valid.
8273 The subsystem thus includes ways to query for the validity of keys and to
8274 propagate revocation messages.
8275
8276 @menu
8277 * Dissemination::
8278 * Revocation Message Design Requirements::
8279 * libgnunetrevocation::
8280 * The REVOCATION Client-Service Protocol::
8281 * The REVOCATION Peer-to-Peer Protocol::
8282 @end menu
8283
8284 @node Dissemination
8285 @subsection Dissemination
8286
8287
8288
8289 When a revocation is performed, the revocation is first of all
8290 disseminated by flooding the overlay network.
8291 The goal is to reach every peer, so that when a peer needs to check if a
8292 key has been revoked, this will be purely a local operation where the
8293 peer looks at its local revocation list. Flooding the network is also the
8294 most robust form of key revocation --- an adversary would have to control
8295 a separator of the overlay graph to restrict the propagation of the
8296 revocation message. Flooding is also very easy to implement --- peers that
8297 receive a revocation message for a key that they have never seen before
8298 simply pass the message to all of their neighbours.
8299
8300 Flooding can only distribute the revocation message to peers that are
8301 online.
8302 In order to notify peers that join the network later, the revocation
8303 service performs efficient set reconciliation over the sets of known
8304 revocation messages whenever two peers (that both support REVOCATION
8305 dissemination) connect.
8306 The SET service is used to perform this operation efficiently.
8307
8308 @node Revocation Message Design Requirements
8309 @subsection Revocation Message Design Requirements
8310
8311
8312
8313 However, flooding is also quite costly, creating O(|E|) messages on a
8314 network with |E| edges.
8315 Thus, revocation messages are required to contain a proof-of-work, the
8316 result of an expensive computation (which, however, is cheap to verify).
8317 Only peers that have expended the CPU time necessary to provide
8318 this proof will be able to flood the network with the revocation message.
8319 This ensures that an attacker cannot simply flood the network with
8320 millions of revocation messages. The proof-of-work required by GNUnet is
8321 set to take days on a typical PC to compute; if the ability to quickly
8322 revoke a key is needed, users have the option to pre-compute revocation
8323 messages to store off-line and use instantly after their key has expired.
8324
8325 Revocation messages must also be signed by the private key that is being
8326 revoked. Thus, they can only be created while the private key is in the
8327 possession of the respective user. This is another reason to create a
8328 revocation message ahead of time and store it in a secure location.
8329
8330 @node libgnunetrevocation
8331 @subsection libgnunetrevocation
8332
8333
8334
8335 The REVOCATION API consists of two parts, to query and to issue
8336 revocations.
8337
8338
8339 @menu
8340 * Querying for revoked keys::
8341 * Preparing revocations::
8342 * Issuing revocations::
8343 @end menu
8344
8345 @node Querying for revoked keys
8346 @subsubsection Querying for revoked keys
8347
8348
8349
8350 @code{GNUNET_REVOCATION_query} is used to check if a given ECDSA public
8351 key has been revoked.
8352 The given callback will be invoked with the result of the check.
8353 The query can be canceled using @code{GNUNET_REVOCATION_query_cancel} on
8354 the return value.
8355
8356 @node Preparing revocations
8357 @subsubsection Preparing revocations
8358
8359
8360
8361 It is often desirable to create a revocation record ahead-of-time and
8362 store it in an off-line location to be used later in an emergency.
8363 This is particularly true for GNUnet revocations, where performing the
8364 revocation operation itself is computationally expensive and thus is
8365 likely to take some time.
8366 Thus, if users want the ability to perform revocations quickly in an
8367 emergency, they must pre-compute the revocation message.
8368 The revocation API enables this with two functions that are used to
8369 compute the revocation message, but not trigger the actual revocation
8370 operation.
8371
8372 @code{GNUNET_REVOCATION_check_pow} should be used to calculate the
8373 proof-of-work required in the revocation message. This function takes the
8374 public key, the required number of bits for the proof of work (which in
8375 GNUnet is a network-wide constant) and finally a proof-of-work number as
8376 arguments.
8377 The function then checks if the given proof-of-work number is a valid
8378 proof of work for the given public key. Clients preparing a revocation
8379 are expected to call this function repeatedly (typically with a
8380 monotonically increasing sequence of numbers of the proof-of-work number)
8381 until a given number satisfies the check.
8382 That number should then be saved for later use in the revocation
8383 operation.
8384
8385 @code{GNUNET_REVOCATION_sign_revocation} is used to generate the
8386 signature that is required in a revocation message.
8387 It takes the private key that (possibly in the future) is to be revoked
8388 and returns the signature.
8389 The signature can again be saved to disk for later use, which will then
8390 allow performing a revocation even without access to the private key.
8391
8392 @node Issuing revocations
8393 @subsubsection Issuing revocations
8394
8395
8396 Given a ECDSA public key, the signature from @code{GNUNET_REVOCATION_sign}
8397 and the proof-of-work,
8398 @code{GNUNET_REVOCATION_revoke} can be used to perform the
8399 actual revocation. The given callback is called upon completion of the
8400 operation. @code{GNUNET_REVOCATION_revoke_cancel} can be used to stop the
8401 library from calling the continuation; however, in that case it is
8402 undefined whether or not the revocation operation will be executed.
8403
8404 @node The REVOCATION Client-Service Protocol
8405 @subsection The REVOCATION Client-Service Protocol
8406
8407
8408 The REVOCATION protocol consists of four simple messages.
8409
8410 A @code{QueryMessage} containing a public ECDSA key is used to check if a
8411 particular key has been revoked. The service responds with a
8412 @code{QueryResponseMessage} which simply contains a bit that says if the
8413 given public key is still valid, or if it has been revoked.
8414
8415 The second possible interaction is for a client to revoke a key by
8416 passing a @code{RevokeMessage} to the service. The @code{RevokeMessage}
8417 contains the ECDSA public key to be revoked, a signature by the
8418 corresponding private key and the proof-of-work, The service responds
8419 with a @code{RevocationResponseMessage} which can be used to indicate
8420 that the @code{RevokeMessage} was invalid (i.e. proof of work incorrect),
8421 or otherwise indicates that the revocation has been processed
8422 successfully.
8423
8424 @node The REVOCATION Peer-to-Peer Protocol
8425 @subsection The REVOCATION Peer-to-Peer Protocol
8426
8427
8428
8429 Revocation uses two disjoint ways to spread revocation information among
8430 peers.
8431 First of all, P2P gossip exchanged via CORE-level neighbours is used to
8432 quickly spread revocations to all connected peers.
8433 Second, whenever two peers (that both support revocations) connect,
8434 the SET service is used to compute the union of the respective revocation
8435 sets.
8436
8437 In both cases, the exchanged messages are @code{RevokeMessage}s which
8438 contain the public key that is being revoked, a matching ECDSA signature,
8439 and a proof-of-work.
8440 Whenever a peer learns about a new revocation this way, it first
8441 validates the signature and the proof-of-work, then stores it to disk
8442 (typically to a file $GNUNET_DATA_HOME/revocation.dat) and finally
8443 spreads the information to all directly connected neighbours.
8444
8445 For computing the union using the SET service, the peer with the smaller
8446 hashed peer identity will connect (as a "client" in the two-party set
8447 protocol) to the other peer after one second (to reduce traffic spikes
8448 on connect) and initiate the computation of the set union.
8449 All revocation services use a common hash to identify the SET operation
8450 over revocation sets.
8451
8452 The current implementation accepts revocation set union operations from
8453 all peers at any time; however, well-behaved peers should only initiate
8454 this operation once after establishing a connection to a peer with a
8455 larger hashed peer identity.
8456
8457 @cindex FS
8458 @cindex FS Subsystem
8459 @node File-sharing (FS) Subsystem
8460 @section File-sharing (FS) Subsystem
8461
8462
8463
8464 This chapter describes the details of how the file-sharing service works.
8465 As with all services, it is split into an API (libgnunetfs), the service
8466 process (gnunet-service-fs) and user interface(s).
8467 The file-sharing service uses the datastore service to store blocks and
8468 the DHT (and indirectly datacache) for lookups for non-anonymous
8469 file-sharing.
8470 Furthermore, the file-sharing service uses the block library (and the
8471 block fs plugin) for validation of DHT operations.
8472
8473 In contrast to many other services, libgnunetfs is rather complex since
8474 the client library includes a large number of high-level abstractions;
8475 this is necessary since the Fs service itself largely only operates on
8476 the block level.
8477 The FS library is responsible for providing a file-based abstraction to
8478 applications, including directories, meta data, keyword search,
8479 verification, and so on.
8480
8481 The method used by GNUnet to break large files into blocks and to use
8482 keyword search is called the
8483 "Encoding for Censorship Resistant Sharing" (ECRS).
8484 ECRS is largely implemented in the fs library; block validation is also
8485 reflected in the block FS plugin and the FS service.
8486 ECRS on-demand encoding is implemented in the FS service.
8487
8488 NOTE: The documentation in this chapter is quite incomplete.
8489
8490 @menu
8491 * Encoding for Censorship-Resistant Sharing (ECRS)::
8492 * File-sharing persistence directory structure::
8493 @end menu
8494
8495 @cindex ECRS
8496 @cindex Encoding for Censorship-Resistant Sharing
8497 @node Encoding for Censorship-Resistant Sharing (ECRS)
8498 @subsection Encoding for Censorship-Resistant Sharing (ECRS)
8499
8500
8501
8502 When GNUnet shares files, it uses a content encoding that is called ECRS,
8503 the Encoding for Censorship-Resistant Sharing.
8504 Most of ECRS is described in the (so far unpublished) research paper
8505 attached to this page. ECRS obsoletes the previous ESED and ESED II
8506 encodings which were used in GNUnet before version 0.7.0.
8507 The rest of this page assumes that the reader is familiar with the
8508 attached paper. What follows is a description of some minor extensions
8509 that GNUnet makes over what is described in the paper.
8510 The reason why these extensions are not in the paper is that we felt
8511 that they were obvious or trivial extensions to the original scheme and
8512 thus did not warrant space in the research report.
8513
8514 @menu
8515 * Namespace Advertisements::
8516 * KSBlocks::
8517 @end menu
8518
8519 @node Namespace Advertisements
8520 @subsubsection Namespace Advertisements
8521
8522
8523 @c %**FIXME: all zeroses -> ?
8524
8525 An @code{SBlock} with identifier all zeros is a signed
8526 advertisement for a namespace. This special @code{SBlock} contains
8527 metadata describing the content of the namespace.
8528 Instead of the name of the identifier for a potential update, it contains
8529 the identifier for the root of the namespace.
8530 The URI should always be empty. The @code{SBlock} is signed with the
8531 content provider's RSA private key (just like any other SBlock). Peers
8532 can search for @code{SBlock}s in order to find out more about a namespace.
8533
8534 @node KSBlocks
8535 @subsubsection KSBlocks
8536
8537
8538
8539 GNUnet implements @code{KSBlocks} which are @code{KBlocks} that, instead
8540 of encrypting a CHK and metadata, encrypt an @code{SBlock} instead.
8541 In other words, @code{KSBlocks} enable GNUnet to find @code{SBlocks}
8542 using the global keyword search.
8543 Usually the encrypted @code{SBlock} is a namespace advertisement.
8544 The rationale behind @code{KSBlock}s and @code{SBlock}s is to enable
8545 peers to discover namespaces via keyword searches, and, to associate
8546 useful information with namespaces. When GNUnet finds @code{KSBlocks}
8547 during a normal keyword search, it adds the information to an internal
8548 list of discovered namespaces. Users looking for interesting namespaces
8549 can then inspect this list, reducing the need for out-of-band discovery
8550 of namespaces.
8551 Naturally, namespaces (or more specifically, namespace advertisements) can
8552 also be referenced from directories, but @code{KSBlock}s should make it
8553 easier to advertise namespaces for the owner of the pseudonym since they
8554 eliminate the need to first create a directory.
8555
8556 Collections are also advertised using @code{KSBlock}s.
8557
8558 @c https://old.gnunet.org/sites/default/files/ecrs.pdf
8559
8560 @node File-sharing persistence directory structure
8561 @subsection File-sharing persistence directory structure
8562
8563
8564
8565 This section documents how the file-sharing library implements
8566 persistence of file-sharing operations and specifically the resulting
8567 directory structure.
8568 This code is only active if the @code{GNUNET_FS_FLAGS_PERSISTENCE} flag
8569 was set when calling @code{GNUNET_FS_start}.
8570 In this case, the file-sharing library will try hard to ensure that all
8571 major operations (searching, downloading, publishing, unindexing) are
8572 persistent, that is, can live longer than the process itself.
8573 More specifically, an operation is supposed to live until it is
8574 explicitly stopped.
8575
8576 If @code{GNUNET_FS_stop} is called before an operation has been stopped, a
8577 @code{SUSPEND} event is generated and then when the process calls
8578 @code{GNUNET_FS_start} next time, a @code{RESUME} event is generated.
8579 Additionally, even if an application crashes (segfault, SIGKILL, system
8580 crash) and hence @code{GNUNET_FS_stop} is never called and no
8581 @code{SUSPEND} events are generated, operations are still resumed (with
8582 @code{RESUME} events).
8583 This is implemented by constantly writing the current state of the
8584 file-sharing operations to disk.
8585 Specifically, the current state is always written to disk whenever
8586 anything significant changes (the exception are block-wise progress in
8587 publishing and unindexing, since those operations would be slowed down
8588 significantly and can be resumed cheaply even without detailed
8589 accounting).
8590 Note that if the process crashes (or is killed) during a serialization
8591 operation, FS does not guarantee that this specific operation is
8592 recoverable (no strict transactional semantics, again for performance
8593 reasons). However, all other unrelated operations should resume nicely.
8594
8595 Since we need to serialize the state continuously and want to recover as
8596 much as possible even after crashing during a serialization operation,
8597 we do not use one large file for serialization.
8598 Instead, several directories are used for the various operations.
8599 When @code{GNUNET_FS_start} executes, the master directories are scanned
8600 for files describing operations to resume.
8601 Sometimes, these operations can refer to related operations in child
8602 directories which may also be resumed at this point.
8603 Note that corrupted files are cleaned up automatically.
8604 However, dangling files in child directories (those that are not
8605 referenced by files from the master directories) are not automatically
8606 removed.
8607
8608 Persistence data is kept in a directory that begins with the "STATE_DIR"
8609 prefix from the configuration file
8610 (by default, "$SERVICEHOME/persistence/") followed by the name of the
8611 client as given to @code{GNUNET_FS_start} (for example, "gnunet-gtk")
8612 followed by the actual name of the master or child directory.
8613
8614 The names for the master directories follow the names of the operations:
8615
8616 @itemize @bullet
8617 @item "search"
8618 @item "download"
8619 @item "publish"
8620 @item "unindex"
8621 @end itemize
8622
8623 Each of the master directories contains names (chosen at random) for each
8624 active top-level (master) operation.
8625 Note that a download that is associated with a search result is not a
8626 top-level operation.
8627
8628 In contrast to the master directories, the child directories are only
8629 consulted when another operation refers to them.
8630 For each search, a subdirectory (named after the master search
8631 synchronization file) contains the search results.
8632 Search results can have an associated download, which is then stored in
8633 the general "download-child" directory.
8634 Downloads can be recursive, in which case children are stored in
8635 subdirectories mirroring the structure of the recursive download
8636 (either starting in the master "download" directory or in the
8637 "download-child" directory depending on how the download was initiated).
8638 For publishing operations, the "publish-file" directory contains
8639 information about the individual files and directories that are part of
8640 the publication.
8641 However, this directory structure is flat and does not mirror the
8642 structure of the publishing operation.
8643 Note that unindex operations cannot have associated child operations.
8644
8645 @cindex REGEX subsystem
8646 @node REGEX Subsystem
8647 @section REGEX Subsystem
8648
8649
8650
8651 Using the REGEX subsystem, you can discover peers that offer a particular
8652 service using regular expressions.
8653 The peers that offer a service specify it using a regular expressions.
8654 Peers that want to patronize a service search using a string.
8655 The REGEX subsystem will then use the DHT to return a set of matching
8656 offerers to the patrons.
8657
8658 For the technical details, we have Max's defense talk and Max's Master's
8659 thesis.
8660
8661 @c An additional publication is under preparation and available to
8662 @c team members (in Git).
8663 @c FIXME: Where is the file? Point to it. Assuming that it's szengel2012ms
8664
8665 @menu
8666 * How to run the regex profiler::
8667 @end menu
8668
8669 @node How to run the regex profiler
8670 @subsection How to run the regex profiler
8671
8672
8673
8674 The gnunet-regex-profiler can be used to profile the usage of mesh/regex
8675 for a given set of regular expressions and strings.
8676 Mesh/regex allows you to announce your peer ID under a certain regex and
8677 search for peers matching a particular regex using a string.
8678 See @uref{https://bib.gnunet.org/full/date.html#2012_5f2, szengel2012ms} for a full
8679 introduction.
8680
8681 First of all, the regex profiler uses GNUnet testbed, thus all the
8682 implications for testbed also apply to the regex profiler
8683 (for example you need password-less ssh login to the machines listed in
8684 your hosts file).
8685
8686 @strong{Configuration}
8687
8688 Moreover, an appropriate configuration file is needed.
8689 Generally you can refer to the
8690 @file{contrib/regex_profiler_infiniband.conf} file in the sourcecode
8691 of GNUnet for an example configuration.
8692 In the following paragraph the important details are highlighted.
8693
8694 Announcing of the regular expressions is done by the
8695 gnunet-daemon-regexprofiler, therefore you have to make sure it is
8696 started, by adding it to the START_ON_DEMAND set of ARM:
8697
8698 @example
8699 [regexprofiler]
8700 START_ON_DEMAND = YES
8701 @end example
8702
8703 @noindent
8704 Furthermore you have to specify the location of the binary:
8705
8706 @example
8707 [regexprofiler]
8708 # Location of the gnunet-daemon-regexprofiler binary.
8709 BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler
8710 # Regex prefix that will be applied to all regular expressions and
8711 # search string.
8712 REGEX_PREFIX = "GNVPN-0001-PAD"
8713 @end example
8714
8715 @noindent
8716 When running the profiler with a large scale deployment, you probably
8717 want to reduce the workload of each peer.
8718 Use the following options to do this.
8719
8720 @example
8721 [dht]
8722 # Force network size estimation
8723 FORCE_NSE = 1
8724
8725 [dhtcache]
8726 DATABASE = heap
8727 # Disable RC-file for Bloom filter? (for benchmarking with limited IO
8728 # availability)
8729 DISABLE_BF_RC = YES
8730 # Disable Bloom filter entirely
8731 DISABLE_BF = YES
8732
8733 [nse]
8734 # Minimize proof-of-work CPU consumption by NSE
8735 WORKBITS = 1
8736 @end example
8737
8738 @noindent
8739 @strong{Options}
8740
8741 To finally run the profiler some options and the input data need to be
8742 specified on the command line.
8743
8744 @example
8745 gnunet-regex-profiler -c config-file -d log-file -n num-links \
8746 -p path-compression-length -s search-delay -t matching-timeout \
8747 -a num-search-strings hosts-file policy-dir search-strings-file
8748 @end example
8749
8750 @noindent
8751 Where...
8752
8753 @itemize @bullet
8754 @item ... @code{config-file} means the configuration file created earlier.
8755 @item ... @code{log-file} is the file where to write statistics output.
8756 @item ... @code{num-links} indicates the number of random links between
8757 started peers.
8758 @item ... @code{path-compression-length} is the maximum path compression
8759 length in the DFA.
8760 @item ... @code{search-delay} time to wait between peers finished linking
8761 and starting to match strings.
8762 @item ... @code{matching-timeout} timeout after which to cancel the
8763 searching.
8764 @item ... @code{num-search-strings} number of strings in the
8765 search-strings-file.
8766 @item ... the @code{hosts-file} should contain a list of hosts for the
8767 testbed, one per line in the following format:
8768
8769 @itemize @bullet
8770 @item @code{user@@host_ip:port}
8771 @end itemize
8772 @item ... the @code{policy-dir} is a folder containing text files
8773 containing one or more regular expressions. A peer is started for each
8774 file in that folder and the regular expressions in the corresponding file
8775 are announced by this peer.
8776 @item ... the @code{search-strings-file} is a text file containing search
8777 strings, one in each line.
8778 @end itemize
8779
8780 @noindent
8781 You can create regular expressions and search strings for every AS in the
8782 Internet using the attached scripts. You need one of the
8783 @uref{http://data.caida.org/datasets/routing/routeviews-prefix2as/, CAIDA routeviews prefix2as}
8784 data files for this. Run
8785
8786 @example
8787 create_regex.py <filename> <output path>
8788 @end example
8789
8790 @noindent
8791 to create the regular expressions and
8792
8793 @example
8794 create_strings.py <input path> <outfile>
8795 @end example
8796
8797 @noindent
8798 to create a search strings file from the previously created
8799 regular expressions.
8800
8801 @cindex REST subsystem
8802 @node REST Subsystem
8803 @section REST Subsystem
8804
8805
8806
8807 Using the REST subsystem, you can expose REST-based APIs or services.
8808 The REST service is designed as a pluggable architecture.
8809 To create a new REST endpoint, simply add a library in the form
8810 ``plugin_rest_*''.
8811 The REST service will automatically load all REST plugins on startup.
8812
8813 @strong{Configuration}
8814
8815 The REST service can be configured in various ways.
8816 The reference config file can be found in
8817 @file{src/rest/rest.conf}:
8818 @example
8819 [rest]
8820 REST_PORT=7776
8821 REST_ALLOW_HEADERS=Authorization,Accept,Content-Type
8822 REST_ALLOW_ORIGIN=*
8823 REST_ALLOW_CREDENTIALS=true
8824 @end example
8825
8826 The port as well as
8827 @deffn{cross-origin resource sharing} (CORS)
8828 @end deffn
8829 headers that are supposed to be advertised by the rest service are
8830 configurable.
8831
8832 @menu
8833 * Namespace considerations::
8834 * Endpoint documentation::
8835 @end menu
8836
8837 @node Namespace considerations
8838 @subsection Namespace considerations
8839
8840 The @command{gnunet-rest-service} will load all plugins that are installed.
8841 As such it is important that the endpoint namespaces do not clash.
8842
8843 For example, plugin X might expose the endpoint ``/xxx'' while plugin Y
8844 exposes endpoint ``/xxx/yyy''.
8845 This is a problem if plugin X is also supposed to handle a call
8846 to ``/xxx/yyy''.
8847 Currently the REST service will not complain or warn about such clashes,
8848 so please make sure that endpoints are unambiguous.
8849
8850 @node Endpoint documentation
8851 @subsection Endpoint documentation
8852
8853 This is WIP. Endpoints should be documented appropriately.
8854 Preferably using annotations.
8855
8856
8857 @cindex RPS Subsystem
8858 @node RPS Subsystem
8859 @section RPS Subsystem
8860
8861 In literature, Random Peer Sampling (RPS) refers to the problem of
8862 reliably@footnote{"Reliable" in this context means having no bias,
8863 neither spatial, nor temporal, nor through malicious activity.} drawing
8864 random samples from an unstructured p2p network.
8865
8866 Doing so in a reliable manner is not only hard because of inherent
8867 problems but also because of possible malicious peers that could try to
8868 bias the selection.
8869
8870 It is useful for all kind of gossip protocols that require the selection
8871 of random peers in the whole network like gathering statistics,
8872 spreading and aggregating information in the network, load balancing and
8873 overlay topology management.
8874
8875 The approach chosen in the RPS service implementation in GNUnet follows
8876 the @uref{https://bib.gnunet.org/full/date.html\#2009_5f0, Brahms}
8877 design.
8878
8879 The current state is "work in progress". There are a lot of things that
8880 need to be done, primarily finishing the experimental evaluation and a
8881 re-design of the API.
8882
8883 The abstract idea is to subscribe to connect to/start the RPS service
8884 and request random peers that will be returned when they represent a
8885 random selection from the whole network with high probability.
8886
8887 An additional feature to the original Brahms-design is the selection of
8888 sub-groups: The GNUnet implementation of RPS enables clients to ask for
8889 random peers from a group that is defined by a common shared secret.
8890 (The secret could of course also be public, depending on the use-case.)
8891
8892 Another addition to the original protocol was made: The sampler
8893 mechanism that was introduced in Brahms was slightly adapted and used to
8894 actually sample the peers and returned to the client.
8895 This is necessary as the original design only keeps peers connected to
8896 random other peers in the network. In order to return random peers to
8897 client requests independently random, they cannot be drawn from the
8898 connected peers.
8899 The adapted sampler makes sure that each request for random peers is
8900 independent from the others.
8901
8902 @menu
8903 * Brahms::
8904 @end menu
8905
8906 @node Brahms
8907 @subsection Brahms
8908 The high-level concept of Brahms is two-fold: Combining push-pull gossip
8909 with locally fixing a assumed bias using cryptographic min-wise
8910 permutations.
8911 The central data structure is the view - a peer's current local sample.
8912 This view is used to select peers to push to and pull from.
8913 This simple mechanism can be biased easily. For this reason Brahms
8914 'fixes' the bias by using the so-called sampler. A data structure that
8915 takes a list of elements as input and outputs a random one of them
8916 independently of the frequency in the input set. Both an element that
8917 was put into the sampler a single time and an element that was put into
8918 it a million times have the same probability of being the output.
8919 This is achieved with exploiting min-wise independent
8920 permutations. In the RPS service we use HMACs: On the initialisation of a sampler
8921 element, a key is chosen at random. On each input the HMAC with the
8922 random key is computed. The sampler element keeps the element with the
8923 minimal HMAC.
8924
8925 In order to fix the bias in the view, a fraction of the elements in the
8926 view are sampled through the sampler from the random stream of peer IDs.
8927
8928 According to the theoretical analysis of Bortnikov et al. this suffices
8929 to keep the network connected and having random peers in the view.
8930