doc/handbook/chapters/developer.texi

   1 @c ***********************************************************************
   2 @node GNUnet Developer Handbook
   3 @chapter GNUnet Developer Handbook
   4
   5 This book is intended to be an introduction for programmers that want to
   6 extend the GNUnet framework. GNUnet is more than a simple peer-to-peer
   7 application.
   8
   9 For developers, GNUnet is:
  10
  11 @itemize @bullet
  12 @item developed by a community that believes in the GNU philosophy
  13 @item Free Software (Free as in Freedom), licensed under the
  14 GNU Affero General Public License
  15 (@uref{https://www.gnu.org/licenses/licenses.html#AGPL})
  16 @item A set of standards, including coding conventions and
  17 architectural rules
  18 @item A set of layered protocols, both specifying the communication
  19 between peers as well as the communication between components
  20 of a single peer
  21 @item A set of libraries with well-defined APIs suitable for
  22 writing extensions
  23 @end itemize
  24
  25 In particular, the architecture specifies that a peer consists of many
  26 processes communicating via protocols. Processes can be written in almost
  27 any language.
  28 @code{C}, @code{Java} and @code{Guile} APIs exist for accessing existing
  29 services and for writing extensions.
  30 It is possible to write extensions in other languages by
  31 implementing the necessary IPC protocols.
  32
  33 GNUnet can be extended and improved along many possible dimensions, and
  34 anyone interested in Free Software and Freedom-enhancing Networking is
  35 welcome to join the effort. This Developer Handbook attempts to provide
  36 an initial introduction to some of the key design choices and central
  37 components of the system.
  38 This part of the GNUnet documentation is far from complete,
  39 and we welcome informed contributions, be it in the form of
  40 new chapters, sections or insightful comments.
  41
  42 @menu
  43 * Developer Introduction::
  44 * Internal dependencies::
  45 * Code overview::
  46 * System Architecture::
  47 * Subsystem stability::
  48 * Naming conventions and coding style guide::
  49 * Build-system::
  50 * Developing extensions for GNUnet using the gnunet-ext template::
  51 * Writing testcases::
  52 * Building GNUnet and its dependencies::
  53 * TESTING library::
  54 * Performance regression analysis with Gauger::
  55 * TESTBED Subsystem::
  56 * libgnunetutil::
  57 * Automatic Restart Manager (ARM)::
  58 * TRANSPORT Subsystem::
  59 * NAT library::
  60 * Distance-Vector plugin::
  61 * SMTP plugin::
  62 * Bluetooth plugin::
  63 * WLAN plugin::
  64 * ATS Subsystem::
  65 * CORE Subsystem::
  66 * CADET Subsystem::
  67 * NSE Subsystem::
  68 * HOSTLIST Subsystem::
  69 * IDENTITY Subsystem::
  70 * NAMESTORE Subsystem::
  71 * PEERINFO Subsystem::
  72 * PEERSTORE Subsystem::
  73 * SET Subsystem::
  74 * STATISTICS Subsystem::
  75 * Distributed Hash Table (DHT)::
  76 * GNU Name System (GNS)::
  77 * GNS Namecache::
  78 * REVOCATION Subsystem::
  79 * File-sharing (FS) Subsystem::
  80 * REGEX Subsystem::
  81 * REST Subsystem::
  82 * RPS Subsystem::
  83 @end menu
  84
  85 @node Developer Introduction
  86 @section Developer Introduction
  87
  88 This Developer Handbook is intended as first introduction to GNUnet for
  89 new developers that want to extend the GNUnet framework. After the
  90 introduction, each of the GNUnet subsystems (directories in the
  91 @file{src/} tree) is (supposed to be) covered in its own chapter. In
  92 addition to this documentation, GNUnet developers should be aware of the
  93 services available on the GNUnet server to them.
  94
  95 New developers can have a look a the GNUnet tutorials for C and java
  96 available in the @file{src/} directory of the repository or under the
  97 following links:
  98
  99 @c ** FIXME: Link to files in source, not online.
 100 @c ** FIXME: Where is the Java tutorial?
 101 @itemize @bullet
 102 @item @xref{Top, Introduction,, gnunet-c-tutorial, The GNUnet C Tutorial}.
 103 @item @uref{https://docs.gnunet.org/tutorial/gnunet-tutorial.html, GNUnet C tutorial}
 104 @item GNUnet Java tutorial
 105 @end itemize
 106
 107 In addition to the GNUnet Reference Documentation you are reading,
 108 the GNUnet server at @uref{https://gnunet.org} contains
 109 various resources for GNUnet developers and those
 110 who aspire to become regular contributors.
 111 They are all conveniently reachable via the "Developer"
 112 entry in the navigation menu. Some additional tools (such as static
 113 analysis reports) require a special developer access to perform certain
 114 operations. If you want (or require) access, you should contact
 115 @uref{http://grothoff.org/christian/, Christian Grothoff},
 116 GNUnet's maintainer.
 117
 118 @c FIXME: A good part of this belongs on the website or should be
 119 @c extended in subsections explaining usage of this. A simple list
 120 @c is just taking space people have to read.
 121 The public subsystems on the GNUnet server that help developers are:
 122
 123 @itemize @bullet
 124
 125 @item The version control system (git) keeps our code and enables
 126 distributed development.
 127 It is publicly accessible at @uref{https://git.gnunet.org/}.
 128 Only developers with write access can commit code, everyone else is
 129 encouraged to submit patches to the GNUnet-developers mailinglist:
 130 @uref{https://lists.gnu.org/mailman/listinfo/gnunet-developers, https://lists.gnu.org/mailman/listinfo/gnunet-developers}
 131
 132 @item The bugtracking system (Mantis).
 133 We use it to track feature requests, open bug reports and their
 134 resolutions.
 135 It can be accessed at
 136 @uref{https://bugs.gnunet.org/, https://bugs.gnunet.org/}.
 137 Anyone can report bugs.
 138
 139 @item The current quality of our automated test suite is assessed using
 140 code coverage analysis. Testcases that
 141 improve our code coverage are always welcome.
 142
 143 @item We try to automatically find bugs using a static analysis using
 144 various tools. Note that not everything that is flagged by the
 145 analysis is a bug, sometimes even good code can be marked as possibly
 146 problematic. Nevertheless, developers are encouraged to at least be
 147 aware of all issues in their code that are listed.
 148
 149 @item We use Gauger for automatic performance regression visualization.
 150 @c FIXME: LINK!
 151 Details on how to use Gauger are here.
 152
 153 @end itemize
 154
 155
 156
 157 @c ***********************************************************************
 158 @menu
 159 * Project overview::
 160 @end menu
 161
 162 @node Project overview
 163 @subsection Project overview
 164
 165 The GNUnet project consists at this point of several sub-projects. This
 166 section is supposed to give an initial overview about the various
 167 sub-projects. Note that this description also lists projects that are far
 168 from complete, including even those that have literally not a single line
 169 of code in them yet.
 170
 171 GNUnet sub-projects in order of likely relevance are currently:
 172
 173 @table @asis
 174
 175 @item @command{gnunet}
 176 Core of the P2P framework, including file-sharing, VPN and
 177 chat applications; this is what the Developer Handbook covers mostly
 178 @item @command{gnunet-gtk}
 179 Gtk+-based user interfaces, including:
 180
 181 @itemize @bullet
 182 @item @command{gnunet-fs-gtk} (file-sharing),
 183 @item @command{gnunet-statistics-gtk} (statistics over time),
 184 @item @command{gnunet-peerinfo-gtk}
 185 (information about current connections and known peers),
 186 @item @command{gnunet-namestore-gtk} (GNS record editor),
 187 @item @command{gnunet-conversation-gtk} (voice chat GUI) and
 188 @item @command{gnunet-setup} (setup tool for "everything")
 189 @end itemize
 190
 191 @item @command{gnunet-fuse}
 192 Mounting directories shared via GNUnet's file-sharing
 193 on GNU/Linux distributions
 194 @item @command{gnunet-update}
 195 Installation and update tool
 196 @item @command{gnunet-ext}
 197 Template for starting 'external' GNUnet projects
 198 @item @command{gnunet-java}
 199 Java APIs for writing GNUnet services and applications
 200 @item @command{gnunet-java-ext}
 201 @item @command{eclectic}
 202 Code to run GNUnet nodes on testbeds for research, development,
 203 testing and evaluation
 204 @c ** FIXME: Solve the status and location of gnunet-qt
 205 @item @command{gnunet-qt}
 206 Qt-based GNUnet GUI (is it deprecated?)
 207 @item @command{gnunet-cocoa}
 208 cocoa-based GNUnet GUI (is it deprecated?)
 209 @item @command{gnunet-guile}
 210 Guile bindings for GNUnet
 211 @item @command{gnunet-python}
 212 Python bindings for GNUnet
 213
 214 @end table
 215
 216 We are also working on various supporting libraries and tools:
 217 @c ** FIXME: What about gauger, and what about libmwmodem?
 218
 219 @table @asis
 220 @item @command{libextractor}
 221 GNU libextractor (meta data extraction)
 222 @item @command{libmicrohttpd}
 223 GNU libmicrohttpd (embedded HTTP(S) server library)
 224 @item @command{gauger}
 225 Tool for performance regression analysis
 226 @item @command{monkey}
 227 Tool for automated debugging of distributed systems
 228 @item @command{libmwmodem}
 229 Library for accessing satellite connection quality reports
 230 @item @command{libgnurl}
 231 gnURL (feature-restricted variant of cURL/libcurl)
 232 @item @command{www}
 233 the gnunet.org website (Jinja2 based)
 234 @item @command{bibliography}
 235 Our collected bibliography, papers, references, and so forth
 236 @item @command{gnunet-videos-}
 237 Videos about and around GNUnet activities
 238 @end table
 239
 240 Finally, there are various external projects (see links for a list of
 241 those that have a public website) which build on top of the GNUnet
 242 framework.
 243
 244 @c ***********************************************************************
 245 @node Internal dependencies
 246 @section Internal dependencies
 247
 248 This section tries to give an overview of what processes a typical GNUnet
 249 peer running a particular application would consist of. All of the
 250 processes listed here should be automatically started by
 251 @command{gnunet-arm -s}.
 252 The list is given as a rough first guide to users for failure diagnostics.
 253 Ideally, end-users should never have to worry about these internal
 254 dependencies.
 255
 256 In terms of internal dependencies, a minimum file-sharing system consists
 257 of the following GNUnet processes (in order of dependency):
 258
 259 @itemize @bullet
 260 @item gnunet-service-arm
 261 @item gnunet-service-resolver (required by all)
 262 @item gnunet-service-statistics (required by all)
 263 @item gnunet-service-peerinfo
 264 @item gnunet-service-transport (requires peerinfo)
 265 @item gnunet-service-core (requires transport)
 266 @item gnunet-daemon-hostlist (requires core)
 267 @item gnunet-daemon-topology (requires hostlist, peerinfo)
 268 @item gnunet-service-datastore
 269 @item gnunet-service-dht (requires core)
 270 @item gnunet-service-identity
 271 @item gnunet-service-fs (requires identity, mesh, dht, datastore, core)
 272 @end itemize
 273
 274 @noindent
 275 A minimum VPN system consists of the following GNUnet processes (in
 276 order of dependency):
 277
 278 @itemize @bullet
 279 @item gnunet-service-arm
 280 @item gnunet-service-resolver (required by all)
 281 @item gnunet-service-statistics (required by all)
 282 @item gnunet-service-peerinfo
 283 @item gnunet-service-transport (requires peerinfo)
 284 @item gnunet-service-core (requires transport)
 285 @item gnunet-daemon-hostlist (requires core)
 286 @item gnunet-service-dht (requires core)
 287 @item gnunet-service-mesh (requires dht, core)
 288 @item gnunet-service-dns (requires dht)
 289 @item gnunet-service-regex (requires dht)
 290 @item gnunet-service-vpn (requires regex, dns, mesh, dht)
 291 @end itemize
 292
 293 @noindent
 294 A minimum GNS system consists of the following GNUnet processes (in
 295 order of dependency):
 296
 297 @itemize @bullet
 298 @item gnunet-service-arm
 299 @item gnunet-service-resolver (required by all)
 300 @item gnunet-service-statistics (required by all)
 301 @item gnunet-service-peerinfo
 302 @item gnunet-service-transport (requires peerinfo)
 303 @item gnunet-service-core (requires transport)
 304 @item gnunet-daemon-hostlist (requires core)
 305 @item gnunet-service-dht (requires core)
 306 @item gnunet-service-mesh (requires dht, core)
 307 @item gnunet-service-dns (requires dht)
 308 @item gnunet-service-regex (requires dht)
 309 @item gnunet-service-vpn (requires regex, dns, mesh, dht)
 310 @item gnunet-service-identity
 311 @item gnunet-service-namestore (requires identity)
 312 @item gnunet-service-gns (requires vpn, dns, dht, namestore, identity)
 313 @end itemize
 314
 315 @c ***********************************************************************
 316 @node Code overview
 317 @section Code overview
 318
 319 This section gives a brief overview of the GNUnet source code.
 320 Specifically, we sketch the function of each of the subdirectories in
 321 the @file{gnunet/src/} directory. The order given is roughly bottom-up
 322 (in terms of the layers of the system).
 323
 324 @table @asis
 325 @item @file{util/} --- libgnunetutil
 326 Library with general utility functions, all
 327 GNUnet binaries link against this library. Anything from memory
 328 allocation and data structures to cryptography and inter-process
 329 communication. The goal is to provide an OS-independent interface and
 330 more 'secure' or convenient implementations of commonly used primitives.
 331 The API is spread over more than a dozen headers, developers should study
 332 those closely to avoid duplicating existing functions.
 333 @pxref{libgnunetutil}.
 334 @item @file{hello/} --- libgnunethello
 335 HELLO messages are used to
 336 describe under which addresses a peer can be reached (for example,
 337 protocol, IP, port). This library manages parsing and generating of HELLO
 338 messages.
 339 @item @file{block/} --- libgnunetblock
 340 The DHT and other components of GNUnet
 341 store information in units called 'blocks'. Each block has a type and the
 342 type defines a particular format and how that binary format is to be
 343 linked to a hash code (the key for the DHT and for databases). The block
 344 library is a wrapper around block plugins which provide the necessary
 345 functions for each block type.
 346 @item @file{statistics/} --- statistics service
 347 The statistics service enables associating
 348 values (of type uint64_t) with a component name and a string. The main
 349 uses is debugging (counting events), performance tracking and user
 350 entertainment (what did my peer do today?).
 351 @item @file{arm/} --- Automatic Restart Manager (ARM)
 352 The automatic-restart-manager (ARM) service
 353 is the GNUnet master service. Its role is to start gnunet-services, to
 354 re-start them when they crashed and finally to shut down the system when
 355 requested.
 356 @item @file{peerinfo/} --- peerinfo service
 357 The peerinfo service keeps track of which peers are known
 358 to the local peer and also tracks the validated addresses for each peer
 359 (in the form of a HELLO message) for each of those peers. The peer is not
 360 necessarily connected to all peers known to the peerinfo service.
 361 Peerinfo provides persistent storage for peer identities --- peers are
 362 not forgotten just because of a system restart.
 363 @item @file{datacache/} --- libgnunetdatacache
 364 The datacache library provides (temporary) block storage for the DHT.
 365 Existing plugins can store blocks in Sqlite, Postgres or MySQL databases.
 366 All data stored in the cache is lost when the peer is stopped or
 367 restarted (datacache uses temporary tables).
 368 @item @file{datastore/} --- datastore service
 369 The datastore service stores file-sharing blocks in
 370 databases for extended periods of time. In contrast to the datacache, data
 371 is not lost when peers restart. However, quota restrictions may still
 372 cause old, expired or low-priority data to be eventually discarded.
 373 Existing plugins can store blocks in Sqlite, Postgres or MySQL databases.
 374 @item @file{template/} --- service template
 375 Template for writing a new service. Does nothing.
 376 @item @file{ats/} --- Automatic Transport Selection
 377 The automatic transport selection (ATS) service
 378 is responsible for deciding which address (i.e.
 379 which transport plugin) should be used for communication with other peers,
 380 and at what bandwidth.
 381 @item @file{nat/} --- libgnunetnat
 382 Library that provides basic functions for NAT traversal.
 383 The library supports NAT traversal with
 384 manual hole-punching by the user, UPnP and ICMP-based autonomous NAT
 385 traversal. The library also includes an API for testing if the current
 386 configuration works and the @code{gnunet-nat-server} which provides an
 387 external service to test the local configuration.
 388 @item @file{fragmentation/} --- libgnunetfragmentation
 389 Some transports (UDP and WLAN, mostly) have restrictions on the maximum
 390 transfer unit (MTU) for packets. The fragmentation library can be used to
 391 break larger packets into chunks of at most 1k and transmit the resulting
 392 fragments reliably (with acknowledgment, retransmission, timeouts,
 393 etc.).
 394 @item @file{transport/} --- transport service
 395 The transport service is responsible for managing the
 396 basic P2P communication. It uses plugins to support P2P communication
 397 over TCP, UDP, HTTP, HTTPS and other protocols.The transport service
 398 validates peer addresses, enforces bandwidth restrictions, limits the
 399 total number of connections and enforces connectivity restrictions (i.e.
 400 friends-only).
 401 @item @file{peerinfo-tool/} --- gnunet-peerinfo
 402 This directory contains the gnunet-peerinfo binary which can be used to
 403 inspect the peers and HELLOs known to the peerinfo service.
 404 @item @file{core/}
 405 The core service is responsible for establishing encrypted, authenticated
 406 connections with other peers, encrypting and decrypting messages and
 407 forwarding messages to higher-level services that are interested in them.
 408 @item @file{testing/} --- libgnunettesting
 409 The testing library allows starting (and stopping) peers
 410 for writing testcases.
 411 It also supports automatic generation of configurations for peers
 412 ensuring that the ports and paths are disjoint. libgnunettesting is also
 413 the foundation for the testbed service
 414 @item @file{testbed/} --- testbed service
 415 The testbed service is used for creating small or large scale deployments
 416 of GNUnet peers for evaluation of protocols.
 417 It facilitates peer deployments on multiple
 418 hosts (for example, in a cluster) and establishing various network
 419 topologies (both underlay and overlay).
 420 @item @file{nse/} --- Network Size Estimation
 421 The network size estimation (NSE) service
 422 implements a protocol for (securely) estimating the current size of the
 423 P2P network.
 424 @item @file{dht/} --- distributed hash table
 425 The distributed hash table (DHT) service provides a
 426 distributed implementation of a hash table to store blocks under hash
 427 keys in the P2P network.
 428 @item @file{hostlist/} --- hostlist service
 429 The hostlist service allows learning about
 430 other peers in the network by downloading HELLO messages from an HTTP
 431 server, can be configured to run such an HTTP server and also implements
 432 a P2P protocol to advertise and automatically learn about other peers
 433 that offer a public hostlist server.
 434 @item @file{topology/} --- topology service
 435 The topology service is responsible for
 436 maintaining the mesh topology. It tries to maintain connections to friends
 437 (depending on the configuration) and also tries to ensure that the peer
 438 has a decent number of active connections at all times. If necessary, new
 439 connections are added. All peers should run the topology service,
 440 otherwise they may end up not being connected to any other peer (unless
 441 some other service ensures that core establishes the required
 442 connections). The topology service also tells the transport service which
 443 connections are permitted (for friend-to-friend networking)
 444 @item @file{fs/} --- file-sharing
 445 The file-sharing (FS) service implements GNUnet's
 446 file-sharing application. Both anonymous file-sharing (using gap) and
 447 non-anonymous file-sharing (using dht) are supported.
 448 @item @file{cadet/} --- cadet service
 449 The CADET service provides a general-purpose routing abstraction to create
 450 end-to-end encrypted tunnels in mesh networks. We wrote a paper
 451 documenting key aspects of the design.
 452 @item @file{tun/} --- libgnunettun
 453 Library for building IPv4, IPv6 packets and creating
 454 checksums for UDP, TCP and ICMP packets. The header
 455 defines C structs for common Internet packet formats and in particular
 456 structs for interacting with TUN (virtual network) interfaces.
 457 @item @file{mysql/} --- libgnunetmysql
 458 Library for creating and executing prepared MySQL
 459 statements and to manage the connection to the MySQL database.
 460 Essentially a lightweight wrapper for the interaction between GNUnet
 461 components and libmysqlclient.
 462 @item @file{dns/}
 463 Service that allows intercepting and modifying DNS requests of
 464 the local machine. Currently used for IPv4-IPv6 protocol translation
 465 (DNS-ALG) as implemented by "pt/" and for the GNUnet naming system. The
 466 service can also be configured to offer an exit service for DNS traffic.
 467 @item @file{vpn/} --- VPN service
 468 The virtual public network (VPN) service provides a virtual
 469 tunnel interface (VTUN) for IP routing over GNUnet.
 470 Needs some other peers to run an "exit" service to work.
 471 Can be activated using the "gnunet-vpn" tool or integrated with DNS using
 472 the "pt" daemon.
 473 @item @file{exit/}
 474 Daemon to allow traffic from the VPN to exit this
 475 peer to the Internet or to specific IP-based services of the local peer.
 476 Currently, an exit service can only be restricted to IPv4 or IPv6, not to
 477 specific ports and or IP address ranges. If this is not acceptable,
 478 additional firewall rules must be added manually. exit currently only
 479 works for normal UDP, TCP and ICMP traffic; DNS queries need to leave the
 480 system via a DNS service.
 481 @item @file{pt/}
 482 protocol translation daemon. This daemon enables 4-to-6,
 483 6-to-4, 4-over-6 or 6-over-4 transitions for the local system. It
 484 essentially uses "DNS" to intercept DNS replies and then maps results to
 485 those offered by the VPN, which then sends them using mesh to some daemon
 486 offering an appropriate exit service.
 487 @item @file{identity/}
 488 Management of egos (alter egos) of a user; identities are
 489 essentially named ECC private keys and used for zones in the GNU name
 490 system and for namespaces in file-sharing, but might find other uses later
 491 @item @file{revocation/}
 492 Key revocation service, can be used to revoke the
 493 private key of an identity if it has been compromised
 494 @item @file{namecache/}
 495 Cache for resolution results for the GNU name system;
 496 data is encrypted and can be shared among users,
 497 loss of the data should ideally only result in a
 498 performance degradation (persistence not required)
 499 @item @file{namestore/}
 500 Database for the GNU name system with per-user private information,
 501 persistence required
 502 @item @file{gns/}
 503 GNU name system, a GNU approach to DNS and PKI.
 504 @item @file{dv/}
 505 A plugin for distance-vector (DV)-based routing.
 506 DV consists of a service and a transport plugin to provide peers
 507 with the illusion of a direct P2P connection for connections
 508 that use multiple (typically up to 3) hops in the actual underlay network.
 509 @item @file{regex/}
 510 Service for the (distributed) evaluation of regular expressions.
 511 @item @file{scalarproduct/}
 512 The scalar product service offers an API to perform a secure multiparty
 513 computation which calculates a scalar product between two peers
 514 without exposing the private input vectors of the peers to each other.
 515 @item @file{consensus/}
 516 The consensus service will allow a set of peers to agree
 517 on a set of values via a distributed set union computation.
 518 @item @file{rest/}
 519 The rest API allows access to GNUnet services using RESTful interaction.
 520 The services provide plugins that can exposed by the rest server.
 521 @c FIXME: Where did this disappear to?
 522 @c @item @file{experimentation/}
 523 @c The experimentation daemon coordinates distributed
 524 @c experimentation to evaluate transport and ATS properties.
 525 @end table
 526
 527 @c ***********************************************************************
 528 @node System Architecture
 529 @section System Architecture
 530
 531 @c FIXME: For those irritated by the textflow, we are missing images here,
 532 @c in the short term we should add them back, in the long term this should
 533 @c work without images or have images with alt-text.
 534
 535 GNUnet developers like LEGOs. The blocks are indestructible, can be
 536 stacked together to construct complex buildings and it is generally easy
 537 to swap one block for a different one that has the same shape. GNUnet's
 538 architecture is based on LEGOs:
 539
 540 @image{images/service_lego_block,5in,,picture of a LEGO block stack - 3 APIs upon IPC/network protocol provided by a service}
 541
 542 This chapter documents the GNUnet LEGO system, also known as GNUnet's
 543 system architecture.
 544
 545 The most common GNUnet component is a service. Services offer an API (or
 546 several, depending on what you count as "an API") which is implemented as
 547 a library. The library communicates with the main process of the service
 548 using a service-specific network protocol. The main process of the service
 549 typically doesn't fully provide everything that is needed --- it has holes
 550 to be filled by APIs to other services.
 551
 552 A special kind of component in GNUnet are user interfaces and daemons.
 553 Like services, they have holes to be filled by APIs of other services.
 554 Unlike services, daemons do not implement their own network protocol and
 555 they have no API:
 556
 557 @image{images/daemon_lego_block,5in,,A daemon in GNUnet is a component that does not offer an API for others to build upon}
 558
 559 The GNUnet system provides a range of services, daemons and user
 560 interfaces, which are then combined into a layered GNUnet instance (also
 561 known as a peer).
 562
 563 @image{images/service_stack,5in,,A GNUnet peer consists of many layers of services}
 564
 565 Note that while it is generally possible to swap one service for another
 566 compatible service, there is often only one implementation. However,
 567 during development we often have a "new" version of a service in parallel
 568 with an "old" version. While the "new" version is not working, developers
 569 working on other parts of the service can continue their development by
 570 simply using the "old" service. Alternative design ideas can also be
 571 easily investigated by swapping out individual components. This is
 572 typically achieved by simply changing the name of the "BINARY" in the
 573 respective configuration section.
 574
 575 Key properties of GNUnet services are that they must be separate
 576 processes and that they must protect themselves by applying tight error
 577 checking against the network protocol they implement (thereby achieving a
 578 certain degree of robustness).
 579
 580 On the other hand, the APIs are implemented to tolerate failures of the
 581 service, isolating their host process from errors by the service. If the
 582 service process crashes, other services and daemons around it should not
 583 also fail, but instead wait for the service process to be restarted by
 584 ARM.
 585
 586
 587 @c ***********************************************************************
 588 @node Subsystem stability
 589 @section Subsystem stability
 590
 591 This section documents the current stability of the various GNUnet
 592 subsystems. Stability here describes the expected degree of compatibility
 593 with future versions of GNUnet. For each subsystem we distinguish between
 594 compatibility on the P2P network level (communication protocol between
 595 peers), the IPC level (communication between the service and the service
 596 library) and the API level (stability of the API). P2P compatibility is
 597 relevant in terms of which applications are likely going to be able to
 598 communicate with future versions of the network. IPC communication is
 599 relevant for the implementation of language bindings that re-implement the
 600 IPC messages. Finally, API compatibility is relevant to developers that
 601 hope to be able to avoid changes to applications build on top of the APIs
 602 of the framework.
 603
 604 The following table summarizes our current view of the stability of the
 605 respective protocols or APIs:
 606
 607 @multitable @columnfractions .20 .20 .20 .20
 608 @headitem Subsystem @tab P2P @tab IPC @tab C API
 609 @item util @tab n/a @tab n/a @tab stable
 610 @item arm @tab n/a @tab stable @tab stable
 611 @item ats @tab n/a @tab unstable @tab testing
 612 @item block @tab n/a @tab n/a @tab stable
 613 @item cadet @tab testing @tab testing @tab testing
 614 @item consensus @tab experimental @tab experimental @tab experimental
 615 @item core @tab stable @tab stable @tab stable
 616 @item datacache @tab n/a @tab n/a @tab stable
 617 @item datastore @tab n/a @tab stable @tab stable
 618 @item dht @tab stable @tab stable @tab stable
 619 @item dns @tab stable @tab stable @tab stable
 620 @item dv @tab testing @tab testing @tab n/a
 621 @item exit @tab testing @tab n/a @tab n/a
 622 @item fragmentation @tab stable @tab n/a @tab stable
 623 @item fs @tab stable @tab stable @tab stable
 624 @item gns @tab stable @tab stable @tab stable
 625 @item hello @tab n/a @tab n/a @tab testing
 626 @item hostlist @tab stable @tab stable @tab n/a
 627 @item identity @tab stable @tab stable @tab n/a
 628 @item multicast @tab experimental @tab experimental @tab experimental
 629 @item mysql @tab stable @tab n/a @tab stable
 630 @item namestore @tab n/a @tab stable @tab stable
 631 @item nat @tab n/a @tab n/a @tab stable
 632 @item nse @tab stable @tab stable @tab stable
 633 @item peerinfo @tab n/a @tab stable @tab stable
 634 @item psyc @tab experimental @tab experimental @tab experimental
 635 @item pt @tab n/a @tab n/a @tab n/a
 636 @item regex @tab stable @tab stable @tab stable
 637 @item revocation @tab stable @tab stable @tab stable
 638 @item social @tab experimental @tab experimental @tab experimental
 639 @item statistics @tab n/a @tab stable @tab stable
 640 @item testbed @tab n/a @tab testing @tab testing
 641 @item testing @tab n/a @tab n/a @tab testing
 642 @item topology @tab n/a @tab n/a @tab n/a
 643 @item transport @tab stable @tab stable @tab stable
 644 @item tun @tab n/a @tab n/a @tab stable
 645 @item vpn @tab testing @tab n/a @tab n/a
 646 @end multitable
 647
 648 Here is a rough explanation of the values:
 649
 650 @table @samp
 651 @item stable
 652 No incompatible changes are planned at this time; for IPC/APIs, if
 653 there are incompatible changes, they will be minor and might only require
 654 minimal changes to existing code; for P2P, changes will be avoided if at
 655 all possible for the 0.10.x-series
 656
 657 @item testing
 658 No incompatible changes are
 659 planned at this time, but the code is still known to be in flux; so while
 660 we have no concrete plans, our expectation is that there will still be
 661 minor modifications; for P2P, changes will likely be extensions that
 662 should not break existing code
 663
 664 @item unstable
 665 Changes are planned and will happen; however, they
 666 will not be totally radical and the result should still resemble what is
 667 there now; nevertheless, anticipated changes will break protocol/API
 668 compatibility
 669
 670 @item experimental
 671 Changes are planned and the result may look nothing like
 672 what the API/protocol looks like today
 673
 674 @item unknown
 675 Someone should think about where this subsystem headed
 676
 677 @item n/a
 678 This subsystem does not have an API/IPC-protocol/P2P-protocol
 679 @end table
 680
 681 @c ***********************************************************************
 682 @node Naming conventions and coding style guide
 683 @section Naming conventions and coding style guide
 684
 685 Here you can find some rules to help you write code for GNUnet.
 686
 687 @c ***********************************************************************
 688 @menu
 689 * Naming conventions::
 690 * Coding style::
 691 @end menu
 692
 693 @node Naming conventions
 694 @subsection Naming conventions
 695
 696
 697 @c ***********************************************************************
 698 @menu
 699 * include files::
 700 * binaries::
 701 * logging::
 702 * configuration::
 703 * exported symbols::
 704 * private (library-internal) symbols (including structs and macros)::
 705 * testcases::
 706 * performance tests::
 707 * src/ directories::
 708 @end menu
 709
 710 @node include files
 711 @subsubsection include files
 712
 713 @itemize @bullet
 714 @item _lib: library without need for a process
 715 @item _service: library that needs a service process
 716 @item _plugin: plugin definition
 717 @item _protocol: structs used in network protocol
 718 @item exceptions:
 719 @itemize @bullet
 720 @item gnunet_config.h --- generated
 721 @item platform.h --- first included
 722 @item gnunet_common.h --- fundamental routines
 723 @item gnunet_directories.h --- generated
 724 @item gettext.h --- external library
 725 @end itemize
 726 @end itemize
 727
 728 @c ***********************************************************************
 729 @node binaries
 730 @subsubsection binaries
 731
 732 @itemize @bullet
 733 @item gnunet-service-xxx: service process (has listen socket)
 734 @item gnunet-daemon-xxx: daemon process (no listen socket)
 735 @item gnunet-helper-xxx[-yyy]: SUID helper for module xxx
 736 @item gnunet-yyy: command-line tool for end-users
 737 @item libgnunet_plugin_xxx_yyy.so: plugin for API xxx
 738 @item libgnunetxxx.so: library for API xxx
 739 @end itemize
 740
 741 @c ***********************************************************************
 742 @node logging
 743 @subsubsection logging
 744
 745 @itemize @bullet
 746 @item services and daemons use their directory name in
 747 @code{GNUNET_log_setup} (i.e. 'core') and log using
 748 plain 'GNUNET_log'.
 749 @item command-line tools use their full name in
 750 @code{GNUNET_log_setup} (i.e. 'gnunet-publish') and log using
 751 plain 'GNUNET_log'.
 752 @item service access libraries log using
 753 '@code{GNUNET_log_from}' and use '@code{DIRNAME-api}' for the
 754 component (i.e. 'core-api')
 755 @item pure libraries (without associated service) use
 756 '@code{GNUNET_log_from}' with the component set to their
 757 library name (without lib or '@file{.so}'),
 758 which should also be their directory name (i.e. '@file{nat}')
 759 @item plugins should use '@code{GNUNET_log_from}'
 760 with the directory name and the plugin name combined to produce
 761 the component name (i.e. 'transport-tcp').
 762 @item logging should be unified per-file by defining a
 763 @code{LOG} macro with the appropriate arguments,
 764 along these lines:
 765
 766 @example
 767 #define LOG(kind,...)
 768 GNUNET_log_from (kind, "example-api",__VA_ARGS__)
 769 @end example
 770
 771 @end itemize
 772
 773 @c ***********************************************************************
 774 @node configuration
 775 @subsubsection configuration
 776
 777 @itemize @bullet
 778 @item paths (that are substituted in all filenames) are in PATHS
 779 (have as few as possible)
 780 @item all options for a particular module (@file{src/MODULE})
 781 are under @code{[MODULE]}
 782 @item options for a plugin of a module
 783 are under @code{[MODULE-PLUGINNAME]}
 784 @end itemize
 785
 786 @c ***********************************************************************
 787 @node exported symbols
 788 @subsubsection exported symbols
 789
 790 @itemize @bullet
 791 @item must start with @code{GNUNET_modulename_} and be defined in
 792 @file{modulename.c}
 793 @item exceptions: those defined in @file{gnunet_common.h}
 794 @end itemize
 795
 796 @c ***********************************************************************
 797 @node private (library-internal) symbols (including structs and macros)
 798 @subsubsection private (library-internal) symbols (including structs and macros)
 799
 800 @itemize @bullet
 801 @item must NOT start with any prefix
 802 @item must not be exported in a way that linkers could use them or@ other
 803 libraries might see them via headers; they must be either
 804 declared/defined in C source files or in headers that are in the
 805 respective directory under @file{src/modulename/} and NEVER be declared
 806 in @file{src/include/}.
 807 @end itemize
 808
 809 @node testcases
 810 @subsubsection testcases
 811
 812 @itemize @bullet
 813 @item must be called @file{test_module-under-test_case-description.c}
 814 @item "case-description" maybe omitted if there is only one test
 815 @end itemize
 816
 817 @c ***********************************************************************
 818 @node performance tests
 819 @subsubsection performance tests
 820
 821 @itemize @bullet
 822 @item must be called @file{perf_module-under-test_case-description.c}
 823 @item "case-description" maybe omitted if there is only one performance
 824 test
 825 @item Must only be run if @code{HAVE_BENCHMARKS} is satisfied
 826 @end itemize
 827
 828 @c ***********************************************************************
 829 @node src/ directories
 830 @subsubsection src/ directories
 831
 832 @itemize @bullet
 833 @item gnunet-NAME: end-user applications (i.e., gnunet-search, gnunet-arm)
 834 @item gnunet-service-NAME: service processes with accessor library (i.e.,
 835 gnunet-service-arm)
 836 @item libgnunetNAME: accessor library (_service.h-header) or standalone
 837 library (_lib.h-header)
 838 @item gnunet-daemon-NAME: daemon process without accessor library (i.e.,
 839 gnunet-daemon-hostlist) and no GNUnet management port
 840 @item libgnunet_plugin_DIR_NAME: loadable plugins (i.e.,
 841 libgnunet_plugin_transport_tcp)
 842 @end itemize
 843
 844 @cindex Coding style
 845 @node Coding style
 846 @subsection Coding style
 847
 848 @c XXX: Adjust examples to GNU Standards!
 849 @itemize @bullet
 850 @item We follow the GNU Coding Standards (@pxref{Top, The GNU Coding Standards,, standards, The GNU Coding Standards});
 851 @item Indentation is done with spaces, two per level, no tabs;
 852 @item C99 struct initialization is fine;
 853 @item declare only one variable per line, for example:
 854
 855 @noindent
 856 instead of
 857
 858 @example
 859 int i,j;
 860 @end example
 861
 862 @noindent
 863 write:
 864
 865 @example
 866 int i;
 867 int j;
 868 @end example
 869
 870 @c TODO: include actual example from a file in source
 871
 872 @noindent
 873 This helps keep diffs small and forces developers to think precisely about
 874 the type of every variable.
 875 Note that @code{char *} is different from @code{const char*} and
 876 @code{int} is different from @code{unsigned int} or @code{uint32_t}.
 877 Each variable type should be chosen with care.
 878
 879 @item While @code{goto} should generally be avoided, having a
 880 @code{goto} to the end of a function to a block of clean up
 881 statements (free, close, etc.) can be acceptable.
 882
 883 @item Conditions should be written with constants on the left (to avoid
 884 accidental assignment) and with the @code{true} target being either the
 885 @code{error} case or the significantly simpler continuation. For example:
 886
 887 @example
 888 if (0 != stat ("filename,"
 889                &sbuf))
 890 @{
 891   error();
 892 @}
 893 else
 894 @{
 895   /* handle normal case here */
 896 @}
 897 @end example
 898
 899 @noindent
 900 instead of
 901
 902 @example
 903 if (stat ("filename," &sbuf) == 0) @{
 904   /* handle normal case here */
 905  @} else @{
 906   error();
 907  @}
 908 @end example
 909
 910 @noindent
 911 If possible, the error clause should be terminated with a @code{return} (or
 912 @code{goto} to some cleanup routine) and in this case, the @code{else} clause
 913 should be omitted:
 914
 915 @example
 916 if (0 != stat ("filename",
 917                &sbuf))
 918 @{
 919   error();
 920   return;
 921 @}
 922 /* handle normal case here */
 923 @end example
 924
 925 This serves to avoid deep nesting. The 'constants on the left' rule
 926 applies to all constants (including. @code{GNUNET_SCHEDULER_NO_TASK}),
 927 NULL, and enums). With the two above rules (constants on left, errors in
 928 'true' branch), there is only one way to write most branches correctly.
 929
 930 @item Combined assignments and tests are allowed if they do not hinder
 931 code clarity. For example, one can write:
 932
 933 @example
 934 if (NULL == (value = lookup_function()))
 935 @{
 936   error();
 937   return;
 938 @}
 939 @end example
 940
 941 @item Use @code{break} and @code{continue} wherever possible to avoid
 942 deep(er) nesting. Thus, we would write:
 943
 944 @example
 945 next = head;
 946 while (NULL != (pos = next))
 947 @{
 948   next = pos->next;
 949   if (! should_free (pos))
 950     continue;
 951   GNUNET_CONTAINER_DLL_remove (head,
 952                                tail,
 953                                pos);
 954   GNUNET_free (pos);
 955 @}
 956 @end example
 957
 958 instead of
 959
 960 @example
 961 next = head; while (NULL != (pos = next)) @{
 962   next = pos->next;
 963   if (should_free (pos)) @{
 964     /* unnecessary nesting! */
 965     GNUNET_CONTAINER_DLL_remove (head, tail, pos);
 966     GNUNET_free (pos);
 967    @}
 968   @}
 969 @end example
 970
 971 @item We primarily use @code{for} and @code{while} loops.
 972 A @code{while} loop is used if the method for advancing in the loop is
 973 not a straightforward increment operation. In particular, we use:
 974
 975 @example
 976 next = head;
 977 while (NULL != (pos = next))
 978 @{
 979   next = pos->next;
 980   if (! should_free (pos))
 981     continue;
 982   GNUNET_CONTAINER_DLL_remove (head,
 983                                tail,
 984                                pos);
 985   GNUNET_free (pos);
 986 @}
 987 @end example
 988
 989 to free entries in a list (as the iteration changes the structure of the
 990 list due to the free; the equivalent @code{for} loop does no longer
 991 follow the simple @code{for} paradigm of @code{for(INIT;TEST;INC)}).
 992 However, for loops that do follow the simple @code{for} paradigm we do
 993 use @code{for}, even if it involves linked lists:
 994
 995 @example
 996 /* simple iteration over a linked list */
 997 for (pos = head;
 998      NULL != pos;
 999      pos = pos->next)
1000 @{
1001    use (pos);
1002 @}
1003 @end example
1004
1005
1006 @item The first argument to all higher-order functions in GNUnet must be
1007 declared to be of type @code{void *} and is reserved for a closure. We do
1008 not use inner functions, as trampolines would conflict with setups that
1009 use non-executable stacks.
1010 The first statement in a higher-order function, which unusually should
1011 be part of the variable declarations, should assign the
1012 @code{cls} argument to the precise expected type. For example:
1013
1014 @example
1015 int
1016 callback (void *cls,
1017           char *args)
1018 @{
1019   struct Foo *foo = cls;
1020   int other_variables;
1021
1022    /* rest of function */
1023 @}
1024 @end example
1025
1026 @item As shown in the example above, after the return type of a
1027 function there should be a break.  Each parameter should
1028 be on a new line.
1029
1030 @item It is good practice to write complex @code{if} expressions instead
1031 of using deeply nested @code{if} statements. However, except for addition
1032 and multiplication, all operators should use parens. This is fine:
1033
1034 @example
1035 if ( (1 == foo) ||
1036      ( (0 == bar) &&
1037        (x != y) ) )
1038   return x;
1039 @end example
1040
1041
1042 However, this is not:
1043
1044 @example
1045 if (1 == foo)
1046   return x;
1047 if (0 == bar && x != y)
1048   return x;
1049 @end example
1050
1051 @noindent
1052 Note that splitting the @code{if} statement above is debatable as the
1053 @code{return x} is a very trivial statement. However, once the logic after
1054 the branch becomes more complicated (and is still identical), the "or"
1055 formulation should be used for sure.
1056
1057 @item There should be two empty lines between the end of the function and
1058 the comments describing the following function. There should be a single
1059 empty line after the initial variable declarations of a function. If a
1060 function has no local variables, there should be no initial empty line. If
1061 a long function consists of several complex steps, those steps might be
1062 separated by an empty line (possibly followed by a comment describing the
1063 following step). The code should not contain empty lines in arbitrary
1064 places; if in doubt, it is likely better to NOT have an empty line (this
1065 way, more code will fit on the screen).
1066 @end itemize
1067
1068 @c ***********************************************************************
1069 @node Build-system
1070 @section Build-system
1071
1072 If you have code that is likely not to compile or build rules you might
1073 want to not trigger for most developers, use @code{if HAVE_EXPERIMENTAL}
1074 in your @file{Makefile.am}.
1075 Then it is OK to (temporarily) add non-compiling (or known-to-not-port)
1076 code.
1077
1078 If you want to compile all testcases but NOT run them, run configure with
1079 the @code{--enable-test-suppression} option.
1080
1081 If you want to run all testcases, including those that take a while, run
1082 configure with the @code{--enable-expensive-testcases} option.
1083
1084 If you want to compile and run benchmarks, run configure with the
1085 @code{--enable-benchmarks} option.
1086
1087 If you want to obtain code coverage results, run configure with the
1088 @code{--enable-coverage} option and run the @file{coverage.sh} script in
1089 the @file{contrib/} directory.
1090
1091 @cindex gnunet-ext
1092 @node Developing extensions for GNUnet using the gnunet-ext template
1093 @section Developing extensions for GNUnet using the gnunet-ext template
1094
1095 For developers who want to write extensions for GNUnet we provide the
1096 gnunet-ext template to provide an easy to use skeleton.
1097
1098 gnunet-ext contains the build environment and template files for the
1099 development of GNUnet services, command line tools, APIs and tests.
1100
1101 First of all you have to obtain gnunet-ext from git:
1102
1103 @example
1104 git clone https://git.gnunet.org/gnunet-ext.git
1105 @end example
1106
1107 The next step is to bootstrap and configure it. For configure you have to
1108 provide the path containing GNUnet with
1109 @code{--with-gnunet=/path/to/gnunet} and the prefix where you want the
1110 install the extension using @code{--prefix=/path/to/install}:
1111
1112 @example
1113 ./bootstrap
1114 ./configure --prefix=/path/to/install --with-gnunet=/path/to/gnunet
1115 @end example
1116
1117 When your GNUnet installation is not included in the default linker search
1118 path, you have to add @code{/path/to/gnunet} to the file
1119 @file{/etc/ld.so.conf} and run @code{ldconfig} or your add it to the
1120 environmental variable @code{LD_LIBRARY_PATH} by using
1121
1122 @example
1123 export LD_LIBRARY_PATH=/path/to/gnunet/lib
1124 @end example
1125
1126 @cindex writing testcases
1127 @node Writing testcases
1128 @section Writing testcases
1129
1130 Ideally, any non-trivial GNUnet code should be covered by automated
1131 testcases. Testcases should reside in the same place as the code that is
1132 being tested. The name of source files implementing tests should begin
1133 with @code{test_} followed by the name of the file that contains
1134 the code that is being tested.
1135
1136 Testcases in GNUnet should be integrated with the autotools build system.
1137 This way, developers and anyone building binary packages will be able to
1138 run all testcases simply by running @code{make check}. The final
1139 testcases shipped with the distribution should output at most some brief
1140 progress information and not display debug messages by default. The
1141 success or failure of a testcase must be indicated by returning zero
1142 (success) or non-zero (failure) from the main method of the testcase.
1143 The integration with the autotools is relatively straightforward and only
1144 requires modifications to the @file{Makefile.am} in the directory
1145 containing the testcase. For a testcase testing the code in @file{foo.c}
1146 the @file{Makefile.am} would contain the following lines:
1147
1148 @example
1149 check_PROGRAMS = test_foo
1150 TESTS = $(check_PROGRAMS)
1151 test_foo_SOURCES = test_foo.c
1152 test_foo_LDADD = $(top_builddir)/src/util/libgnunetutil.la
1153 @end example
1154
1155 Naturally, other libraries used by the testcase may be specified in the
1156 @code{LDADD} directive as necessary.
1157
1158 Often testcases depend on additional input files, such as a configuration
1159 file. These support files have to be listed using the @code{EXTRA_DIST}
1160 directive in order to ensure that they are included in the distribution.
1161
1162 Example:
1163
1164 @example
1165 EXTRA_DIST = test_foo_data.conf
1166 @end example
1167
1168 Executing @code{make check} will run all testcases in the current
1169 directory and all subdirectories. Testcases can be compiled individually
1170 by running @code{make test_foo} and then invoked directly using
1171 @code{./test_foo}. Note that due to the use of plugins in GNUnet, it is
1172 typically necessary to run @code{make install} before running any
1173 testcases. Thus the canonical command @code{make check install} has to be
1174 changed to @code{make install check} for GNUnet.
1175
1176 @c ***********************************************************************
1177 @cindex Building GNUnet
1178 @node Building GNUnet and its dependencies
1179 @section Building GNUnet and its dependencies
1180
1181 In the following section we will outline how to build GNUnet and
1182 some of its dependencies. We will assume a fair amount of knowledge
1183 for building applications under UNIX-like systems. Furthermore we
1184 assume that the build environment is sane and that you are aware of
1185 any implications actions in this process could have.
1186 Instructions here can be seen as notes for developers (an extension to
1187 the 'HACKING' section in README) as well as package maintainers.
1188 @b{Users should rely on the available binary packages.}
1189 We will use Debian as an example Operating System environment. Substitute
1190 accordingly with your own Operating System environment.
1191
1192 For the full list of dependencies, consult the appropriate, up-to-date
1193 section in the @file{README} file.
1194
1195 First, we need to build or install (depending on your OS) the following
1196 packages. If you build them from source, build them in this exact order:
1197
1198 @example
1199 libgpgerror, libgcrypt, libnettle, libunbound, GnuTLS (with libunbound
1200 support)
1201 @end example
1202
1203 After we have build and installed those packages, we continue with
1204 packages closer to GNUnet in this step: libgnurl (our libcurl fork),
1205 GNU libmicrohttpd, and GNU libextractor. Again, if your package manager
1206 provides one of these packages, use the packages provided from it
1207 unless you have good reasons (package version too old, conflicts, etc).
1208 We advise against compiling widely used packages such as GnuTLS
1209 yourself if your OS provides a variant already unless you take care
1210 of maintenance of the packages then.
1211
1212 In the optimistic case, this command will give you all the dependencies
1213 on Debian, Debian derived systems or any Linux Operating System using
1214 the apt package manager:
1215
1216 @example
1217 sudo apt-get install libgnurl libmicrohttpd libextractor
1218 @end example
1219
1220 From experience we know that at the very least libgnurl is not
1221 available in some environments. You could substitute libgnurl
1222 with libcurl, but we recommend to install libgnurl, as it gives
1223 you a predefined libcurl with the small set GNUnet requires.
1224 libgnurl has been developed to co-exist with libcurl installations,
1225 installing it will cause no filename or namespace collisions.
1226
1227 @cindex libgnurl
1228 @cindex compiling libgnurl
1229 GNUnet and some of its function depend on a limited subset of cURL/libcurl.
1230 Rather than trying to enforce a certain configuration on the world, we
1231 opted to maintain a microfork of it that ensures that we can link
1232 against the right set of features.
1233 We called this specialized set of libcurl "libgnurl".
1234 It is fully ABI compatible with libcurl and currently used by
1235 GNUnet and some of its dependencies.
1236
1237 We download libgnurl and its digital signature from the GNU fileserver,
1238 assuming @env{TMPDIR} exists.
1239
1240 @quotation
1241 Note: TMPDIR might be @file{/tmp}, @env{TMPDIR}, @env{TMP} or any other
1242 location. For consistency we assume @env{TMPDIR} points to @file{/tmp}
1243 for the remainder of this section.
1244 @end quotation
1245
1246 @example
1247 cd \$TMPDIR
1248 wget https://ftp.gnu.org/gnu/gnunet/gnurl-7.65.3.tar.Z
1249 wget https://ftp.gnu.org/gnu/gnunet/gnurl-7.65.3.tar.Z.sig
1250 @end example
1251
1252 Next, verify the digital signature of the file:
1253
1254 @example
1255 gpg --verify gnurl-7.65.3.tar.Z.sig
1256 @end example
1257
1258 If gpg fails, you might try with @command{gpg2} on your OS. If the error
1259 states that ``the key can not be found'' or it is unknown, you have to
1260 retrieve the key (A88C8ADD129828D7EAC02E52E22F9BBFEE348588) from a
1261 keyserver first:
1262
1263 @example
1264 gpg --keyserver pgp.mit.edu --recv-keys A88C8ADD129828D7EAC02E52E22F9BBFEE348588
1265 @end example
1266
1267 or
1268
1269 @example
1270 gpg --keyserver hkps://keys.openpgp.org --recv-keys A88C8ADD129828D7EAC02E52E22F9BBFEE348588
1271 @end example
1272
1273 and rerun the verification command.
1274
1275 libgnurl will require the following packages to be present at runtime:
1276 GnuTLS (with DANE support / libunbound), libidn, zlib and at compile time:
1277 libtool, perl, pkg-config, and (for tests) python (2.7, or
1278 any version of python 3).
1279
1280 Once you have verified that all the required packages are present on your
1281 system, we can proceed to compile libgnurl. This assumes you will install
1282 gnurl in the default location as prefix. To change this, pass --prefix= to
1283 the configure-gnurl script (which is a simple wrapper around configure).
1284
1285 @example
1286 tar -xvf gnurl-7.65.3.tar.Z
1287 cd gnurl-7.65.3
1288 sh ./configure-gnurl
1289 make
1290 make -C tests test
1291 sudo make install
1292 @end example
1293
1294 After you've compiled and installed libgnurl, we can proceed to building
1295 GNUnet.
1296
1297
1298
1299
1300 First, in addition to the GNUnet sources you might require downloading the
1301 latest version of various dependencies, depending on how recent the
1302 software versions in your distribution of GNU/Linux are.
1303 Most distributions do not include sufficiently recent versions of these
1304 dependencies.
1305 Thus, a typically installation on a "modern" GNU/Linux distribution
1306 requires you to install the following dependencies (ideally in this
1307 order):
1308
1309 @itemize @bullet
1310 @item libgpgerror and libgcrypt
1311 @item libnettle and libunbound (possibly from distribution), GnuTLS
1312 @item libgnurl (read the README)
1313 @item GNU libmicrohttpd
1314 @item GNU libextractor
1315 @end itemize
1316
1317 Make sure to first install the various mandatory and optional
1318 dependencies including development headers from your distribution.
1319
1320 Other dependencies that you should strongly consider to install is a
1321 database (MySQL, SQLite3 or Postgres).
1322 The following instructions will assume that you installed at least
1323 SQLite3 (commonly distributed as ``sqlite'' or ``sqlite3'').
1324 For most distributions you should be able to find pre-build packages for
1325 the database. Again, make sure to install the client libraries @b{and} the
1326 respective development headers (if they are packaged separately) as well.
1327
1328 @c TODO: Do these platform specific descriptions still exist? If not,
1329 @c we should find a way to sync website parts with this texinfo.
1330 You can find specific, detailed instructions for installing of the
1331 dependencies (and possibly the rest of the GNUnet installation) in the
1332 platform-specific descriptions, which can be found in the Index.
1333 Please consult them now.
1334 If your distribution is not listed, please study the build
1335 instructions for Debian stable, carefully as you try to install the
1336 dependencies for your own distribution.
1337 Contributing additional instructions for further platforms is always
1338 appreciated.
1339 Please take in mind that operating system development tends to move at
1340 a rather fast speed. Due to this you should be aware that some of
1341 the instructions could be outdated by the time you are reading this.
1342 If you find a mistake, please tell us about it (or even better: send
1343 a patch to the documentation to fix it!).
1344
1345 Before proceeding further, please double-check the dependency list.
1346 Note that in addition to satisfying the dependencies, you might have to
1347 make sure that development headers for the various libraries are also
1348 installed.
1349 There maybe files for other distributions, or you might be able to find
1350 equivalent packages for your distribution.
1351
1352 While it is possible to build and install GNUnet without having root
1353 access, we will assume that you have full control over your system in
1354 these instructions.
1355 First, you should create a system user @emph{gnunet} and an additional
1356 group @emph{gnunetdns}. On the GNU/Linux distributions Debian and Ubuntu,
1357 type:
1358
1359 @example
1360 sudo adduser --system --home /var/lib/gnunet --group \
1361 --disabled-password gnunet
1362 sudo addgroup --system gnunetdns
1363 @end example
1364
1365 @noindent
1366 On other Unix-like systems, this should have the same effect:
1367
1368 @example
1369 sudo useradd --system --groups gnunet --home-dir /var/lib/gnunet
1370 sudo addgroup --system gnunetdns
1371 @end example
1372
1373 Now compile and install GNUnet using:
1374
1375 @example
1376 tar xvf gnunet-@value{VERSION}.tar.gz
1377 cd gnunet-@value{VERSION}
1378 ./configure --with-sudo=sudo --with-nssdir=/lib
1379 make
1380 sudo make install
1381 @end example
1382
1383 If you want to be able to enable DEBUG-level log messages, add
1384 @code{--enable-logging=verbose} to the end of the
1385 @command{./configure} command.
1386 @code{DEBUG}-level log messages are in English only and
1387 should only be useful for developers (or for filing
1388 really detailed bug reports).
1389
1390 @noindent
1391 Next, edit the file @file{/etc/gnunet.conf} to contain the following:
1392
1393 @example
1394 [arm]
1395 START_SYSTEM_SERVICES = YES
1396 START_USER_SERVICES = NO
1397 @end example
1398
1399 @noindent
1400 You may need to update your @code{ld.so} cache to include
1401 files installed in @file{/usr/local/lib}:
1402
1403 @example
1404 # ldconfig
1405 @end example
1406
1407 @noindent
1408 Then, switch from user @code{root} to user @code{gnunet} to start
1409 the peer:
1410
1411 @example
1412 # su -s /bin/sh - gnunet
1413 $ gnunet-arm -c /etc/gnunet.conf -s
1414 @end example
1415
1416 You may also want to add the last line in the gnunet user's @file{crontab}
1417 prefixed with @code{@@reboot} so that it is executed whenever the system
1418 is booted:
1419
1420 @example
1421 @@reboot /usr/local/bin/gnunet-arm -c /etc/gnunet.conf -s
1422 @end example
1423
1424 @noindent
1425 This will only start the system-wide GNUnet services.
1426 Type @command{exit} to get back your root shell.
1427 Now, you need to configure the per-user part. For each
1428 user that should get access to GNUnet on the system, run
1429 (replace alice with your username):
1430
1431 @example
1432 sudo adduser alice gnunet
1433 @end example
1434
1435 @noindent
1436 to allow them to access the system-wide GNUnet services. Then, each
1437 user should create a configuration file @file{~/.config/gnunet.conf}
1438 with the lines:
1439
1440 @example
1441 [arm]
1442 START_SYSTEM_SERVICES = NO
1443 START_USER_SERVICES = YES
1444 DEFAULTSERVICES = gns
1445 @end example
1446
1447 @noindent
1448 and start the per-user services using
1449
1450 @example
1451 $ gnunet-arm -c ~/.config/gnunet.conf -s
1452 @end example
1453
1454 @noindent
1455 Again, adding a @code{crontab} entry to autostart the peer is advised:
1456
1457 @example
1458 @@reboot /usr/local/bin/gnunet-arm -c $HOME/.config/gnunet.conf -s
1459 @end example
1460
1461 @noindent
1462 Note that some GNUnet services (such as socks5 proxies) may need a
1463 system-wide TCP port for each user.
1464 For those services, systems with more than one user may require each user
1465 to specify a different port number in their personal configuration file.
1466
1467 Finally, the user should perform the basic initial setup for the GNU Name
1468 System (GNS) certificate authority. This is done by running:
1469
1470 @example
1471 $ gnunet-gns-proxy-setup-ca
1472 @end example
1473
1474 @noindent
1475 The first generates the default zones, whereas the second setups the GNS
1476 Certificate Authority with the user's browser. Now, to activate GNS in the
1477 normal DNS resolution process, you need to edit your
1478 @file{/etc/nsswitch.conf} where you should find a line like this:
1479
1480 @example
1481 hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4
1482 @end example
1483
1484 @noindent
1485 The exact details may differ a bit, which is fine. Add the text
1486 @emph{"gns [NOTFOUND=return]"} after @emph{"files"}.
1487 Keep in mind that we included a backslash ("\") here just for
1488 markup reasons. You should write the text below on @b{one line}
1489 and @b{without} the "\":
1490
1491 @example
1492 hosts: files gns [NOTFOUND=return] mdns4_minimal \
1493 [NOTFOUND=return] dns mdns4
1494 @end example
1495
1496 @c FIXME: Document new behavior.
1497 You might want to make sure that @file{/lib/libnss_gns.so.2} exists on
1498 your system, it should have been created during the installation.
1499
1500
1501 @c **********************************************************************
1502 @cindex TESTING library
1503 @node TESTING library
1504 @section TESTING library
1505
1506 The TESTING library is used for writing testcases which involve starting a
1507 single or multiple peers. While peers can also be started by testcases
1508 using the ARM subsystem, using TESTING library provides an elegant way to
1509 do this. The configurations of the peers are auto-generated from a given
1510 template to have non-conflicting port numbers ensuring that peers'
1511 services do not run into bind errors. This is achieved by testing ports'
1512 availability by binding a listening socket to them before allocating them
1513 to services in the generated configurations.
1514
1515 An another advantage while using TESTING is that it shortens the testcase
1516 startup time as the hostkeys for peers are copied from a pre-computed set
1517 of hostkeys instead of generating them at peer startup which may take a
1518 considerable amount of time when starting multiple peers or on an embedded
1519 processor.
1520
1521 TESTING also allows for certain services to be shared among peers. This
1522 feature is invaluable when testing with multiple peers as it helps to
1523 reduce the number of services run per each peer and hence the total
1524 number of processes run per testcase.
1525
1526 TESTING library only handles creating, starting and stopping peers.
1527 Features useful for testcases such as connecting peers in a topology are
1528 not available in TESTING but are available in the TESTBED subsystem.
1529 Furthermore, TESTING only creates peers on the localhost, however by
1530 using TESTBED testcases can benefit from creating peers across multiple
1531 hosts.
1532
1533 @menu
1534 * API::
1535 * Finer control over peer stop::
1536 * Helper functions::
1537 * Testing with multiple processes::
1538 @end menu
1539
1540 @cindex TESTING API
1541 @node API
1542 @subsection API
1543
1544 TESTING abstracts a group of peers as a TESTING system. All peers in a
1545 system have common hostname and no two services of these peers have a
1546 same port or a UNIX domain socket path.
1547
1548 TESTING system can be created with the function
1549 @code{GNUNET_TESTING_system_create()} which returns a handle to the
1550 system. This function takes a directory path which is used for generating
1551 the configurations of peers, an IP address from which connections to the
1552 peers' services should be allowed, the hostname to be used in peers'
1553 configuration, and an array of shared service specifications of type
1554 @code{struct GNUNET_TESTING_SharedService}.
1555
1556 The shared service specification must specify the name of the service to
1557 share, the configuration pertaining to that shared service and the
1558 maximum number of peers that are allowed to share a single instance of
1559 the shared service.
1560
1561 TESTING system created with @code{GNUNET_TESTING_system_create()} chooses
1562 ports from the default range @code{12000} - @code{56000} while
1563 auto-generating configurations for peers.
1564 This range can be customised with the function
1565 @code{GNUNET_TESTING_system_create_with_portrange()}. This function is
1566 similar to @code{GNUNET_TESTING_system_create()} except that it take 2
1567 additional parameters --- the start and end of the port range to use.
1568
1569 A TESTING system is destroyed with the function
1570 @code{GNUNET_TESTING_system_destory()}. This function takes the handle of
1571 the system and a flag to remove the files created in the directory used
1572 to generate configurations.
1573
1574 A peer is created with the function
1575 @code{GNUNET_TESTING_peer_configure()}. This functions takes the system
1576 handle, a configuration template from which the configuration for the peer
1577 is auto-generated and the index from where the hostkey for the peer has to
1578 be copied from. When successful, this function returns a handle to the
1579 peer which can be used to start and stop it and to obtain the identity of
1580 the peer. If unsuccessful, a NULL pointer is returned with an error
1581 message. This function handles the generated configuration to have
1582 non-conflicting ports and paths.
1583
1584 Peers can be started and stopped by calling the functions
1585 @code{GNUNET_TESTING_peer_start()} and @code{GNUNET_TESTING_peer_stop()}
1586 respectively. A peer can be destroyed by calling the function
1587 @code{GNUNET_TESTING_peer_destroy}. When a peer is destroyed, the ports
1588 and paths in allocated in its configuration are reclaimed for usage in new
1589 peers.
1590
1591 @c ***********************************************************************
1592 @node Finer control over peer stop
1593 @subsection Finer control over peer stop
1594
1595 Using @code{GNUNET_TESTING_peer_stop()} is normally fine for testcases.
1596 However, calling this function for each peer is inefficient when trying to
1597 shutdown multiple peers as this function sends the termination signal to
1598 the given peer process and waits for it to terminate. It would be faster
1599 in this case to send the termination signals to the peers first and then
1600 wait on them. This is accomplished by the functions
1601 @code{GNUNET_TESTING_peer_kill()} which sends a termination signal to the
1602 peer, and the function @code{GNUNET_TESTING_peer_wait()} which waits on
1603 the peer.
1604
1605 Further finer control can be achieved by choosing to stop a peer
1606 asynchronously with the function @code{GNUNET_TESTING_peer_stop_async()}.
1607 This function takes a callback parameter and a closure for it in addition
1608 to the handle to the peer to stop. The callback function is called with
1609 the given closure when the peer is stopped. Using this function
1610 eliminates blocking while waiting for the peer to terminate.
1611
1612 An asynchronous peer stop can be canceled by calling the function
1613 @code{GNUNET_TESTING_peer_stop_async_cancel()}. Note that calling this
1614 function does not prevent the peer from terminating if the termination
1615 signal has already been sent to it. It does, however, cancels the
1616 callback to be called when the peer is stopped.
1617
1618 @c ***********************************************************************
1619 @node Helper functions
1620 @subsection Helper functions
1621
1622 Most of the testcases can benefit from an abstraction which configures a
1623 peer and starts it. This is provided by the function
1624 @code{GNUNET_TESTING_peer_run()}. This function takes the testing
1625 directory pathname, a configuration template, a callback and its closure.
1626 This function creates a peer in the given testing directory by using the
1627 configuration template, starts the peer and calls the given callback with
1628 the given closure.
1629
1630 The function @code{GNUNET_TESTING_peer_run()} starts the ARM service of
1631 the peer which starts the rest of the configured services. A similar
1632 function @code{GNUNET_TESTING_service_run} can be used to just start a
1633 single service of a peer. In this case, the peer's ARM service is not
1634 started; instead, only the given service is run.
1635
1636 @c ***********************************************************************
1637 @node Testing with multiple processes
1638 @subsection Testing with multiple processes
1639
1640 When testing GNUnet, the splitting of the code into a services and clients
1641 often complicates testing. The solution to this is to have the testcase
1642 fork @code{gnunet-service-arm}, ask it to start the required server and
1643 daemon processes and then execute appropriate client actions (to test the
1644 client APIs or the core module or both). If necessary, multiple ARM
1645 services can be forked using different ports (!) to simulate a network.
1646 However, most of the time only one ARM process is needed. Note that on
1647 exit, the testcase should shutdown ARM with a @code{TERM} signal (to give
1648 it the chance to cleanly stop its child processes).
1649
1650 @c TODO: Is this still compiling and working as intended?
1651 The following code illustrates spawning and killing an ARM process from a
1652 testcase:
1653
1654 @example
1655 static void run (void *cls,
1656                  char *const *args,
1657                  const char *cfgfile,
1658                  const struct GNUNET_CONFIGURATION_Handle *cfg) @{
1659   struct GNUNET_OS_Process *arm_pid;
1660   arm_pid = GNUNET_OS_start_process (NULL,
1661                                      NULL,
1662                                      "gnunet-service-arm",
1663                                      "gnunet-service-arm",
1664                                      "-c",
1665                                      cfgname,
1666                                      NULL);
1667   /* do real test work here */
1668   if (0 != GNUNET_OS_process_kill (arm_pid, SIGTERM))
1669     GNUNET_log_strerror
1670       (GNUNET_ERROR_TYPE_WARNING, "kill");
1671   GNUNET_assert (GNUNET_OK == GNUNET_OS_process_wait (arm_pid));
1672   GNUNET_OS_process_close (arm_pid); @}
1673
1674 GNUNET_PROGRAM_run (argc, argv,
1675                     "NAME-OF-TEST",
1676                     "nohelp",
1677                     options,
1678                     &run,
1679                     cls);
1680 @end example
1681
1682
1683 An alternative way that works well to test plugins is to implement a
1684 mock-version of the environment that the plugin expects and then to
1685 simply load the plugin directly.
1686
1687 @c ***********************************************************************
1688 @node Performance regression analysis with Gauger
1689 @section Performance regression analysis with Gauger
1690
1691 To help avoid performance regressions, GNUnet uses Gauger. Gauger is a
1692 simple logging tool that allows remote hosts to send performance data to
1693 a central server, where this data can be analyzed and visualized. Gauger
1694 shows graphs of the repository revisions and the performance data recorded
1695 for each revision, so sudden performance peaks or drops can be identified
1696 and linked to a specific revision number.
1697
1698 In the case of GNUnet, the buildbots log the performance data obtained
1699 during the tests after each build. The data can be accessed on GNUnet's
1700 Gauger page.
1701
1702 The menu on the left allows to select either the results of just one
1703 build bot (under "Hosts") or review the data from all hosts for a given
1704 test result (under "Metrics"). In case of very different absolute value
1705 of the results, for instance arm vs. amd64 machines, the option
1706 "Normalize" on a metric view can help to get an idea about the
1707 performance evolution across all hosts.
1708
1709 Using Gauger in GNUnet and having the performance of a module tracked over
1710 time is very easy. First of course, the testcase must generate some
1711 consistent metric, which makes sense to have logged. Highly volatile or
1712 random dependent metrics probably are not ideal candidates for meaningful
1713 regression detection.
1714
1715 To start logging any value, just include @code{gauger.h} in your testcase
1716 code. Then, use the macro @code{GAUGER()} to make the Buildbots log
1717 whatever value is of interest for you to @code{gnunet.org}'s Gauger
1718 server. No setup is necessary as most Buildbots have already everything
1719 in place and new metrics are created on demand. To delete a metric, you
1720 need to contact a member of the GNUnet development team (a file will need
1721 to be removed manually from the respective directory).
1722
1723 The code in the test should look like this:
1724
1725 @example
1726 [other includes]
1727 #include <gauger.h>
1728
1729 int main (int argc, char *argv[]) @{
1730
1731   [run test, generate data]
1732     GAUGER("YOUR_MODULE",
1733            "METRIC_NAME",
1734            (float)value,
1735            "UNIT"); @}
1736 @end example
1737
1738 Where:
1739
1740 @table @asis
1741
1742 @item @strong{YOUR_MODULE} is a category in the gauger page and should be
1743 the name of the module or subsystem like "Core" or "DHT"
1744 @item @strong{METRIC} is
1745 the name of the metric being collected and should be concise and
1746 descriptive, like "PUT operations in sqlite-datastore".
1747 @item @strong{value} is the value
1748 of the metric that is logged for this run.
1749 @item @strong{UNIT} is the unit in
1750 which the value is measured, for instance "kb/s" or "kb of RAM/node".
1751 @end table
1752
1753 If you wish to use Gauger for your own project, you can grab a copy of the
1754 latest stable release or check out Gauger's Subversion repository.
1755
1756 @cindex TESTBED Subsystem
1757 @node TESTBED Subsystem
1758 @section TESTBED Subsystem
1759
1760 The TESTBED subsystem facilitates testing and measuring of multi-peer
1761 deployments on a single host or over multiple hosts.
1762
1763 The architecture of the testbed module is divided into the following:
1764 @itemize @bullet
1765
1766 @item Testbed API: An API which is used by the testing driver programs. It
1767 provides with functions for creating, destroying, starting, stopping
1768 peers, etc.
1769
1770 @item Testbed service (controller): A service which is started through the
1771 Testbed API. This service handles operations to create, destroy, start,
1772 stop peers, connect them, modify their configurations.
1773
1774 @item Testbed helper: When a controller has to be started on a host, the
1775 testbed API starts the testbed helper on that host which in turn starts
1776 the controller. The testbed helper receives a configuration for the
1777 controller through its stdin and changes it to ensure the controller
1778 doesn't run into any port conflict on that host.
1779 @end itemize
1780
1781
1782 The testbed service (controller) is different from the other GNUnet
1783 services in that it is not started by ARM and is not supposed to be run
1784 as a daemon. It is started by the testbed API through a testbed helper.
1785 In a typical scenario involving multiple hosts, a controller is started
1786 on each host. Controllers take up the actual task of creating peers,
1787 starting and stopping them on the hosts they run.
1788
1789 While running deployments on a single localhost the testbed API starts the
1790 testbed helper directly as a child process. When running deployments on
1791 remote hosts the testbed API starts Testbed Helpers on each remote host
1792 through remote shell. By default testbed API uses SSH as a remote shell.
1793 This can be changed by setting the environmental variable
1794 GNUNET_TESTBED_RSH_CMD to the required remote shell program. This
1795 variable can also contain parameters which are to be passed to the remote
1796 shell program. For e.g:
1797
1798 @example
1799 export GNUNET_TESTBED_RSH_CMD="ssh -o BatchMode=yes \
1800 -o NoHostAuthenticationForLocalhost=yes %h"
1801 @end example
1802
1803 Substitutions are allowed in the command string above,
1804 this allows for substitutions through placemarks which begin with a `%'.
1805 At present the following substitutions are supported
1806
1807 @itemize @bullet
1808 @item %h: hostname
1809 @item %u: username
1810 @item %p: port
1811 @end itemize
1812
1813 Note that the substitution placemark is replaced only when the
1814 corresponding field is available and only once. Specifying
1815
1816 @example
1817 %u@@%h
1818 @end example
1819
1820 doesn't work either. If you want to user username substitutions for
1821 @command{SSH}, use the argument @code{-l} before the
1822 username substitution.
1823
1824 For example:
1825 @example
1826 ssh -l %u -p %p %h
1827 @end example
1828
1829 The testbed API and the helper communicate through the helpers stdin and
1830 stdout. As the helper is started through a remote shell on remote hosts
1831 any output messages from the remote shell interfere with the communication
1832 and results in a failure while starting the helper. For this reason, it is
1833 suggested to use flags to make the remote shells produce no output
1834 messages and to have password-less logins. The default remote shell, SSH,
1835 the default options are:
1836
1837 @example
1838 -o BatchMode=yes -o NoHostBasedAuthenticationForLocalhost=yes"
1839 @end example
1840
1841 Password-less logins should be ensured by using SSH keys.
1842
1843 Since the testbed API executes the remote shell as a non-interactive
1844 shell, certain scripts like .bashrc, .profiler may not be executed. If
1845 this is the case testbed API can be forced to execute an interactive
1846 shell by setting up the environmental variable
1847 @code{GNUNET_TESTBED_RSH_CMD_SUFFIX} to a shell program.
1848
1849 An example could be:
1850
1851 @example
1852 export GNUNET_TESTBED_RSH_CMD_SUFFIX="sh -lc"
1853 @end example
1854
1855 The testbed API will then execute the remote shell program as:
1856
1857 @example
1858 $GNUNET_TESTBED_RSH_CMD -p $port $dest $GNUNET_TESTBED_RSH_CMD_SUFFIX \
1859 gnunet-helper-testbed
1860 @end example
1861
1862 On some systems, problems may arise while starting testbed helpers if
1863 GNUnet is installed into a custom location since the helper may not be
1864 found in the standard path. This can be addressed by setting the variable
1865 `@code{HELPER_BINARY_PATH}' to the path of the testbed helper.
1866 Testbed API will then use this path to start helper binaries both
1867 locally and remotely.
1868
1869 Testbed API can accessed by including the
1870 @file{gnunet_testbed_service.h} file and linking with
1871 @code{-lgnunettestbed}.
1872
1873 @c ***********************************************************************
1874 @menu
1875 * Supported Topologies::
1876 * Hosts file format::
1877 * Topology file format::
1878 * Testbed Barriers::
1879 * TESTBED Caveats::
1880 @end menu
1881
1882 @node Supported Topologies
1883 @subsection Supported Topologies
1884
1885 While testing multi-peer deployments, it is often needed that the peers
1886 are connected in some topology. This requirement is addressed by the
1887 function @code{GNUNET_TESTBED_overlay_connect()} which connects any given
1888 two peers in the testbed.
1889
1890 The API also provides a helper function
1891 @code{GNUNET_TESTBED_overlay_configure_topology()} to connect a given set
1892 of peers in any of the following supported topologies:
1893
1894 @itemize @bullet
1895
1896 @item @code{GNUNET_TESTBED_TOPOLOGY_CLIQUE}: All peers are connected with
1897 each other
1898
1899 @item @code{GNUNET_TESTBED_TOPOLOGY_LINE}: Peers are connected to form a
1900 line
1901
1902 @item @code{GNUNET_TESTBED_TOPOLOGY_RING}: Peers are connected to form a
1903 ring topology
1904
1905 @item @code{GNUNET_TESTBED_TOPOLOGY_2D_TORUS}: Peers are connected to
1906 form a 2 dimensional torus topology. The number of peers may not be a
1907 perfect square, in that case the resulting torus may not have the uniform
1908 poloidal and toroidal lengths
1909
1910 @item @code{GNUNET_TESTBED_TOPOLOGY_ERDOS_RENYI}: Topology is generated
1911 to form a random graph. The number of links to be present should be given
1912
1913 @item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD}: Peers are connected to
1914 form a 2D Torus with some random links among them. The number of random
1915 links are to be given
1916
1917 @item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD_RING}: Peers are
1918 connected to form a ring with some random links among them. The number of
1919 random links are to be given
1920
1921 @item @code{GNUNET_TESTBED_TOPOLOGY_SCALE_FREE}: Connects peers in a
1922 topology where peer connectivity follows power law - new peers are
1923 connected with high probability to well connected peers.
1924 (See Emergence of Scaling in Random Networks. Science 286,
1925 509-512, 1999
1926 (@uref{https://git.gnunet.org/bibliography.git/plain/docs/emergence_of_scaling_in_random_networks__barabasi_albert_science_286__1999.pdf, pdf}))
1927
1928 @item @code{GNUNET_TESTBED_TOPOLOGY_FROM_FILE}: The topology information
1929 is loaded from a file. The path to the file has to be given.
1930 @xref{Topology file format}, for the format of this file.
1931
1932 @item @code{GNUNET_TESTBED_TOPOLOGY_NONE}: No topology
1933 @end itemize
1934
1935
1936 The above supported topologies can be specified respectively by setting
1937 the variable @code{OVERLAY_TOPOLOGY} to the following values in the
1938 configuration passed to Testbed API functions
1939 @code{GNUNET_TESTBED_test_run()} and
1940 @code{GNUNET_TESTBED_run()}:
1941
1942 @itemize @bullet
1943 @item @code{CLIQUE}
1944 @item @code{RING}
1945 @item @code{LINE}
1946 @item @code{2D_TORUS}
1947 @item @code{RANDOM}
1948 @item @code{SMALL_WORLD}
1949 @item @code{SMALL_WORLD_RING}
1950 @item @code{SCALE_FREE}
1951 @item @code{FROM_FILE}
1952 @item @code{NONE}
1953 @end itemize
1954
1955
1956 Topologies @code{RANDOM}, @code{SMALL_WORLD} and @code{SMALL_WORLD_RING}
1957 require the option @code{OVERLAY_RANDOM_LINKS} to be set to the number of
1958 random links to be generated in the configuration. The option will be
1959 ignored for the rest of the topologies.
1960
1961 Topology @code{SCALE_FREE} requires the options
1962 @code{SCALE_FREE_TOPOLOGY_CAP} to be set to the maximum number of peers
1963 which can connect to a peer and @code{SCALE_FREE_TOPOLOGY_M} to be set to
1964 how many peers a peer should be at least connected to.
1965
1966 Similarly, the topology @code{FROM_FILE} requires the option
1967 @code{OVERLAY_TOPOLOGY_FILE} to contain the path of the file containing
1968 the topology information. This option is ignored for the rest of the
1969 topologies. @xref{Topology file format}, for the format of this file.
1970
1971 @c ***********************************************************************
1972 @node Hosts file format
1973 @subsection Hosts file format
1974
1975 The testbed API offers the function
1976 @code{GNUNET_TESTBED_hosts_load_from_file()} to load from a given file
1977 details about the hosts which testbed can use for deploying peers.
1978 This function is useful to keep the data about hosts
1979 separate instead of hard coding them in code.
1980
1981 Another helper function from testbed API, @code{GNUNET_TESTBED_run()}
1982 also takes a hosts file name as its parameter. It uses the above
1983 function to populate the hosts data structures and start controllers to
1984 deploy peers.
1985
1986 These functions require the hosts file to be of the following format:
1987 @itemize @bullet
1988 @item Each line is interpreted to have details about a host
1989 @item Host details should include the username to use for logging into the
1990 host, the hostname of the host and the port number to use for the remote
1991 shell program. All thee values should be given.
1992 @item These details should be given in the following format:
1993 @example
1994 <username>@@<hostname>:<port>
1995 @end example
1996 @end itemize
1997
1998 Note that having canonical hostnames may cause problems while resolving
1999 the IP addresses (See this bug). Hence it is advised to provide the hosts'
2000 IP numerical addresses as hostnames whenever possible.
2001
2002 @c ***********************************************************************
2003 @node Topology file format
2004 @subsection Topology file format
2005
2006 A topology file describes how peers are to be connected. It should adhere
2007 to the following format for testbed to parse it correctly.
2008
2009 Each line should begin with the target peer id. This should be followed by
2010 a colon(`:') and origin peer ids separated by `|'. All spaces except for
2011 newline characters are ignored. The API will then try to connect each
2012 origin peer to the target peer.
2013
2014 For example, the following file will result in 5 overlay connections:
2015 [2->1], [3->1],[4->3], [0->3], [2->0]@
2016 @code{@ 1:2|3@ 3:4| 0@ 0: 2@ }
2017
2018 @c ***********************************************************************
2019 @node Testbed Barriers
2020 @subsection Testbed Barriers
2021
2022 The testbed subsystem's barriers API facilitates coordination among the
2023 peers run by the testbed and the experiment driver. The concept is
2024 similar to the barrier synchronisation mechanism found in parallel
2025 programming or multi-threading paradigms - a peer waits at a barrier upon
2026 reaching it until the barrier is reached by a predefined number of peers.
2027 This predefined number of peers required to cross a barrier is also called
2028 quorum. We say a peer has reached a barrier if the peer is waiting for the
2029 barrier to be crossed. Similarly a barrier is said to be reached if the
2030 required quorum of peers reach the barrier. A barrier which is reached is
2031 deemed as crossed after all the peers waiting on it are notified.
2032
2033 The barriers API provides the following functions:
2034 @itemize @bullet
2035 @item @strong{@code{GNUNET_TESTBED_barrier_init()}:} function to
2036 initialize a barrier in the experiment
2037 @item @strong{@code{GNUNET_TESTBED_barrier_cancel()}:} function to cancel
2038 a barrier which has been initialized before
2039 @item @strong{@code{GNUNET_TESTBED_barrier_wait()}:} function to signal
2040 barrier service that the caller has reached a barrier and is waiting for
2041 it to be crossed
2042 @item @strong{@code{GNUNET_TESTBED_barrier_wait_cancel()}:} function to
2043 stop waiting for a barrier to be crossed
2044 @end itemize
2045
2046
2047 Among the above functions, the first two, namely
2048 @code{GNUNET_TESTBED_barrier_init()} and
2049 @code{GNUNET_TESTBED_barrier_cancel()} are used by experiment drivers. All
2050 barriers should be initialised by the experiment driver by calling
2051 @code{GNUNET_TESTBED_barrier_init()}. This function takes a name to
2052 identify the barrier, the quorum required for the barrier to be crossed
2053 and a notification callback for notifying the experiment driver when the
2054 barrier is crossed. @code{GNUNET_TESTBED_barrier_cancel()} cancels an
2055 initialised barrier and frees the resources allocated for it. This
2056 function can be called upon a initialised barrier before it is crossed.
2057
2058 The remaining two functions @code{GNUNET_TESTBED_barrier_wait()} and
2059 @code{GNUNET_TESTBED_barrier_wait_cancel()} are used in the peer's
2060 processes. @code{GNUNET_TESTBED_barrier_wait()} connects to the local
2061 barrier service running on the same host the peer is running on and
2062 registers that the caller has reached the barrier and is waiting for the
2063 barrier to be crossed. Note that this function can only be used by peers
2064 which are started by testbed as this function tries to access the local
2065 barrier service which is part of the testbed controller service. Calling
2066 @code{GNUNET_TESTBED_barrier_wait()} on an uninitialised barrier results
2067 in failure. @code{GNUNET_TESTBED_barrier_wait_cancel()} cancels the
2068 notification registered by @code{GNUNET_TESTBED_barrier_wait()}.
2069
2070
2071 @c ***********************************************************************
2072 @menu
2073 * Implementation::
2074 @end menu
2075
2076 @node Implementation
2077 @subsubsection Implementation
2078
2079 Since barriers involve coordination between experiment driver and peers,
2080 the barrier service in the testbed controller is split into two
2081 components. The first component responds to the message generated by the
2082 barrier API used by the experiment driver (functions
2083 @code{GNUNET_TESTBED_barrier_init()} and
2084 @code{GNUNET_TESTBED_barrier_cancel()}) and the second component to the
2085 messages generated by barrier API used by peers (functions
2086 @code{GNUNET_TESTBED_barrier_wait()} and
2087 @code{GNUNET_TESTBED_barrier_wait_cancel()}).
2088
2089 Calling @code{GNUNET_TESTBED_barrier_init()} sends a
2090 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_INIT} message to the master
2091 controller. The master controller then registers a barrier and calls
2092 @code{GNUNET_TESTBED_barrier_init()} for each its subcontrollers. In this
2093 way barrier initialisation is propagated to the controller hierarchy.
2094 While propagating initialisation, any errors at a subcontroller such as
2095 timeout during further propagation are reported up the hierarchy back to
2096 the experiment driver.
2097
2098 Similar to @code{GNUNET_TESTBED_barrier_init()},
2099 @code{GNUNET_TESTBED_barrier_cancel()} propagates
2100 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_CANCEL} message which causes
2101 controllers to remove an initialised barrier.
2102
2103 The second component is implemented as a separate service in the binary
2104 `gnunet-service-testbed' which already has the testbed controller service.
2105 Although this deviates from the gnunet process architecture of having one
2106 service per binary, it is needed in this case as this component needs
2107 access to barrier data created by the first component. This component
2108 responds to @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages from
2109 local peers when they call @code{GNUNET_TESTBED_barrier_wait()}. Upon
2110 receiving @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} message, the
2111 service checks if the requested barrier has been initialised before and
2112 if it was not initialised, an error status is sent through
2113 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to the local
2114 peer and the connection from the peer is terminated. If the barrier is
2115 initialised before, the barrier's counter for reached peers is incremented
2116 and a notification is registered to notify the peer when the barrier is
2117 reached. The connection from the peer is left open.
2118
2119 When enough peers required to attain the quorum send
2120 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages, the controller
2121 sends a @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to its
2122 parent informing that the barrier is crossed. If the controller has
2123 started further subcontrollers, it delays this message until it receives
2124 a similar notification from each of those subcontrollers. Finally, the
2125 barriers API at the experiment driver receives the
2126 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} when the barrier is
2127 reached at all the controllers.
2128
2129 The barriers API at the experiment driver responds to the
2130 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message by echoing it
2131 back to the master controller and notifying the experiment controller
2132 through the notification callback that a barrier has been crossed. The
2133 echoed @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message is
2134 propagated by the master controller to the controller hierarchy. This
2135 propagation triggers the notifications registered by peers at each of the
2136 controllers in the hierarchy. Note the difference between this downward
2137 propagation of the @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS}
2138 message from its upward propagation --- the upward propagation is needed
2139 for ensuring that the barrier is reached by all the controllers and the
2140 downward propagation is for triggering that the barrier is crossed.
2141
2142 @cindex TESTBED Caveats
2143 @node TESTBED Caveats
2144 @subsection TESTBED Caveats
2145
2146 This section documents a few caveats when using the GNUnet testbed
2147 subsystem.
2148
2149 @c ***********************************************************************
2150 @menu
2151 * CORE must be started::
2152 * ATS must want the connections::
2153 @end menu
2154
2155 @node CORE must be started
2156 @subsubsection CORE must be started
2157
2158 A uncomplicated issue is bug #3993
2159 (@uref{https://bugs.gnunet.org/view.php?id=3993, https://bugs.gnunet.org/view.php?id=3993}):
2160 Your configuration MUST somehow ensure that for each peer the
2161 @code{CORE} service is started when the peer is setup, otherwise
2162 @code{TESTBED} may fail to connect peers when the topology is initialized,
2163 as @code{TESTBED} will start some @code{CORE} services but not
2164 necessarily all (but it relies on all of them running). The easiest way
2165 is to set
2166
2167 @example
2168 [core]
2169 IMMEDIATE_START = YES
2170 @end example
2171
2172 @noindent
2173 in the configuration file.
2174 Alternatively, having any service that directly or indirectly depends on
2175 @code{CORE} being started with @code{IMMEDIATE_START} will also do.
2176 This issue largely arises if users try to over-optimize by not
2177 starting any services with @code{IMMEDIATE_START}.
2178
2179 @c ***********************************************************************
2180 @node ATS must want the connections
2181 @subsubsection ATS must want the connections
2182
2183 When TESTBED sets up connections, it only offers the respective HELLO
2184 information to the TRANSPORT service. It is then up to the ATS service to
2185 @strong{decide} to use the connection. The ATS service will typically
2186 eagerly establish any connection if the number of total connections is
2187 low (relative to bandwidth). Details may further depend on the
2188 specific ATS backend that was configured. If ATS decides to NOT establish
2189 a connection (even though TESTBED provided the required information), then
2190 that connection will count as failed for TESTBED. Note that you can
2191 configure TESTBED to tolerate a certain number of connection failures
2192 (see '-e' option of gnunet-testbed-profiler). This issue largely arises
2193 for dense overlay topologies, especially if you try to create cliques
2194 with more than 20 peers.
2195
2196 @cindex libgnunetutil
2197 @node libgnunetutil
2198 @section libgnunetutil
2199
2200 libgnunetutil is the fundamental library that all GNUnet code builds upon.
2201 Ideally, this library should contain most of the platform dependent code
2202 (except for user interfaces and really special needs that only few
2203 applications have). It is also supposed to offer basic services that most
2204 if not all GNUnet binaries require. The code of libgnunetutil is in the
2205 @file{src/util/} directory. The public interface to the library is in the
2206 gnunet_util.h header. The functions provided by libgnunetutil fall
2207 roughly into the following categories (in roughly the order of importance
2208 for new developers):
2209
2210 @itemize @bullet
2211 @item logging (common_logging.c)
2212 @item memory allocation (common_allocation.c)
2213 @item endianess conversion (common_endian.c)
2214 @item internationalization (common_gettext.c)
2215 @item String manipulation (string.c)
2216 @item file access (disk.c)
2217 @item buffered disk IO (bio.c)
2218 @item time manipulation (time.c)
2219 @item configuration parsing (configuration.c)
2220 @item command-line handling (getopt*.c)
2221 @item cryptography (crypto_*.c)
2222 @item data structures (container_*.c)
2223 @item CPS-style scheduling (scheduler.c)
2224 @item Program initialization (program.c)
2225 @item Networking (network.c, client.c, server*.c, service.c)
2226 @item message queuing (mq.c)
2227 @item bandwidth calculations (bandwidth.c)
2228 @item Other OS-related (os*.c, plugin.c, signal.c)
2229 @item Pseudonym management (pseudonym.c)
2230 @end itemize
2231
2232 It should be noted that only developers that fully understand this entire
2233 API will be able to write good GNUnet code.
2234
2235 Ideally, porting GNUnet should only require porting the gnunetutil
2236 library. More testcases for the gnunetutil APIs are therefore a great
2237 way to make porting of GNUnet easier.
2238
2239 @menu
2240 * Logging::
2241 * Interprocess communication API (IPC)::
2242 * Cryptography API::
2243 * Message Queue API::
2244 * Service API::
2245 * Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps::
2246 * CONTAINER_MDLL API::
2247 @end menu
2248
2249 @cindex Logging
2250 @cindex log levels
2251 @node Logging
2252 @subsection Logging
2253
2254 GNUnet is able to log its activity, mostly for the purposes of debugging
2255 the program at various levels.
2256
2257 @file{gnunet_common.h} defines several @strong{log levels}:
2258 @table @asis
2259
2260 @item ERROR for errors
2261 (really problematic situations, often leading to crashes)
2262 @item WARNING for warnings
2263 (troubling situations that might have negative consequences, although
2264 not fatal)
2265 @item INFO for various information.
2266 Used somewhat rarely, as GNUnet statistics is used to hold and display
2267 most of the information that users might find interesting.
2268 @item DEBUG for debugging.
2269 Does not produce much output on normal builds, but when extra logging is
2270 enabled at compile time, a staggering amount of data is outputted under
2271 this log level.
2272 @end table
2273
2274
2275 Normal builds of GNUnet (configured with @code{--enable-logging[=yes]})
2276 are supposed to log nothing under DEBUG level. The
2277 @code{--enable-logging=verbose} configure option can be used to create a
2278 build with all logging enabled. However, such build will produce large
2279 amounts of log data, which is inconvenient when one tries to hunt down a
2280 specific problem.
2281
2282 To mitigate this problem, GNUnet provides facilities to apply a filter to
2283 reduce the logs:
2284 @table @asis
2285
2286 @item Logging by default When no log levels are configured in any other
2287 way (see below), GNUnet will default to the WARNING log level. This
2288 mostly applies to GNUnet command line utilities, services and daemons;
2289 tests will always set log level to WARNING or, if
2290 @code{--enable-logging=verbose} was passed to configure, to DEBUG. The
2291 default level is suggested for normal operation.
2292 @item The -L option Most GNUnet executables accept an "-L loglevel" or
2293 "--log=loglevel" option. If used, it makes the process set a global log
2294 level to "loglevel". Thus it is possible to run some processes
2295 with -L DEBUG, for example, and others with -L ERROR to enable specific
2296 settings to diagnose problems with a particular process.
2297 @item Configuration files.  Because GNUnet
2298 service and daemon processes are usually launched by gnunet-arm, it is not
2299 possible to pass different custom command line options directly to every
2300 one of them. The options passed to @code{gnunet-arm} only affect
2301 gnunet-arm and not the rest of GNUnet. However, one can specify a
2302 configuration key "OPTIONS" in the section that corresponds to a service
2303 or a daemon, and put a value of "-L loglevel" there. This will make the
2304 respective service or daemon set its log level to "loglevel" (as the
2305 value of OPTIONS will be passed as a command-line argument).
2306
2307 To specify the same log level for all services without creating separate
2308 "OPTIONS" entries in the configuration for each one, the user can specify
2309 a config key "GLOBAL_POSTFIX" in the [arm] section of the configuration
2310 file. The value of GLOBAL_POSTFIX will be appended to all command lines
2311 used by the ARM service to run other services. It can contain any option
2312 valid for all GNUnet commands, thus in particular the "-L loglevel"
2313 option. The ARM service itself is, however, unaffected by GLOBAL_POSTFIX;
2314 to set log level for it, one has to specify "OPTIONS" key in the [arm]
2315 section.
2316 @item Environment variables.
2317 Setting global per-process log levels with "-L loglevel" does not offer
2318 sufficient log filtering granularity, as one service will call interface
2319 libraries and supporting libraries of other GNUnet services, potentially
2320 producing lots of debug log messages from these libraries. Also, changing
2321 the config file is not always convenient (especially when running the
2322 GNUnet test suite).@ To fix that, and to allow GNUnet to use different
2323 log filtering at runtime without re-compiling the whole source tree, the
2324 log calls were changed to be configurable at run time. To configure them
2325 one has to define environment variables "GNUNET_FORCE_LOGFILE",
2326 "GNUNET_LOG" and/or "GNUNET_FORCE_LOG":
2327 @itemize @bullet
2328
2329 @item "GNUNET_LOG" only affects the logging when no global log level is
2330 configured by any other means (that is, the process does not explicitly
2331 set its own log level, there are no "-L loglevel" options on command line
2332 or in configuration files), and can be used to override the default
2333 WARNING log level.
2334
2335 @item "GNUNET_FORCE_LOG" will completely override any other log
2336 configuration options given.
2337
2338 @item "GNUNET_FORCE_LOGFILE" will completely override the location of the
2339 file to log messages to. It should contain a relative or absolute file
2340 name. Setting GNUNET_FORCE_LOGFILE is equivalent to passing
2341 "--log-file=logfile" or "-l logfile" option (see below). It supports "[]"
2342 format in file names, but not "@{@}" (see below).
2343 @end itemize
2344
2345
2346 Because environment variables are inherited by child processes when they
2347 are launched, starting or re-starting the ARM service with these
2348 variables will propagate them to all other services.
2349
2350 "GNUNET_LOG" and "GNUNET_FORCE_LOG" variables must contain a specially
2351 formatted @strong{logging definition} string, which looks like this:@
2352
2353 @c FIXME: Can we close this with [/component] instead?
2354 @example
2355 [component];[file];[function];[from_line[-to_line]];loglevel[/component...]
2356 @end example
2357
2358 That is, a logging definition consists of definition entries, separated by
2359 slashes ('/'). If only one entry is present, there is no need to add a
2360 slash to its end (although it is not forbidden either).@ All definition
2361 fields (component, file, function, lines and loglevel) are mandatory, but
2362 (except for the loglevel) they can be empty. An empty field means
2363 "match anything". Note that even if fields are empty, the semicolon (';')
2364 separators must be present.@ The loglevel field is mandatory, and must
2365 contain one of the log level names (ERROR, WARNING, INFO or DEBUG).@
2366 The lines field might contain one non-negative number, in which case it
2367 matches only one line, or a range "from_line-to_line", in which case it
2368 matches any line in the interval [from_line;to_line] (that is, including
2369 both start and end line).@ GNUnet mostly defaults component name to the
2370 name of the service that is implemented in a process ('transport',
2371 'core', 'peerinfo', etc), but logging calls can specify custom component
2372 names using @code{GNUNET_log_from}.@ File name and function name are
2373 provided by the compiler (__FILE__ and __FUNCTION__ built-ins).
2374
2375 Component, file and function fields are interpreted as non-extended
2376 regular expressions (GNU libc regex functions are used). Matching is
2377 case-sensitive, "^" and "$" will match the beginning and the end of the
2378 text. If a field is empty, its contents are automatically replaced with
2379 a ".*" regular expression, which matches anything. Matching is done in
2380 the default way, which means that the expression matches as long as it's
2381 contained anywhere in the string. Thus "GNUNET_" will match both
2382 "GNUNET_foo" and "BAR_GNUNET_BAZ". Use '^' and/or '$' to make sure that
2383 the expression matches at the start and/or at the end of the string.
2384 The semicolon (';') can't be escaped, and GNUnet will not use it in
2385 component names (it can't be used in function names and file names
2386 anyway).
2387
2388 @end table
2389
2390
2391 Every logging call in GNUnet code will be (at run time) matched against
2392 the log definitions passed to the process. If a log definition fields are
2393 matching the call arguments, then the call log level is compared the the
2394 log level of that definition. If the call log level is less or equal to
2395 the definition log level, the call is allowed to proceed. Otherwise the
2396 logging call is forbidden, and nothing is logged. If no definitions
2397 matched at all, GNUnet will use the global log level or (if a global log
2398 level is not specified) will default to WARNING (that is, it will allow
2399 the call to proceed, if its level is less or equal to the global log
2400 level or to WARNING).
2401
2402 That is, definitions are evaluated from left to right, and the first
2403 matching definition is used to allow or deny the logging call. Thus it is
2404 advised to place narrow definitions at the beginning of the logdef
2405 string, and generic definitions - at the end.
2406
2407 Whether a call is allowed or not is only decided the first time this
2408 particular call is made. The evaluation result is then cached, so that
2409 any attempts to make the same call later will be allowed or disallowed
2410 right away. Because of that runtime log level evaluation should not
2411 significantly affect the process performance.
2412 Log definition parsing is only done once, at the first call to
2413 @code{GNUNET_log_setup ()} made by the process (which is usually
2414 done soon after it starts).
2415
2416 At the moment of writing there is no way to specify logging definitions
2417 from configuration files, only via environment variables.
2418
2419 At the moment GNUnet will stop processing a log definition when it
2420 encounters an error in definition formatting or an error in regular
2421 expression syntax, and will not report the failure in any way.
2422
2423
2424 @c ***********************************************************************
2425 @menu
2426 * Examples::
2427 * Log files::
2428 * Updated behavior of GNUNET_log::
2429 @end menu
2430
2431 @node Examples
2432 @subsubsection Examples
2433
2434 @table @asis
2435
2436 @item @code{GNUNET_FORCE_LOG=";;;;DEBUG" gnunet-arm -s} Start GNUnet
2437 process tree, running all processes with DEBUG level (one should be
2438 careful with it, as log files will grow at alarming rate!)
2439 @item @code{GNUNET_FORCE_LOG="core;;;;DEBUG" gnunet-arm -s} Start GNUnet
2440 process tree, running the core service under DEBUG level (everything else
2441 will use configured or default level).
2442
2443 @item Start GNUnet process tree, allowing any logging calls from
2444 gnunet-service-transport_validation.c (everything else will use
2445 configured or default level).
2446
2447 @example
2448 GNUNET_FORCE_LOG=";gnunet-service-transport_validation.c;;; DEBUG" \
2449 gnunet-arm -s
2450 @end example
2451
2452 @item Start GNUnet process tree, allowing any logging calls from
2453 gnunet-gnunet-service-fs_push.c (everything else will use configured or
2454 default level).
2455
2456 @example
2457 GNUNET_FORCE_LOG="fs;gnunet-service-fs_push.c;;;DEBUG" gnunet-arm -s
2458 @end example
2459
2460 @item Start GNUnet process tree, allowing any logging calls from the
2461 GNUNET_NETWORK_socket_select function (everything else will use
2462 configured or default level).
2463
2464 @example
2465 GNUNET_FORCE_LOG=";;GNUNET_NETWORK_socket_select;;DEBUG" gnunet-arm -s
2466 @end example
2467
2468 @item Start GNUnet process tree, allowing any logging calls from the
2469 components that have "transport" in their names, and are made from
2470 function that have "send" in their names. Everything else will be allowed
2471 to be logged only if it has WARNING level.
2472
2473 @example
2474 GNUNET_FORCE_LOG="transport.*;;.*send.*;;DEBUG/;;;;WARNING" gnunet-arm -s
2475 @end example
2476
2477 @end table
2478
2479
2480 On Windows, one can use batch files to run GNUnet processes with special
2481 environment variables, without affecting the whole system. Such batch
2482 file will look like this:
2483
2484 @example
2485 set GNUNET_FORCE_LOG=;;do_transmit;;DEBUG@ gnunet-arm -s
2486 @end example
2487
2488 (note the absence of double quotes in the environment variable definition,
2489 as opposed to earlier examples, which use the shell).
2490 Another limitation, on Windows, GNUNET_FORCE_LOGFILE @strong{MUST} be set
2491 in order to GNUNET_FORCE_LOG to work.
2492
2493
2494 @cindex Log files
2495 @node Log files
2496 @subsubsection Log files
2497
2498 GNUnet can be told to log everything into a file instead of stderr (which
2499 is the default) using the "--log-file=logfile" or "-l logfile" option.
2500 This option can also be passed via command line, or from the "OPTION" and
2501 "GLOBAL_POSTFIX" configuration keys (see above). The file name passed
2502 with this option is subject to GNUnet filename expansion. If specified in
2503 "GLOBAL_POSTFIX", it is also subject to ARM service filename expansion,
2504 in particular, it may contain "@{@}" (left and right curly brace)
2505 sequence, which will be replaced by ARM with the name of the service.
2506 This is used to keep logs from more than one service separate, while only
2507 specifying one template containing "@{@}" in GLOBAL_POSTFIX.
2508
2509 As part of a secondary file name expansion, the first occurrence of "[]"
2510 sequence ("left square brace" followed by "right square brace") in the
2511 file name will be replaced with a process identifier or the process when
2512 it initializes its logging subsystem. As a result, all processes will log
2513 into different files. This is convenient for isolating messages of a
2514 particular process, and prevents I/O races when multiple processes try to
2515 write into the file at the same time. This expansion is done
2516 independently of "@{@}" expansion that ARM service does (see above).
2517
2518 The log file name that is specified via "-l" can contain format characters
2519 from the 'strftime' function family. For example, "%Y" will be replaced
2520 with the current year. Using "basename-%Y-%m-%d.log" would include the
2521 current year, month and day in the log file. If a GNUnet process runs for
2522 long enough to need more than one log file, it will eventually clean up
2523 old log files. Currently, only the last three log files (plus the current
2524 log file) are preserved. So once the fifth log file goes into use (so
2525 after 4 days if you use "%Y-%m-%d" as above), the first log file will be
2526 automatically deleted. Note that if your log file name only contains "%Y",
2527 then log files would be kept for 4 years and the logs from the first year
2528 would be deleted once year 5 begins. If you do not use any date-related
2529 string format codes, logs would never be automatically deleted by GNUnet.
2530
2531
2532 @c ***********************************************************************
2533
2534 @node Updated behavior of GNUNET_log
2535 @subsubsection Updated behavior of GNUNET_log
2536
2537 It's currently quite common to see constructions like this all over the
2538 code:
2539
2540 @example
2541 #if MESH_DEBUG
2542 GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, "MESH: client disconnected\n");
2543 #endif
2544 @end example
2545
2546 The reason for the #if is not to avoid displaying the message when
2547 disabled (GNUNET_ERROR_TYPE takes care of that), but to avoid the
2548 compiler including it in the binary at all, when compiling GNUnet for
2549 platforms with restricted storage space / memory (MIPS routers,
2550 ARM plug computers / dev boards, etc).
2551
2552 This presents several problems: the code gets ugly, hard to write and it
2553 is very easy to forget to include the #if guards, creating non-consistent
2554 code. A new change in GNUNET_log aims to solve these problems.
2555
2556 @strong{This change requires to @file{./configure} with at least
2557 @code{--enable-logging=verbose} to see debug messages.}
2558
2559 Here is an example of code with dense debug statements:
2560
2561 @example
2562 switch (restrict_topology) @{
2563 case GNUNET_TESTING_TOPOLOGY_CLIQUE:#if VERBOSE_TESTING
2564 GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but clique
2565 topology\n")); #endif unblacklisted_connections = create_clique (pg,
2566 &remove_connections, BLACKLIST, GNUNET_NO); break; case
2567 GNUNET_TESTING_TOPOLOGY_SMALL_WORLD_RING: #if VERBOSE_TESTING GNUNET_log
2568 (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but small world (ring)
2569 topology\n")); #endif unblacklisted_connections = create_small_world_ring
2570 (pg,&remove_connections, BLACKLIST); break;
2571 @end example
2572
2573
2574 Pretty hard to follow, huh?
2575
2576 From now on, it is not necessary to include the #if / #endif statements to
2577 achieve the same behavior. The @code{GNUNET_log} and @code{GNUNET_log_from}
2578 macros take care of it for you, depending on the configure option:
2579
2580 @itemize @bullet
2581 @item If @code{--enable-logging} is set to @code{no}, the binary will
2582 contain no log messages at all.
2583 @item If @code{--enable-logging} is set to @code{yes}, the binary will
2584 contain no DEBUG messages, and therefore running with @command{-L DEBUG}
2585 will have
2586 no effect. Other messages (ERROR, WARNING, INFO, etc) will be included.
2587 @item If @code{--enable-logging} is set to @code{verbose}, or
2588 @code{veryverbose} the binary will contain DEBUG messages (still, it will
2589 be necessary to run with @command{-L DEBUG} or set the DEBUG config option
2590 to show them).
2591 @end itemize
2592
2593
2594 If you are a developer:
2595 @itemize @bullet
2596 @item please make sure that you @code{./configure
2597 --enable-logging=@{verbose,veryverbose@}}, so you can see DEBUG messages.
2598 @item please remove the @code{#if} statements around @code{GNUNET_log
2599 (GNUNET_ERROR_TYPE_DEBUG, ...)} lines, to improve the readability of your
2600 code.
2601 @end itemize
2602
2603 Since now activating DEBUG automatically makes it VERBOSE and activates
2604 @strong{all} debug messages by default, you probably want to use the
2605 @uref{https://docs.gnunet.org/#Logging, https://docs.gnunet.org/#Logging}
2606 functionality to filter only relevant messages.
2607 A suitable configuration could be:
2608
2609 @example
2610 $ export GNUNET_FORCE_LOG="^YOUR_SUBSYSTEM$;;;;DEBUG/;;;;WARNING"
2611 @end example
2612
2613 Which will behave almost like enabling DEBUG in that subsystem before the
2614 change. Of course you can adapt it to your particular needs, this is only
2615 a quick example.
2616
2617 @cindex Interprocess communication API
2618 @cindex ICP
2619 @node Interprocess communication API (IPC)
2620 @subsection Interprocess communication API (IPC)
2621
2622 In GNUnet a variety of new message types might be defined and used in
2623 interprocess communication, in this tutorial we use the
2624 @code{struct AddressLookupMessage} as a example to introduce how to
2625 construct our own message type in GNUnet and how to implement the message
2626 communication between service and client.
2627 (Here, a client uses the @code{struct AddressLookupMessage} as a request
2628 to ask the server to return the address of any other peer connecting to
2629 the service.)
2630
2631
2632 @c ***********************************************************************
2633 @menu
2634 * Define new message types::
2635 * Define message struct::
2636 * Client - Establish connection::
2637 * Client - Initialize request message::
2638 * Client - Send request and receive response::
2639 * Server - Startup service::
2640 * Server - Add new handles for specified messages::
2641 * Server - Process request message::
2642 * Server - Response to client::
2643 * Server - Notification of clients::
2644 * Conversion between Network Byte Order (Big Endian) and Host Byte Order::
2645 @end menu
2646
2647 @node Define new message types
2648 @subsubsection Define new message types
2649
2650 First of all, you should define the new message type in
2651 @file{gnunet_protocols.h}:
2652
2653 @example
2654  // Request to look addresses of peers in server.
2655 #define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP 29
2656   // Response to the address lookup request.
2657 #define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY 30
2658 @end example
2659
2660 @c ***********************************************************************
2661 @node Define message struct
2662 @subsubsection Define message struct
2663
2664 After the type definition, the specified message structure should also be
2665 described in the header file, e.g. transport.h in our case.
2666
2667 @example
2668 struct AddressLookupMessage @{
2669   struct GNUNET_MessageHeader header;
2670   int32_t numeric_only GNUNET_PACKED;
2671   struct GNUNET_TIME_AbsoluteNBO timeout;
2672   uint32_t addrlen GNUNET_PACKED;
2673   /* followed by 'addrlen' bytes of the actual address, then
2674      followed by the 0-terminated name of the transport */ @};
2675 GNUNET_NETWORK_STRUCT_END
2676 @end example
2677
2678
2679 Please note @code{GNUNET_NETWORK_STRUCT_BEGIN} and @code{GNUNET_PACKED}
2680 which both ensure correct alignment when sending structs over the network.
2681
2682 @menu
2683 @end menu
2684
2685 @c ***********************************************************************
2686 @node Client - Establish connection
2687 @subsubsection Client - Establish connection
2688
2689
2690
2691 At first, on the client side, the underlying API is employed to create a
2692 new connection to a service, in our example the transport service would be
2693 connected.
2694
2695 @example
2696 struct GNUNET_CLIENT_Connection *client;
2697 client = GNUNET_CLIENT_connect ("transport", cfg);
2698 @end example
2699
2700 @c ***********************************************************************
2701 @node Client - Initialize request message
2702 @subsubsection Client - Initialize request message
2703
2704
2705 When the connection is ready, we initialize the message. In this step,
2706 all the fields of the message should be properly initialized, namely the
2707 size, type, and some extra user-defined data, such as timeout, name of
2708 transport, address and name of transport.
2709
2710 @example
2711 struct AddressLookupMessage *msg;
2712 size_t len = sizeof (struct AddressLookupMessage)
2713   + addressLen
2714   + strlen (nameTrans)
2715   + 1;
2716 msg->header->size = htons (len);
2717 msg->header->type = htons
2718 (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP);
2719 msg->timeout = GNUNET_TIME_absolute_hton (abs_timeout);
2720 msg->addrlen = htonl (addressLen);
2721 char *addrbuf = (char *) &msg[1];
2722 memcpy (addrbuf, address, addressLen);
2723 char *tbuf = &addrbuf[addressLen];
2724 memcpy (tbuf, nameTrans, strlen (nameTrans) + 1);
2725 @end example
2726
2727 Note that, here the functions @code{htonl}, @code{htons} and
2728 @code{GNUNET_TIME_absolute_hton} are applied to convert little endian
2729 into big endian, about the usage of the big/small endian order and the
2730 corresponding conversion function please refer to Introduction of
2731 Big Endian and Little Endian.
2732
2733 @c ***********************************************************************
2734 @node Client - Send request and receive response
2735 @subsubsection Client - Send request and receive response
2736
2737
2738 @b{FIXME: This is very outdated, see the tutorial for the current API!}
2739
2740 Next, the client would send the constructed message as a request to the
2741 service and wait for the response from the service. To accomplish this
2742 goal, there are a number of API calls that can be used. In this example,
2743 @code{GNUNET_CLIENT_transmit_and_get_response} is chosen as the most
2744 appropriate function to use.
2745
2746 @example
2747 GNUNET_CLIENT_transmit_and_get_response
2748 (client, msg->header, timeout, GNUNET_YES, &address_response_processor,
2749 arp_ctx);
2750 @end example
2751
2752 the argument @code{address_response_processor} is a function with
2753 @code{GNUNET_CLIENT_MessageHandler} type, which is used to process the
2754 reply message from the service.
2755
2756 @node Server - Startup service
2757 @subsubsection Server - Startup service
2758
2759 After receiving the request message, we run a standard GNUnet service
2760 startup sequence using @code{GNUNET_SERVICE_run}, as follows,
2761
2762 @example
2763 int main(int argc, char**argv) @{
2764   GNUNET_SERVICE_run(argc, argv, "transport"
2765   GNUNET_SERVICE_OPTION_NONE, &run, NULL)); @}
2766 @end example
2767
2768 @c ***********************************************************************
2769 @node Server - Add new handles for specified messages
2770 @subsubsection Server - Add new handles for specified messages
2771
2772
2773 in the function above the argument @code{run} is used to initiate
2774 transport service,and defined like this:
2775
2776 @example
2777 static void run (void *cls,
2778 struct GNUNET_SERVER_Handle *serv,
2779 const struct GNUNET_CONFIGURATION_Handle *cfg) @{
2780   GNUNET_SERVER_add_handlers (serv, handlers); @}
2781 @end example
2782
2783
2784 Here, @code{GNUNET_SERVER_add_handlers} must be called in the run
2785 function to add new handlers in the service. The parameter
2786 @code{handlers} is a list of @code{struct GNUNET_SERVER_MessageHandler}
2787 to tell the service which function should be called when a particular
2788 type of message is received, and should be defined in this way:
2789
2790 @example
2791 static struct GNUNET_SERVER_MessageHandler handlers[] = @{
2792   @{&handle_start,
2793    NULL,
2794    GNUNET_MESSAGE_TYPE_TRANSPORT_START,
2795    0@},
2796   @{&handle_send,
2797    NULL,
2798    GNUNET_MESSAGE_TYPE_TRANSPORT_SEND,
2799    0@},
2800   @{&handle_try_connect,
2801    NULL,
2802    GNUNET_MESSAGE_TYPE_TRANSPORT_TRY_CONNECT,
2803    sizeof (struct TryConnectMessage)
2804   @},
2805   @{&handle_address_lookup,
2806    NULL,
2807    GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP,
2808    0@},
2809   @{NULL,
2810    NULL,
2811    0,
2812    0@}
2813 @};
2814 @end example
2815
2816
2817 As shown, the first member of the struct in the first area is a callback
2818 function, which is called to process the specified message types, given
2819 as the third member. The second parameter is the closure for the callback
2820 function, which is set to @code{NULL} in most cases, and the last
2821 parameter is the expected size of the message of this type, usually we
2822 set it to 0 to accept variable size, for special cases the exact size of
2823 the specified message also can be set. In addition, the terminator sign
2824 depicted as @code{@{NULL, NULL, 0, 0@}} is set in the last area.
2825
2826 @c ***********************************************************************
2827 @node Server - Process request message
2828 @subsubsection Server - Process request message
2829
2830
2831 After the initialization of transport service, the request message would
2832 be processed. Before handling the main message data, the validity of this
2833 message should be checked out, e.g., to check whether the size of message
2834 is correct.
2835
2836 @example
2837 size = ntohs (message->size);
2838 if (size < sizeof (struct AddressLookupMessage)) @{
2839   GNUNET_break_op (0);
2840   GNUNET_SERVER_receive_done (client, GNUNET_SYSERR);
2841   return; @}
2842 @end example
2843
2844
2845 Note that, opposite to the construction method of the request message in
2846 the client, in the server the function @code{nothl} and @code{ntohs}
2847 should be employed during the extraction of the data from the message, so
2848 that the data in big endian order can be converted back into little
2849 endian order. See more in detail please refer to Introduction of
2850 Big Endian and Little Endian.
2851
2852 Moreover in this example, the name of the transport stored in the message
2853 is a 0-terminated string, so we should also check whether the name of the
2854 transport in the received message is 0-terminated:
2855
2856 @example
2857 nameTransport = (const char *) &address[addressLen];
2858 if (nameTransport[size - sizeof
2859                   (struct AddressLookupMessage)
2860                   - addressLen - 1] != '\0') @{
2861   GNUNET_break_op (0);
2862   GNUNET_SERVER_receive_done (client,
2863                               GNUNET_SYSERR);
2864   return; @}
2865 @end example
2866
2867 Here, @code{GNUNET_SERVER_receive_done} should be called to tell the
2868 service that the request is done and can receive the next message. The
2869 argument @code{GNUNET_SYSERR} here indicates that the service didn't
2870 understand the request message, and the processing of this request would
2871 be terminated.
2872
2873 In comparison to the aforementioned situation, when the argument is equal
2874 to @code{GNUNET_OK}, the service would continue to process the request
2875 message.
2876
2877 @c ***********************************************************************
2878 @node Server - Response to client
2879 @subsubsection Server - Response to client
2880
2881
2882 Once the processing of current request is done, the server should give the
2883 response to the client. A new @code{struct AddressLookupMessage} would be
2884 produced by the server in a similar way as the client did and sent to the
2885 client, but here the type should be
2886 @code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY} rather than
2887 @code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP} in client.
2888 @example
2889 struct AddressLookupMessage *msg;
2890 size_t len = sizeof (struct AddressLookupMessage)
2891   + addressLen
2892   + strlen (nameTrans) + 1;
2893 msg->header->size = htons (len);
2894 msg->header->type = htons
2895   (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2896
2897 // ...
2898
2899 struct GNUNET_SERVER_TransmitContext *tc;
2900 tc = GNUNET_SERVER_transmit_context_create (client);
2901 GNUNET_SERVER_transmit_context_append_data
2902 (tc,
2903  NULL,
2904  0,
2905  GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2906 GNUNET_SERVER_transmit_context_run (tc, rtimeout);
2907 @end example
2908
2909
2910 Note that, there are also a number of other APIs provided to the service
2911 to send the message.
2912
2913 @c ***********************************************************************
2914 @node Server - Notification of clients
2915 @subsubsection Server - Notification of clients
2916
2917
2918 Often a service needs to (repeatedly) transmit notifications to a client
2919 or a group of clients. In these cases, the client typically has once
2920 registered for a set of events and then needs to receive a message
2921 whenever such an event happens (until the client disconnects). The use of
2922 a notification context can help manage message queues to clients and
2923 handle disconnects. Notification contexts can be used to send
2924 individualized messages to a particular client or to broadcast messages
2925 to a group of clients. An individualized notification might look like
2926 this:
2927
2928 @example
2929 GNUNET_SERVER_notification_context_unicast(nc,
2930                                            client,
2931                                            msg,
2932                                            GNUNET_YES);
2933 @end example
2934
2935
2936 Note that after processing the original registration message for
2937 notifications, the server code still typically needs to call
2938 @code{GNUNET_SERVER_receive_done} so that the client can transmit further
2939 messages to the server.
2940
2941 @c ***********************************************************************
2942 @node Conversion between Network Byte Order (Big Endian) and Host Byte Order
2943 @subsubsection Conversion between Network Byte Order (Big Endian) and Host Byte Order
2944 @c %** subsub? it's a referenced page on the ipc document.
2945
2946
2947 Here we can simply comprehend big endian and little endian as Network Byte
2948 Order and Host Byte Order respectively. What is the difference between
2949 both two?
2950
2951 Usually in our host computer we store the data byte as Host Byte Order,
2952 for example, we store a integer in the RAM which might occupies 4 Byte,
2953 as Host Byte Order the higher Byte would be stored at the lower address
2954 of RAM, and the lower Byte would be stored at the higher address of RAM.
2955 However, contrast to this, Network Byte Order just take the totally
2956 opposite way to store the data, says, it will store the lower Byte at the
2957 lower address, and the higher Byte will stay at higher address.
2958
2959 For the current communication of network, we normally exchange the
2960 information by surveying the data package, every two host wants to
2961 communicate with each other must send and receive data package through
2962 network. In order to maintain the identity of data through the
2963 transmission in the network, the order of the Byte storage must changed
2964 before sending and after receiving the data.
2965
2966 There ten convenient functions to realize the conversion of Byte Order in
2967 GNUnet, as following:
2968
2969 @table @asis
2970
2971 @item uint16_t htons(uint16_t hostshort) Convert host byte order to net
2972 byte order with short int
2973 @item uint32_t htonl(uint32_t hostlong) Convert host byte
2974 order to net byte order with long int
2975 @item uint16_t ntohs(uint16_t netshort)
2976 Convert net byte order to host byte order with short int
2977 @item uint32_t
2978 ntohl(uint32_t netlong) Convert net byte order to host byte order with
2979 long int
2980 @item unsigned long long GNUNET_ntohll (unsigned long long netlonglong)
2981 Convert net byte order to host byte order with long long int
2982 @item unsigned long long GNUNET_htonll (unsigned long long hostlonglong)
2983 Convert host byte order to net byte order with long long int
2984 @item struct GNUNET_TIME_RelativeNBO GNUNET_TIME_relative_hton
2985 (struct GNUNET_TIME_Relative a) Convert relative time to network byte
2986 order.
2987 @item struct GNUNET_TIME_Relative GNUNET_TIME_relative_ntoh
2988 (struct GNUNET_TIME_RelativeNBO a) Convert relative time from network
2989 byte order.
2990 @item struct GNUNET_TIME_AbsoluteNBO GNUNET_TIME_absolute_hton
2991 (struct GNUNET_TIME_Absolute a) Convert relative time to network byte
2992 order.
2993 @item struct GNUNET_TIME_Absolute GNUNET_TIME_absolute_ntoh
2994 (struct GNUNET_TIME_AbsoluteNBO a) Convert relative time from network
2995 byte order.
2996 @end table
2997
2998 @cindex Cryptography API
2999 @node Cryptography API
3000 @subsection Cryptography API
3001
3002
3003 The gnunetutil APIs provides the cryptographic primitives used in GNUnet.
3004 GNUnet uses 2048 bit RSA keys for the session key exchange and for signing
3005 messages by peers and most other public-key operations. Most researchers
3006 in cryptography consider 2048 bit RSA keys as secure and practically
3007 unbreakable for a long time. The API provides functions to create a fresh
3008 key pair, read a private key from a file (or create a new file if the
3009 file does not exist), encrypt, decrypt, sign, verify and extraction of
3010 the public key into a format suitable for network transmission.
3011
3012 For the encryption of files and the actual data exchanged between peers
3013 GNUnet uses 256-bit AES encryption. Fresh, session keys are negotiated
3014 for every new connection.@ Again, there is no published technique to
3015 break this cipher in any realistic amount of time. The API provides
3016 functions for generation of keys, validation of keys (important for
3017 checking that decryptions using RSA succeeded), encryption and decryption.
3018
3019 GNUnet uses SHA-512 for computing one-way hash codes. The API provides
3020 functions to compute a hash over a block in memory or over a file on disk.
3021
3022 The crypto API also provides functions for randomizing a block of memory,
3023 obtaining a single random number and for generating a permutation of the
3024 numbers 0 to n-1. Random number generation distinguishes between WEAK and
3025 STRONG random number quality; WEAK random numbers are pseudo-random
3026 whereas STRONG random numbers use entropy gathered from the operating
3027 system.
3028
3029 Finally, the crypto API provides a means to deterministically generate a
3030 1024-bit RSA key from a hash code. These functions should most likely not
3031 be used by most applications; most importantly,
3032 GNUNET_CRYPTO_rsa_key_create_from_hash does not create an RSA-key that
3033 should be considered secure for traditional applications of RSA.
3034
3035 @cindex Message Queue API
3036 @node Message Queue API
3037 @subsection Message Queue API
3038
3039
3040 @strong{ Introduction }@
3041 Often, applications need to queue messages that
3042 are to be sent to other GNUnet peers, clients or services. As all of
3043 GNUnet's message-based communication APIs, by design, do not allow
3044 messages to be queued, it is common to implement custom message queues
3045 manually when they are needed. However, writing very similar code in
3046 multiple places is tedious and leads to code duplication.
3047
3048 MQ (for Message Queue) is an API that provides the functionality to
3049 implement and use message queues. We intend to eventually replace all of
3050 the custom message queue implementations in GNUnet with MQ.
3051
3052 @strong{ Basic Concepts }@
3053 The two most important entities in MQ are queues and envelopes.
3054
3055 Every queue is backed by a specific implementation (e.g. for mesh, stream,
3056 connection, server client, etc.) that will actually deliver the queued
3057 messages. For convenience,@ some queues also allow to specify a list of
3058 message handlers. The message queue will then also wait for incoming
3059 messages and dispatch them appropriately.
3060
3061 An envelope holds the the memory for a message, as well as metadata
3062 (Where is the envelope queued? What should happen after it has been
3063 sent?). Any envelope can only be queued in one message queue.
3064
3065 @strong{ Creating Queues }@
3066 The following is a list of currently available message queues. Note that
3067 to avoid layering issues, message queues for higher level APIs are not
3068 part of @code{libgnunetutil}, but@ the respective API itself provides the
3069 queue implementation.
3070
3071 @table @asis
3072
3073 @item @code{GNUNET_MQ_queue_for_connection_client}
3074 Transmits queued messages over a @code{GNUNET_CLIENT_Connection} handle.
3075 Also supports receiving with message handlers.
3076
3077 @item @code{GNUNET_MQ_queue_for_server_client}
3078 Transmits queued messages over a @code{GNUNET_SERVER_Client} handle. Does
3079 not support incoming message handlers.
3080
3081 @item @code{GNUNET_MESH_mq_create} Transmits queued messages over a
3082 @code{GNUNET_MESH_Tunnel} handle. Does not support incoming message
3083 handlers.
3084
3085 @item @code{GNUNET_MQ_queue_for_callbacks} This is the most general
3086 implementation. Instead of delivering and receiving messages with one of
3087 GNUnet's communication APIs, implementation callbacks are called. Refer to
3088 "Implementing Queues" for a more detailed explanation.
3089 @end table
3090
3091
3092 @strong{ Allocating Envelopes }@
3093 A GNUnet message (as defined by the GNUNET_MessageHeader) has three
3094 parts: The size, the type, and the body.
3095
3096 MQ provides macros to allocate an envelope containing a message
3097 conveniently, automatically setting the size and type fields of the
3098 message.
3099
3100 Consider the following simple message, with the body consisting of a
3101 single number value.
3102 @c why the empty code function?
3103 @code{}
3104
3105 @example
3106 struct NumberMessage @{
3107   /** Type: GNUNET_MESSAGE_TYPE_EXAMPLE_1 */
3108   struct GNUNET_MessageHeader header;
3109   uint32_t number GNUNET_PACKED;
3110 @};
3111 @end example
3112
3113 An envelope containing an instance of the NumberMessage can be
3114 constructed like this:
3115
3116 @example
3117 struct GNUNET_MQ_Envelope *ev;
3118 struct NumberMessage *msg;
3119 ev = GNUNET_MQ_msg (msg, GNUNET_MESSAGE_TYPE_EXAMPLE_1);
3120 msg->number = htonl (42);
3121 @end example
3122
3123 In the above code, @code{GNUNET_MQ_msg} is a macro. The return value is
3124 the newly allocated envelope. The first argument must be a pointer to some
3125 @code{struct} containing a @code{struct GNUNET_MessageHeader header}
3126 field, while the second argument is the desired message type, in host
3127 byte order.
3128
3129 The @code{msg} pointer now points to an allocated message, where the
3130 message type and the message size are already set. The message's size is
3131 inferred from the type of the @code{msg} pointer: It will be set to
3132 'sizeof(*msg)', properly converted to network byte order.
3133
3134 If the message body's size is dynamic, the the macro
3135 @code{GNUNET_MQ_msg_extra} can be used to allocate an envelope whose
3136 message has additional space allocated after the @code{msg} structure.
3137
3138 If no structure has been defined for the message,
3139 @code{GNUNET_MQ_msg_header_extra} can be used to allocate additional space
3140 after the message header. The first argument then must be a pointer to a
3141 @code{GNUNET_MessageHeader}.
3142
3143 @strong{Envelope Properties}@
3144 A few functions in MQ allow to set additional properties on envelopes:
3145
3146 @table @asis
3147
3148 @item @code{GNUNET_MQ_notify_sent} Allows to specify a function that will
3149 be called once the envelope's message has been sent irrevocably.
3150 An envelope can be canceled precisely up to the@ point where the notify
3151 sent callback has been called.
3152
3153 @item @code{GNUNET_MQ_disable_corking} No corking will be used when
3154 sending the message. Not every@ queue supports this flag, per default,
3155 envelopes are sent with corking.@
3156
3157 @end table
3158
3159
3160 @strong{Sending Envelopes}@
3161 Once an envelope has been constructed, it can be queued for sending with
3162 @code{GNUNET_MQ_send}.
3163
3164 Note that in order to avoid memory leaks, an envelope must either be sent
3165 (the queue will free it) or destroyed explicitly with
3166 @code{GNUNET_MQ_discard}.
3167
3168 @strong{Canceling Envelopes}@
3169 An envelope queued with @code{GNUNET_MQ_send} can be canceled with
3170 @code{GNUNET_MQ_cancel}. Note that after the notify sent callback has
3171 been called, canceling a message results in undefined behavior.
3172 Thus it is unsafe to cancel an envelope that does not have a notify sent
3173 callback. When canceling an envelope, it is not necessary@ to call
3174 @code{GNUNET_MQ_discard}, and the envelope can't be sent again.
3175
3176 @strong{ Implementing Queues }@
3177 @code{TODO}
3178
3179 @cindex Service API
3180 @node Service API
3181 @subsection Service API
3182
3183
3184 Most GNUnet code lives in the form of services. Services are processes
3185 that offer an API for other components of the system to build on. Those
3186 other components can be command-line tools for users, graphical user
3187 interfaces or other services. Services provide their API using an IPC
3188 protocol. For this, each service must listen on either a TCP port or a
3189 UNIX domain socket; for this, the service implementation uses the server
3190 API. This use of server is exposed directly to the users of the service
3191 API. Thus, when using the service API, one is usually also often using
3192 large parts of the server API. The service API provides various
3193 convenience functions, such as parsing command-line arguments and the
3194 configuration file, which are not found in the server API.
3195 The dual to the service/server API is the client API, which can be used to
3196 access services.
3197
3198 The most common way to start a service is to use the
3199 @code{GNUNET_SERVICE_run} function from the program's main function.
3200 @code{GNUNET_SERVICE_run} will then parse the command line and
3201 configuration files and, based on the options found there,
3202 start the server. It will then give back control to the main
3203 program, passing the server and the configuration to the
3204 @code{GNUNET_SERVICE_Main} callback. @code{GNUNET_SERVICE_run}
3205 will also take care of starting the scheduler loop.
3206 If this is inappropriate (for example, because the scheduler loop
3207 is already running), @code{GNUNET_SERVICE_start} and
3208 related functions provide an alternative to @code{GNUNET_SERVICE_run}.
3209
3210 When starting a service, the service_name option is used to determine
3211 which sections in the configuration file should be used to configure the
3212 service. A typical value here is the name of the @file{src/}
3213 sub-directory, for example @file{statistics}.
3214 The same string would also be given to
3215 @code{GNUNET_CLIENT_connect} to access the service.
3216
3217 Once a service has been initialized, the program should use the
3218 @code{GNUNET_SERVICE_Main} callback to register message handlers
3219 using @code{GNUNET_SERVER_add_handlers}.
3220 The service will already have registered a handler for the
3221 "TEST" message.
3222
3223 @findex GNUNET_SERVICE_Options
3224 The option bitfield (@code{enum GNUNET_SERVICE_Options})
3225 determines how a service should behave during shutdown.
3226 There are three key strategies:
3227
3228 @table @asis
3229
3230 @item instant (@code{GNUNET_SERVICE_OPTION_NONE})
3231 Upon receiving the shutdown
3232 signal from the scheduler, the service immediately terminates the server,
3233 closing all existing connections with clients.
3234 @item manual (@code{GNUNET_SERVICE_OPTION_MANUAL_SHUTDOWN})
3235 The service does nothing by itself
3236 during shutdown. The main program will need to take the appropriate
3237 action by calling GNUNET_SERVER_destroy or GNUNET_SERVICE_stop (depending
3238 on how the service was initialized) to terminate the service. This method
3239 is used by gnunet-service-arm and rather uncommon.
3240 @item soft (@code{GNUNET_SERVICE_OPTION_SOFT_SHUTDOWN})
3241 Upon receiving the shutdown signal from the scheduler,
3242 the service immediately tells the server to stop
3243 listening for incoming clients. Requests from normal existing clients are
3244 still processed and the server/service terminates once all normal clients
3245 have disconnected. Clients that are not expected to ever disconnect (such
3246 as clients that monitor performance values) can be marked as 'monitor'
3247 clients using GNUNET_SERVER_client_mark_monitor. Those clients will
3248 continue to be processed until all 'normal' clients have disconnected.
3249 Then, the server will terminate, closing the monitor connections.
3250 This mode is for example used by 'statistics', allowing existing 'normal'
3251 clients to set (possibly persistent) statistic values before terminating.
3252
3253 @end table
3254
3255 @c ***********************************************************************
3256 @node Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
3257 @subsection Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
3258
3259
3260 A commonly used data structure in GNUnet is a (multi-)hash map. It is most
3261 often used to map a peer identity to some data structure, but also to map
3262 arbitrary keys to values (for example to track requests in the distributed
3263 hash table or in file-sharing). As it is commonly used, the DHT is
3264 actually sometimes responsible for a large share of GNUnet's overall
3265 memory consumption (for some processes, 30% is not uncommon). The
3266 following text documents some API quirks (and their implications for
3267 applications) that were recently introduced to minimize the footprint of
3268 the hash map.
3269
3270
3271 @c ***********************************************************************
3272 @menu
3273 * Analysis::
3274 * Solution::
3275 * Migration::
3276 * Conclusion::
3277 * Availability::
3278 @end menu
3279
3280 @node Analysis
3281 @subsubsection Analysis
3282
3283
3284 The main reason for the "excessive" memory consumption by the hash map is
3285 that GNUnet uses 512-bit cryptographic hash codes --- and the
3286 (multi-)hash map also uses the same 512-bit 'struct GNUNET_HashCode'. As
3287 a result, storing just the keys requires 64 bytes of memory for each key.
3288 As some applications like to keep a large number of entries in the hash
3289 map (after all, that's what maps are good for), 64 bytes per hash is
3290 significant: keeping a pointer to the value and having a linked list for
3291 collisions consume between 8 and 16 bytes, and 'malloc' may add about the
3292 same overhead per allocation, putting us in the 16 to 32 byte per entry
3293 ballpark. Adding a 64-byte key then triples the overall memory
3294 requirement for the hash map.
3295
3296 To make things "worse", most of the time storing the key in the hash map
3297 is not required: it is typically already in memory elsewhere! In most
3298 cases, the values stored in the hash map are some application-specific
3299 struct that _also_ contains the hash. Here is a simplified example:
3300
3301 @example
3302 struct MyValue @{
3303 struct GNUNET_HashCode key;
3304 unsigned int my_data; @};
3305
3306 // ...
3307 val = GNUNET_malloc (sizeof (struct MyValue));
3308 val->key = key;
3309 val->my_data = 42;
3310 GNUNET_CONTAINER_multihashmap_put (map, &key, val, ...);
3311 @end example
3312
3313 This is a common pattern as later the entries might need to be removed,
3314 and at that time it is convenient to have the key immediately at hand:
3315
3316 @example
3317 GNUNET_CONTAINER_multihashmap_remove (map, &val->key, val);
3318 @end example
3319
3320
3321 Note that here we end up with two times 64 bytes for the key, plus maybe
3322 64 bytes total for the rest of the 'struct MyValue' and the map entry in
3323 the hash map. The resulting redundant storage of the key increases
3324 overall memory consumption per entry from the "optimal" 128 bytes to 192
3325 bytes. This is not just an extreme example: overheads in practice are
3326 actually sometimes close to those highlighted in this example. This is
3327 especially true for maps with a significant number of entries, as there
3328 we tend to really try to keep the entries small.
3329
3330 @c ***********************************************************************
3331 @node Solution
3332 @subsubsection Solution
3333
3334
3335 The solution that has now been implemented is to @strong{optionally}
3336 allow the hash map to not make a (deep) copy of the hash but instead have
3337 a pointer to the hash/key in the entry. This reduces the memory
3338 consumption for the key from 64 bytes to 4 to 8 bytes. However, it can
3339 also only work if the key is actually stored in the entry (which is the
3340 case most of the time) and if the entry does not modify the key (which in
3341 all of the code I'm aware of has been always the case if there key is
3342 stored in the entry). Finally, when the client stores an entry in the
3343 hash map, it @strong{must} provide a pointer to the key within the entry,
3344 not just a pointer to a transient location of the key. If
3345 the client code does not meet these requirements, the result is a dangling
3346 pointer and undefined behavior of the (multi-)hash map API.
3347
3348 @c ***********************************************************************
3349 @node Migration
3350 @subsubsection Migration
3351
3352
3353 To use the new feature, first check that the values contain the respective
3354 key (and never modify it). Then, all calls to
3355 @code{GNUNET_CONTAINER_multihashmap_put} on the respective map must be
3356 audited and most likely changed to pass a pointer into the value's struct.
3357 For the initial example, the new code would look like this:
3358
3359 @example
3360 struct MyValue @{
3361 struct GNUNET_HashCode key;
3362 unsigned int my_data; @};
3363
3364 // ...
3365 val = GNUNET_malloc (sizeof (struct MyValue));
3366 val->key = key; val->my_data = 42;
3367 GNUNET_CONTAINER_multihashmap_put (map, &val->key, val, ...);
3368 @end example
3369
3370
3371 Note that @code{&val} was changed to @code{&val->key} in the argument to
3372 the @code{put} call. This is critical as often @code{key} is on the stack
3373 or in some other transient data structure and thus having the hash map
3374 keep a pointer to @code{key} would not work. Only the key inside of
3375 @code{val} has the same lifetime as the entry in the map (this must of
3376 course be checked as well). Naturally, @code{val->key} must be
3377 initialized before the @code{put} call. Once all @code{put} calls have
3378 been converted and double-checked, you can change the call to create the
3379 hash map from
3380
3381 @example
3382 map =
3383 GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_NO);
3384 @end example
3385
3386 to
3387
3388 @example
3389 map = GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_YES);
3390 @end example
3391
3392 If everything was done correctly, you now use about 60 bytes less memory
3393 per entry in @code{map}. However, if now (or in the future) any call to
3394 @code{put} does not ensure that the given key is valid until the entry is
3395 removed from the map, undefined behavior is likely to be observed.
3396
3397 @c ***********************************************************************
3398 @node Conclusion
3399 @subsubsection Conclusion
3400
3401
3402 The new optimization can is often applicable and can result in a
3403 reduction in memory consumption of up to 30% in practice. However, it
3404 makes the code less robust as additional invariants are imposed on the
3405 multi hash map client. Thus applications should refrain from enabling the
3406 new mode unless the resulting performance increase is deemed significant
3407 enough. In particular, it should generally not be used in new code (wait
3408 at least until benchmarks exist).
3409
3410 @c ***********************************************************************
3411 @node Availability
3412 @subsubsection Availability
3413
3414
3415 The new multi hash map code was committed in SVN 24319 (which made its
3416 way into GNUnet version 0.9.4).
3417 Various subsystems (transport, core, dht, file-sharing) were
3418 previously audited and modified to take advantage of the new capability.
3419 In particular, memory consumption of the file-sharing service is expected
3420 to drop by 20-30% due to this change.
3421
3422
3423 @cindex CONTAINER_MDLL API
3424 @node CONTAINER_MDLL API
3425 @subsection CONTAINER_MDLL API
3426
3427
3428 This text documents the GNUNET_CONTAINER_MDLL API. The
3429 GNUNET_CONTAINER_MDLL API is similar to the GNUNET_CONTAINER_DLL API in
3430 that it provides operations for the construction and manipulation of
3431 doubly-linked lists. The key difference to the (simpler) DLL-API is that
3432 the MDLL-version allows a single element (instance of a "struct") to be
3433 in multiple linked lists at the same time.
3434
3435 Like the DLL API, the MDLL API stores (most of) the data structures for
3436 the doubly-linked list with the respective elements; only the 'head' and
3437 'tail' pointers are stored "elsewhere" --- and the application needs to
3438 provide the locations of head and tail to each of the calls in the
3439 MDLL API. The key difference for the MDLL API is that the "next" and
3440 "previous" pointers in the struct can no longer be simply called "next"
3441 and "prev" --- after all, the element may be in multiple doubly-linked
3442 lists, so we cannot just have one "next" and one "prev" pointer!
3443
3444 The solution is to have multiple fields that must have a name of the
3445 format "next_XX" and "prev_XX" where "XX" is the name of one of the
3446 doubly-linked lists. Here is a simple example:
3447
3448 @example
3449 struct MyMultiListElement @{
3450   struct MyMultiListElement *next_ALIST;
3451   struct MyMultiListElement *prev_ALIST;
3452   struct MyMultiListElement *next_BLIST;
3453   struct MyMultiListElement *prev_BLIST;
3454   void
3455   *data;
3456 @};
3457 @end example
3458
3459
3460 Note that by convention, we use all-uppercase letters for the list names.
3461 In addition, the program needs to have a location for the head and tail
3462 pointers for both lists, for example:
3463
3464 @example
3465 static struct MyMultiListElement *head_ALIST;
3466 static struct MyMultiListElement *tail_ALIST;
3467 static struct MyMultiListElement *head_BLIST;
3468 static struct MyMultiListElement *tail_BLIST;
3469 @end example
3470
3471
3472 Using the MDLL-macros, we can now insert an element into the ALIST:
3473
3474 @example
3475 GNUNET_CONTAINER_MDLL_insert (ALIST, head_ALIST, tail_ALIST, element);
3476 @end example
3477
3478
3479 Passing "ALIST" as the first argument to MDLL specifies which of the
3480 next/prev fields in the 'struct MyMultiListElement' should be used. The
3481 extra "ALIST" argument and the "_ALIST" in the names of the
3482 next/prev-members are the only differences between the MDDL and DLL-API.
3483 Like the DLL-API, the MDLL-API offers functions for inserting (at head,
3484 at tail, after a given element) and removing elements from the list.
3485 Iterating over the list should be done by directly accessing the
3486 "next_XX" and/or "prev_XX" members.
3487
3488 @cindex Automatic Restart Manager
3489 @cindex ARM
3490 @node Automatic Restart Manager (ARM)
3491 @section Automatic Restart Manager (ARM)
3492
3493
3494 GNUnet's Automated Restart Manager (ARM) is the GNUnet service responsible
3495 for system initialization and service babysitting. ARM starts and halts
3496 services, detects configuration changes and restarts services impacted by
3497 the changes as needed. It's also responsible for restarting services in
3498 case of crashes and is planned to incorporate automatic debugging for
3499 diagnosing service crashes providing developers insights about crash
3500 reasons. The purpose of this document is to give GNUnet developer an idea
3501 about how ARM works and how to interact with it.
3502
3503 @menu
3504 * Basic functionality::
3505 * Key configuration options::
3506 * ARM - Availability::
3507 * Reliability::
3508 @end menu
3509
3510 @c ***********************************************************************
3511 @node Basic functionality
3512 @subsection Basic functionality
3513
3514
3515 @itemize @bullet
3516 @item ARM source code can be found under "src/arm".@ Service processes are
3517 managed by the functions in "gnunet-service-arm.c" which is controlled
3518 with "gnunet-arm.c" (main function in that file is ARM's entry point).
3519
3520 @item The functions responsible for communicating with ARM , starting and
3521 stopping services -including ARM service itself- are provided by the
3522 ARM API "arm_api.c".@ Function: GNUNET_ARM_connect() returns to the caller
3523 an ARM handle after setting it to the caller's context (configuration and
3524 scheduler in use). This handle can be used afterwards by the caller to
3525 communicate with ARM. Functions GNUNET_ARM_start_service() and
3526 GNUNET_ARM_stop_service() are used for starting and stopping services
3527 respectively.
3528
3529 @item A typical example of using these basic ARM services can be found in
3530 file test_arm_api.c. The test case connects to ARM, starts it, then uses
3531 it to start a service "resolver", stops the "resolver" then stops "ARM".
3532 @end itemize
3533
3534 @c ***********************************************************************
3535 @node Key configuration options
3536 @subsection Key configuration options
3537
3538
3539 Configurations for ARM and services should be available in a .conf file
3540 (As an example, see test_arm_api_data.conf). When running ARM, the
3541 configuration file to use should be passed to the command:
3542
3543 @example
3544 $ gnunet-arm -s -c configuration_to_use.conf
3545 @end example
3546
3547 If no configuration is passed, the default configuration file will be used
3548 (see GNUNET_PREFIX/share/gnunet/defaults.conf which is created from
3549 contrib/defaults.conf).@ Each of the services is having a section starting
3550 by the service name between square brackets, for example: "[arm]".
3551 The following options configure how ARM configures or interacts with the
3552 various services:
3553
3554 @table @asis
3555
3556 @item PORT Port number on which the service is listening for incoming TCP
3557 connections. ARM will start the services should it notice a request at
3558 this port.
3559
3560 @item HOSTNAME Specifies on which host the service is deployed. Note
3561 that ARM can only start services that are running on the local system
3562 (but will not check that the hostname matches the local machine name).
3563 This option is used by the @code{gnunet_client_lib.h} implementation to
3564 determine which system to connect to. The default is "localhost".
3565
3566 @item BINARY The name of the service binary file.
3567
3568 @item OPTIONS To be passed to the service.
3569
3570 @item PREFIX A command to pre-pend to the actual command, for example,
3571 running a service with "valgrind" or "gdb"
3572
3573 @item DEBUG Run in debug mode (much verbosity).
3574
3575 @item START_ON_DEMAND ARM will listen to UNIX domain socket and/or TCP port of
3576 the service and start the service on-demand.
3577
3578 @item IMMEDIATE_START ARM will always start this service when the peer
3579 is started.
3580
3581 @item ACCEPT_FROM IPv4 addresses the service accepts connections from.
3582
3583 @item ACCEPT_FROM6 IPv6 addresses the service accepts connections from.
3584
3585 @end table
3586
3587
3588 Options that impact the operation of ARM overall are in the "[arm]"
3589 section. ARM is a normal service and has (except for START_ON_DEMAND) all of the
3590 options that other services do. In addition, ARM has the
3591 following options:
3592
3593 @table @asis
3594
3595 @item GLOBAL_PREFIX Command to be pre-pended to all services that are
3596 going to run.
3597
3598 @item GLOBAL_POSTFIX Global option that will be supplied to all the
3599 services that are going to run.
3600
3601 @end table
3602
3603 @c ***********************************************************************
3604 @node ARM - Availability
3605 @subsection ARM - Availability
3606
3607
3608 As mentioned before, one of the features provided by ARM is starting
3609 services on demand. Consider the example of one service "client" that
3610 wants to connect to another service a "server". The "client" will ask ARM
3611 to run the "server". ARM starts the "server". The "server" starts
3612 listening to incoming connections. The "client" will establish a
3613 connection with the "server". And then, they will start to communicate
3614 together.@ One problem with that scheme is that it's slow!@
3615 The "client" service wants to communicate with the "server" service at
3616 once and is not willing wait for it to be started and listening to
3617 incoming connections before serving its request.@ One solution for that
3618 problem will be that ARM starts all services as default services. That
3619 solution will solve the problem, yet, it's not quite practical, for some
3620 services that are going to be started can never be used or are going to
3621 be used after a relatively long time.@
3622 The approach followed by ARM to solve this problem is as follows:
3623
3624 @itemize @bullet
3625
3626 @item For each service having a PORT field in the configuration file and
3627 that is not one of the default services ( a service that accepts incoming
3628 connections from clients), ARM creates listening sockets for all addresses
3629 associated with that service.
3630
3631 @item The "client" will immediately establish a connection with
3632 the "server".
3633
3634 @item ARM --- pretending to be the "server" --- will listen on the
3635 respective port and notice the incoming connection from the "client"
3636 (but not accept it), instead
3637
3638 @item Once there is an incoming connection, ARM will start the "server",
3639 passing on the listen sockets (now, the service is started and can do its
3640 work).
3641
3642 @item Other client services now can directly connect directly to the
3643 "server".
3644
3645 @end itemize
3646
3647 @c ***********************************************************************
3648 @node Reliability
3649 @subsection Reliability
3650
3651 One of the features provided by ARM, is the automatic restart of crashed
3652 services.@ ARM needs to know which of the running services died. Function
3653 "gnunet-service-arm.c/maint_child_death()" is responsible for that. The
3654 function is scheduled to run upon receiving a SIGCHLD signal. The
3655 function, then, iterates ARM's list of services running and monitors
3656 which service has died (crashed). For all crashing services, ARM restarts
3657 them.@
3658 Now, considering the case of a service having a serious problem causing it
3659 to crash each time it's started by ARM. If ARM keeps blindly restarting
3660 such a service, we are going to have the pattern:
3661 start-crash-restart-crash-restart-crash and so forth!! Which is of course
3662 not practical.@
3663 For that reason, ARM schedules the service to be restarted after waiting
3664 for some delay that grows exponentially with each crash/restart of that
3665 service.@ To clarify the idea, considering the following example:
3666
3667 @itemize @bullet
3668
3669 @item Service S crashed.
3670
3671 @item ARM receives the SIGCHLD and inspects its list of services to find
3672 the dead one(s).
3673
3674 @item ARM finds S dead and schedules it for restarting after "backoff"
3675 time which is initially set to 1ms. ARM will double the backoff time
3676 correspondent to S (now backoff(S) = 2ms)
3677
3678 @item Because there is a severe problem with S, it crashed again.
3679
3680 @item Again ARM receives the SIGCHLD and detects that it's S again that's
3681 crashed. ARM schedules it for restarting but after its new backoff time
3682 (which became 2ms), and doubles its backoff time (now backoff(S) = 4).
3683
3684 @item and so on, until backoff(S) reaches a certain threshold
3685 (@code{EXPONENTIAL_BACKOFF_THRESHOLD} is set to half an hour),
3686 after reaching it, backoff(S) will remain half an hour,
3687 hence ARM won't be busy for a lot of time trying to restart a
3688 problematic service.
3689 @end itemize
3690
3691 @cindex TRANSPORT Subsystem
3692 @node TRANSPORT Subsystem
3693 @section TRANSPORT Subsystem
3694
3695
3696 This chapter documents how the GNUnet transport subsystem works. The
3697 GNUnet transport subsystem consists of three main components: the
3698 transport API (the interface used by the rest of the system to access the
3699 transport service), the transport service itself (most of the interesting
3700 functions, such as choosing transports, happens here) and the transport
3701 plugins. A transport plugin is a concrete implementation for how two
3702 GNUnet peers communicate; many plugins exist, for example for
3703 communication via TCP, UDP, HTTP, HTTPS and others. Finally, the
3704 transport subsystem uses supporting code, especially the NAT/UPnP
3705 library to help with tasks such as NAT traversal.
3706
3707 Key tasks of the transport service include:
3708
3709 @itemize @bullet
3710
3711 @item Create our HELLO message, notify clients and neighbours if our HELLO
3712 changes (using NAT library as necessary)
3713
3714 @item Validate HELLOs from other peers (send PING), allow other peers to
3715 validate our HELLO's addresses (send PONG)
3716
3717 @item Upon request, establish connections to other peers (using address
3718 selection from ATS subsystem) and maintain them (again using PINGs and
3719 PONGs) as long as desired
3720
3721 @item Accept incoming connections, give ATS service the opportunity to
3722 switch communication channels
3723
3724 @item Notify clients about peers that have connected to us or that have
3725 been disconnected from us
3726
3727 @item If a (stateful) connection goes down unexpectedly (without explicit
3728 DISCONNECT), quickly attempt to recover (without notifying clients) but do
3729 notify clients quickly if reconnecting fails
3730
3731 @item Send (payload) messages arriving from clients to other peers via
3732 transport plugins and receive messages from other peers, forwarding
3733 those to clients
3734
3735 @item Enforce inbound traffic limits (using flow-control if it is
3736 applicable); outbound traffic limits are enforced by CORE, not by us (!)
3737
3738 @item Enforce restrictions on P2P connection as specified by the blacklist
3739 configuration and blacklisting clients
3740 @end itemize
3741
3742 Note that the term "clients" in the list above really refers to the
3743 GNUnet-CORE service, as CORE is typically the only client of the
3744 transport service.
3745
3746 @menu
3747 * Address validation protocol::
3748 @end menu
3749
3750 @node Address validation protocol
3751 @subsection Address validation protocol
3752
3753
3754 This section documents how the GNUnet transport service validates
3755 connections with other peers. It is a high-level description of the
3756 protocol necessary to understand the details of the implementation. It
3757 should be noted that when we talk about PING and PONG messages in this
3758 section, we refer to transport-level PING and PONG messages, which are
3759 different from core-level PING and PONG messages (both in implementation
3760 and function).
3761
3762 The goal of transport-level address validation is to minimize the chances
3763 of a successful man-in-the-middle attack against GNUnet peers on the
3764 transport level. Such an attack would not allow the adversary to decrypt
3765 the P2P transmissions, but a successful attacker could at least measure
3766 traffic volumes and latencies (raising the adversaries capabilities by
3767 those of a global passive adversary in the worst case). The scenarios we
3768 are concerned about is an attacker, Mallory, giving a @code{HELLO} to
3769 Alice that claims to be for Bob, but contains Mallory's IP address
3770 instead of Bobs (for some transport).
3771 Mallory would then forward the traffic to Bob (by initiating a
3772 connection to Bob and claiming to be Alice). As a further
3773 complication, the scheme has to work even if say Alice is behind a NAT
3774 without traversal support and hence has no address of her own (and thus
3775 Alice must always initiate the connection to Bob).
3776
3777 An additional constraint is that @code{HELLO} messages do not contain a
3778 cryptographic signature since other peers must be able to edit
3779 (i.e. remove) addresses from the @code{HELLO} at any time (this was
3780 not true in GNUnet 0.8.x). A basic @strong{assumption} is that each peer
3781 knows the set of possible network addresses that it @strong{might}
3782 be reachable under (so for example, the external IP address of the
3783 NAT plus the LAN address(es) with the respective ports).
3784
3785 The solution is the following. If Alice wants to validate that a given
3786 address for Bob is valid (i.e. is actually established @strong{directly}
3787 with the intended target), she sends a PING message over that connection
3788 to Bob. Note that in this case, Alice initiated the connection so only
3789 Alice knows which address was used for sure (Alice may be behind NAT, so
3790 whatever address Bob sees may not be an address Alice knows she has).
3791 Bob checks that the address given in the @code{PING} is actually one
3792 of Bob's addresses (ie: does not belong to Mallory), and if it is,
3793 sends back a @code{PONG} (with a signature that says that Bob
3794 owns/uses the address from the @code{PING}).
3795 Alice checks the signature and is happy if it is valid and the address
3796 in the @code{PONG} is the address Alice used.
3797 This is similar to the 0.8.x protocol where the @code{HELLO} contained a
3798 signature from Bob for each address used by Bob.
3799 Here, the purpose code for the signature is
3800 @code{GNUNET_SIGNATURE_PURPOSE_TRANSPORT_PONG_OWN}. After this, Alice will
3801 remember Bob's address and consider the address valid for a while (12h in
3802 the current implementation). Note that after this exchange, Alice only
3803 considers Bob's address to be valid, the connection itself is not
3804 considered 'established'. In particular, Alice may have many addresses
3805 for Bob that Alice considers valid.
3806
3807 The @code{PONG} message is protected with a nonce/challenge against replay
3808 attacks (@uref{http://en.wikipedia.org/wiki/Replay_attack, replay})
3809 and uses an expiration time for the signature (but those are almost
3810 implementation details).
3811
3812 @cindex NAT library
3813 @node NAT library
3814 @section NAT library
3815
3816
3817 The goal of the GNUnet NAT library is to provide a general-purpose API for
3818 NAT traversal @strong{without} third-party support. So protocols that
3819 involve contacting a third peer to help establish a connection between
3820 two peers are outside of the scope of this API. That does not mean that
3821 GNUnet doesn't support involving a third peer (we can do this with the
3822 distance-vector transport or using application-level protocols), it just
3823 means that the NAT API is not concerned with this possibility. The API is
3824 written so that it will work for IPv6-NAT in the future as well as
3825 current IPv4-NAT. Furthermore, the NAT API is always used, even for peers
3826 that are not behind NAT --- in that case, the mapping provided is simply
3827 the identity.
3828
3829 NAT traversal is initiated by calling @code{GNUNET_NAT_register}. Given a
3830 set of addresses that the peer has locally bound to (TCP or UDP), the NAT
3831 library will return (via callback) a (possibly longer) list of addresses
3832 the peer @strong{might} be reachable under. Internally, depending on the
3833 configuration, the NAT library will try to punch a hole (using UPnP) or
3834 just "know" that the NAT was manually punched and generate the respective
3835 external IP address (the one that should be globally visible) based on
3836 the given information.
3837
3838 The NAT library also supports ICMP-based NAT traversal. Here, the other
3839 peer can request connection-reversal by this peer (in this special case,
3840 the peer is even allowed to configure a port number of zero). If the NAT
3841 library detects a connection-reversal request, it returns the respective
3842 target address to the client as well. It should be noted that
3843 connection-reversal is currently only intended for TCP, so other plugins
3844 @strong{must} pass @code{NULL} for the reversal callback. Naturally, the
3845 NAT library also supports requesting connection reversal from a remote
3846 peer (@code{GNUNET_NAT_run_client}).
3847
3848 Once initialized, the NAT handle can be used to test if a given address is
3849 possibly a valid address for this peer (@code{GNUNET_NAT_test_address}).
3850 This is used for validating our addresses when generating PONGs.
3851
3852 Finally, the NAT library contains an API to test if our NAT configuration
3853 is correct. Using @code{GNUNET_NAT_test_start} @strong{before} binding to
3854 the respective port, the NAT library can be used to test if the
3855 configuration works. The test function act as a local client, initialize
3856 the NAT traversal and then contact a @code{gnunet-nat-server} (running by
3857 default on @code{gnunet.org}) and ask for a connection to be established.
3858 This way, it is easy to test if the current NAT configuration is valid.
3859
3860 @node Distance-Vector plugin
3861 @section Distance-Vector plugin
3862
3863
3864 The Distance Vector (DV) transport is a transport mechanism that allows
3865 peers to act as relays for each other, thereby connecting peers that would
3866 otherwise be unable to connect. This gives a larger connection set to
3867 applications that may work better with more peers to choose from (for
3868 example, File Sharing and/or DHT).
3869
3870 The Distance Vector transport essentially has two functions. The first is
3871 "gossiping" connection information about more distant peers to directly
3872 connected peers. The second is taking messages intended for non-directly
3873 connected peers and encapsulating them in a DV wrapper that contains the
3874 required information for routing the message through forwarding peers. Via
3875 gossiping, optimal routes through the known DV neighborhood are discovered
3876 and utilized and the message encapsulation provides some benefits in
3877 addition to simply getting the message from the correct source to the
3878 proper destination.
3879
3880 The gossiping function of DV provides an up to date routing table of
3881 peers that are available up to some number of hops. We call this a
3882 fisheye view of the network (like a fish, nearby objects are known while
3883 more distant ones unknown). Gossip messages are sent only to directly
3884 connected peers, but they are sent about other knowns peers within the
3885 "fisheye distance". Whenever two peers connect, they immediately gossip
3886 to each other about their appropriate other neighbors. They also gossip
3887 about the newly connected peer to previously
3888 connected neighbors. In order to keep the routing tables up to date,
3889 disconnect notifications are propagated as gossip as well (because
3890 disconnects may not be sent/received, timeouts are also used remove
3891 stagnant routing table entries).
3892
3893 Routing of messages via DV is straightforward. When the DV transport is
3894 notified of a message destined for a non-direct neighbor, the appropriate
3895 forwarding peer is selected, and the base message is encapsulated in a DV
3896 message which contains information about the initial peer and the intended
3897 recipient. At each forwarding hop, the initial peer is validated (the
3898 forwarding peer ensures that it has the initial peer in its neighborhood,
3899 otherwise the message is dropped). Next the base message is
3900 re-encapsulated in a new DV message for the next hop in the forwarding
3901 chain (or delivered to the current peer, if it has arrived at the
3902 destination).
3903
3904 Assume a three peer network with peers Alice, Bob and Carol. Assume that
3905
3906 @example
3907 Alice <-> Bob and Bob <-> Carol
3908 @end example
3909
3910 @noindent
3911 are direct (e.g. over TCP or UDP transports) connections, but that
3912 Alice cannot directly connect to Carol.
3913 This may be the case due to NAT or firewall restrictions, or perhaps
3914 based on one of the peers respective configurations. If the Distance
3915 Vector transport is enabled on all three peers, it will automatically
3916 discover (from the gossip protocol) that Alice and Carol can connect via
3917 Bob and provide a "virtual" Alice <-> Carol connection. Routing between
3918 Alice and Carol happens as follows; Alice creates a message destined for
3919 Carol and notifies the DV transport about it. The DV transport at Alice
3920 looks up Carol in the routing table and finds that the message must be
3921 sent through Bob for Carol. The message is encapsulated setting Alice as
3922 the initiator and Carol as the destination and sent to Bob. Bob receives
3923 the messages, verifies that both Alice and Carol are known to Bob, and
3924 re-wraps the message in a new DV message for Carol.
3925 The DV transport at Carol receives this message, unwraps the original
3926 message, and delivers it to Carol as though it came directly from Alice.
3927
3928 @cindex SMTP plugin
3929 @node SMTP plugin
3930 @section SMTP plugin
3931
3932 @c TODO: Update!
3933
3934 This section describes the new SMTP transport plugin for GNUnet as it
3935 exists in the 0.7.x and 0.8.x branch. SMTP support is currently not
3936 available in GNUnet 0.9.x. This page also describes the transport layer
3937 abstraction (as it existed in 0.7.x and 0.8.x) in more detail and gives
3938 some benchmarking results. The performance results presented are quite
3939 old and maybe outdated at this point.
3940 For the readers in the year 2019, you will notice by the mention of
3941 version 0.7, 0.8, and 0.9 that this section has to be taken with your
3942 usual grain of salt and be updated eventually.
3943
3944 @itemize @bullet
3945 @item Why use SMTP for a peer-to-peer transport?
3946 @item SMTPHow does it work?
3947 @item How do I configure my peer?
3948 @item How do I test if it works?
3949 @item How fast is it?
3950 @item Is there any additional documentation?
3951 @end itemize
3952
3953
3954 @menu
3955 * Why use SMTP for a peer-to-peer transport?::
3956 * How does it work?::
3957 * How do I configure my peer?::
3958 * How do I test if it works?::
3959 * How fast is it?::
3960 @end menu
3961
3962 @node Why use SMTP for a peer-to-peer transport?
3963 @subsection Why use SMTP for a peer-to-peer transport?
3964
3965
3966 There are many reasons why one would not want to use SMTP:
3967
3968 @itemize @bullet
3969 @item SMTP is using more bandwidth than TCP, UDP or HTTP
3970 @item SMTP has a much higher latency.
3971 @item SMTP requires significantly more computation (encoding and decoding
3972 time) for the peers.
3973 @item SMTP is significantly more complicated to configure.
3974 @item SMTP may be abused by tricking GNUnet into sending mail to@
3975 non-participating third parties.
3976 @end itemize
3977
3978 So why would anybody want to use SMTP?
3979 @itemize @bullet
3980 @item SMTP can be used to contact peers behind NAT boxes (in virtual
3981 private networks).
3982 @item SMTP can be used to circumvent policies that limit or prohibit
3983 peer-to-peer traffic by masking as "legitimate" traffic.
3984 @item SMTP uses E-mail addresses which are independent of a specific IP,
3985 which can be useful to address peers that use dynamic IP addresses.
3986 @item SMTP can be used to initiate a connection (e.g. initial address
3987 exchange) and peers can then negotiate the use of a more efficient
3988 protocol (e.g. TCP) for the actual communication.
3989 @end itemize
3990
3991 In summary, SMTP can for example be used to send a message to a peer
3992 behind a NAT box that has a dynamic IP to tell the peer to establish a
3993 TCP connection to a peer outside of the private network. Even an
3994 extraordinary overhead for this first message would be irrelevant in this
3995 type of situation.
3996
3997 @node How does it work?
3998 @subsection How does it work?
3999
4000
4001 When a GNUnet peer needs to send a message to another GNUnet peer that has
4002 advertised (only) an SMTP transport address, GNUnet base64-encodes the
4003 message and sends it in an E-mail to the advertised address. The
4004 advertisement contains a filter which is placed in the E-mail header,
4005 such that the receiving host can filter the tagged E-mails and forward it
4006 to the GNUnet peer process. The filter can be specified individually by
4007 each peer and be changed over time. This makes it impossible to censor
4008 GNUnet E-mail messages by searching for a generic filter.
4009
4010 @node How do I configure my peer?
4011 @subsection How do I configure my peer?
4012
4013
4014 First, you need to configure @code{procmail} to filter your inbound E-mail
4015 for GNUnet traffic. The GNUnet messages must be delivered into a pipe, for
4016 example @code{/tmp/gnunet.smtp}. You also need to define a filter that is
4017 used by @command{procmail} to detect GNUnet messages. You are free to
4018 choose whichever filter you like, but you should make sure that it does
4019 not occur in your other E-mail. In our example, we will use
4020 @code{X-mailer: GNUnet}. The @code{~/.procmailrc} configuration file then
4021 looks like this:
4022
4023 @example
4024 :0:
4025 * ^X-mailer: GNUnet
4026 /tmp/gnunet.smtp
4027 # where do you want your other e-mail delivered to
4028 # (default: /var/spool/mail/)
4029 :0: /var/spool/mail/
4030 @end example
4031
4032 After adding this file, first make sure that your regular E-mail still
4033 works (e.g. by sending an E-mail to yourself). Then edit the GNUnet
4034 configuration. In the section @code{SMTP} you need to specify your E-mail
4035 address under @code{EMAIL}, your mail server (for outgoing mail) under
4036 @code{SERVER}, the filter (X-mailer: GNUnet in the example) under
4037 @code{FILTER} and the name of the pipe under @code{PIPE}.@ The completed
4038 section could then look like this:
4039
4040 @example
4041 EMAIL = me@@mail.gnu.org MTU = 65000 SERVER = mail.gnu.org:25 FILTER =
4042 "X-mailer: GNUnet" PIPE = /tmp/gnunet.smtp
4043 @end example
4044
4045 Finally, you need to add @code{smtp} to the list of @code{TRANSPORTS} in
4046 the @code{GNUNETD} section. GNUnet peers will use the E-mail address that
4047 you specified to contact your peer until the advertisement times out.
4048 Thus, if you are not sure if everything works properly or if you are not
4049 planning to be online for a long time, you may want to configure this
4050 timeout to be short, e.g. just one hour. For this, set
4051 @code{HELLOEXPIRES} to @code{1} in the @code{GNUNETD} section.
4052
4053 This should be it, but you may probably want to test it first.
4054
4055 @node How do I test if it works?
4056 @subsection How do I test if it works?
4057
4058
4059 Any transport can be subjected to some rudimentary tests using the
4060 @code{gnunet-transport-check} tool. The tool sends a message to the local
4061 node via the transport and checks that a valid message is received. While
4062 this test does not involve other peers and can not check if firewalls or
4063 other network obstacles prohibit proper operation, this is a great
4064 testcase for the SMTP transport since it tests pretty much nearly all of
4065 the functionality.
4066
4067 @code{gnunet-transport-check} should only be used without running
4068 @code{gnunetd} at the same time. By default, @code{gnunet-transport-check}
4069 tests all transports that are specified in the configuration file. But
4070 you can specifically test SMTP by giving the option
4071 @code{--transport=smtp}.
4072
4073 Note that this test always checks if a transport can receive and send.
4074 While you can configure most transports to only receive or only send
4075 messages, this test will only work if you have configured the transport
4076 to send and receive messages.
4077
4078 @node How fast is it?
4079 @subsection How fast is it?
4080
4081
4082 We have measured the performance of the UDP, TCP and SMTP transport layer
4083 directly and when used from an application using the GNUnet core.
4084 Measuring just the transport layer gives the better view of the actual
4085 overhead of the protocol, whereas evaluating the transport from the
4086 application puts the overhead into perspective from a practical point of
4087 view.
4088
4089 The loopback measurements of the SMTP transport were performed on three
4090 different machines spanning a range of modern SMTP configurations. We
4091 used a PIII-800 running RedHat 7.3 with the Purdue Computer Science
4092 configuration which includes filters for spam. We also used a Xenon 2 GHZ
4093 with a vanilla RedHat 8.0 sendmail configuration. Furthermore, we used
4094 qmail on a PIII-1000 running Sorcerer GNU Linux (SGL). The numbers for
4095 UDP and TCP are provided using the SGL configuration. The qmail benchmark
4096 uses qmail's internal filtering whereas the sendmail benchmarks relies on
4097 procmail to filter and deliver the mail. We used the transport layer to
4098 send a message of b bytes (excluding transport protocol headers) directly
4099 to the local machine. This way, network latency and packet loss on the
4100 wire have no impact on the timings. n messages were sent sequentially over
4101 the transport layer, sending message i+1 after the i-th message was
4102 received. All messages were sent over the same connection and the time to
4103 establish the connection was not taken into account since this overhead is
4104 minuscule in practice --- as long as a connection is used for a
4105 significant number of messages.
4106
4107 @multitable @columnfractions .20 .15 .15 .15 .15 .15
4108 @headitem Transport @tab UDP @tab TCP @tab SMTP (Purdue sendmail)
4109 @tab SMTP (RH 8.0) @tab SMTP (SGL qmail)
4110 @item  11 bytes @tab 31 ms @tab 55 ms @tab  781 s @tab 77 s @tab 24 s
4111 @item  407 bytes @tab 37 ms @tab 62 ms @tab  789 s @tab 78 s @tab 25 s
4112 @item 1,221 bytes @tab 46 ms @tab 73 ms @tab  804 s @tab 78 s @tab 25 s
4113 @end multitable
4114
4115 The benchmarks show that UDP and TCP are, as expected, both significantly
4116 faster compared with any of the SMTP services. Among the SMTP
4117 implementations, there can be significant differences depending on the
4118 SMTP configuration. Filtering with an external tool like procmail that
4119 needs to re-parse its configuration for each mail can be very expensive.
4120 Applying spam filters can also significantly impact the performance of
4121 the underlying SMTP implementation. The microbenchmark shows that SMTP
4122 can be a viable solution for initiating peer-to-peer sessions: a couple of
4123 seconds to connect to a peer are probably not even going to be noticed by
4124 users. The next benchmark measures the possible throughput for a
4125 transport. Throughput can be measured by sending multiple messages in
4126 parallel and measuring packet loss. Note that not only UDP but also the
4127 TCP transport can actually loose messages since the TCP implementation
4128 drops messages if the @code{write} to the socket would block. While the
4129 SMTP protocol never drops messages itself, it is often so
4130 slow that only a fraction of the messages can be sent and received in the
4131 given time-bounds. For this benchmark we report the message loss after
4132 allowing t time for sending m messages. If messages were not sent (or
4133 received) after an overall timeout of t, they were considered lost. The
4134 benchmark was performed using two Xeon 2 GHZ machines running RedHat 8.0
4135 with sendmail. The machines were connected with a direct 100 MBit Ethernet
4136 connection.@ Figures udp1200, tcp1200 and smtp-MTUs show that the
4137 throughput for messages of size 1,200 octets is 2,343 kbps, 3,310 kbps
4138 and 6 kbps for UDP, TCP and SMTP respectively. The high per-message
4139 overhead of SMTP can be improved by increasing the MTU, for example, an
4140 MTU of 12,000 octets improves the throughput to 13 kbps as figure
4141 smtp-MTUs shows. Our research paper) has some more details on the
4142 benchmarking results.
4143
4144 @cindex Bluetooth plugin
4145 @node Bluetooth plugin
4146 @section Bluetooth plugin
4147
4148
4149 This page describes the new Bluetooth transport plugin for GNUnet. The
4150 plugin is still in the testing stage so don't expect it to work
4151 perfectly. If you have any questions or problems just post them here or
4152 ask on the IRC channel.
4153
4154 @itemize @bullet
4155 @item What do I need to use the Bluetooth plugin transport?
4156 @item BluetoothHow does it work?
4157 @item What possible errors should I be aware of?
4158 @item How do I configure my peer?
4159 @item How can I test it?
4160 @end itemize
4161
4162 @menu
4163 * What do I need to use the Bluetooth plugin transport?::
4164 * How does it work2?::
4165 * What possible errors should I be aware of?::
4166 * How do I configure my peer2?::
4167 * How can I test it?::
4168 * The implementation of the Bluetooth transport plugin::
4169 @end menu
4170
4171 @node What do I need to use the Bluetooth plugin transport?
4172 @subsection What do I need to use the Bluetooth plugin transport?
4173
4174
4175 If you are a GNU/Linux user and you want to use the Bluetooth
4176 transport plugin you should install the
4177 @command{BlueZ} development libraries (if they aren't already
4178 installed).
4179 For instructions about how to install the libraries you should
4180 check out the BlueZ site
4181 (@uref{http://www.bluez.org/, http://www.bluez.org}). If you don't know if
4182 you have the necessary libraries, don't worry, just run the GNUnet
4183 configure script and you will be able to see a notification at the end
4184 which will warn you if you don't have the necessary libraries.
4185
4186 @c If you are a Windows user you should have installed the
4187 @c @emph{MinGW}/@emph{MSys2} with the latest updates (especially the
4188 @c @emph{ws2bth} header). If this is your first build of GNUnet on Windows
4189 @c you should check out the SBuild repository. It will semi-automatically
4190 @c assembles a @emph{MinGW}/@emph{MSys2} installation with a lot of extra
4191 @c packages which are needed for the GNUnet build. So this will ease your
4192 @c work!@ Finally you just have to be sure that you have the correct drivers
4193 @c for your Bluetooth device installed and that your device is on and in a
4194 @c discoverable mode. The Windows Bluetooth Stack supports only the RFCOMM
4195 @c protocol so we cannot turn on your device programatically!
4196
4197 @c FIXME: Change to unique title
4198 @node How does it work2?
4199 @subsection How does it work2?
4200
4201
4202 The Bluetooth transport plugin uses virtually the same code as the WLAN
4203 plugin and only the helper binary is different. The helper takes a single
4204 argument, which represents the interface name and is specified in the
4205 configuration file. Here are the basic steps that are followed by the
4206 helper binary used on GNU/Linux:
4207
4208 @itemize @bullet
4209 @item it verifies if the name corresponds to a Bluetooth interface name
4210 @item it verifies if the interface is up (if it is not, it tries to bring
4211 it up)
4212 @item it tries to enable the page and inquiry scan in order to make the
4213 device discoverable and to accept incoming connection requests
4214 @emph{The above operations require root access so you should start the
4215 transport plugin with root privileges.}
4216 @item it finds an available port number and registers a SDP service which
4217 will be used to find out on which port number is the server listening on
4218 and switch the socket in listening mode
4219 @item it sends a HELLO message with its address
4220 @item finally it forwards traffic from the reading sockets to the STDOUT
4221 and from the STDIN to the writing socket
4222 @end itemize
4223
4224 Once in a while the device will make an inquiry scan to discover the
4225 nearby devices and it will send them randomly HELLO messages for peer
4226 discovery.
4227
4228 @node What possible errors should I be aware of?
4229 @subsection What possible errors should I be aware of?
4230
4231
4232 @emph{This section is dedicated for GNU/Linux users}
4233
4234 Well there are many ways in which things could go wrong but I will try to
4235 present some tools that you could use to debug and some scenarios.
4236
4237 @itemize @bullet
4238
4239 @item @code{bluetoothd -n -d} : use this command to enable logging in the
4240 foreground and to print the logging messages
4241
4242 @item @code{hciconfig}: can be used to configure the Bluetooth devices.
4243 If you run it without any arguments it will print information about the
4244 state of the interfaces. So if you receive an error that the device
4245 couldn't be brought up you should try to bring it manually and to see if
4246 it works (use @code{hciconfig -a hciX up}). If you can't and the
4247 Bluetooth address has the form 00:00:00:00:00:00 it means that there is
4248 something wrong with the D-Bus daemon or with the Bluetooth daemon. Use
4249 @code{bluetoothd} tool to see the logs
4250
4251 @item @code{sdptool} can be used to control and interrogate SDP servers.
4252 If you encounter problems regarding the SDP server (like the SDP server is
4253 down) you should check out if the D-Bus daemon is running correctly and to
4254 see if the Bluetooth daemon started correctly(use @code{bluetoothd} tool).
4255 Also, sometimes the SDP service could work but somehow the device couldn't
4256 register its service. Use @code{sdptool browse [dev-address]} to see if
4257 the service is registered. There should be a service with the name of the
4258 interface and GNUnet as provider.
4259
4260 @item @code{hcitool} : another useful tool which can be used to configure
4261 the device and to send some particular commands to it.
4262
4263 @item @code{hcidump} : could be used for low level debugging
4264 @end itemize
4265
4266 @c FIXME: A more unique name
4267 @node How do I configure my peer2?
4268 @subsection How do I configure my peer2?
4269
4270
4271 On GNU/Linux, you just have to be sure that the interface name
4272 corresponds to the one that you want to use.
4273 Use the @code{hciconfig} tool to check that.
4274 By default it is set to hci0 but you can change it.
4275
4276 A basic configuration looks like this:
4277
4278 @example
4279 [transport-bluetooth]
4280 # Name of the interface (typically hciX)
4281 INTERFACE = hci0
4282 # Real hardware, no testing
4283 TESTMODE = 0 TESTING_IGNORE_KEYS = ACCEPT_FROM;
4284 @end example
4285
4286 In order to use the Bluetooth transport plugin when the transport service
4287 is started, you must add the plugin name to the default transport service
4288 plugins list. For example:
4289
4290 @example
4291 [transport] ...  PLUGINS = dns bluetooth ...
4292 @end example
4293
4294 If you want to use only the Bluetooth plugin set
4295 @emph{PLUGINS = bluetooth}
4296
4297 On Windows, you cannot specify which device to use. The only thing that
4298 you should do is to add @emph{bluetooth} on the plugins list of the
4299 transport service.
4300
4301 @node How can I test it?
4302 @subsection How can I test it?
4303
4304
4305 If you have two Bluetooth devices on the same machine and you are using
4306 GNU/Linux you must:
4307
4308 @itemize @bullet
4309
4310 @item create two different file configuration (one which will use the
4311 first interface (@emph{hci0}) and the other which will use the second
4312 interface (@emph{hci1})). Let's name them @emph{peer1.conf} and
4313 @emph{peer2.conf}.
4314
4315 @item run @emph{gnunet-peerinfo -c peerX.conf -s} in order to generate the
4316 peers private keys. The @strong{X} must be replace with 1 or 2.
4317
4318 @item run @emph{gnunet-arm -c peerX.conf -s -i=transport} in order to
4319 start the transport service. (Make sure that you have "bluetooth" on the
4320 transport plugins list if the Bluetooth transport service doesn't start.)
4321
4322 @item run @emph{gnunet-peerinfo -c peer1.conf -s} to get the first peer's
4323 ID. If you already know your peer ID (you saved it from the first
4324 command), this can be skipped.
4325
4326 @item run @emph{gnunet-transport -c peer2.conf -p=PEER1_ID -s} to start
4327 sending data for benchmarking to the other peer.
4328
4329 @end itemize
4330
4331
4332 This scenario will try to connect the second peer to the first one and
4333 then start sending data for benchmarking.
4334
4335 @c On Windows you cannot test the plugin functionality using two Bluetooth
4336 @c devices from the same machine because after you install the drivers there
4337 @c will occur some conflicts between the Bluetooth stacks. (At least that is
4338 @c what happened on my machine : I wasn't able to use the Bluesoleil stack and
4339 @c the WINDCOMM one in the same time).
4340
4341 If you have two different machines and your configuration files are good
4342 you can use the same scenario presented on the beginning of this section.
4343
4344 Another way to test the plugin functionality is to create your own
4345 application which will use the GNUnet framework with the Bluetooth
4346 transport service.
4347
4348 @node The implementation of the Bluetooth transport plugin
4349 @subsection The implementation of the Bluetooth transport plugin
4350
4351
4352 This page describes the implementation of the Bluetooth transport plugin.
4353
4354 First I want to remind you that the Bluetooth transport plugin uses
4355 virtually the same code as the WLAN plugin and only the helper binary is
4356 different. Also the scope of the helper binary from the Bluetooth
4357 transport plugin is the same as the one used for the WLAN transport
4358 plugin: it accesses the interface and then it forwards traffic in both
4359 directions between the Bluetooth interface and stdin/stdout of the
4360 process involved.
4361
4362 The Bluetooth plugin transport could be used both on GNU/Linux and Windows
4363 platforms.
4364
4365 @itemize @bullet
4366 @item Linux functionality
4367 @c @item Windows functionality
4368 @item Pending Features
4369 @end itemize
4370
4371
4372
4373 @menu
4374 * Linux functionality::
4375 * THE INITIALIZATION::
4376 * THE LOOP::
4377 * Details about the broadcast implementation::
4378 @c * Windows functionality::
4379 * Pending features::
4380 @end menu
4381
4382 @node Linux functionality
4383 @subsubsection Linux functionality
4384
4385
4386 In order to implement the plugin functionality on GNU/Linux I
4387 used the BlueZ stack.
4388 For the communication with the other devices I used the RFCOMM
4389 protocol. Also I used the HCI protocol to gain some control over the
4390 device. The helper binary takes a single argument (the name of the
4391 Bluetooth interface) and is separated in two stages:
4392
4393 @c %** 'THE INITIALIZATION' should be in bigger letters or stand out, not
4394 @c %** starting a new section?
4395 @node THE INITIALIZATION
4396 @subsubsection THE INITIALIZATION
4397
4398 @itemize @bullet
4399 @item first, it checks if we have root privileges
4400 (@emph{Remember that we need to have root privileges in order to be able
4401 to bring the interface up if it is down or to change its state.}).
4402
4403 @item second, it verifies if the interface with the given name exists.
4404
4405 @strong{If the interface with that name exists and it is a Bluetooth
4406 interface:}
4407
4408 @item it creates a RFCOMM socket which will be used for listening and call
4409 the @emph{open_device} method
4410
4411 On the @emph{open_device} method:
4412 @itemize @bullet
4413 @item creates a HCI socket used to send control events to the the device
4414 @item searches for the device ID using the interface name
4415 @item saves the device MAC address
4416 @item checks if the interface is down and tries to bring it UP
4417 @item checks if the interface is in discoverable mode and tries to make it
4418 discoverable
4419 @item closes the HCI socket and binds the RFCOMM one
4420 @item switches the RFCOMM socket in listening mode
4421 @item registers the SDP service (the service will be used by the other
4422 devices to get the port on which this device is listening on)
4423 @end itemize
4424
4425 @item drops the root privileges
4426
4427 @strong{If the interface is not a Bluetooth interface the helper exits
4428 with a suitable error}
4429 @end itemize
4430
4431 @c %** Same as for @node entry above
4432 @node THE LOOP
4433 @subsubsection THE LOOP
4434
4435 The helper binary uses a list where it saves all the connected neighbour
4436 devices (@emph{neighbours.devices}) and two buffers (@emph{write_pout} and
4437 @emph{write_std}). The first message which is send is a control message
4438 with the device's MAC address in order to announce the peer presence to
4439 the neighbours. Here are a short description of what happens in the main
4440 loop:
4441
4442 @itemize @bullet
4443 @item Every time when it receives something from the STDIN it processes
4444 the data and saves the message in the first buffer (@emph{write_pout}).
4445 When it has something in the buffer, it gets the destination address from
4446 the buffer, searches the destination address in the list (if there is no
4447 connection with that device, it creates a new one and saves it to the
4448 list) and sends the message.
4449 @item Every time when it receives something on the listening socket it
4450 accepts the connection and saves the socket on a list with the reading
4451 sockets. @item Every time when it receives something from a reading
4452 socket it parses the message, verifies the CRC and saves it in the
4453 @emph{write_std} buffer in order to be sent later to the STDOUT.
4454 @end itemize
4455
4456 So in the main loop we use the select function to wait until one of the
4457 file descriptor saved in one of the two file descriptors sets used is
4458 ready to use. The first set (@emph{rfds}) represents the reading set and
4459 it could contain the list with the reading sockets, the STDIN file
4460 descriptor or the listening socket. The second set (@emph{wfds}) is the
4461 writing set and it could contain the sending socket or the STDOUT file
4462 descriptor. After the select function returns, we check which file
4463 descriptor is ready to use and we do what is supposed to do on that kind
4464 of event. @emph{For example:} if it is the listening socket then we
4465 accept a new connection and save the socket in the reading list; if it is
4466 the STDOUT file descriptor, then we write to STDOUT the message from the
4467 @emph{write_std} buffer.
4468
4469 To find out on which port a device is listening on we connect to the local
4470 SDP server and search the registered service for that device.
4471
4472 @emph{You should be aware of the fact that if the device fails to connect
4473 to another one when trying to send a message it will attempt one more
4474 time. If it fails again, then it skips the message.}
4475 @emph{Also you should know that the transport Bluetooth plugin has
4476 support for @strong{broadcast messages}.}
4477
4478 @node Details about the broadcast implementation
4479 @subsubsection Details about the broadcast implementation
4480
4481
4482 First I want to point out that the broadcast functionality for the CONTROL
4483 messages is not implemented in a conventional way. Since the inquiry scan
4484 time is too big and it will take some time to send a message to all the
4485 discoverable devices I decided to tackle the problem in a different way.
4486 Here is how I did it:
4487
4488 @itemize @bullet
4489 @item If it is the first time when I have to broadcast a message I make an
4490 inquiry scan and save all the devices' addresses to a vector.
4491 @item After the inquiry scan ends I take the first address from the list
4492 and I try to connect to it. If it fails, I try to connect to the next one.
4493 If it succeeds, I save the socket to a list and send the message to the
4494 device.
4495 @item When I have to broadcast another message, first I search on the list
4496 for a new device which I'm not connected to. If there is no new device on
4497 the list I go to the beginning of the list and send the message to the
4498 old devices. After 5 cycles I make a new inquiry scan to check out if
4499 there are new discoverable devices and save them to the list. If there
4500 are no new discoverable devices I reset the cycling counter and go again
4501 through the old list and send messages to the devices saved in it.
4502 @end itemize
4503
4504 @strong{Therefore}:
4505
4506 @itemize @bullet
4507 @item every time when I have a broadcast message I look up on the list
4508 for a new device and send the message to it
4509 @item if I reached the end of the list for 5 times and I'm connected to
4510 all the devices from the list I make a new inquiry scan.
4511 @emph{The number of the list's cycles after an inquiry scan could be
4512 increased by redefining the MAX_LOOPS variable}
4513 @item when there are no new devices I send messages to the old ones.
4514 @end itemize
4515
4516 Doing so, the broadcast control messages will reach the devices but with
4517 delay.
4518
4519 @emph{NOTICE:} When I have to send a message to a certain device first I
4520 check on the broadcast list to see if we are connected to that device. If
4521 not we try to connect to it and in case of success we save the address and
4522 the socket on the list. If we are already connected to that device we
4523 simply use the socket.
4524
4525 @c @node Windows functionality
4526 @c @subsubsection Windows functionality
4527
4528
4529 @c For Windows I decided to use the Microsoft Bluetooth stack which has the
4530 @c advantage of coming standard from Windows XP SP2. The main disadvantage is
4531 @c that it only supports the RFCOMM protocol so we will not be able to have
4532 @c a low level control over the Bluetooth device. Therefore it is the user
4533 @c responsibility to check if the device is up and in the discoverable mode.
4534 @c Also there are no tools which could be used for debugging in order to read
4535 @c the data coming from and going to a Bluetooth device, which obviously
4536 @c hindered my work. Another thing that slowed down the implementation of the
4537 @c plugin (besides that I wasn't too accommodated with the win32 API) was that
4538 @c there were some bugs on MinGW regarding the Bluetooth. Now they are solved
4539 @c but you should keep in mind that you should have the latest updates
4540 @c (especially the @emph{ws2bth} header).
4541
4542 @c Besides the fact that it uses the Windows Sockets, the Windows
4543 @c implementation follows the same principles as the GNU/Linux one:
4544
4545 @c @itemize @bullet
4546 @c @item It has a initalization part where it initializes the
4547 @c Windows Sockets, creates a RFCOMM socket which will be binded and switched
4548 @c to the listening mode and registers a SDP service. In the Microsoft
4549 @c Bluetooth API there are two ways to work with the SDP:
4550 @c @itemize @bullet
4551 @c @item an easy way which works with very simple service records
4552 @c @item a hard way which is useful when you need to update or to delete the
4553 @c record
4554 @c @end itemize
4555 @c @end itemize
4556
4557 @c Since I only needed the SDP service to find out on which port the device
4558 @c is listening on and that did not change, I decided to use the easy way.
4559 @c In order to register the service I used the @emph{WSASetService} function
4560 @c and I generated the @emph{Universally Unique Identifier} with the
4561 @c @emph{guidgen.exe} Windows's tool.
4562
4563 @c In the loop section the only difference from the GNU/Linux implementation
4564 @c is that I used the @code{GNUNET_NETWORK} library for
4565 @c functions like @emph{accept}, @emph{bind}, @emph{connect} or
4566 @c @emph{select}. I decided to use the
4567 @c @code{GNUNET_NETWORK} library because I also needed to interact
4568 @c with the STDIN and STDOUT handles and on Windows
4569 @c the select function is only defined for sockets,
4570 @c and it will not work for arbitrary file handles.
4571
4572 @c Another difference between GNU/Linux and Windows implementation is that in
4573 @c GNU/Linux, the Bluetooth address is represented in 48 bits
4574 @c while in Windows is represented in 64 bits.
4575 @c Therefore I had to do some changes on @emph{plugin_transport_wlan} header.
4576
4577 @c Also, currently on Windows the Bluetooth plugin doesn't have support for
4578 @c broadcast messages. When it receives a broadcast message it will skip it.
4579
4580 @node Pending features
4581 @subsubsection Pending features
4582
4583
4584 @itemize @bullet
4585 @c @item Implement the broadcast functionality on Windows @emph{(currently
4586 @c working on)}
4587 @item Implement a testcase for the helper :@ @emph{The testcase
4588 consists of a program which emulates the plugin and uses the helper. It
4589 will simulate connections, disconnections and data transfers.}
4590 @end itemize
4591
4592 If you have a new idea about a feature of the plugin or suggestions about
4593 how I could improve the implementation you are welcome to comment or to
4594 contact me.
4595
4596 @node WLAN plugin
4597 @section WLAN plugin
4598
4599
4600 This section documents how the wlan transport plugin works. Parts which
4601 are not implemented yet or could be better implemented are described at
4602 the end.
4603
4604 @cindex ATS Subsystem
4605 @node ATS Subsystem
4606 @section ATS Subsystem
4607
4608
4609 ATS stands for "automatic transport selection", and the function of ATS in
4610 GNUnet is to decide on which address (and thus transport plugin) should
4611 be used for two peers to communicate, and what bandwidth limits should be
4612 imposed on such an individual connection. To help ATS make an informed
4613 decision, higher-level services inform the ATS service about their
4614 requirements and the quality of the service rendered. The ATS service
4615 also interacts with the transport service to be appraised of working
4616 addresses and to communicate its resource allocation decisions. Finally,
4617 the ATS service's operation can be observed using a monitoring API.
4618
4619 The main logic of the ATS service only collects the available addresses,
4620 their performance characteristics and the applications requirements, but
4621 does not make the actual allocation decision. This last critical step is
4622 left to an ATS plugin, as we have implemented (currently three) different
4623 allocation strategies which differ significantly in their performance and
4624 maturity, and it is still unclear if any particular plugin is generally
4625 superior.
4626
4627 @cindex CORE Subsystem
4628 @node CORE Subsystem
4629 @section CORE Subsystem
4630
4631
4632 The CORE subsystem in GNUnet is responsible for securing link-layer
4633 communications between nodes in the GNUnet overlay network. CORE builds
4634 on the TRANSPORT subsystem which provides for the actual, insecure,
4635 unreliable link-layer communication (for example, via UDP or WLAN), and
4636 then adds fundamental security to the connections:
4637
4638 @itemize @bullet
4639 @item confidentiality with so-called perfect forward secrecy; we use
4640 ECDHE
4641 (@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman})
4642 powered by Curve25519
4643 (@uref{http://cr.yp.to/ecdh.html, Curve25519}) for the key
4644 exchange and then use symmetric encryption, encrypting with both AES-256
4645 (@uref{http://en.wikipedia.org/wiki/Rijndael, AES-256}) and
4646 Twofish (@uref{http://en.wikipedia.org/wiki/Twofish, Twofish})
4647 @item @uref{http://en.wikipedia.org/wiki/Authentication, authentication}
4648 is achieved by signing the ephemeral keys using Ed25519
4649 (@uref{http://ed25519.cr.yp.to/, Ed25519}), a deterministic
4650 variant of ECDSA
4651 (@uref{http://en.wikipedia.org/wiki/ECDSA, ECDSA})
4652 @item integrity protection (using SHA-512
4653 (@uref{http://en.wikipedia.org/wiki/SHA-2, SHA-512}) to do
4654 encrypt-then-MAC
4655 (@uref{http://en.wikipedia.org/wiki/Authenticated_encryption, encrypt-then-MAC}))
4656 @item Replay
4657 (@uref{http://en.wikipedia.org/wiki/Replay_attack, replay})
4658 protection (using nonces, timestamps, challenge-response,
4659 message counters and ephemeral keys)
4660 @item liveness (keep-alive messages, timeout)
4661 @end itemize
4662
4663 @menu
4664 * Limitations::
4665 * When is a peer "connected"?::
4666 * libgnunetcore::
4667 * The CORE Client-Service Protocol::
4668 * The CORE Peer-to-Peer Protocol::
4669 @end menu
4670
4671 @cindex core subsystem limitations
4672 @node Limitations
4673 @subsection Limitations
4674
4675
4676 CORE does not perform
4677 @uref{http://en.wikipedia.org/wiki/Routing, routing}; using CORE it is
4678 only possible to communicate with peers that happen to already be
4679 "directly" connected with each other. CORE also does not have an
4680 API to allow applications to establish such "direct" connections --- for
4681 this, applications can ask TRANSPORT, but TRANSPORT might not be able to
4682 establish a "direct" connection. The TOPOLOGY subsystem is responsible for
4683 trying to keep a few "direct" connections open at all times. Applications
4684 that need to talk to particular peers should use the CADET subsystem, as
4685 it can establish arbitrary "indirect" connections.
4686
4687 Because CORE does not perform routing, CORE must only be used directly by
4688 applications that either perform their own routing logic (such as
4689 anonymous file-sharing) or that do not require routing, for example
4690 because they are based on flooding the network. CORE communication is
4691 unreliable and delivery is possibly out-of-order. Applications that
4692 require reliable communication should use the CADET service. Each
4693 application can only queue one message per target peer with the CORE
4694 service at any time; messages cannot be larger than approximately
4695 63 kilobytes. If messages are small, CORE may group multiple messages
4696 (possibly from different applications) prior to encryption. If permitted
4697 by the application (using the @uref{http://baus.net/on-tcp_cork/, cork}
4698 option), CORE may delay transmissions to facilitate grouping of multiple
4699 small messages. If cork is not enabled, CORE will transmit the message as
4700 soon as TRANSPORT allows it (TRANSPORT is responsible for limiting
4701 bandwidth and congestion control). CORE does not allow flow control;
4702 applications are expected to process messages at line-speed. If flow
4703 control is needed, applications should use the CADET service.
4704
4705 @cindex when is a peer connected
4706 @node When is a peer "connected"?
4707 @subsection When is a peer "connected"?
4708
4709
4710 In addition to the security features mentioned above, CORE also provides
4711 one additional key feature to applications using it, and that is a
4712 limited form of protocol-compatibility checking. CORE distinguishes
4713 between TRANSPORT-level connections (which enable communication with other
4714 peers) and application-level connections. Applications using the CORE API
4715 will (typically) learn about application-level connections from CORE, and
4716 not about TRANSPORT-level connections. When a typical application uses
4717 CORE, it will specify a set of message types
4718 (from @code{gnunet_protocols.h}) that it understands. CORE will then
4719 notify the application about connections it has with other peers if and
4720 only if those applications registered an intersecting set of message
4721 types with their CORE service. Thus, it is quite possible that CORE only
4722 exposes a subset of the established direct connections to a particular
4723 application --- and different applications running above CORE might see
4724 different sets of connections at the same time.
4725
4726 A special case are applications that do not register a handler for any
4727 message type.
4728 CORE assumes that these applications merely want to monitor connections
4729 (or "all" messages via other callbacks) and will notify those applications
4730 about all connections. This is used, for example, by the
4731 @code{gnunet-core} command-line tool to display the active connections.
4732 Note that it is also possible that the TRANSPORT service has more active
4733 connections than the CORE service, as the CORE service first has to
4734 perform a key exchange with connecting peers before exchanging information
4735 about supported message types and notifying applications about the new
4736 connection.
4737
4738 @cindex libgnunetcore
4739 @node libgnunetcore
4740 @subsection libgnunetcore
4741
4742
4743 The CORE API (defined in @file{gnunet_core_service.h}) is the basic
4744 messaging API used by P2P applications built using GNUnet. It provides
4745 applications the ability to send and receive encrypted messages to the
4746 peer's "directly" connected neighbours.
4747
4748 As CORE connections are generally "direct" connections,@ applications must
4749 not assume that they can connect to arbitrary peers this way, as "direct"
4750 connections may not always be possible. Applications using CORE are
4751 notified about which peers are connected. Creating new "direct"
4752 connections must be done using the TRANSPORT API.
4753
4754 The CORE API provides unreliable, out-of-order delivery. While the
4755 implementation tries to ensure timely, in-order delivery, both message
4756 losses and reordering are not detected and must be tolerated by the
4757 application. Most important, the core will NOT perform retransmission if
4758 messages could not be delivered.
4759
4760 Note that CORE allows applications to queue one message per connected
4761 peer. The rate at which each connection operates is influenced by the
4762 preferences expressed by local application as well as restrictions
4763 imposed by the other peer. Local applications can express their
4764 preferences for particular connections using the "performance" API of the
4765 ATS service.
4766
4767 Applications that require more sophisticated transmission capabilities
4768 such as TCP-like behavior, or if you intend to send messages to arbitrary
4769 remote peers, should use the CADET API.
4770
4771 The typical use of the CORE API is to connect to the CORE service using
4772 @code{GNUNET_CORE_connect}, process events from the CORE service (such as
4773 peers connecting, peers disconnecting and incoming messages) and send
4774 messages to connected peers using
4775 @code{GNUNET_CORE_notify_transmit_ready}. Note that applications must
4776 cancel pending transmission requests if they receive a disconnect event
4777 for a peer that had a transmission pending; furthermore, queuing more
4778 than one transmission request per peer per application using the
4779 service is not permitted.
4780
4781 The CORE API also allows applications to monitor all communications of the
4782 peer prior to encryption (for outgoing messages) or after decryption (for
4783 incoming messages). This can be useful for debugging, diagnostics or to
4784 establish the presence of cover traffic (for anonymity). As monitoring
4785 applications are often not interested in the payload, the monitoring
4786 callbacks can be configured to only provide the message headers (including
4787 the message type and size) instead of copying the full data stream to the
4788 monitoring client.
4789
4790 The init callback of the @code{GNUNET_CORE_connect} function is called
4791 with the hash of the public key of the peer. This public key is used to
4792 identify the peer globally in the GNUnet network. Applications are
4793 encouraged to check that the provided hash matches the hash that they are
4794 using (as theoretically the application may be using a different
4795 configuration file with a different private key, which would result in
4796 hard to find bugs).
4797
4798 As with most service APIs, the CORE API isolates applications from crashes
4799 of the CORE service. If the CORE service crashes, the application will see
4800 disconnect events for all existing connections. Once the connections are
4801 re-established, the applications will be receive matching connect events.
4802
4803 @cindex core clinet-service protocol
4804 @node The CORE Client-Service Protocol
4805 @subsection The CORE Client-Service Protocol
4806
4807
4808 This section describes the protocol between an application using the CORE
4809 service (the client) and the CORE service process itself.
4810
4811
4812 @menu
4813 * Setup2::
4814 * Notifications::
4815 * Sending::
4816 @end menu
4817
4818 @node Setup2
4819 @subsubsection Setup2
4820
4821
4822 When a client connects to the CORE service, it first sends a
4823 @code{InitMessage} which specifies options for the connection and a set of
4824 message type values which are supported by the application. The options
4825 bitmask specifies which events the client would like to be notified about.
4826 The options include:
4827
4828 @table @asis
4829 @item GNUNET_CORE_OPTION_NOTHING No notifications
4830 @item GNUNET_CORE_OPTION_STATUS_CHANGE Peers connecting and disconnecting
4831 @item GNUNET_CORE_OPTION_FULL_INBOUND All inbound messages (after
4832 decryption) with full payload
4833 @item GNUNET_CORE_OPTION_HDR_INBOUND Just the @code{MessageHeader}
4834 of all inbound messages
4835 @item GNUNET_CORE_OPTION_FULL_OUTBOUND All outbound
4836 messages (prior to encryption) with full payload
4837 @item GNUNET_CORE_OPTION_HDR_OUTBOUND Just the @code{MessageHeader} of all
4838 outbound messages
4839 @end table
4840
4841 Typical applications will only monitor for connection status changes.
4842
4843 The CORE service responds to the @code{InitMessage} with an
4844 @code{InitReplyMessage} which contains the peer's identity. Afterwards,
4845 both CORE and the client can send messages.
4846
4847 @node Notifications
4848 @subsubsection Notifications
4849
4850
4851 The CORE will send @code{ConnectNotifyMessage}s and
4852 @code{DisconnectNotifyMessage}s whenever peers connect or disconnect from
4853 the CORE (assuming their type maps overlap with the message types
4854 registered by the client). When the CORE receives a message that matches
4855 the set of message types specified during the @code{InitMessage} (or if
4856 monitoring is enabled in for inbound messages in the options), it sends a
4857 @code{NotifyTrafficMessage} with the peer identity of the sender and the
4858 decrypted payload. The same message format (except with
4859 @code{GNUNET_MESSAGE_TYPE_CORE_NOTIFY_OUTBOUND} for the message type) is
4860 used to notify clients monitoring outbound messages; here, the peer
4861 identity given is that of the receiver.
4862
4863 @node Sending
4864 @subsubsection Sending
4865
4866
4867 When a client wants to transmit a message, it first requests a
4868 transmission slot by sending a @code{SendMessageRequest} which specifies
4869 the priority, deadline and size of the message. Note that these values
4870 may be ignored by CORE. When CORE is ready for the message, it answers
4871 with a @code{SendMessageReady} response. The client can then transmit the
4872 payload with a @code{SendMessage} message. Note that the actual message
4873 size in the @code{SendMessage} is allowed to be smaller than the size in
4874 the original request. A client may at any time send a fresh
4875 @code{SendMessageRequest}, which then superceeds the previous
4876 @code{SendMessageRequest}, which is then no longer valid. The client can
4877 tell which @code{SendMessageRequest} the CORE service's
4878 @code{SendMessageReady} message is for as all of these messages contain a
4879 "unique" request ID (based on a counter incremented by the client
4880 for each request).
4881
4882 @cindex CORE Peer-to-Peer Protocol
4883 @node The CORE Peer-to-Peer Protocol
4884 @subsection The CORE Peer-to-Peer Protocol
4885
4886
4887
4888 @menu
4889 * Creating the EphemeralKeyMessage::
4890 * Establishing a connection::
4891 * Encryption and Decryption::
4892 * Type maps::
4893 @end menu
4894
4895 @cindex EphemeralKeyMessage creation
4896 @node Creating the EphemeralKeyMessage
4897 @subsubsection Creating the EphemeralKeyMessage
4898
4899
4900 When the CORE service starts, each peer creates a fresh ephemeral (ECC)
4901 public-private key pair and signs the corresponding
4902 @code{EphemeralKeyMessage} with its long-term key (which we usually call
4903 the peer's identity; the hash of the public long term key is what results
4904 in a @code{struct GNUNET_PeerIdentity} in all GNUnet APIs. The ephemeral
4905 key is ONLY used for an ECDHE
4906 (@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman})
4907 exchange by the CORE service to establish symmetric session keys. A peer
4908 will use the same @code{EphemeralKeyMessage} for all peers for
4909 @code{REKEY_FREQUENCY}, which is usually 12 hours. After that time, it
4910 will create a fresh ephemeral key (forgetting the old one) and broadcast
4911 the new @code{EphemeralKeyMessage} to all connected peers, resulting in
4912 fresh symmetric session keys. Note that peers independently decide on
4913 when to discard ephemeral keys; it is not a protocol violation to discard
4914 keys more often. Ephemeral keys are also never stored to disk; restarting
4915 a peer will thus always create a fresh ephemeral key. The use of ephemeral
4916 keys is what provides @uref{http://en.wikipedia.org/wiki/Forward_secrecy, forward secrecy}.
4917
4918 Just before transmission, the @code{EphemeralKeyMessage} is patched to
4919 reflect the current sender_status, which specifies the current state of
4920 the connection from the point of view of the sender. The possible values
4921 are:
4922
4923 @itemize @bullet
4924 @item @code{KX_STATE_DOWN} Initial value, never used on the network
4925 @item @code{KX_STATE_KEY_SENT} We sent our ephemeral key, do not know the
4926 key of the other peer
4927 @item @code{KX_STATE_KEY_RECEIVED} This peer has received a valid
4928 ephemeral key of the other peer, but we are waiting for the other peer to
4929 confirm it's authenticity (ability to decode) via challenge-response.
4930 @item @code{KX_STATE_UP} The connection is fully up from the point of
4931 view of the sender (now performing keep-alives)
4932 @item @code{KX_STATE_REKEY_SENT} The sender has initiated a rekeying
4933 operation; the other peer has so far failed to confirm a working
4934 connection using the new ephemeral key
4935 @end itemize
4936
4937 @node Establishing a connection
4938 @subsubsection Establishing a connection
4939
4940
4941 Peers begin their interaction by sending a @code{EphemeralKeyMessage} to
4942 the other peer once the TRANSPORT service notifies the CORE service about
4943 the connection.
4944 A peer receiving an @code{EphemeralKeyMessage} with a status
4945 indicating that the sender does not have the receiver's ephemeral key, the
4946 receiver's @code{EphemeralKeyMessage} is sent in response.
4947 Additionally, if the receiver has not yet confirmed the authenticity of
4948 the sender, it also sends an (encrypted)@code{PingMessage} with a
4949 challenge (and the identity of the target) to the other peer. Peers
4950 receiving a @code{PingMessage} respond with an (encrypted)
4951 @code{PongMessage} which includes the challenge. Peers receiving a
4952 @code{PongMessage} check the challenge, and if it matches set the
4953 connection to @code{KX_STATE_UP}.
4954
4955 @node Encryption and Decryption
4956 @subsubsection Encryption and Decryption
4957
4958
4959 All functions related to the key exchange and encryption/decryption of
4960 messages can be found in @file{gnunet-service-core_kx.c} (except for the
4961 cryptographic primitives, which are in @file{util/crypto*.c}).
4962 Given the key material from ECDHE, a Key derivation function
4963 (@uref{https://en.wikipedia.org/wiki/Key_derivation_function, Key derivation function})
4964 is used to derive two pairs of encryption and decryption keys for AES-256
4965 and TwoFish, as well as initialization vectors and authentication keys
4966 (for HMAC
4967 (@uref{https://en.wikipedia.org/wiki/HMAC, HMAC})).
4968 The HMAC is computed over the encrypted payload.
4969 Encrypted messages include an iv_seed and the HMAC in the header.
4970
4971 Each encrypted message in the CORE service includes a sequence number and
4972 a timestamp in the encrypted payload. The CORE service remembers the
4973 largest observed sequence number and a bit-mask which represents which of
4974 the previous 32 sequence numbers were already used.
4975 Messages with sequence numbers lower than the largest observed sequence
4976 number minus 32 are discarded. Messages with a timestamp that is less
4977 than @code{REKEY_TOLERANCE} off (5 minutes) are also discarded. This of
4978 course means that system clocks need to be reasonably synchronized for
4979 peers to be able to communicate. Additionally, as the ephemeral key
4980 changes every 12 hours, a peer would not even be able to decrypt messages
4981 older than 12 hours.
4982
4983 @node Type maps
4984 @subsubsection Type maps
4985
4986
4987 Once an encrypted connection has been established, peers begin to exchange
4988 type maps. Type maps are used to allow the CORE service to determine which
4989 (encrypted) connections should be shown to which applications. A type map
4990 is an array of 65536 bits representing the different types of messages
4991 understood by applications using the CORE service. Each CORE service
4992 maintains this map, simply by setting the respective bit for each message
4993 type supported by any of the applications using the CORE service. Note
4994 that bits for message types embedded in higher-level protocols (such as
4995 MESH) will not be included in these type maps.
4996
4997 Typically, the type map of a peer will be sparse. Thus, the CORE service
4998 attempts to compress its type map using @code{gzip}-style compression
4999 ("deflate") prior to transmission. However, if the compression fails to
5000 compact the map, the map may also be transmitted without compression
5001 (resulting in @code{GNUNET_MESSAGE_TYPE_CORE_COMPRESSED_TYPE_MAP} or
5002 @code{GNUNET_MESSAGE_TYPE_CORE_BINARY_TYPE_MAP} messages respectively).
5003 Upon receiving a type map, the respective CORE service notifies
5004 applications about the connection to the other peer if they support any
5005 message type indicated in the type map (or no message type at all).
5006 If the CORE service experience a connect or disconnect event from an
5007 application, it updates its type map (setting or unsetting the respective
5008 bits) and notifies its neighbours about the change.
5009 The CORE services of the neighbours then in turn generate connect and
5010 disconnect events for the peer that sent the type map for their respective
5011 applications. As CORE messages may be lost, the CORE service confirms
5012 receiving a type map by sending back a
5013 @code{GNUNET_MESSAGE_TYPE_CORE_CONFIRM_TYPE_MAP}. If such a confirmation
5014 (with the correct hash of the type map) is not received, the sender will
5015 retransmit the type map (with exponential back-off).
5016
5017 @cindex CADET Subsystem
5018 @cindex CADET
5019 @cindex cadet
5020 @node CADET Subsystem
5021 @section CADET Subsystem
5022
5023 The CADET subsystem in GNUnet is responsible for secure end-to-end
5024 communications between nodes in the GNUnet overlay network. CADET builds
5025 on the CORE subsystem which provides for the link-layer communication and
5026 then adds routing, forwarding and additional security to the connections.
5027 CADET offers the same cryptographic services as CORE, but on an
5028 end-to-end level. This is done so peers retransmitting traffic on behalf
5029 of other peers cannot access the payload data.
5030
5031 @itemize @bullet
5032 @item CADET provides confidentiality with so-called perfect forward
5033 secrecy; we use ECDHE powered by Curve25519 for the key exchange and then
5034 use symmetric encryption, encrypting with both AES-256 and Twofish
5035 @item authentication is achieved by signing the ephemeral keys using
5036 Ed25519, a deterministic variant of ECDSA
5037 @item integrity protection (using SHA-512 to do encrypt-then-MAC, although
5038 only 256 bits are sent to reduce overhead)
5039 @item replay protection (using nonces, timestamps, challenge-response,
5040 message counters and ephemeral keys)
5041 @item liveness (keep-alive messages, timeout)
5042 @end itemize
5043
5044 Additional to the CORE-like security benefits, CADET offers other
5045 properties that make it a more universal service than CORE.
5046
5047 @itemize @bullet
5048 @item CADET can establish channels to arbitrary peers in GNUnet. If a
5049 peer is not immediately reachable, CADET will find a path through the
5050 network and ask other peers to retransmit the traffic on its behalf.
5051 @item CADET offers (optional) reliability mechanisms. In a reliable
5052 channel traffic is guaranteed to arrive complete, unchanged and in-order.
5053 @item CADET takes care of flow and congestion control mechanisms, not
5054 allowing the sender to send more traffic than the receiver or the network
5055 are able to process.
5056 @end itemize
5057
5058 @menu
5059 * libgnunetcadet::
5060 @end menu
5061
5062 @cindex libgnunetcadet
5063 @node libgnunetcadet
5064 @subsection libgnunetcadet
5065
5066
5067 The CADET API (defined in @file{gnunet_cadet_service.h}) is the
5068 messaging API used by P2P applications built using GNUnet.
5069 It provides applications the ability to send and receive encrypted
5070 messages to any peer participating in GNUnet.
5071 The API is heavily base on the CORE API.
5072
5073 CADET delivers messages to other peers in "channels".
5074 A channel is a permanent connection defined by a destination peer
5075 (identified by its public key) and a port number.
5076 Internally, CADET tunnels all channels towards a destination peer
5077 using one session key and relays the data on multiple "connections",
5078 independent from the channels.
5079
5080 Each channel has optional parameters, the most important being the
5081 reliability flag.
5082 Should a message get lost on TRANSPORT/CORE level, if a channel is
5083 created with as reliable, CADET will retransmit the lost message and
5084 deliver it in order to the destination application.
5085
5086 @pindex GNUNET_CADET_connect
5087 To communicate with other peers using CADET, it is necessary to first
5088 connect to the service using @code{GNUNET_CADET_connect}.
5089 This function takes several parameters in form of callbacks, to allow the
5090 client to react to various events, like incoming channels or channels that
5091 terminate, as well as specify a list of ports the client wishes to listen
5092 to (at the moment it is not possible to start listening on further ports
5093 once connected, but nothing prevents a client to connect several times to
5094 CADET, even do one connection per listening port).
5095 The function returns a handle which has to be used for any further
5096 interaction with the service.
5097
5098 @pindex GNUNET_CADET_channel_create
5099 To connect to a remote peer, a client has to call the
5100 @code{GNUNET_CADET_channel_create} function. The most important parameters
5101 given are the remote peer's identity (it public key) and a port, which
5102 specifies which application on the remote peer to connect to, similar to
5103 TCP/UDP ports. CADET will then find the peer in the GNUnet network and
5104 establish the proper low-level connections and do the necessary key
5105 exchanges to assure and authenticated, secure and verified communication.
5106 Similar to @code{GNUNET_CADET_connect},@code{GNUNET_CADET_create_channel}
5107 returns a handle to interact with the created channel.
5108
5109 @pindex GNUNET_CADET_notify_transmit_ready
5110 For every message the client wants to send to the remote application,
5111 @code{GNUNET_CADET_notify_transmit_ready} must be called, indicating the
5112 channel on which the message should be sent and the size of the message
5113 (but not the message itself!). Once CADET is ready to send the message,
5114 the provided callback will fire, and the message contents are provided to
5115 this callback.
5116
5117 Please note the CADET does not provide an explicit notification of when a
5118 channel is connected. In loosely connected networks, like big wireless
5119 mesh networks, this can take several seconds, even minutes in the worst
5120 case. To be alerted when a channel is online, a client can call
5121 @code{GNUNET_CADET_notify_transmit_ready} immediately after
5122 @code{GNUNET_CADET_create_channel}. When the callback is activated, it
5123 means that the channel is online. The callback can give 0 bytes to CADET
5124 if no message is to be sent, this is OK.
5125
5126 @pindex GNUNET_CADET_notify_transmit_cancel
5127 If a transmission was requested but before the callback fires it is no
5128 longer needed, it can be canceled with
5129 @code{GNUNET_CADET_notify_transmit_ready_cancel}, which uses the handle
5130 given back by @code{GNUNET_CADET_notify_transmit_ready}.
5131 As in the case of CORE, only one message can be requested at a time: a
5132 client must not call @code{GNUNET_CADET_notify_transmit_ready} again until
5133 the callback is called or the request is canceled.
5134
5135 @pindex GNUNET_CADET_channel_destroy
5136 When a channel is no longer needed, a client can call
5137 @code{GNUNET_CADET_channel_destroy} to get rid of it.
5138 Note that CADET will try to transmit all pending traffic before notifying
5139 the remote peer of the destruction of the channel, including
5140 retransmitting lost messages if the channel was reliable.
5141
5142 Incoming channels, channels being closed by the remote peer, and traffic
5143 on any incoming or outgoing channels are given to the client when CADET
5144 executes the callbacks given to it at the time of
5145 @code{GNUNET_CADET_connect}.
5146
5147 @pindex GNUNET_CADET_disconnect
5148 Finally, when an application no longer wants to use CADET, it should call
5149 @code{GNUNET_CADET_disconnect}, but first all channels and pending
5150 transmissions must be closed (otherwise CADET will complain).
5151
5152 @cindex NSE Subsystem
5153 @node NSE Subsystem
5154 @section NSE Subsystem
5155
5156
5157 NSE stands for @dfn{Network Size Estimation}. The NSE subsystem provides
5158 other subsystems and users with a rough estimate of the number of peers
5159 currently participating in the GNUnet overlay.
5160 The computed value is not a precise number as producing a precise number
5161 in a decentralized, efficient and secure way is impossible.
5162 While NSE's estimate is inherently imprecise, NSE also gives the expected
5163 range. For a peer that has been running in a stable network for a
5164 while, the real network size will typically (99.7% of the time) be in the
5165 range of [2/3 estimate, 3/2 estimate]. We will now give an overview of the
5166 algorithm used to calculate the estimate;
5167 all of the details can be found in this technical report.
5168
5169 @c FIXME: link to the report.
5170
5171 @menu
5172 * Motivation::
5173 * Principle::
5174 * libgnunetnse::
5175 * The NSE Client-Service Protocol::
5176 * The NSE Peer-to-Peer Protocol::
5177 @end menu
5178
5179 @node Motivation
5180 @subsection Motivation
5181
5182
5183 Some subsystems, like DHT, need to know the size of the GNUnet network to
5184 optimize some parameters of their own protocol. The decentralized nature
5185 of GNUnet makes efficient and securely counting the exact number of peers
5186 infeasible. Although there are several decentralized algorithms to count
5187 the number of peers in a system, so far there is none to do so securely.
5188 Other protocols may allow any malicious peer to manipulate the final
5189 result or to take advantage of the system to perform
5190 @dfn{Denial of Service} (DoS) attacks against the network.
5191 GNUnet's NSE protocol avoids these drawbacks.
5192
5193
5194
5195 @menu
5196 * Security::
5197 @end menu
5198
5199 @cindex NSE security
5200 @cindex nse security
5201 @node Security
5202 @subsubsection Security
5203
5204
5205 The NSE subsystem is designed to be resilient against these attacks.
5206 It uses @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proofs of work}
5207 to prevent one peer from impersonating a large number of participants,
5208 which would otherwise allow an adversary to artificially inflate the
5209 estimate.
5210 The DoS protection comes from the time-based nature of the protocol:
5211 the estimates are calculated periodically and out-of-time traffic is
5212 either ignored or stored for later retransmission by benign peers.
5213 In particular, peers cannot trigger global network communication at will.
5214
5215 @cindex NSE principle
5216 @cindex nse principle
5217 @node Principle
5218 @subsection Principle
5219
5220
5221 The algorithm calculates the estimate by finding the globally closest
5222 peer ID to a random, time-based value.
5223
5224 The idea is that the closer the ID is to the random value, the more
5225 "densely packed" the ID space is, and therefore, more peers are in the
5226 network.
5227
5228
5229
5230 @menu
5231 * Example::
5232 * Algorithm::
5233 * Target value::
5234 * Timing::
5235 * Controlled Flooding::
5236 * Calculating the estimate::
5237 @end menu
5238
5239 @node Example
5240 @subsubsection Example
5241
5242
5243 Suppose all peers have IDs between 0 and 100 (our ID space), and the
5244 random value is 42.
5245 If the closest peer has the ID 70 we can imagine that the average
5246 "distance" between peers is around 30 and therefore the are around 3
5247 peers in the whole ID space. On the other hand, if the closest peer has
5248 the ID 44, we can imagine that the space is rather packed with peers,
5249 maybe as much as 50 of them.
5250 Naturally, we could have been rather unlucky, and there is only one peer
5251 and happens to have the ID 44. Thus, the current estimate is calculated
5252 as the average over multiple rounds, and not just a single sample.
5253
5254 @node Algorithm
5255 @subsubsection Algorithm
5256
5257
5258 Given that example, one can imagine that the job of the subsystem is to
5259 efficiently communicate the ID of the closest peer to the target value
5260 to all the other peers, who will calculate the estimate from it.
5261
5262 @node Target value
5263 @subsubsection Target value
5264
5265
5266
5267 The target value itself is generated by hashing the current time, rounded
5268 down to an agreed value. If the rounding amount is 1h (default) and the
5269 time is 12:34:56, the time to hash would be 12:00:00. The process is
5270 repeated each rounding amount (in this example would be every hour).
5271 Every repetition is called a round.
5272
5273 @node Timing
5274 @subsubsection Timing
5275
5276
5277 The NSE subsystem has some timing control to avoid everybody broadcasting
5278 its ID all at one. Once each peer has the target random value, it
5279 compares its own ID to the target and calculates the hypothetical size of
5280 the network if that peer were to be the closest.
5281 Then it compares the hypothetical size with the estimate from the previous
5282 rounds. For each value there is an associated point in the period,
5283 let's call it "broadcast time". If its own hypothetical estimate
5284 is the same as the previous global estimate, its "broadcast time" will be
5285 in the middle of the round. If its bigger it will be earlier and if its
5286 smaller (the most likely case) it will be later. This ensures that the
5287 peers closest to the target value start broadcasting their ID the first.
5288
5289 @node Controlled Flooding
5290 @subsubsection Controlled Flooding
5291
5292
5293
5294 When a peer receives a value, first it verifies that it is closer than the
5295 closest value it had so far, otherwise it answers the incoming message
5296 with a message containing the better value. Then it checks a proof of
5297 work that must be included in the incoming message, to ensure that the
5298 other peer's ID is not made up (otherwise a malicious peer could claim to
5299 have an ID of exactly the target value every round). Once validated, it
5300 compares the broadcast time of the received value with the current time
5301 and if it's not too early, sends the received value to its neighbors.
5302 Otherwise it stores the value until the correct broadcast time comes.
5303 This prevents unnecessary traffic of sub-optimal values, since a better
5304 value can come before the broadcast time, rendering the previous one
5305 obsolete and saving the traffic that would have been used to broadcast it
5306 to the neighbors.
5307
5308 @node Calculating the estimate
5309 @subsubsection Calculating the estimate
5310
5311
5312
5313 Once the closest ID has been spread across the network each peer gets the
5314 exact distance between this ID and the target value of the round and
5315 calculates the estimate with a mathematical formula described in the tech
5316 report. The estimate generated with this method for a single round is not
5317 very precise. Remember the case of the example, where the only peer is the
5318 ID 44 and we happen to generate the target value 42, thinking there are
5319 50 peers in the network. Therefore, the NSE subsystem remembers the last
5320 64 estimates and calculates an average over them, giving a result of which
5321 usually has one bit of uncertainty (the real size could be half of the
5322 estimate or twice as much). Note that the actual network size is
5323 calculated in powers of two of the raw input, thus one bit of uncertainty
5324 means a factor of two in the size estimate.
5325
5326 @cindex libgnunetnse
5327 @node libgnunetnse
5328 @subsection libgnunetnse
5329
5330
5331
5332 The NSE subsystem has the simplest API of all services, with only two
5333 calls: @code{GNUNET_NSE_connect} and @code{GNUNET_NSE_disconnect}.
5334
5335 The connect call gets a callback function as a parameter and this function
5336 is called each time the network agrees on an estimate. This usually is
5337 once per round, with some exceptions: if the closest peer has a late
5338 local clock and starts spreading its ID after everyone else agreed on a
5339 value, the callback might be activated twice in a round, the second value
5340 being always bigger than the first. The default round time is set to
5341 1 hour.
5342
5343 The disconnect call disconnects from the NSE subsystem and the callback
5344 is no longer called with new estimates.
5345
5346
5347
5348 @menu
5349 * Results::
5350 * libgnunetnse - Examples::
5351 @end menu
5352
5353 @node Results
5354 @subsubsection Results
5355
5356
5357
5358 The callback provides two values: the average and the
5359 @uref{http://en.wikipedia.org/wiki/Standard_deviation, standard deviation}
5360 of the last 64 rounds. The values provided by the callback function are
5361 logarithmic, this means that the real estimate numbers can be obtained by
5362 calculating 2 to the power of the given value (2average). From a
5363 statistics point of view this means that:
5364
5365 @itemize @bullet
5366 @item 68% of the time the real size is included in the interval
5367 [(2average-stddev), 2]
5368 @item 95% of the time the real size is included in the interval
5369 [(2average-2*stddev, 2^average+2*stddev]
5370 @item 99.7% of the time the real size is included in the interval
5371 [(2average-3*stddev, 2average+3*stddev]
5372 @end itemize
5373
5374 The expected standard variation for 64 rounds in a network of stable size
5375 is 0.2. Thus, we can say that normally:
5376
5377 @itemize @bullet
5378 @item 68% of the time the real size is in the range [-13%, +15%]
5379 @item 95% of the time the real size is in the range [-24%, +32%]
5380 @item 99.7% of the time the real size is in the range [-34%, +52%]
5381 @end itemize
5382
5383 As said in the introduction, we can be quite sure that usually the real
5384 size is between one third and three times the estimate. This can of
5385 course vary with network conditions.
5386 Thus, applications may want to also consider the provided standard
5387 deviation value, not only the average (in particular, if the standard
5388 variation is very high, the average maybe meaningless: the network size is
5389 changing rapidly).
5390
5391 @node libgnunetnse - Examples
5392 @subsubsection libgnunetnse -Examples
5393
5394
5395
5396 Let's close with a couple examples.
5397
5398 @table @asis
5399
5400 @item Average: 10, std dev: 1 Here the estimate would be
5401 2^10 = 1024 peers. (The range in which we can be 95% sure is:
5402 [2^8, 2^12] = [256, 4096]. We can be very (>99.7%) sure that the network
5403 is not a hundred peers and absolutely sure that it is not a million peers,
5404 but somewhere around a thousand.)
5405
5406 @item Average 22, std dev: 0.2 Here the estimate would be
5407 2^22 = 4 Million peers. (The range in which we can be 99.7% sure
5408 is: [2^21.4, 2^22.6] = [2.8M, 6.3M]. We can be sure that the network size
5409 is around four million, with absolutely way of it being 1 million.)
5410
5411 @end table
5412
5413 To put this in perspective, if someone remembers the LHC Higgs boson
5414 results, were announced with "5 sigma" and "6 sigma" certainties. In this
5415 case a 5 sigma minimum would be 2 million and a 6 sigma minimum,
5416 1.8 million.
5417
5418 @node The NSE Client-Service Protocol
5419 @subsection The NSE Client-Service Protocol
5420
5421
5422
5423 As with the API, the client-service protocol is very simple, only has 2
5424 different messages, defined in @code{src/nse/nse.h}:
5425
5426 @itemize @bullet
5427 @item @code{GNUNET_MESSAGE_TYPE_NSE_START}@ This message has no parameters
5428 and is sent from the client to the service upon connection.
5429 @item @code{GNUNET_MESSAGE_TYPE_NSE_ESTIMATE}@ This message is sent from
5430 the service to the client for every new estimate and upon connection.
5431 Contains a timestamp for the estimate, the average and the standard
5432 deviation for the respective round.
5433 @end itemize
5434
5435 When the @code{GNUNET_NSE_disconnect} API call is executed, the client
5436 simply disconnects from the service, with no message involved.
5437
5438 @cindex NSE Peer-to-Peer Protocol
5439 @node The NSE Peer-to-Peer Protocol
5440 @subsection The NSE Peer-to-Peer Protocol
5441
5442
5443 @pindex GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD
5444 The NSE subsystem only has one message in the P2P protocol, the
5445 @code{GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD} message.
5446
5447 This message key contents are the timestamp to identify the round
5448 (differences in system clocks may cause some peers to send messages way
5449 too early or way too late, so the timestamp allows other peers to
5450 identify such messages easily), the
5451 @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proof of work}
5452 used to make it difficult to mount a
5453 @uref{http://en.wikipedia.org/wiki/Sybil_attack, Sybil attack}, and the
5454 public key, which is used to verify the signature on the message.
5455
5456 Every peer stores a message for the previous, current and next round. The
5457 messages for the previous and current round are given to peers that
5458 connect to us. The message for the next round is simply stored until our
5459 system clock advances to the next round. The message for the current round
5460 is what we are flooding the network with right now.
5461 At the beginning of each round the peer does the following:
5462
5463 @itemize @bullet
5464 @item calculates its own distance to the target value
5465 @item creates, signs and stores the message for the current round (unless
5466 it has a better message in the "next round" slot which came early in the
5467 previous round)
5468 @item calculates, based on the stored round message (own or received) when
5469 to start flooding it to its neighbors
5470 @end itemize
5471
5472 Upon receiving a message the peer checks the validity of the message
5473 (round, proof of work, signature). The next action depends on the
5474 contents of the incoming message:
5475
5476 @itemize @bullet
5477 @item if the message is worse than the current stored message, the peer
5478 sends the current message back immediately, to stop the other peer from
5479 spreading suboptimal results
5480 @item if the message is better than the current stored message, the peer
5481 stores the new message and calculates the new target time to start
5482 spreading it to its neighbors (excluding the one the message came from)
5483 @item if the message is for the previous round, it is compared to the
5484 message stored in the "previous round slot", which may then be updated
5485 @item if the message is for the next round, it is compared to the message
5486 stored in the "next round slot", which again may then be updated
5487 @end itemize
5488
5489 Finally, when it comes to send the stored message for the current round to
5490 the neighbors there is a random delay added for each neighbor, to avoid
5491 traffic spikes and minimize cross-messages.
5492
5493 @cindex HOSTLIST Subsystem
5494 @node HOSTLIST Subsystem
5495 @section HOSTLIST Subsystem
5496
5497
5498
5499 Peers in the GNUnet overlay network need address information so that they
5500 can connect with other peers. GNUnet uses so called HELLO messages to
5501 store and exchange peer addresses.
5502 GNUnet provides several methods for peers to obtain this information:
5503
5504 @itemize @bullet
5505 @item out-of-band exchange of HELLO messages (manually, using for example
5506 gnunet-peerinfo)
5507 @item HELLO messages shipped with GNUnet (automatic with distribution)
5508 @item UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast)
5509 @item topology gossiping (learning from other peers we already connected
5510 to), and
5511 @item the HOSTLIST daemon covered in this section, which is particularly
5512 relevant for bootstrapping new peers.
5513 @end itemize
5514
5515 New peers have no existing connections (and thus cannot learn from gossip
5516 among peers), may not have other peers in their LAN and might be started
5517 with an outdated set of HELLO messages from the distribution.
5518 In this case, getting new peers to connect to the network requires either
5519 manual effort or the use of a HOSTLIST to obtain HELLOs.
5520
5521 @menu
5522 * HELLOs::
5523 * Overview for the HOSTLIST subsystem::
5524 * Interacting with the HOSTLIST daemon::
5525 * Hostlist security address validation::
5526 * The HOSTLIST daemon::
5527 * The HOSTLIST server::
5528 * The HOSTLIST client::
5529 * Usage::
5530 @end menu
5531
5532 @node HELLOs
5533 @subsection HELLOs
5534
5535
5536
5537 The basic information peers require to connect to other peers are
5538 contained in so called HELLO messages you can think of as a business card.
5539 Besides the identity of the peer (based on the cryptographic public key) a
5540 HELLO message may contain address information that specifies ways to
5541 contact a peer. By obtaining HELLO messages, a peer can learn how to
5542 contact other peers.
5543
5544 @node Overview for the HOSTLIST subsystem
5545 @subsection Overview for the HOSTLIST subsystem
5546
5547
5548
5549 The HOSTLIST subsystem provides a way to distribute and obtain contact
5550 information to connect to other peers using a simple HTTP GET request.
5551 It's implementation is split in three parts, the main file for the daemon
5552 itself (@file{gnunet-daemon-hostlist.c}), the HTTP client used to download
5553 peer information (@file{hostlist-client.c}) and the server component used
5554 to provide this information to other peers (@file{hostlist-server.c}).
5555 The server is basically a small HTTP web server (based on GNU
5556 libmicrohttpd) which provides a list of HELLOs known to the local peer for
5557 download. The client component is basically a HTTP client
5558 (based on libcurl) which can download hostlists from one or more websites.
5559 The hostlist format is a binary blob containing a sequence of HELLO
5560 messages. Note that any HTTP server can theoretically serve a hostlist,
5561 the build-in hostlist server makes it simply convenient to offer this
5562 service.
5563
5564
5565 @menu
5566 * Features::
5567 * HOSTLIST - Limitations::
5568 @end menu
5569
5570 @node Features
5571 @subsubsection Features
5572
5573
5574
5575 The HOSTLIST daemon can:
5576
5577 @itemize @bullet
5578 @item provide HELLO messages with validated addresses obtained from
5579 PEERINFO to download for other peers
5580 @item download HELLO messages and forward these message to the TRANSPORT
5581 subsystem for validation
5582 @item advertises the URL of this peer's hostlist address to other peers
5583 via gossip
5584 @item automatically learn about hostlist servers from the gossip of other
5585 peers
5586 @end itemize
5587
5588 @node HOSTLIST - Limitations
5589 @subsubsection HOSTLIST - Limitations
5590
5591
5592
5593 The HOSTLIST daemon does not:
5594
5595 @itemize @bullet
5596 @item verify the cryptographic information in the HELLO messages
5597 @item verify the address information in the HELLO messages
5598 @end itemize
5599
5600 @node Interacting with the HOSTLIST daemon
5601 @subsection Interacting with the HOSTLIST daemon
5602
5603
5604
5605 The HOSTLIST subsystem is currently implemented as a daemon, so there is
5606 no need for the user to interact with it and therefore there is no
5607 command line tool and no API to communicate with the daemon. In the
5608 future, we can envision changing this to allow users to manually trigger
5609 the download of a hostlist.
5610
5611 Since there is no command line interface to interact with HOSTLIST, the
5612 only way to interact with the hostlist is to use STATISTICS to obtain or
5613 modify information about the status of HOSTLIST:
5614
5615 @example
5616 $ gnunet-statistics -s hostlist
5617 @end example
5618
5619 @noindent
5620 In particular, HOSTLIST includes a @strong{persistent} value in statistics
5621 that specifies when the hostlist server might be queried next. As this
5622 value is exponentially increasing during runtime, developers may want to
5623 reset or manually adjust it. Note that HOSTLIST (but not STATISTICS) needs
5624 to be shutdown if changes to this value are to have any effect on the
5625 daemon (as HOSTLIST does not monitor STATISTICS for changes to the
5626 download frequency).
5627
5628 @node Hostlist security address validation
5629 @subsection Hostlist security address validation
5630
5631
5632
5633 Since information obtained from other parties cannot be trusted without
5634 validation, we have to distinguish between @emph{validated} and
5635 @emph{not validated} addresses. Before using (and so trusting)
5636 information from other parties, this information has to be double-checked
5637 (validated). Address validation is not done by HOSTLIST but by the
5638 TRANSPORT service.
5639
5640 The HOSTLIST component is functionally located between the PEERINFO and
5641 the TRANSPORT subsystem. When acting as a server, the daemon obtains valid
5642 (@emph{validated}) peer information (HELLO messages) from the PEERINFO
5643 service and provides it to other peers. When acting as a client, it
5644 contacts the HOSTLIST servers specified in the configuration, downloads
5645 the (unvalidated) list of HELLO messages and forwards these information
5646 to the TRANSPORT server to validate the addresses.
5647
5648 @cindex HOSTLIST daemon
5649 @node The HOSTLIST daemon
5650 @subsection The HOSTLIST daemon
5651
5652
5653
5654 The hostlist daemon is the main component of the HOSTLIST subsystem. It is
5655 started by the ARM service and (if configured) starts the HOSTLIST client
5656 and server components.
5657
5658 @pindex GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT
5659 If the daemon provides a hostlist itself it can advertise it's own
5660 hostlist to other peers. To do so it sends a
5661 @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to other peers
5662 when they connect to this peer on the CORE level. This hostlist
5663 advertisement message contains the URL to access the HOSTLIST HTTP
5664 server of the sender. The daemon may also subscribe to this type of
5665 message from CORE service, and then forward these kind of message to the
5666 HOSTLIST client. The client then uses all available URLs to download peer
5667 information when necessary.
5668
5669 When starting, the HOSTLIST daemon first connects to the CORE subsystem
5670 and if hostlist learning is enabled, registers a CORE handler to receive
5671 this kind of messages. Next it starts (if configured) the client and
5672 server. It passes pointers to CORE connect and disconnect and receive
5673 handlers where the client and server store their functions, so the daemon
5674 can notify them about CORE events.
5675
5676 To clean up on shutdown, the daemon has a cleaning task, shutting down all
5677 subsystems and disconnecting from CORE.
5678
5679 @cindex HOSTLIST server
5680 @node The HOSTLIST server
5681 @subsection The HOSTLIST server
5682
5683
5684
5685 The server provides a way for other peers to obtain HELLOs. Basically it
5686 is a small web server other peers can connect to and download a list of
5687 HELLOs using standard HTTP; it may also advertise the URL of the hostlist
5688 to other peers connecting on CORE level.
5689
5690
5691 @menu
5692 * The HTTP Server::
5693 * Advertising the URL::
5694 @end menu
5695
5696 @node The HTTP Server
5697 @subsubsection The HTTP Server
5698
5699
5700
5701 During startup, the server starts a web server listening on the port
5702 specified with the HTTPPORT value (default 8080). In addition it connects
5703 to the PEERINFO service to obtain peer information. The HOSTLIST server
5704 uses the GNUNET_PEERINFO_iterate function to request HELLO information for
5705 all peers and adds their information to a new hostlist if they are
5706 suitable (expired addresses and HELLOs without addresses are both not
5707 suitable) and the maximum size for a hostlist is not exceeded
5708 (MAX_BYTES_PER_HOSTLISTS = 500000).
5709 When PEERINFO finishes (with a last NULL callback), the server destroys
5710 the previous hostlist response available for download on the web server
5711 and replaces it with the updated hostlist. The hostlist format is
5712 basically a sequence of HELLO messages (as obtained from PEERINFO) without
5713 any special tokenization. Since each HELLO message contains a size field,
5714 the response can easily be split into separate HELLO messages by the
5715 client.
5716
5717 A HOSTLIST client connecting to the HOSTLIST server will receive the
5718 hostlist as a HTTP response and the the server will terminate the
5719 connection with the result code @code{HTTP 200 OK}.
5720 The connection will be closed immediately if no hostlist is available.
5721
5722 @node Advertising the URL
5723 @subsubsection Advertising the URL
5724
5725
5726
5727 The server also advertises the URL to download the hostlist to other peers
5728 if hostlist advertisement is enabled.
5729 When a new peer connects and has hostlist learning enabled, the server
5730 sends a @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to this
5731 peer using the CORE service.
5732
5733 @cindex HOSTLIST client
5734 @node The HOSTLIST client
5735 @subsection The HOSTLIST client
5736
5737
5738
5739 The client provides the functionality to download the list of HELLOs from
5740 a set of URLs.
5741 It performs a standard HTTP request to the URLs configured and learned
5742 from advertisement messages received from other peers. When a HELLO is
5743 downloaded, the HOSTLIST client forwards the HELLO to the TRANSPORT
5744 service for validation.
5745
5746 The client supports two modes of operation:
5747
5748 @itemize @bullet
5749 @item download of HELLOs (bootstrapping)
5750 @item learning of URLs
5751 @end itemize
5752
5753 @menu
5754 * Bootstrapping::
5755 * Learning::
5756 @end menu
5757
5758 @node Bootstrapping
5759 @subsubsection Bootstrapping
5760
5761
5762
5763 For bootstrapping, it schedules a task to download the hostlist from the
5764 set of known URLs.
5765 The downloads are only performed if the number of current
5766 connections is smaller than a minimum number of connections
5767 (at the moment 4).
5768 The interval between downloads increases exponentially; however, the
5769 exponential growth is limited if it becomes longer than an hour.
5770 At that point, the frequency growth is capped at
5771 (#number of connections * 1h).
5772
5773 Once the decision has been taken to download HELLOs, the daemon chooses a
5774 random URL from the list of known URLs. URLs can be configured in the
5775 configuration or be learned from advertisement messages.
5776 The client uses a HTTP client library (libcurl) to initiate the download
5777 using the libcurl multi interface.
5778 Libcurl passes the data to the callback_download function which
5779 stores the data in a buffer if space is available and the maximum size for
5780 a hostlist download is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000).
5781 When a full HELLO was downloaded, the HOSTLIST client offers this
5782 HELLO message to the TRANSPORT service for validation.
5783 When the download is finished or failed, statistical information about the
5784 quality of this URL is updated.
5785
5786 @cindex HOSTLIST learning
5787 @node Learning
5788 @subsubsection Learning
5789
5790
5791
5792 The client also manages hostlist advertisements from other peers. The
5793 HOSTLIST daemon forwards @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT}
5794 messages to the client subsystem, which extracts the URL from the message.
5795 Next, a test of the newly obtained URL is performed by triggering a
5796 download from the new URL. If the URL works correctly, it is added to the
5797 list of working URLs.
5798
5799 The size of the list of URLs is restricted, so if an additional server is
5800 added and the list is full, the URL with the worst quality ranking
5801 (determined through successful downloads and number of HELLOs e.g.) is
5802 discarded. During shutdown the list of URLs is saved to a file for
5803 persistance and loaded on startup. URLs from the configuration file are
5804 never discarded.
5805
5806 @node Usage
5807 @subsection Usage
5808
5809
5810
5811 To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES
5812 section for the ARM services. This is done in the default configuration.
5813
5814 For more information on how to configure the HOSTLIST subsystem see the
5815 installation handbook:@
5816 Configuring the hostlist to bootstrap@
5817 Configuring your peer to provide a hostlist
5818
5819 @cindex IDENTITY Subsystem
5820 @node IDENTITY Subsystem
5821 @section IDENTITY Subsystem
5822
5823
5824
5825 Identities of "users" in GNUnet are called egos.
5826 Egos can be used as pseudonyms ("fake names") or be tied to an
5827 organization (for example, "GNU") or even the actual identity of a human.
5828 GNUnet users are expected to have many egos. They might have one tied to
5829 their real identity, some for organizations they manage, and more for
5830 different domains where they want to operate under a pseudonym.
5831
5832 The IDENTITY service allows users to manage their egos. The identity
5833 service manages the private keys egos of the local user; it does not
5834 manage identities of other users (public keys). Public keys for other
5835 users need names to become manageable. GNUnet uses the
5836 @dfn{GNU Name System} (GNS) to give names to other users and manage their
5837 public keys securely. This chapter is about the IDENTITY service,
5838 which is about the management of private keys.
5839
5840 On the network, an ego corresponds to an ECDSA key (over Curve25519,
5841 using RFC 6979, as required by GNS). Thus, users can perform actions
5842 under a particular ego by using (signing with) a particular private key.
5843 Other users can then confirm that the action was really performed by that
5844 ego by checking the signature against the respective public key.
5845
5846 The IDENTITY service allows users to associate a human-readable name with
5847 each ego. This way, users can use names that will remind them of the
5848 purpose of a particular ego.
5849 The IDENTITY service will store the respective private keys and
5850 allows applications to access key information by name.
5851 Users can change the name that is locally (!) associated with an ego.
5852 Egos can also be deleted, which means that the private key will be removed
5853 and it thus will not be possible to perform actions with that ego in the
5854 future.
5855
5856 Additionally, the IDENTITY subsystem can associate service functions with
5857 egos.
5858 For example, GNS requires the ego that should be used for the shorten
5859 zone. GNS will ask IDENTITY for an ego for the "gns-short" service.
5860 The IDENTITY service has a mapping of such service strings to the name of
5861 the ego that the user wants to use for this service, for example
5862 "my-short-zone-ego".
5863
5864 Finally, the IDENTITY API provides access to a special ego, the
5865 anonymous ego. The anonymous ego is special in that its private key is not
5866 really private, but fixed and known to everyone.
5867 Thus, anyone can perform actions as anonymous. This can be useful as with
5868 this trick, code does not have to contain a special case to distinguish
5869 between anonymous and pseudonymous egos.
5870
5871 @menu
5872 * libgnunetidentity::
5873 * The IDENTITY Client-Service Protocol::
5874 @end menu
5875
5876 @cindex libgnunetidentity
5877 @node libgnunetidentity
5878 @subsection libgnunetidentity
5879
5880
5881
5882 @menu
5883 * Connecting to the service::
5884 * Operations on Egos::
5885 * The anonymous Ego::
5886 * Convenience API to lookup a single ego::
5887 * Associating egos with service functions::
5888 @end menu
5889
5890 @node Connecting to the service
5891 @subsubsection Connecting to the service
5892
5893
5894
5895 First, typical clients connect to the identity service using
5896 @code{GNUNET_IDENTITY_connect}. This function takes a callback as a
5897 parameter.
5898 If the given callback parameter is non-null, it will be invoked to notify
5899 the application about the current state of the identities in the system.
5900
5901 @itemize @bullet
5902 @item First, it will be invoked on all known egos at the time of the
5903 connection. For each ego, a handle to the ego and the user's name for the
5904 ego will be passed to the callback. Furthermore, a @code{void **} context
5905 argument will be provided which gives the client the opportunity to
5906 associate some state with the ego.
5907 @item Second, the callback will be invoked with NULL for the ego, the name
5908 and the context. This signals that the (initial) iteration over all egos
5909 has completed.
5910 @item Then, the callback will be invoked whenever something changes about
5911 an ego.
5912 If an ego is renamed, the callback is invoked with the ego handle of the
5913 ego that was renamed, and the new name. If an ego is deleted, the callback
5914 is invoked with the ego handle and a name of NULL. In the deletion case,
5915 the application should also release resources stored in the context.
5916 @item When the application destroys the connection to the identity service
5917 using @code{GNUNET_IDENTITY_disconnect}, the callback is again invoked
5918 with the ego and a name of NULL (equivalent to deletion of the egos).
5919 This should again be used to clean up the per-ego context.
5920 @end itemize
5921
5922 The ego handle passed to the callback remains valid until the callback is
5923 invoked with a name of NULL, so it is safe to store a reference to the
5924 ego's handle.
5925
5926 @node Operations on Egos
5927 @subsubsection Operations on Egos
5928
5929
5930
5931 Given an ego handle, the main operations are to get its associated private
5932 key using @code{GNUNET_IDENTITY_ego_get_private_key} or its associated
5933 public key using @code{GNUNET_IDENTITY_ego_get_public_key}.
5934
5935 The other operations on egos are pretty straightforward.
5936 Using @code{GNUNET_IDENTITY_create}, an application can request the
5937 creation of an ego by specifying the desired name.
5938 The operation will fail if that name is
5939 already in use. Using @code{GNUNET_IDENTITY_rename} the name of an
5940 existing ego can be changed. Finally, egos can be deleted using
5941 @code{GNUNET_IDENTITY_delete}. All of these operations will trigger
5942 updates to the callback given to the @code{GNUNET_IDENTITY_connect}
5943 function of all applications that are connected with the identity service
5944 at the time. @code{GNUNET_IDENTITY_cancel} can be used to cancel the
5945 operations before the respective continuations would be called.
5946 It is not guaranteed that the operation will not be completed anyway,
5947 only the continuation will no longer be called.
5948
5949 @node The anonymous Ego
5950 @subsubsection The anonymous Ego
5951
5952
5953
5954 A special way to obtain an ego handle is to call
5955 @code{GNUNET_IDENTITY_ego_get_anonymous}, which returns an ego for the
5956 "anonymous" user --- anyone knows and can get the private key for this
5957 user, so it is suitable for operations that are supposed to be anonymous
5958 but require signatures (for example, to avoid a special path in the code).
5959 The anonymous ego is always valid and accessing it does not require a
5960 connection to the identity service.
5961
5962 @node Convenience API to lookup a single ego
5963 @subsubsection Convenience API to lookup a single ego
5964
5965
5966 As applications commonly simply have to lookup a single ego, there is a
5967 convenience API to do just that. Use @code{GNUNET_IDENTITY_ego_lookup} to
5968 lookup a single ego by name. Note that this is the user's name for the
5969 ego, not the service function. The resulting ego will be returned via a
5970 callback and will only be valid during that callback. The operation can
5971 be canceled via @code{GNUNET_IDENTITY_ego_lookup_cancel}
5972 (cancellation is only legal before the callback is invoked).
5973
5974 @node Associating egos with service functions
5975 @subsubsection Associating egos with service functions
5976
5977
5978 The @code{GNUNET_IDENTITY_set} function is used to associate a particular
5979 ego with a service function. The name used by the service and the ego are
5980 given as arguments.
5981 Afterwards, the service can use its name to lookup the associated ego
5982 using @code{GNUNET_IDENTITY_get}.
5983
5984 @node The IDENTITY Client-Service Protocol
5985 @subsection The IDENTITY Client-Service Protocol
5986
5987
5988
5989 A client connecting to the identity service first sends a message with
5990 type
5991 @code{GNUNET_MESSAGE_TYPE_IDENTITY_START} to the service. After that, the
5992 client will receive information about changes to the egos by receiving
5993 messages of type @code{GNUNET_MESSAGE_TYPE_IDENTITY_UPDATE}.
5994 Those messages contain the private key of the ego and the user's name of
5995 the ego (or zero bytes for the name to indicate that the ego was deleted).
5996 A special bit @code{end_of_list} is used to indicate the end of the
5997 initial iteration over the identity service's egos.
5998
5999 The client can trigger changes to the egos by sending @code{CREATE},
6000 @code{RENAME} or @code{DELETE} messages.
6001 The CREATE message contains the private key and the desired name.@
6002 The RENAME message contains the old name and the new name.@
6003 The DELETE message only needs to include the name of the ego to delete.@
6004 The service responds to each of these messages with a @code{RESULT_CODE}
6005 message which indicates success or error of the operation, and possibly
6006 a human-readable error message.
6007
6008 Finally, the client can bind the name of a service function to an ego by
6009 sending a @code{SET_DEFAULT} message with the name of the service function
6010 and the private key of the ego.
6011 Such bindings can then be resolved using a @code{GET_DEFAULT} message,
6012 which includes the name of the service function. The identity service
6013 will respond to a GET_DEFAULT request with a SET_DEFAULT message
6014 containing the respective information, or with a RESULT_CODE to
6015 indicate an error.
6016
6017 @cindex NAMESTORE Subsystem
6018 @node NAMESTORE Subsystem
6019 @section NAMESTORE Subsystem
6020
6021 The NAMESTORE subsystem provides persistent storage for local GNS zone
6022 information. All local GNS zone information are managed by NAMESTORE. It
6023 provides both the functionality to administer local GNS information (e.g.
6024 delete and add records) as well as to retrieve GNS information (e.g to
6025 list name information in a client).
6026 NAMESTORE does only manage the persistent storage of zone information
6027 belonging to the user running the service: GNS information from other
6028 users obtained from the DHT are stored by the NAMECACHE subsystem.
6029
6030 NAMESTORE uses a plugin-based database backend to store GNS information
6031 with good performance. Here sqlite, MySQL and PostgreSQL are supported
6032 database backends.
6033 NAMESTORE clients interact with the IDENTITY subsystem to obtain
6034 cryptographic information about zones based on egos as described with the
6035 IDENTITY subsystem, but internally NAMESTORE refers to zones using the
6036 ECDSA private key.
6037 In addition, it collaborates with the NAMECACHE subsystem and
6038 stores zone information when local information are modified in the
6039 GNS cache to increase look-up performance for local information.
6040
6041 NAMESTORE provides functionality to look-up and store records, to iterate
6042 over a specific or all zones and to monitor zones for changes. NAMESTORE
6043 functionality can be accessed using the NAMESTORE api or the NAMESTORE
6044 command line tool.
6045
6046 @menu
6047 * libgnunetnamestore::
6048 @end menu
6049
6050 @cindex libgnunetnamestore
6051 @node libgnunetnamestore
6052 @subsection libgnunetnamestore
6053
6054 To interact with NAMESTORE clients first connect to the NAMESTORE service
6055 using the @code{GNUNET_NAMESTORE_connect} passing a configuration handle.
6056 As a result they obtain a NAMESTORE handle, they can use for operations,
6057 or NULL is returned if the connection failed.
6058
6059 To disconnect from NAMESTORE, clients use
6060 @code{GNUNET_NAMESTORE_disconnect} and specify the handle to disconnect.
6061
6062 NAMESTORE internally uses the ECDSA private key to refer to zones. These
6063 private keys can be obtained from the IDENTITY subsytem.
6064 Here @emph{egos} @emph{can be used to refer to zones or the default ego
6065 assigned to the GNS subsystem can be used to obtained the master zone's
6066 private key.}
6067
6068
6069 @menu
6070 * Editing Zone Information::
6071 * Iterating Zone Information::
6072 * Monitoring Zone Information::
6073 @end menu
6074
6075 @node Editing Zone Information
6076 @subsubsection Editing Zone Information
6077
6078
6079
6080 NAMESTORE provides functions to lookup records stored under a label in a
6081 zone and to store records under a label in a zone.
6082
6083 To store (and delete) records, the client uses the
6084 @code{GNUNET_NAMESTORE_records_store} function and has to provide
6085 namestore handle to use, the private key of the zone, the label to store
6086 the records under, the records and number of records plus an callback
6087 function.
6088 After the operation is performed NAMESTORE will call the provided
6089 callback function with the result GNUNET_SYSERR on failure
6090 (including timeout/queue drop/failure to validate), GNUNET_NO if content
6091 was already there or not found GNUNET_YES (or other positive value) on
6092 success plus an additional error message.
6093
6094 Records are deleted by using the store command with 0 records to store.
6095 It is important to note, that records are not merged when records exist
6096 with the label.
6097 So a client has first to retrieve records, merge with existing records
6098 and then store the result.
6099
6100 To perform a lookup operation, the client uses the
6101 @code{GNUNET_NAMESTORE_records_store} function. Here it has to pass the
6102 namestore handle, the private key of the zone and the label. It also has
6103 to provide a callback function which will be called with the result of
6104 the lookup operation:
6105 the zone for the records, the label, and the records including the
6106 number of records included.
6107
6108 A special operation is used to set the preferred nickname for a zone.
6109 This nickname is stored with the zone and is automatically merged with
6110 all labels and records stored in a zone. Here the client uses the
6111 @code{GNUNET_NAMESTORE_set_nick} function and passes the private key of
6112 the zone, the nickname as string plus a the callback with the result of
6113 the operation.
6114
6115 @node Iterating Zone Information
6116 @subsubsection Iterating Zone Information
6117
6118
6119
6120 A client can iterate over all information in a zone or all zones managed
6121 by NAMESTORE.
6122 Here a client uses the @code{GNUNET_NAMESTORE_zone_iteration_start}
6123 function and passes the namestore handle, the zone to iterate over and a
6124 callback function to call with the result.
6125 If the client wants to iterate over all the WHAT!? FIXME, it passes NULL for the zone.
6126 A @code{GNUNET_NAMESTORE_ZoneIterator} handle is returned to be used to
6127 continue iteration.
6128
6129 NAMESTORE calls the callback for every result and expects the client to
6130 call @code{GNUNET_NAMESTORE_zone_iterator_next} to continue to iterate or
6131 @code{GNUNET_NAMESTORE_zone_iterator_stop} to interrupt the iteration.
6132 When NAMESTORE reached the last item it will call the callback with a
6133 NULL value to indicate.
6134
6135 @node Monitoring Zone Information
6136 @subsubsection Monitoring Zone Information
6137
6138
6139
6140 Clients can also monitor zones to be notified about changes. Here the
6141 clients uses the @code{GNUNET_NAMESTORE_zone_monitor_start} function and
6142 passes the private key of the zone and and a callback function to call
6143 with updates for a zone.
6144 The client can specify to obtain zone information first by iterating over
6145 the zone and specify a synchronization callback to be called when the
6146 client and the namestore are synced.
6147
6148 On an update, NAMESTORE will call the callback with the private key of the
6149 zone, the label and the records and their number.
6150
6151 To stop monitoring, the client calls
6152 @code{GNUNET_NAMESTORE_zone_monitor_stop} and passes the handle obtained
6153 from the function to start the monitoring.
6154
6155 @cindex PEERINFO Subsystem
6156 @node PEERINFO Subsystem
6157 @section PEERINFO Subsystem
6158
6159
6160
6161 The PEERINFO subsystem is used to store verified (validated) information
6162 about known peers in a persistent way. It obtains these addresses for
6163 example from TRANSPORT service which is in charge of address validation.
6164 Validation means that the information in the HELLO message are checked by
6165 connecting to the addresses and performing a cryptographic handshake to
6166 authenticate the peer instance stating to be reachable with these
6167 addresses.
6168 Peerinfo does not validate the HELLO messages itself but only stores them
6169 and gives them to interested clients.
6170
6171 As future work, we think about moving from storing just HELLO messages to
6172 providing a generic persistent per-peer information store.
6173 More and more subsystems tend to need to store per-peer information in
6174 persistent way.
6175 To not duplicate this functionality we plan to provide a PEERSTORE
6176 service providing this functionality.
6177
6178 @menu
6179 * PEERINFO - Features::
6180 * PEERINFO - Limitations::
6181 * DeveloperPeer Information::
6182 * Startup::
6183 * Managing Information::
6184 * Obtaining Information::
6185 * The PEERINFO Client-Service Protocol::
6186 * libgnunetpeerinfo::
6187 @end menu
6188
6189 @node PEERINFO - Features
6190 @subsection PEERINFO - Features
6191
6192
6193
6194 @itemize @bullet
6195 @item Persistent storage
6196 @item Client notification mechanism on update
6197 @item Periodic clean up for expired information
6198 @item Differentiation between public and friend-only HELLO
6199 @end itemize
6200
6201 @node PEERINFO - Limitations
6202 @subsection PEERINFO - Limitations
6203
6204
6205 @itemize @bullet
6206 @item Does not perform HELLO validation
6207 @end itemize
6208
6209 @node DeveloperPeer Information
6210 @subsection DeveloperPeer Information
6211
6212
6213
6214 The PEERINFO subsystem stores these information in the form of HELLO
6215 messages you can think of as business cards.
6216 These HELLO messages contain the public key of a peer and the addresses
6217 a peer can be reached under.
6218 The addresses include an expiration date describing how long they are
6219 valid. This information is updated regularly by the TRANSPORT service by
6220 revalidating the address.
6221 If an address is expired and not renewed, it can be removed from the
6222 HELLO message.
6223
6224 Some peer do not want to have their HELLO messages distributed to other
6225 peers, especially when GNUnet's friend-to-friend modus is enabled.
6226 To prevent this undesired distribution. PEERINFO distinguishes between
6227 @emph{public} and @emph{friend-only} HELLO messages.
6228 Public HELLO messages can be freely distributed to other (possibly
6229 unknown) peers (for example using the hostlist, gossiping, broadcasting),
6230 whereas friend-only HELLO messages may not be distributed to other peers.
6231 Friend-only HELLO messages have an additional flag @code{friend_only} set
6232 internally. For public HELLO message this flag is not set.
6233 PEERINFO does and cannot not check if a client is allowed to obtain a
6234 specific HELLO type.
6235
6236 The HELLO messages can be managed using the GNUnet HELLO library.
6237 Other GNUnet systems can obtain these information from PEERINFO and use
6238 it for their purposes.
6239 Clients are for example the HOSTLIST component providing these
6240 information to other peers in form of a hostlist or the TRANSPORT
6241 subsystem using these information to maintain connections to other peers.
6242
6243 @node Startup
6244 @subsection Startup
6245
6246
6247
6248 During startup the PEERINFO services loads persistent HELLOs from disk.
6249 First PEERINFO parses the directory configured in the HOSTS value of the
6250 @code{PEERINFO} configuration section to store PEERINFO information.
6251 For all files found in this directory valid HELLO messages are extracted.
6252 In addition it loads HELLO messages shipped with the GNUnet distribution.
6253 These HELLOs are used to simplify network bootstrapping by providing
6254 valid peer information with the distribution.
6255 The use of these HELLOs can be prevented by setting the
6256 @code{USE_INCLUDED_HELLOS} in the @code{PEERINFO} configuration section to
6257 @code{NO}. Files containing invalid information are removed.
6258
6259 @node Managing Information
6260 @subsection Managing Information
6261
6262
6263
6264 The PEERINFO services stores information about known PEERS and a single
6265 HELLO message for every peer.
6266 A peer does not need to have a HELLO if no information are available.
6267 HELLO information from different sources, for example a HELLO obtained
6268 from a remote HOSTLIST and a second HELLO stored on disk, are combined
6269 and merged into one single HELLO message per peer which will be given to
6270 clients. During this merge process the HELLO is immediately written to
6271 disk to ensure persistence.
6272
6273 PEERINFO in addition periodically scans the directory where information
6274 are stored for empty HELLO messages with expired TRANSPORT addresses.
6275 This periodic task scans all files in the directory and recreates the
6276 HELLO messages it finds.
6277 Expired TRANSPORT addresses are removed from the HELLO and if the
6278 HELLO does not contain any valid addresses, it is discarded and removed
6279 from the disk.
6280
6281 @node Obtaining Information
6282 @subsection Obtaining Information
6283
6284
6285
6286 When a client requests information from PEERINFO, PEERINFO performs a
6287 lookup for the respective peer or all peers if desired and transmits this
6288 information to the client.
6289 The client can specify if friend-only HELLOs have to be included or not
6290 and PEERINFO filters the respective HELLO messages before transmitting
6291 information.
6292
6293 To notify clients about changes to PEERINFO information, PEERINFO
6294 maintains a list of clients interested in this notifications.
6295 Such a notification occurs if a HELLO for a peer was updated (due to a
6296 merge for example) or a new peer was added.
6297
6298 @node The PEERINFO Client-Service Protocol
6299 @subsection The PEERINFO Client-Service Protocol
6300
6301
6302
6303 To connect and disconnect to and from the PEERINFO Service PEERINFO
6304 utilizes the util client/server infrastructure, so no special messages
6305 types are used here.
6306
6307 To add information for a peer, the plain HELLO message is transmitted to
6308 the service without any wrapping. All pieces of information required are
6309 stored within the HELLO message.
6310 The PEERINFO service provides a message handler accepting and processing
6311 these HELLO messages.
6312
6313 When obtaining PEERINFO information using the iterate functionality
6314 specific messages are used. To obtain information for all peers, a
6315 @code{struct ListAllPeersMessage} with message type
6316 @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL} and a flag
6317 include_friend_only to indicate if friend-only HELLO messages should be
6318 included are transmitted. If information for a specific peer is required
6319 a @code{struct ListAllPeersMessage} with
6320 @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET} containing the peer identity is
6321 used.
6322
6323 For both variants the PEERINFO service replies for each HELLO message it
6324 wants to transmit with a @code{struct ListAllPeersMessage} with type
6325 @code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO} containing the plain HELLO.
6326 The final message is @code{struct GNUNET_MessageHeader} with type
6327 @code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO}. If the client receives this
6328 message, it can proceed with the next request if any is pending.
6329
6330 @node libgnunetpeerinfo
6331 @subsection libgnunetpeerinfo
6332
6333
6334
6335 The PEERINFO API consists mainly of three different functionalities:
6336
6337 @itemize @bullet
6338 @item maintaining a connection to the service
6339 @item adding new information to the PEERINFO service
6340 @item retrieving information from the PEERINFO service
6341 @end itemize
6342
6343 @menu
6344 * Connecting to the PEERINFO Service::
6345 * Adding Information to the PEERINFO Service::
6346 * Obtaining Information from the PEERINFO Service::
6347 @end menu
6348
6349 @node Connecting to the PEERINFO Service
6350 @subsubsection Connecting to the PEERINFO Service
6351
6352
6353
6354 To connect to the PEERINFO service the function
6355 @code{GNUNET_PEERINFO_connect} is used, taking a configuration handle as
6356 an argument, and to disconnect from PEERINFO the function
6357 @code{GNUNET_PEERINFO_disconnect}, taking the PEERINFO
6358 handle returned from the connect function has to be called.
6359
6360 @node Adding Information to the PEERINFO Service
6361 @subsubsection Adding Information to the PEERINFO Service
6362
6363
6364
6365 @code{GNUNET_PEERINFO_add_peer} adds a new peer to the PEERINFO subsystem
6366 storage. This function takes the PEERINFO handle as an argument, the HELLO
6367 message to store and a continuation with a closure to be called with the
6368 result of the operation.
6369 The @code{GNUNET_PEERINFO_add_peer} returns a handle to this operation
6370 allowing to cancel the operation with the respective cancel function
6371 @code{GNUNET_PEERINFO_add_peer_cancel}. To retrieve information from
6372 PEERINFO you can iterate over all information stored with PEERINFO or you
6373 can tell PEERINFO to notify if new peer information are available.
6374
6375 @node Obtaining Information from the PEERINFO Service
6376 @subsubsection Obtaining Information from the PEERINFO Service
6377
6378
6379
6380 To iterate over information in PEERINFO you use
6381 @code{GNUNET_PEERINFO_iterate}.
6382 This function expects the PEERINFO handle, a flag if HELLO messages
6383 intended for friend only mode should be included, a timeout how long the
6384 operation should take and a callback with a callback closure to be called
6385 for the results.
6386 If you want to obtain information for a specific peer, you can specify
6387 the peer identity, if this identity is NULL, information for all peers are
6388 returned. The function returns a handle to allow to cancel the operation
6389 using @code{GNUNET_PEERINFO_iterate_cancel}.
6390
6391 To get notified when peer information changes, you can use
6392 @code{GNUNET_PEERINFO_notify}.
6393 This function expects a configuration handle and a flag if friend-only
6394 HELLO messages should be included. The PEERINFO service will notify you
6395 about every change and the callback function will be called to notify you
6396 about changes. The function returns a handle to cancel notifications
6397 with @code{GNUNET_PEERINFO_notify_cancel}.
6398
6399 @cindex PEERSTORE Subsystem
6400 @node PEERSTORE Subsystem
6401 @section PEERSTORE Subsystem
6402
6403
6404
6405 GNUnet's PEERSTORE subsystem offers persistent per-peer storage for other
6406 GNUnet subsystems. GNUnet subsystems can use PEERSTORE to persistently
6407 store and retrieve arbitrary data.
6408 Each data record stored with PEERSTORE contains the following fields:
6409
6410 @itemize @bullet
6411 @item subsystem: Name of the subsystem responsible for the record.
6412 @item peerid: Identity of the peer this record is related to.
6413 @item key: a key string identifying the record.
6414 @item value: binary record value.
6415 @item expiry: record expiry date.
6416 @end itemize
6417
6418 @menu
6419 * Functionality::
6420 * Architecture::
6421 * libgnunetpeerstore::
6422 @end menu
6423
6424 @node Functionality
6425 @subsection Functionality
6426
6427
6428
6429 Subsystems can store any type of value under a (subsystem, peerid, key)
6430 combination. A "replace" flag set during store operations forces the
6431 PEERSTORE to replace any old values stored under the same
6432 (subsystem, peerid, key) combination with the new value.
6433 Additionally, an expiry date is set after which the record is *possibly*
6434 deleted by PEERSTORE.
6435
6436 Subsystems can iterate over all values stored under any of the following
6437 combination of fields:
6438
6439 @itemize @bullet
6440 @item (subsystem)
6441 @item (subsystem, peerid)
6442 @item (subsystem, key)
6443 @item (subsystem, peerid, key)
6444 @end itemize
6445
6446 Subsystems can also request to be notified about any new values stored
6447 under a (subsystem, peerid, key) combination by sending a "watch"
6448 request to PEERSTORE.
6449
6450 @node Architecture
6451 @subsection Architecture
6452
6453
6454
6455 PEERSTORE implements the following components:
6456
6457 @itemize @bullet
6458 @item PEERSTORE service: Handles store, iterate and watch operations.
6459 @item PEERSTORE API: API to be used by other subsystems to communicate and
6460 issue commands to the PEERSTORE service.
6461 @item PEERSTORE plugins: Handles the persistent storage. At the moment,
6462 only an "sqlite" plugin is implemented.
6463 @end itemize
6464
6465 @cindex libgnunetpeerstore
6466 @node libgnunetpeerstore
6467 @subsection libgnunetpeerstore
6468
6469
6470
6471 libgnunetpeerstore is the library containing the PEERSTORE API. Subsystems
6472 wishing to communicate with the PEERSTORE service use this API to open a
6473 connection to PEERSTORE. This is done by calling
6474 @code{GNUNET_PEERSTORE_connect} which returns a handle to the newly
6475 created connection.
6476 This handle has to be used with any further calls to the API.
6477
6478 To store a new record, the function @code{GNUNET_PEERSTORE_store} is to
6479 be used which requires the record fields and a continuation function that
6480 will be called by the API after the STORE request is sent to the
6481 PEERSTORE service.
6482 Note that calling the continuation function does not mean that the record
6483 is successfully stored, only that the STORE request has been successfully
6484 sent to the PEERSTORE service.
6485 @code{GNUNET_PEERSTORE_store_cancel} can be called to cancel the STORE
6486 request only before the continuation function has been called.
6487
6488 To iterate over stored records, the function
6489 @code{GNUNET_PEERSTORE_iterate} is
6490 to be used. @emph{peerid} and @emph{key} can be set to NULL. An iterator
6491 callback function will be called with each matching record found and a
6492 NULL record at the end to signal the end of result set.
6493 @code{GNUNET_PEERSTORE_iterate_cancel} can be used to cancel the ITERATE
6494 request before the iterator callback is called with a NULL record.
6495
6496 To be notified with new values stored under a (subsystem, peerid, key)
6497 combination, the function @code{GNUNET_PEERSTORE_watch} is to be used.
6498 This will register the watcher with the PEERSTORE service, any new
6499 records matching the given combination will trigger the callback
6500 function passed to @code{GNUNET_PEERSTORE_watch}. This continues until
6501 @code{GNUNET_PEERSTORE_watch_cancel} is called or the connection to the
6502 service is destroyed.
6503
6504 After the connection is no longer needed, the function
6505 @code{GNUNET_PEERSTORE_disconnect} can be called to disconnect from the
6506 PEERSTORE service.
6507 Any pending ITERATE or WATCH requests will be destroyed.
6508 If the @code{sync_first} flag is set to @code{GNUNET_YES}, the API will
6509 delay the disconnection until all pending STORE requests are sent to
6510 the PEERSTORE service, otherwise, the pending STORE requests will be
6511 destroyed as well.
6512
6513 @cindex SET Subsystem
6514 @node SET Subsystem
6515 @section SET Subsystem
6516
6517
6518
6519 The SET service implements efficient set operations between two peers
6520 over a mesh tunnel.
6521 Currently, set union and set intersection are the only supported
6522 operations. Elements of a set consist of an @emph{element type} and
6523 arbitrary binary @emph{data}.
6524 The size of an element's data is limited to around 62 KB.
6525
6526 @menu
6527 * Local Sets::
6528 * Set Modifications::
6529 * Set Operations::
6530 * Result Elements::
6531 * libgnunetset::
6532 * The SET Client-Service Protocol::
6533 * The SET Intersection Peer-to-Peer Protocol::
6534 * The SET Union Peer-to-Peer Protocol::
6535 @end menu
6536
6537 @node Local Sets
6538 @subsection Local Sets
6539
6540
6541
6542 Sets created by a local client can be modified and reused for multiple
6543 operations. As each set operation requires potentially expensive special
6544 auxiliary data to be computed for each element of a set, a set can only
6545 participate in one type of set operation (i.e. union or intersection).
6546 The type of a set is determined upon its creation.
6547 If a the elements of a set are needed for an operation of a different
6548 type, all of the set's element must be copied to a new set of appropriate
6549 type.
6550
6551 @node Set Modifications
6552 @subsection Set Modifications
6553
6554
6555
6556 Even when set operations are active, one can add to and remove elements
6557 from a set.
6558 However, these changes will only be visible to operations that have been
6559 created after the changes have taken place. That is, every set operation
6560 only sees a snapshot of the set from the time the operation was started.
6561 This mechanism is @emph{not} implemented by copying the whole set, but by
6562 attaching @emph{generation information} to each element and operation.
6563
6564 @node Set Operations
6565 @subsection Set Operations
6566
6567
6568
6569 Set operations can be started in two ways: Either by accepting an
6570 operation request from a remote peer, or by requesting a set operation
6571 from a remote peer.
6572 Set operations are uniquely identified by the involved @emph{peers}, an
6573 @emph{application id} and the @emph{operation type}.
6574
6575 The client is notified of incoming set operations by @emph{set listeners}.
6576 A set listener listens for incoming operations of a specific operation
6577 type and application id.
6578 Once notified of an incoming set request, the client can accept the set
6579 request (providing a local set for the operation) or reject it.
6580
6581 @node Result Elements
6582 @subsection Result Elements
6583
6584
6585
6586 The SET service has three @emph{result modes} that determine how an
6587 operation's result set is delivered to the client:
6588
6589 @itemize @bullet
6590 @item @strong{Full Result Set.} All elements of set resulting from the set
6591 operation are returned to the client.
6592 @item @strong{Added Elements.} Only elements that result from the
6593 operation and are not already in the local peer's set are returned.
6594 Note that for some operations (like set intersection) this result mode
6595 will never return any elements.
6596 This can be useful if only the remove peer is actually interested in
6597 the result of the set operation.
6598 @item @strong{Removed Elements.} Only elements that are in the local
6599 peer's initial set but not in the operation's result set are returned.
6600 Note that for some operations (like set union) this result mode will
6601 never return any elements. This can be useful if only the remove peer is
6602 actually interested in the result of the set operation.
6603 @end itemize
6604
6605 @cindex libgnunetset
6606 @node libgnunetset
6607 @subsection libgnunetset
6608
6609
6610
6611 @menu
6612 * Sets::
6613 * Listeners::
6614 * Operations::
6615 * Supplying a Set::
6616 * The Result Callback::
6617 @end menu
6618
6619 @node Sets
6620 @subsubsection Sets
6621
6622
6623
6624 New sets are created with @code{GNUNET_SET_create}. Both the local peer's
6625 configuration (as each set has its own client connection) and the
6626 operation type must be specified.
6627 The set exists until either the client calls @code{GNUNET_SET_destroy} or
6628 the client's connection to the service is disrupted.
6629 In the latter case, the client is notified by the return value of
6630 functions dealing with sets. This return value must always be checked.
6631
6632 Elements are added and removed with @code{GNUNET_SET_add_element} and
6633 @code{GNUNET_SET_remove_element}.
6634
6635 @node Listeners
6636 @subsubsection Listeners
6637
6638
6639
6640 Listeners are created with @code{GNUNET_SET_listen}. Each time time a
6641 remote peer suggests a set operation with an application id and operation
6642 type matching a listener, the listener's callback is invoked.
6643 The client then must synchronously call either @code{GNUNET_SET_accept}
6644 or @code{GNUNET_SET_reject}. Note that the operation will not be started
6645 until the client calls @code{GNUNET_SET_commit}
6646 (see Section "Supplying a Set").
6647
6648 @node Operations
6649 @subsubsection Operations
6650
6651
6652
6653 Operations to be initiated by the local peer are created with
6654 @code{GNUNET_SET_prepare}. Note that the operation will not be started
6655 until the client calls @code{GNUNET_SET_commit}
6656 (see Section "Supplying a Set").
6657
6658 @node Supplying a Set
6659 @subsubsection Supplying a Set
6660
6661
6662
6663 To create symmetry between the two ways of starting a set operation
6664 (accepting and initiating it), the operation handles returned by
6665 @code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare} do not yet have a
6666 set to operate on, thus they can not do any work yet.
6667
6668 The client must call @code{GNUNET_SET_commit} to specify a set to use for
6669 an operation. @code{GNUNET_SET_commit} may only be called once per set
6670 operation.
6671
6672 @node The Result Callback
6673 @subsubsection The Result Callback
6674
6675
6676
6677 Clients must specify both a result mode and a result callback with
6678 @code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare}. The result
6679 callback with a status indicating either that an element was received, or
6680 the operation failed or succeeded.
6681 The interpretation of the received element depends on the result mode.
6682 The callback needs to know which result mode it is used in, as the
6683 arguments do not indicate if an element is part of the full result set,
6684 or if it is in the difference between the original set and the final set.
6685
6686 @node The SET Client-Service Protocol
6687 @subsection The SET Client-Service Protocol
6688
6689
6690
6691 @menu
6692 * Creating Sets::
6693 * Listeners2::
6694 * Initiating Operations::
6695 * Modifying Sets::
6696 * Results and Operation Status::
6697 * Iterating Sets::
6698 @end menu
6699
6700 @node Creating Sets
6701 @subsubsection Creating Sets
6702
6703
6704
6705 For each set of a client, there exists a client connection to the service.
6706 Sets are created by sending the @code{GNUNET_SERVICE_SET_CREATE} message
6707 over a new client connection. Multiple operations for one set are
6708 multiplexed over one client connection, using a request id supplied by
6709 the client.
6710
6711 @node Listeners2
6712 @subsubsection Listeners2
6713
6714
6715
6716 Each listener also requires a seperate client connection. By sending the
6717 @code{GNUNET_SERVICE_SET_LISTEN} message, the client notifies the service
6718 of the application id and operation type it is interested in. A client
6719 rejects an incoming request by sending @code{GNUNET_SERVICE_SET_REJECT}
6720 on the listener's client connection.
6721 In contrast, when accepting an incoming request, a
6722 @code{GNUNET_SERVICE_SET_ACCEPT} message must be sent over the@ set that
6723 is supplied for the set operation.
6724
6725 @node Initiating Operations
6726 @subsubsection Initiating Operations
6727
6728
6729
6730 Operations with remote peers are initiated by sending a
6731 @code{GNUNET_SERVICE_SET_EVALUATE} message to the service. The@ client
6732 connection that this message is sent by determines the set to use.
6733
6734 @node Modifying Sets
6735 @subsubsection Modifying Sets
6736
6737
6738
6739 Sets are modified with the @code{GNUNET_SERVICE_SET_ADD} and
6740 @code{GNUNET_SERVICE_SET_REMOVE} messages.
6741
6742
6743 @c %@menu
6744 @c %* Results and Operation Status::
6745 @c %* Iterating Sets::
6746 @c %@end menu
6747
6748 @node Results and Operation Status
6749 @subsubsection Results and Operation Status
6750
6751
6752 The service notifies the client of result elements and success/failure of
6753 a set operation with the @code{GNUNET_SERVICE_SET_RESULT} message.
6754
6755 @node Iterating Sets
6756 @subsubsection Iterating Sets
6757
6758
6759
6760 All elements of a set can be requested by sending
6761 @code{GNUNET_SERVICE_SET_ITER_REQUEST}. The server responds with
6762 @code{GNUNET_SERVICE_SET_ITER_ELEMENT} and eventually terminates the
6763 iteration with @code{GNUNET_SERVICE_SET_ITER_DONE}.
6764 After each received element, the client
6765 must send @code{GNUNET_SERVICE_SET_ITER_ACK}. Note that only one set
6766 iteration may be active for a set at any given time.
6767
6768 @node The SET Intersection Peer-to-Peer Protocol
6769 @subsection The SET Intersection Peer-to-Peer Protocol
6770
6771
6772
6773 The intersection protocol operates over CADET and starts with a
6774 GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer
6775 initiating the operation to the peer listening for inbound requests.
6776 It includes the number of elements of the initiating peer, which is used
6777 to decide which side will send a Bloom filter first.
6778
6779 The listening peer checks if the operation type and application
6780 identifier are acceptable for its current state.
6781 If not, it responds with a GNUNET_MESSAGE_TYPE_SET_RESULT and a status of
6782 GNUNET_SET_STATUS_FAILURE (and terminates the CADET channel).
6783
6784 If the application accepts the request, the listener sends back a
6785 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} if it has
6786 more elements in the set than the client.
6787 Otherwise, it immediately starts with the Bloom filter exchange.
6788 If the initiator receives a
6789 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} response,
6790 it beings the Bloom filter exchange, unless the set size is indicated to
6791 be zero, in which case the intersection is considered finished after
6792 just the initial handshake.
6793
6794
6795 @menu
6796 * The Bloom filter exchange::
6797 * Salt::
6798 @end menu
6799
6800 @node The Bloom filter exchange
6801 @subsubsection The Bloom filter exchange
6802
6803
6804
6805 In this phase, each peer transmits a Bloom filter over the remaining
6806 keys of the local set to the other peer using a
6807 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_BF} message. This
6808 message additionally includes the number of elements left in the sender's
6809 set, as well as the XOR over all of the keys in that set.
6810
6811 The number of bits 'k' set per element in the Bloom filter is calculated
6812 based on the relative size of the two sets.
6813 Furthermore, the size of the Bloom filter is calculated based on 'k' and
6814 the number of elements in the set to maximize the amount of data filtered
6815 per byte transmitted on the wire (while avoiding an excessively high
6816 number of iterations).
6817
6818 The receiver of the message removes all elements from its local set that
6819 do not pass the Bloom filter test.
6820 It then checks if the set size of the sender and the XOR over the keys
6821 match what is left of its own set. If they do, it sends a
6822 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_DONE} back to indicate
6823 that the latest set is the final result.
6824 Otherwise, the receiver starts another Bloom filter exchange, except
6825 this time as the sender.
6826
6827 @node Salt
6828 @subsubsection Salt
6829
6830
6831
6832 Bloomfilter operations are probabilistic: With some non-zero probability
6833 the test may incorrectly say an element is in the set, even though it is
6834 not.
6835
6836 To mitigate this problem, the intersection protocol iterates exchanging
6837 Bloom filters using a different random 32-bit salt in each iteration (the
6838 salt is also included in the message).
6839 With different salts, set operations may fail for different elements.
6840 Merging the results from the executions, the probability of failure drops
6841 to zero.
6842
6843 The iterations terminate once both peers have established that they have
6844 sets of the same size, and where the XOR over all keys computes the same
6845 512-bit value (leaving a failure probability of 2-511).
6846
6847 @node The SET Union Peer-to-Peer Protocol
6848 @subsection The SET Union Peer-to-Peer Protocol
6849
6850
6851
6852 The SET union protocol is based on Eppstein's efficient set reconciliation
6853 without prior context. You should read this paper first if you want to
6854 understand the protocol.
6855
6856 The union protocol operates over CADET and starts with a
6857 GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer
6858 initiating the operation to the peer listening for inbound requests.
6859 It includes the number of elements of the initiating peer, which is
6860 currently not used.
6861
6862 The listening peer checks if the operation type and application
6863 identifier are acceptable for its current state. If not, it responds with
6864 a @code{GNUNET_MESSAGE_TYPE_SET_RESULT} and a status of
6865 @code{GNUNET_SET_STATUS_FAILURE} (and terminates the CADET channel).
6866
6867 If the application accepts the request, it sends back a strata estimator
6868 using a message of type GNUNET_MESSAGE_TYPE_SET_UNION_P2P_SE. The
6869 initiator evaluates the strata estimator and initiates the exchange of
6870 invertible Bloom filters, sending a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
6871
6872 During the IBF exchange, if the receiver cannot invert the Bloom filter or
6873 detects a cycle, it sends a larger IBF in response (up to a defined
6874 maximum limit; if that limit is reached, the operation fails).
6875 Elements decoded while processing the IBF are transmitted to the other
6876 peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS, or requested from the
6877 other peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS messages,
6878 depending on the sign observed during decoding of the IBF.
6879 Peers respond to a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS message
6880 with the respective element in a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS
6881 message. If the IBF fully decodes, the peer responds with a
6882 GNUNET_MESSAGE_TYPE_SET_UNION_P2P_DONE message instead of another
6883 GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
6884
6885 All Bloom filter operations use a salt to mingle keys before hashing them
6886 into buckets, such that future iterations have a fresh chance of
6887 succeeding if they failed due to collisions before.
6888
6889 @cindex STATISTICS Subsystem
6890 @node STATISTICS Subsystem
6891 @section STATISTICS Subsystem
6892
6893
6894
6895 In GNUnet, the STATISTICS subsystem offers a central place for all
6896 subsystems to publish unsigned 64-bit integer run-time statistics.
6897 Keeping this information centrally means that there is a unified way for
6898 the user to obtain data on all subsystems, and individual subsystems do
6899 not have to always include a custom data export method for performance
6900 metrics and other statistics. For example, the TRANSPORT system uses
6901 STATISTICS to update information about the number of directly connected
6902 peers and the bandwidth that has been consumed by the various plugins.
6903 This information is valuable for diagnosing connectivity and performance
6904 issues.
6905
6906 Following the GNUnet service architecture, the STATISTICS subsystem is
6907 divided into an API which is exposed through the header
6908 @strong{gnunet_statistics_service.h} and the STATISTICS service
6909 @strong{gnunet-service-statistics}. The @strong{gnunet-statistics}
6910 command-line tool can be used to obtain (and change) information about
6911 the values stored by the STATISTICS service. The STATISTICS service does
6912 not communicate with other peers.
6913
6914 Data is stored in the STATISTICS service in the form of tuples
6915 @strong{(subsystem, name, value, persistence)}. The subsystem determines
6916 to which other GNUnet's subsystem the data belongs. name is the name
6917 through which value is associated. It uniquely identifies the record
6918 from among other records belonging to the same subsystem.
6919 In some parts of the code, the pair @strong{(subsystem, name)} is called
6920 a @strong{statistic} as it identifies the values stored in the STATISTCS
6921 service.The persistence flag determines if the record has to be preserved
6922 across service restarts. A record is said to be persistent if this flag
6923 is set for it; if not, the record is treated as a non-persistent record
6924 and it is lost after service restart. Persistent records are written to
6925 and read from the file @strong{statistics.data} before shutdown
6926 and upon startup. The file is located in the HOME directory of the peer.
6927
6928 An anomaly of the STATISTICS service is that it does not terminate
6929 immediately upon receiving a shutdown signal if it has any clients
6930 connected to it. It waits for all the clients that are not monitors to
6931 close their connections before terminating itself.
6932 This is to prevent the loss of data during peer shutdown --- delaying the
6933 STATISTICS service shutdown helps other services to store important data
6934 to STATISTICS during shutdown.
6935
6936 @menu
6937 * libgnunetstatistics::
6938 * The STATISTICS Client-Service Protocol::
6939 @end menu
6940
6941 @cindex libgnunetstatistics
6942 @node libgnunetstatistics
6943 @subsection libgnunetstatistics
6944
6945
6946
6947 @strong{libgnunetstatistics} is the library containing the API for the
6948 STATISTICS subsystem. Any process requiring to use STATISTICS should use
6949 this API by to open a connection to the STATISTICS service.
6950 This is done by calling the function @code{GNUNET_STATISTICS_create()}.
6951 This function takes the subsystem's name which is trying to use STATISTICS
6952 and a configuration.
6953 All values written to STATISTICS with this connection will be placed in
6954 the section corresponding to the given subsystem's name.
6955 The connection to STATISTICS can be destroyed with the function
6956 @code{GNUNET_STATISTICS_destroy()}. This function allows for the
6957 connection to be destroyed immediately or upon transferring all
6958 pending write requests to the service.
6959
6960 Note: STATISTICS subsystem can be disabled by setting @code{DISABLE = YES}
6961 under the @code{[STATISTICS]} section in the configuration. With such a
6962 configuration all calls to @code{GNUNET_STATISTICS_create()} return
6963 @code{NULL} as the STATISTICS subsystem is unavailable and no other
6964 functions from the API can be used.
6965
6966
6967 @menu
6968 * Statistics retrieval::
6969 * Setting statistics and updating them::
6970 * Watches::
6971 @end menu
6972
6973 @node Statistics retrieval
6974 @subsubsection Statistics retrieval
6975
6976
6977
6978 Once a connection to the statistics service is obtained, information
6979 about any other system which uses statistics can be retrieved with the
6980 function GNUNET_STATISTICS_get().
6981 This function takes the connection handle, the name of the subsystem
6982 whose information we are interested in (a @code{NULL} value will
6983 retrieve information of all available subsystems using STATISTICS), the
6984 name of the statistic we are interested in (a @code{NULL} value will
6985 retrieve all available statistics), a continuation callback which is
6986 called when all of requested information is retrieved, an iterator
6987 callback which is called for each parameter in the retrieved information
6988 and a closure for the aforementioned callbacks. The library then invokes
6989 the iterator callback for each value matching the request.
6990
6991 Call to @code{GNUNET_STATISTICS_get()} is asynchronous and can be
6992 canceled with the function @code{GNUNET_STATISTICS_get_cancel()}.
6993 This is helpful when retrieving statistics takes too long and especially
6994 when we want to shutdown and cleanup everything.
6995
6996 @node Setting statistics and updating them
6997 @subsubsection Setting statistics and updating them
6998
6999
7000
7001 So far we have seen how to retrieve statistics, here we will learn how we
7002 can set statistics and update them so that other subsystems can retrieve
7003 them.
7004
7005 A new statistic can be set using the function
7006 @code{GNUNET_STATISTICS_set()}.
7007 This function takes the name of the statistic and its value and a flag to
7008 make the statistic persistent.
7009 The value of the statistic should be of the type @code{uint64_t}.
7010 The function does not take the name of the subsystem; it is determined
7011 from the previous @code{GNUNET_STATISTICS_create()} invocation. If
7012 the given statistic is already present, its value is overwritten.
7013
7014 An existing statistics can be updated, i.e its value can be increased or
7015 decreased by an amount with the function
7016 @code{GNUNET_STATISTICS_update()}.
7017 The parameters to this function are similar to
7018 @code{GNUNET_STATISTICS_set()}, except that it takes the amount to be
7019 changed as a type @code{int64_t} instead of the value.
7020
7021 The library will combine multiple set or update operations into one
7022 message if the client performs requests at a rate that is faster than the
7023 available IPC with the STATISTICS service. Thus, the client does not have
7024 to worry about sending requests too quickly.
7025
7026 @node Watches
7027 @subsubsection Watches
7028
7029
7030
7031 As interesting feature of STATISTICS lies in serving notifications
7032 whenever a statistic of our interest is modified.
7033 This is achieved by registering a watch through the function
7034 @code{GNUNET_STATISTICS_watch()}.
7035 The parameters of this function are similar to those of
7036 @code{GNUNET_STATISTICS_get()}.
7037 Changes to the respective statistic's value will then cause the given
7038 iterator callback to be called.
7039 Note: A watch can only be registered for a specific statistic. Hence
7040 the subsystem name and the parameter name cannot be @code{NULL} in a
7041 call to @code{GNUNET_STATISTICS_watch()}.
7042
7043 A registered watch will keep notifying any value changes until
7044 @code{GNUNET_STATISTICS_watch_cancel()} is called with the same
7045 parameters that are used for registering the watch.
7046
7047 @node The STATISTICS Client-Service Protocol
7048 @subsection The STATISTICS Client-Service Protocol
7049
7050
7051
7052 @menu
7053 * Statistics retrieval2::
7054 * Setting and updating statistics::
7055 * Watching for updates::
7056 @end menu
7057
7058 @node Statistics retrieval2
7059 @subsubsection Statistics retrieval2
7060
7061
7062
7063 To retrieve statistics, the client transmits a message of type
7064 @code{GNUNET_MESSAGE_TYPE_STATISTICS_GET} containing the given subsystem
7065 name and statistic parameter to the STATISTICS service.
7066 The service responds with a message of type
7067 @code{GNUNET_MESSAGE_TYPE_STATISTICS_VALUE} for each of the statistics
7068 parameters that match the client request for the client. The end of
7069 information retrieved is signaled by the service by sending a message of
7070 type @code{GNUNET_MESSAGE_TYPE_STATISTICS_END}.
7071
7072 @node Setting and updating statistics
7073 @subsubsection Setting and updating statistics
7074
7075
7076
7077 The subsystem name, parameter name, its value and the persistence flag are
7078 communicated to the service through the message
7079 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}.
7080
7081 When the service receives a message of type
7082 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}, it retrieves the subsystem
7083 name and checks for a statistic parameter with matching the name given in
7084 the message.
7085 If a statistic parameter is found, the value is overwritten by the new
7086 value from the message; if not found then a new statistic parameter is
7087 created with the given name and value.
7088
7089 In addition to just setting an absolute value, it is possible to perform a
7090 relative update by sending a message of type
7091 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET} with an update flag
7092 (@code{GNUNET_STATISTICS_SETFLAG_RELATIVE}) signifying that the value in
7093 the message should be treated as an update value.
7094
7095 @node Watching for updates
7096 @subsubsection Watching for updates
7097
7098
7099
7100 The function registers the watch at the service by sending a message of
7101 type @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH}. The service then sends
7102 notifications through messages of type
7103 @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH_VALUE} whenever the statistic
7104 parameter's value is changed.
7105
7106 @cindex DHT
7107 @cindex Distributed Hash Table
7108 @node Distributed Hash Table (DHT)
7109 @section Distributed Hash Table (DHT)
7110
7111
7112
7113 GNUnet includes a generic distributed hash table that can be used by
7114 developers building P2P applications in the framework.
7115 This section documents high-level features and how developers are
7116 expected to use the DHT.
7117 We have a research paper detailing how the DHT works.
7118 Also, Nate's thesis includes a detailed description and performance
7119 analysis (in chapter 6).
7120
7121 Key features of GNUnet's DHT include:
7122
7123 @itemize @bullet
7124 @item stores key-value pairs with values up to (approximately) 63k in size
7125 @item works with many underlay network topologies (small-world, random
7126 graph), underlay does not need to be a full mesh / clique
7127 @item support for extended queries (more than just a simple 'key'),
7128 filtering duplicate replies within the network (bloomfilter) and content
7129 validation (for details, please read the subsection on the block library)
7130 @item can (optionally) return paths taken by the PUT and GET operations
7131 to the application
7132 @item provides content replication to handle churn
7133 @end itemize
7134
7135 GNUnet's DHT is randomized and unreliable. Unreliable means that there is
7136 no strict guarantee that a value stored in the DHT is always
7137 found --- values are only found with high probability.
7138 While this is somewhat true in all P2P DHTs, GNUnet developers should be
7139 particularly wary of this fact (this will help you write secure,
7140 fault-tolerant code). Thus, when writing any application using the DHT,
7141 you should always consider the possibility that a value stored in the
7142 DHT by you or some other peer might simply not be returned, or returned
7143 with a significant delay.
7144 Your application logic must be written to tolerate this (naturally, some
7145 loss of performance or quality of service is expected in this case).
7146
7147 @menu
7148 * Block library and plugins::
7149 * libgnunetdht::
7150 * The DHT Client-Service Protocol::
7151 * The DHT Peer-to-Peer Protocol::
7152 @end menu
7153
7154 @node Block library and plugins
7155 @subsection Block library and plugins
7156
7157
7158
7159 @menu
7160 * What is a Block?::
7161 * The API of libgnunetblock::
7162 * Queries::
7163 * Sample Code::
7164 * Conclusion2::
7165 @end menu
7166
7167 @node What is a Block?
7168 @subsubsection What is a Block?
7169
7170
7171
7172 Blocks are small (< 63k) pieces of data stored under a key (struct
7173 GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which defines
7174 their data format. Blocks are used in GNUnet as units of static data
7175 exchanged between peers and stored (or cached) locally.
7176 Uses of blocks include file-sharing (the files are broken up into blocks),
7177 the VPN (DNS information is stored in blocks) and the DHT (all
7178 information in the DHT and meta-information for the maintenance of the
7179 DHT are both stored using blocks).
7180 The block subsystem provides a few common functions that must be
7181 available for any type of block.
7182
7183 @cindex libgnunetblock API
7184 @node The API of libgnunetblock
7185 @subsubsection The API of libgnunetblock
7186
7187
7188
7189 The block library requires for each (family of) block type(s) a block
7190 plugin (implementing @file{gnunet_block_plugin.h}) that provides basic
7191 functions that are needed by the DHT (and possibly other subsystems) to
7192 manage the block.
7193 These block plugins are typically implemented within their respective
7194 subsystems.
7195 The main block library is then used to locate, load and query the
7196 appropriate block plugin.
7197 Which plugin is appropriate is determined by the block type (which is
7198 just a 32-bit integer). Block plugins contain code that specifies which
7199 block types are supported by a given plugin. The block library loads all
7200 block plugins that are installed at the local peer and forwards the
7201 application request to the respective plugin.
7202
7203 The central functions of the block APIs (plugin and main library) are to
7204 allow the mapping of blocks to their respective key (if possible) and the
7205 ability to check that a block is well-formed and matches a given
7206 request (again, if possible).
7207 This way, GNUnet can avoid storing invalid blocks, storing blocks under
7208 the wrong key and forwarding blocks in response to a query that they do
7209 not answer.
7210
7211 One key function of block plugins is that it allows GNUnet to detect
7212 duplicate replies (via the Bloom filter). All plugins MUST support
7213 detecting duplicate replies (by adding the current response to the
7214 Bloom filter and rejecting it if it is encountered again).
7215 If a plugin fails to do this, responses may loop in the network.
7216
7217 @node Queries
7218 @subsubsection Queries
7219
7220
7221 The query format for any block in GNUnet consists of four main components.
7222 First, the type of the desired block must be specified. Second, the query
7223 must contain a hash code. The hash code is used for lookups in hash
7224 tables and databases and must not be unique for the block (however, if
7225 possible a unique hash should be used as this would be best for
7226 performance).
7227 Third, an optional Bloom filter can be specified to exclude known results;
7228 replies that hash to the bits set in the Bloom filter are considered
7229 invalid. False-positives can be eliminated by sending the same query
7230 again with a different Bloom filter mutator value, which parameterizes
7231 the hash function that is used.
7232 Finally, an optional application-specific "eXtended query" (xquery) can
7233 be specified to further constrain the results. It is entirely up to
7234 the type-specific plugin to determine whether or not a given block
7235 matches a query (type, hash, Bloom filter, and xquery).
7236 Naturally, not all xquery's are valid and some types of blocks may not
7237 support Bloom filters either, so the plugin also needs to check if the
7238 query is valid in the first place.
7239
7240 Depending on the results from the plugin, the DHT will then discard the
7241 (invalid) query, forward the query, discard the (invalid) reply, cache the
7242 (valid) reply, and/or forward the (valid and non-duplicate) reply.
7243
7244 @node Sample Code
7245 @subsubsection Sample Code
7246
7247
7248
7249 The source code in @strong{plugin_block_test.c} is a good starting point
7250 for new block plugins --- it does the minimal work by implementing a
7251 plugin that performs no validation at all.
7252 The respective @strong{Makefile.am} shows how to build and install a
7253 block plugin.
7254
7255 @node Conclusion2
7256 @subsubsection Conclusion2
7257
7258
7259
7260 In conclusion, GNUnet subsystems that want to use the DHT need to define a
7261 block format and write a plugin to match queries and replies. For testing,
7262 the @code{GNUNET_BLOCK_TYPE_TEST} block type can be used; it accepts
7263 any query as valid and any reply as matching any query.
7264 This type is also used for the DHT command line tools.
7265 However, it should NOT be used for normal applications due to the lack
7266 of error checking that results from this primitive implementation.
7267
7268 @cindex libgnunetdht
7269 @node libgnunetdht
7270 @subsection libgnunetdht
7271
7272
7273
7274 The DHT API itself is pretty simple and offers the usual GET and PUT
7275 functions that work as expected. The specified block type refers to the
7276 block library which allows the DHT to run application-specific logic for
7277 data stored in the network.
7278
7279
7280 @menu
7281 * GET::
7282 * PUT::
7283 * MONITOR::
7284 * DHT Routing Options::
7285 @end menu
7286
7287 @node GET
7288 @subsubsection GET
7289
7290
7291
7292 When using GET, the main consideration for developers (other than the
7293 block library) should be that after issuing a GET, the DHT will
7294 continuously cause (small amounts of) network traffic until the operation
7295 is explicitly canceled.
7296 So GET does not simply send out a single network request once; instead,
7297 the DHT will continue to search for data. This is needed to achieve good
7298 success rates and also handles the case where the respective PUT
7299 operation happens after the GET operation was started.
7300 Developers should not cancel an existing GET operation and then
7301 explicitly re-start it to trigger a new round of network requests;
7302 this is simply inefficient, especially as the internal automated version
7303 can be more efficient, for example by filtering results in the network
7304 that have already been returned.
7305
7306 If an application that performs a GET request has a set of replies that it
7307 already knows and would like to filter, it can call@
7308 @code{GNUNET_DHT_get_filter_known_results} with an array of hashes over
7309 the respective blocks to tell the DHT that these results are not
7310 desired (any more).
7311 This way, the DHT will filter the respective blocks using the block
7312 library in the network, which may result in a significant reduction in
7313 bandwidth consumption.
7314
7315 @node PUT
7316 @subsubsection PUT
7317
7318
7319
7320 @c inconsistent use of ``must'' above it's written ``MUST''
7321 In contrast to GET operations, developers @strong{must} manually re-run
7322 PUT operations periodically (if they intend the content to continue to be
7323 available). Content stored in the DHT expires or might be lost due to
7324 churn.
7325 Furthermore, GNUnet's DHT typically requires multiple rounds of PUT
7326 operations before a key-value pair is consistently available to all
7327 peers (the DHT randomizes paths and thus storage locations, and only
7328 after multiple rounds of PUTs there will be a sufficient number of
7329 replicas in large DHTs). An explicit PUT operation using the DHT API will
7330 only cause network traffic once, so in order to ensure basic availability
7331 and resistance to churn (and adversaries), PUTs must be repeated.
7332 While the exact frequency depends on the application, a rule of thumb is
7333 that there should be at least a dozen PUT operations within the content
7334 lifetime. Content in the DHT typically expires after one day, so
7335 DHT PUT operations should be repeated at least every 1-2 hours.
7336
7337 @node MONITOR
7338 @subsubsection MONITOR
7339
7340
7341
7342 The DHT API also allows applications to monitor messages crossing the
7343 local DHT service.
7344 The types of messages used by the DHT are GET, PUT and RESULT messages.
7345 Using the monitoring API, applications can choose to monitor these
7346 requests, possibly limiting themselves to requests for a particular block
7347 type.
7348
7349 The monitoring API is not only useful for diagnostics, it can also be
7350 used to trigger application operations based on PUT operations.
7351 For example, an application may use PUTs to distribute work requests to
7352 other peers.
7353 The workers would then monitor for PUTs that give them work, instead of
7354 looking for work using GET operations.
7355 This can be beneficial, especially if the workers have no good way to
7356 guess the keys under which work would be stored.
7357 Naturally, additional protocols might be needed to ensure that the desired
7358 number of workers will process the distributed workload.
7359
7360 @node DHT Routing Options
7361 @subsubsection DHT Routing Options
7362
7363
7364
7365 There are two important options for GET and PUT requests:
7366
7367 @table @asis
7368 @item GNUNET_DHT_RO_DEMULITPLEX_EVERYWHERE This option means that all
7369 peers should process the request, even if their peer ID is not closest to
7370 the key. For a PUT request, this means that all peers that a request
7371 traverses may make a copy of the data.
7372 Similarly for a GET request, all peers will check their local database
7373 for a result. Setting this option can thus significantly improve caching
7374 and reduce bandwidth consumption --- at the expense of a larger DHT
7375 database. If in doubt, we recommend that this option should be used.
7376 @item GNUNET_DHT_RO_RECORD_ROUTE This option instructs the DHT to record
7377 the path that a GET or a PUT request is taking through the overlay
7378 network. The resulting paths are then returned to the application with
7379 the respective result. This allows the receiver of a result to construct
7380 a path to the originator of the data, which might then be used for
7381 routing. Naturally, setting this option requires additional bandwidth
7382 and disk space, so applications should only set this if the paths are
7383 needed by the application logic.
7384 @item GNUNET_DHT_RO_FIND_PEER This option is an internal option used by
7385 the DHT's peer discovery mechanism and should not be used by applications.
7386 @item GNUNET_DHT_RO_BART This option is currently not implemented. It may
7387 in the future offer performance improvements for clique topologies.
7388 @end table
7389
7390 @node The DHT Client-Service Protocol
7391 @subsection The DHT Client-Service Protocol
7392
7393
7394
7395 @menu
7396 * PUTting data into the DHT::
7397 * GETting data from the DHT::
7398 * Monitoring the DHT::
7399 @end menu
7400
7401 @node PUTting data into the DHT
7402 @subsubsection PUTting data into the DHT
7403
7404
7405
7406 To store (PUT) data into the DHT, the client sends a
7407 @code{struct GNUNET_DHT_ClientPutMessage} to the service.
7408 This message specifies the block type, routing options, the desired
7409 replication level, the expiration time, key,
7410 value and a 64-bit unique ID for the operation. The service responds with
7411 a @code{struct GNUNET_DHT_ClientPutConfirmationMessage} with the same
7412 64-bit unique ID. Note that the service sends the confirmation as soon as
7413 it has locally processed the PUT request. The PUT may still be
7414 propagating through the network at this time.
7415
7416 In the future, we may want to change this to provide (limited) feedback
7417 to the client, for example if we detect that the PUT operation had no
7418 effect because the same key-value pair was already stored in the DHT.
7419 However, changing this would also require additional state and messages
7420 in the P2P interaction.
7421
7422 @node GETting data from the DHT
7423 @subsubsection GETting data from the DHT
7424
7425
7426
7427 To retrieve (GET) data from the DHT, the client sends a
7428 @code{struct GNUNET_DHT_ClientGetMessage} to the service. The message
7429 specifies routing options, a replication level (for replicating the GET,
7430 not the content), the desired block type, the key, the (optional)
7431 extended query and unique 64-bit request ID.
7432
7433 Additionally, the client may send any number of
7434 @code{struct GNUNET_DHT_ClientGetResultSeenMessage}s to notify the
7435 service about results that the client is already aware of.
7436 These messages consist of the key, the unique 64-bit ID of the request,
7437 and an arbitrary number of hash codes over the blocks that the client is
7438 already aware of. As messages are restricted to 64k, a client that
7439 already knows more than about a thousand blocks may need to send
7440 several of these messages. Naturally, the client should transmit these
7441 messages as quickly as possible after the original GET request such that
7442 the DHT can filter those results in the network early on. Naturally, as
7443 these messages are sent after the original request, it is conceivable
7444 that the DHT service may return blocks that match those already known
7445 to the client anyway.
7446
7447 In response to a GET request, the service will send @code{struct
7448 GNUNET_DHT_ClientResultMessage}s to the client. These messages contain the
7449 block type, expiration, key, unique ID of the request and of course the
7450 value (a block). Depending on the options set for the respective
7451 operations, the replies may also contain the path the GET and/or the PUT
7452 took through the network.
7453
7454 A client can stop receiving replies either by disconnecting or by sending
7455 a @code{struct GNUNET_DHT_ClientGetStopMessage} which must contain the
7456 key and the 64-bit unique ID of the original request. Using an
7457 explicit "stop" message is more common as this allows a client to run
7458 many concurrent GET operations over the same connection with the DHT
7459 service --- and to stop them individually.
7460
7461 @node Monitoring the DHT
7462 @subsubsection Monitoring the DHT
7463
7464
7465
7466 To begin monitoring, the client sends a
7467 @code{struct GNUNET_DHT_MonitorStartStop} message to the DHT service.
7468 In this message, flags can be set to enable (or disable) monitoring of
7469 GET, PUT and RESULT messages that pass through a peer. The message can
7470 also restrict monitoring to a particular block type or a particular key.
7471 Once monitoring is enabled, the DHT service will notify the client about
7472 any matching event using @code{struct GNUNET_DHT_MonitorGetMessage}s for
7473 GET events, @code{struct GNUNET_DHT_MonitorPutMessage} for PUT events
7474 and @code{struct GNUNET_DHT_MonitorGetRespMessage} for RESULTs. Each of
7475 these messages contains all of the information about the event.
7476
7477 @node The DHT Peer-to-Peer Protocol
7478 @subsection The DHT Peer-to-Peer Protocol
7479
7480
7481
7482 @menu
7483 * Routing GETs or PUTs::
7484 * PUTting data into the DHT2::
7485 * GETting data from the DHT2::
7486 @end menu
7487
7488 @node Routing GETs or PUTs
7489 @subsubsection Routing GETs or PUTs
7490
7491
7492
7493 When routing GETs or PUTs, the DHT service selects a suitable subset of
7494 neighbours for forwarding. The exact number of neighbours can be zero or
7495 more and depends on the hop counter of the query (initially zero) in
7496 relation to the (log of) the network size estimate, the desired
7497 replication level and the peer's connectivity.
7498 Depending on the hop counter and our network size estimate, the selection
7499 of the peers maybe randomized or by proximity to the key.
7500 Furthermore, requests include a set of peers that a request has already
7501 traversed; those peers are also excluded from the selection.
7502
7503 @node PUTting data into the DHT2
7504 @subsubsection PUTting data into the DHT2
7505
7506
7507
7508 To PUT data into the DHT, the service sends a @code{struct PeerPutMessage}
7509 of type @code{GNUNET_MESSAGE_TYPE_DHT_P2P_PUT} to the respective
7510 neighbour.
7511 In addition to the usual information about the content (type, routing
7512 options, desired replication level for the content, expiration time, key
7513 and value), the message contains a fixed-size Bloom filter with
7514 information about which peers (may) have already seen this request.
7515 This Bloom filter is used to ensure that DHT messages never loop back to
7516 a peer that has already processed the request.
7517 Additionally, the message includes the current hop counter and, depending
7518 on the routing options, the message may include the full path that the
7519 message has taken so far.
7520 The Bloom filter should already contain the identity of the previous hop;
7521 however, the path should not include the identity of the previous hop and
7522 the receiver should append the identity of the sender to the path, not
7523 its own identity (this is done to reduce bandwidth).
7524
7525 @node GETting data from the DHT2
7526 @subsubsection GETting data from the DHT2
7527
7528
7529
7530 A peer can search the DHT by sending @code{struct PeerGetMessage}s of type
7531 @code{GNUNET_MESSAGE_TYPE_DHT_P2P_GET} to other peers. In addition to the
7532 usual information about the request (type, routing options, desired
7533 replication level for the request, the key and the extended query), a GET
7534 request also contains a hop counter, a Bloom filter over the peers
7535 that have processed the request already and depending on the routing
7536 options the full path traversed by the GET.
7537 Finally, a GET request includes a variable-size second Bloom filter and a
7538 so-called Bloom filter mutator value which together indicate which
7539 replies the sender has already seen. During the lookup, each block that
7540 matches they block type, key and extended query is additionally subjected
7541 to a test against this Bloom filter.
7542 The block plugin is expected to take the hash of the block and combine it
7543 with the mutator value and check if the result is not yet in the Bloom
7544 filter. The originator of the query will from time to time modify the
7545 mutator to (eventually) allow false-positives filtered by the Bloom filter
7546 to be returned.
7547
7548 Peers that receive a GET request perform a local lookup (depending on
7549 their proximity to the key and the query options) and forward the request
7550 to other peers.
7551 They then remember the request (including the Bloom filter for blocking
7552 duplicate results) and when they obtain a matching, non-filtered response
7553 a @code{struct PeerResultMessage} of type
7554 @code{GNUNET_MESSAGE_TYPE_DHT_P2P_RESULT} is forwarded to the previous
7555 hop.
7556 Whenever a result is forwarded, the block plugin is used to update the
7557 Bloom filter accordingly, to ensure that the same result is never
7558 forwarded more than once.
7559 The DHT service may also cache forwarded results locally if the
7560 "CACHE_RESULTS" option is set to "YES" in the configuration.
7561
7562 @cindex GNS
7563 @cindex GNU Name System
7564 @node GNU Name System (GNS)
7565 @section GNU Name System (GNS)
7566
7567
7568
7569 The GNU Name System (GNS) is a decentralized database that enables users
7570 to securely resolve names to values.
7571 Names can be used to identify other users (for example, in social
7572 networking), or network services (for example, VPN services running at a
7573 peer in GNUnet, or purely IP-based services on the Internet).
7574 Users interact with GNS by typing in a hostname that ends in a
7575 top-level domain that is configured in the ``GNS'' section, matches
7576 an identity of the user or ends in a Base32-encoded public key.
7577
7578 Videos giving an overview of most of the GNS and the motivations behind
7579 it is available here and here.
7580 The remainder of this chapter targets developers that are familiar with
7581 high level concepts of GNS as presented in these talks.
7582 @c TODO: Add links to here and here and to these.
7583
7584 GNS-aware applications should use the GNS resolver to obtain the
7585 respective records that are stored under that name in GNS.
7586 Each record consists of a type, value, expiration time and flags.
7587
7588 The type specifies the format of the value. Types below 65536 correspond
7589 to DNS record types, larger values are used for GNS-specific records.
7590 Applications can define new GNS record types by reserving a number and
7591 implementing a plugin (which mostly needs to convert the binary value
7592 representation to a human-readable text format and vice-versa).
7593 The expiration time specifies how long the record is to be valid.
7594 The GNS API ensures that applications are only given non-expired values.
7595 The flags are typically irrelevant for applications, as GNS uses them
7596 internally to control visibility and validity of records.
7597
7598 Records are stored along with a signature.
7599 The signature is generated using the private key of the authoritative
7600 zone. This allows any GNS resolver to verify the correctness of a
7601 name-value mapping.
7602
7603 Internally, GNS uses the NAMECACHE to cache information obtained from
7604 other users, the NAMESTORE to store information specific to the local
7605 users, and the DHT to exchange data between users.
7606 A plugin API is used to enable applications to define new GNS
7607 record types.
7608
7609 @menu
7610 * libgnunetgns::
7611 * libgnunetgnsrecord::
7612 * GNS plugins::
7613 * The GNS Client-Service Protocol::
7614 * Hijacking the DNS-Traffic using gnunet-service-dns::
7615 @c * Serving DNS lookups via GNS on W32::
7616 * Importing DNS Zones into GNS::
7617 @end menu
7618
7619 @node libgnunetgns
7620 @subsection libgnunetgns
7621
7622
7623
7624 The GNS API itself is extremely simple. Clients first connect to the
7625 GNS service using @code{GNUNET_GNS_connect}.
7626 They can then perform lookups using @code{GNUNET_GNS_lookup} or cancel
7627 pending lookups using @code{GNUNET_GNS_lookup_cancel}.
7628 Once finished, clients disconnect using @code{GNUNET_GNS_disconnect}.
7629
7630 @menu
7631 * Looking up records::
7632 * Accessing the records::
7633 * Creating records::
7634 * Future work::
7635 @end menu
7636
7637 @node Looking up records
7638 @subsubsection Looking up records
7639
7640
7641
7642 @code{GNUNET_GNS_lookup} takes a number of arguments:
7643
7644 @table @asis
7645 @item handle This is simply the GNS connection handle from
7646 @code{GNUNET_GNS_connect}.
7647 @item name The client needs to specify the name to
7648 be resolved. This can be any valid DNS or GNS hostname.
7649 @item zone The client
7650 needs to specify the public key of the GNS zone against which the
7651 resolution should be done.
7652 Note that a key must be provided, the client should
7653 look up plausible values using its configuration,
7654 the identity service and by attempting to interpret the
7655 TLD as a base32-encoded public key.
7656 @item type This is the desired GNS or DNS record type
7657 to look for. While all records for the given name will be returned, this
7658 can be important if the client wants to resolve record types that
7659 themselves delegate resolution, such as CNAME, PKEY or GNS2DNS.
7660 Resolving a record of any of these types will only work if the respective
7661 record type is specified in the request, as the GNS resolver will
7662 otherwise follow the delegation and return the records from the
7663 respective destination, instead of the delegating record.
7664 @item only_cached This argument should typically be set to
7665 @code{GNUNET_NO}. Setting it to @code{GNUNET_YES} disables resolution via
7666 the overlay network.
7667 @item shorten_zone_key If GNS encounters new names during resolution,
7668 their respective zones can automatically be learned and added to the
7669 "shorten zone". If this is desired, clients must pass the private key of
7670 the shorten zone. If NULL is passed, shortening is disabled.
7671 @item proc This argument identifies
7672 the function to call with the result. It is given proc_cls, the number of
7673 records found (possibly zero) and the array of the records as arguments.
7674 proc will only be called once. After proc,> has been called, the lookup
7675 must no longer be canceled.
7676 @item proc_cls The closure for proc.
7677 @end table
7678
7679 @node Accessing the records
7680 @subsubsection Accessing the records
7681
7682
7683
7684 The @code{libgnunetgnsrecord} library provides an API to manipulate the
7685 GNS record array that is given to proc. In particular, it offers
7686 functions such as converting record values to human-readable
7687 strings (and back). However, most @code{libgnunetgnsrecord} functions are
7688 not interesting to GNS client applications.
7689
7690 For DNS records, the @code{libgnunetdnsparser} library provides
7691 functions for parsing (and serializing) common types of DNS records.
7692
7693 @node Creating records
7694 @subsubsection Creating records
7695
7696
7697
7698 Creating GNS records is typically done by building the respective record
7699 information (possibly with the help of @code{libgnunetgnsrecord} and
7700 @code{libgnunetdnsparser}) and then using the @code{libgnunetnamestore} to
7701 publish the information. The GNS API is not involved in this
7702 operation.
7703
7704 @node Future work
7705 @subsubsection Future work
7706
7707
7708
7709 In the future, we want to expand @code{libgnunetgns} to allow
7710 applications to observe shortening operations performed during GNS
7711 resolution, for example so that users can receive visual feedback when
7712 this happens.
7713
7714 @node libgnunetgnsrecord
7715 @subsection libgnunetgnsrecord
7716
7717
7718
7719 The @code{libgnunetgnsrecord} library is used to manipulate GNS
7720 records (in plaintext or in their encrypted format).
7721 Applications mostly interact with @code{libgnunetgnsrecord} by using the
7722 functions to convert GNS record values to strings or vice-versa, or to
7723 lookup a GNS record type number by name (or vice-versa).
7724 The library also provides various other functions that are mostly
7725 used internally within GNS, such as converting keys to names, checking for
7726 expiration, encrypting GNS records to GNS blocks, verifying GNS block
7727 signatures and decrypting GNS records from GNS blocks.
7728
7729 We will now discuss the four commonly used functions of the API.@
7730 @code{libgnunetgnsrecord} does not perform these operations itself,
7731 but instead uses plugins to perform the operation.
7732 GNUnet includes plugins to support common DNS record types as well as
7733 standard GNS record types.
7734
7735 @menu
7736 * Value handling::
7737 * Type handling::
7738 @end menu
7739
7740 @node Value handling
7741 @subsubsection Value handling
7742
7743
7744
7745 @code{GNUNET_GNSRECORD_value_to_string} can be used to convert
7746 the (binary) representation of a GNS record value to a human readable,
7747 0-terminated UTF-8 string.
7748 NULL is returned if the specified record type is not supported by any
7749 available plugin.
7750
7751 @code{GNUNET_GNSRECORD_string_to_value} can be used to try to convert a
7752 human readable string to the respective (binary) representation of
7753 a GNS record value.
7754
7755 @node Type handling
7756 @subsubsection Type handling
7757
7758
7759
7760 @code{GNUNET_GNSRECORD_typename_to_number} can be used to obtain the
7761 numeric value associated with a given typename. For example, given the
7762 typename "A" (for DNS A reocrds), the function will return the number 1.
7763 A list of common DNS record types is
7764 @uref{http://en.wikipedia.org/wiki/List_of_DNS_record_types, here}.
7765 Note that not all DNS record types are supported by GNUnet GNSRECORD
7766 plugins at this time.
7767
7768 @code{GNUNET_GNSRECORD_number_to_typename} can be used to obtain the
7769 typename associated with a given numeric value.
7770 For example, given the type number 1, the function will return the
7771 typename "A".
7772
7773 @node GNS plugins
7774 @subsection GNS plugins
7775
7776
7777
7778 Adding a new GNS record type typically involves writing (or extending) a
7779 GNSRECORD plugin. The plugin needs to implement the
7780 @code{gnunet_gnsrecord_plugin.h} API which provides basic functions that
7781 are needed by GNSRECORD to convert typenames and values of the respective
7782 record type to strings (and back).
7783 These gnsrecord plugins are typically implemented within their respective
7784 subsystems.
7785 Examples for such plugins can be found in the GNSRECORD, GNS and
7786 CONVERSATION subsystems.
7787
7788 The @code{libgnunetgnsrecord} library is then used to locate, load and
7789 query the appropriate gnsrecord plugin.
7790 Which plugin is appropriate is determined by the record type (which is
7791 just a 32-bit integer). The @code{libgnunetgnsrecord} library loads all
7792 block plugins that are installed at the local peer and forwards the
7793 application request to the plugins. If the record type is not
7794 supported by the plugin, it should simply return an error code.
7795
7796 The central functions of the block APIs (plugin and main library) are the
7797 same four functions for converting between values and strings, and
7798 typenames and numbers documented in the previous subsection.
7799
7800 @node The GNS Client-Service Protocol
7801 @subsection The GNS Client-Service Protocol
7802
7803
7804 The GNS client-service protocol consists of two simple messages, the
7805 @code{LOOKUP} message and the @code{LOOKUP_RESULT}. Each @code{LOOKUP}
7806 message contains a unique 32-bit identifier, which will be included in the
7807 corresponding response. Thus, clients can send many lookup requests in
7808 parallel and receive responses out-of-order.
7809 A @code{LOOKUP} request also includes the public key of the GNS zone,
7810 the desired record type and fields specifying whether shortening is
7811 enabled or networking is disabled. Finally, the @code{LOOKUP} message
7812 includes the name to be resolved.
7813
7814 The response includes the number of records and the records themselves
7815 in the format created by @code{GNUNET_GNSRECORD_records_serialize}.
7816 They can thus be deserialized using
7817 @code{GNUNET_GNSRECORD_records_deserialize}.
7818
7819 @node Hijacking the DNS-Traffic using gnunet-service-dns
7820 @subsection Hijacking the DNS-Traffic using gnunet-service-dns
7821
7822
7823
7824 This section documents how the gnunet-service-dns (and the
7825 gnunet-helper-dns) intercepts DNS queries from the local system.
7826 This is merely one method for how we can obtain GNS queries.
7827 It is also possible to change @code{resolv.conf} to point to a machine
7828 running @code{gnunet-dns2gns} or to modify libc's name system switch
7829 (NSS) configuration to include a GNS resolution plugin.
7830 The method described in this chapter is more of a last-ditch catch-all
7831 approach.
7832
7833 @code{gnunet-service-dns} enables intercepting DNS traffic using policy
7834 based routing.
7835 We MARK every outgoing DNS-packet if it was not sent by our application.
7836 Using a second routing table in the Linux kernel these marked packets are
7837 then routed through our virtual network interface and can thus be
7838 captured unchanged.
7839
7840 Our application then reads the query and decides how to handle it.
7841 If the query can be addressed via GNS, it is passed to
7842 @code{gnunet-service-gns} and resolved internally using GNS.
7843 In the future, a reverse query for an address of the configured virtual
7844 network could be answered with records kept about previous forward
7845 queries.
7846 Queries that are not hijacked by some application using the DNS service
7847 will be sent to the original recipient.
7848 The answer to the query will always be sent back through the virtual
7849 interface with the original nameserver as source address.
7850
7851
7852 @menu
7853 * Network Setup Details::
7854 @end menu
7855
7856 @node Network Setup Details
7857 @subsubsection Network Setup Details
7858
7859
7860
7861 The DNS interceptor adds the following rules to the Linux kernel:
7862 @example
7863 iptables -t mangle -I OUTPUT 1 -p udp --sport $LOCALPORT --dport 53 \
7864 -j ACCEPT iptables -t mangle -I OUTPUT 2 -p udp --dport 53 -j MARK \
7865 --set-mark 3 ip rule add fwmark 3 table2 ip route add default via \
7866 $VIRTUALDNS table2
7867 @end example
7868
7869 @c FIXME: Rewrite to reflect display which is no longer content by line
7870 @c FIXME: due to the < 74 characters limit.
7871 Line 1 makes sure that all packets coming from a port our application
7872 opened beforehand (@code{$LOCALPORT}) will be routed normally.
7873 Line 2 marks every other packet to a DNS-Server with mark 3 (chosen
7874 arbitrarily). The third line adds a routing policy based on this mark
7875 3 via the routing table.
7876
7877 @c @node Serving DNS lookups via GNS on W32
7878 @c @subsection Serving DNS lookups via GNS on W32
7879
7880
7881
7882 @c This section documents how the libw32nsp (and
7883 @c gnunet-gns-helper-service-w32) do DNS resolutions of DNS queries on the
7884 @c local system. This only applies to GNUnet running on W32.
7885
7886 @c W32 has a concept of "Namespaces" and "Namespace providers".
7887 @c These are used to present various name systems to applications in a
7888 @c generic way.
7889 @c Namespaces include DNS, mDNS, NLA and others. For each namespace any
7890 @c number of providers could be registered, and they are queried in an order
7891 @c of priority (which is adjustable).
7892
7893 @c Applications can resolve names by using WSALookupService*() family of
7894 @c functions.
7895
7896 @c However, these are WSA-only facilities. Common BSD socket functions for
7897 @c namespace resolutions are gethostbyname and getaddrinfo (among others).
7898 @c These functions are implemented internally (by default - by mswsock,
7899 @c which also implements the default DNS provider) as wrappers around
7900 @c WSALookupService*() functions (see "Sample Code for a Service Provider"
7901 @c on MSDN).
7902
7903 @c On W32 GNUnet builds a libw32nsp - a namespace provider, which can then be
7904 @c installed into the system by using w32nsp-install (and uninstalled by
7905 @c w32nsp-uninstall), as described in "Installation Handbook".
7906
7907 @c libw32nsp is very simple and has almost no dependencies. As a response to
7908 @c NSPLookupServiceBegin(), it only checks that the provider GUID passed to
7909 @c it by the caller matches GNUnet DNS Provider GUID,
7910 @c then connects to
7911 @c gnunet-gns-helper-service-w32 at 127.0.0.1:5353 (hardcoded) and sends the
7912 @c name resolution request there, returning the connected socket to the
7913 @c caller.
7914
7915 @c When the caller invokes NSPLookupServiceNext(), libw32nsp reads a
7916 @c completely formed reply from that socket, unmarshalls it, then gives
7917 @c it back to the caller.
7918
7919 @c At the moment gnunet-gns-helper-service-w32 is implemented to ever give
7920 @c only one reply, and subsequent calls to NSPLookupServiceNext() will fail
7921 @c with WSA_NODATA (first call to NSPLookupServiceNext() might also fail if
7922 @c GNS failed to find the name, or there was an error connecting to it).
7923
7924 @c gnunet-gns-helper-service-w32 does most of the processing:
7925
7926 @c @itemize @bullet
7927 @c @item Maintains a connection to GNS.
7928 @c @item Reads GNS config and loads appropriate keys.
7929 @c @item Checks service GUID and decides on the type of record to look up,
7930 @c refusing to make a lookup outright when unsupported service GUID is
7931 @c passed.
7932 @c @item Launches the lookup
7933 @c @end itemize
7934
7935 @c When lookup result arrives, gnunet-gns-helper-service-w32 forms a complete
7936 @c reply (including filling a WSAQUERYSETW structure and, possibly, a binary
7937 @c blob with a hostent structure for gethostbyname() client), marshalls it,
7938 @c and sends it back to libw32nsp. If no records were found, it sends an
7939 @c empty header.
7940
7941 @c This works for most normal applications that use gethostbyname() or
7942 @c getaddrinfo() to resolve names, but fails to do anything with
7943 @c applications that use alternative means of resolving names (such as
7944 @c sending queries to a DNS server directly by themselves).
7945 @c This includes some of well known utilities, like "ping" and "nslookup".
7946
7947 @node Importing DNS Zones into GNS
7948 @subsection Importing DNS Zones into GNS
7949
7950 This section discusses the challenges and problems faced when writing the
7951 Ascension tool. It also takes a look at possible improvements in the
7952 future.
7953
7954 Consider the following diagram that shows the workflow of Ascension:
7955
7956 @image{images/ascension_ssd,6in,,Ascensions workflow}
7957
7958 Further the interaction between components of GNUnet are shown in the diagram
7959 below:
7960 @center @image{images/ascension_interaction,,6in,Ascensions workflow}
7961
7962 @menu
7963 * Conversions between DNS and GNS::
7964 * DNS Zone Size::
7965 * Performance::
7966 @end menu
7967
7968 @cindex DNS Conversion
7969 @node Conversions between DNS and GNS
7970 @subsubsection Conversions between DNS and GNS
7971
7972 The differences between the two name systems lies in the details and is not
7973 always transparent.  For instance an SRV record is converted to a BOX record
7974 which is unique to GNS.
7975
7976 This is done by converting to a BOX record from an existing SRV record:
7977
7978 @example
7979 # SRV
7980 # _service._proto.name. TTL class SRV priority weight port target
7981 _sip._tcp.example.com. 14000 IN SRV     0 0 5060 www.example.com.
7982 # BOX
7983 # TTL BOX flags port protocol recordtype priority weight port target
7984 14000 BOX n 5060 6 33 0 0 5060 www.example.com
7985 @end example
7986
7987 Other records that need to undergo such transformation is the MX record type,
7988 as well as the SOA record type.
7989
7990 Transformation of a SOA record into GNS works as described in the
7991 following example. Very important to note are the rname and mname keys.
7992
7993 @example
7994 # BIND syntax for a clean SOA record
7995 @   IN SOA master.example.com. hostmaster.example.com. (
7996     2017030300 ; serial
7997     3600       ; refresh
7998     1800       ; retry
7999     604800     ; expire
8000     600 )      ; ttl
8001 # Recordline for adding the record
8002 $ gnunet-namestore -z example.com -a -n @ -t SOA -V \
8003     rname=master.example.com mname=hostmaster.example.com  \
8004     2017030300,3600,1800,604800,600 -e 7200s
8005 @end example
8006
8007 The transformation of MX records is done in a simple way.
8008 @example
8009 # mail.example.com. 3600 IN MX 10 mail.example.com.
8010 $ gnunet-namestore -z example.com -n mail -R 3600 MX n 10,mail
8011 @end example
8012
8013 Finally, one of the biggest struggling points were the NS records that are
8014 found in top level domain zones. The intended behaviour for those is to add
8015 GNS2DNS records for those so that gnunet-gns can resolve records for those
8016 domains on its own. Those require the values from DNS GLUE records, provided
8017 they are within the same zone.
8018
8019 The following two examples show one record with a GLUE record and the other one
8020 does not have a GLUE record. This takes place in the 'com' TLD.
8021
8022 @example
8023 # ns1.example.com 86400 IN A 127.0.0.1
8024 # example.com 86400 IN NS ns1.example.com.
8025 $ gnunet-namestore -z com -n example -R 86400 GNS2DNS n \
8026     example.com@@127.0.0.1
8027
8028 # example.com 86400 IN NS ns1.example.org.
8029 $ gnunet-namestore -z com -n example -R 86400 GNS2DNS n \
8030     example.com@@ns1.example.org
8031 @end example
8032
8033 As you can see, one of the GNS2DNS records has an IP address listed and the
8034 other one a DNS name. For the first one there is a GLUE record to do the
8035 translation directly and the second one will issue another DNS query to figure
8036 out the IP of ns1.example.org.
8037
8038 A solution was found by creating a hierarchical zone structure in GNS and linking
8039 the zones using PKEY records to one another. This allows the resolution of the
8040 name servers to work within GNS while not taking control over unwanted zones.
8041
8042 Currently the following record types are supported:
8043 @itemize @bullet
8044 @item A
8045 @item AAAA
8046 @item CNAME
8047 @item MX
8048 @item NS
8049 @item SRV
8050 @item TXT
8051 @end itemize
8052
8053 This is not due to technical limitations but rather a practical ones. The
8054 problem occurs with DNSSEC enabled DNS zones. As records within those zones are
8055 signed periodically, and every new signature is an update to the zone, there are
8056 many revisions of zones. This results in a problem with bigger zones as there
8057 are lots of records that have been signed again but no major changes.  Also
8058 trying to add records that are unknown that require a different format take time
8059 as they cause a CLI call of the namestore.  Furthermore certain record types
8060 need transformation into a GNS compatible format which, depending on the record
8061 type, takes more time.
8062
8063 Further a blacklist was added to drop for instance DNSSEC related records. Also
8064 if a record type is neither in the white list nor the blacklist it is considered
8065 as a loss of data and a message is shown to the user. This helps with
8066 transparency and also with contributing, as the not supported record types can
8067 then be added accordingly.
8068
8069 @node DNS Zone Size
8070 @subsubsection DNS Zone Size
8071 Another very big problem exists with very large zones. When migrating a small
8072 zone the delay between adding of records and their expiry is negligible. However
8073 when working with big zones that easily have more than a few million records
8074 this delay becomes a problem.
8075
8076 Records will start to expire well before the zone has finished migrating. This
8077 is usually not a problem but can cause a high CPU load when a peer is restarted
8078 and the records have expired.
8079
8080 A good solution has not been found yet. One of the idea that floated around was
8081 that the records should be added with the s (shadow) flag to keep the records
8082 resolvable even if they expired. However this would introduce the problem of how
8083 to detect if a record has been removed from the zone and would require deletion
8084 of said record(s).
8085
8086 Another problem that still persists is how to refresh records. Expired records
8087 are still displayed when calling gnunet-namestore but do not resolve with
8088 gnunet-gns. Zonemaster will sign the expired records again and make sure that
8089 the records are still valid. With a recent change this was fixed as gnunet-gns
8090 to improve the suffix lookup which allows for a fast lookup even with thousands
8091 of local egos.
8092
8093 Currently the pace of adding records in general is around 10 records per second.
8094 Crypto is the upper limit for adding of records. The performance of your machine
8095 can be tested with the perf_crypto_* tools. There is still a big discrepancy
8096 between the pace of Ascension and the theoretical limit.
8097
8098 A performance metric for measuring improvements has not yet been implemented in
8099 Ascension.
8100
8101 @node Performance
8102 @subsubsection Performance
8103 The performance when migrating a zone using the Ascension tool is limited by a
8104 handful of factors. First of all ascension is written in Python3 and calls the
8105 CLI tools of GNUnet. This is comparable to a fork and exec call which costs a
8106 few CPU cycles. Furthermore all the records that are added to the same
8107 label are signed using the zones private key. This signing operation is very
8108 resource heavy and was optimized during development by adding the '-R'
8109 (Recordline) option to gnunet-namestore which allows to specify multiple records
8110 using the CLI tool. Assuming that in a TLD zone every domain has at least two
8111 name servers this halves the amount of signatures needed.
8112
8113 Another improvement that could be made is with the addition of multiple threads
8114 or using asynchronous subprocesses when opening the GNUnet CLI tools. This could
8115 be implemented by simply creating more workers in the program but performance
8116 improvements were not tested.
8117
8118 Ascension was tested using different hardware and database backends. Performance
8119 differences between SQLite and postgresql are marginal and almost non existent.
8120 What did make a huge impact on record adding performance was the storage medium.
8121 On a traditional mechanical hard drive adding of records were slow compared to a
8122 solid state disk.
8123
8124 In conclusion there are many bottlenecks still around in the program, namely the
8125 single threaded implementation and inefficient, sequential calls of
8126 gnunet-namestore. In the future a solution that uses the C API would be cleaner
8127 and better.
8128
8129 @cindex GNS Namecache
8130 @node GNS Namecache
8131 @section GNS Namecache
8132
8133 The NAMECACHE subsystem is responsible for caching (encrypted) resolution
8134 results of the GNU Name System (GNS). GNS makes zone information available
8135 to other users via the DHT. However, as accessing the DHT for every
8136 lookup is expensive (and as the DHT's local cache is lost whenever the
8137 peer is restarted), GNS uses the NAMECACHE as a more persistent cache for
8138 DHT lookups.
8139 Thus, instead of always looking up every name in the DHT, GNS first
8140 checks if the result is already available locally in the NAMECACHE.
8141 Only if there is no result in the NAMECACHE, GNS queries the DHT.
8142 The NAMECACHE stores data in the same (encrypted) format as the DHT.
8143 It thus makes no sense to iterate over all items in the
8144 NAMECACHE --- the NAMECACHE does not have a way to provide the keys
8145 required to decrypt the entries.
8146
8147 Blocks in the NAMECACHE share the same expiration mechanism as blocks in
8148 the DHT --- the block expires wheneever any of the records in
8149 the (encrypted) block expires.
8150 The expiration time of the block is the only information stored in
8151 plaintext. The NAMECACHE service internally performs all of the required
8152 work to expire blocks, clients do not have to worry about this.
8153 Also, given that NAMECACHE stores only GNS blocks that local users
8154 requested, there is no configuration option to limit the size of the
8155 NAMECACHE. It is assumed to be always small enough (a few MB) to fit on
8156 the drive.
8157
8158 The NAMECACHE supports the use of different database backends via a
8159 plugin API.
8160
8161 @menu
8162 * libgnunetnamecache::
8163 * The NAMECACHE Client-Service Protocol::
8164 * The NAMECACHE Plugin API::
8165 @end menu
8166
8167 @node libgnunetnamecache
8168 @subsection libgnunetnamecache
8169
8170
8171
8172 The NAMECACHE API consists of five simple functions. First, there is
8173 @code{GNUNET_NAMECACHE_connect} to connect to the NAMECACHE service.
8174 This returns the handle required for all other operations on the
8175 NAMECACHE. Using @code{GNUNET_NAMECACHE_block_cache} clients can insert a
8176 block into the cache.
8177 @code{GNUNET_NAMECACHE_lookup_block} can be used to lookup blocks that
8178 were stored in the NAMECACHE. Both operations can be canceled using
8179 @code{GNUNET_NAMECACHE_cancel}. Note that canceling a
8180 @code{GNUNET_NAMECACHE_block_cache} operation can result in the block
8181 being stored in the NAMECACHE --- or not. Cancellation primarily ensures
8182 that the continuation function with the result of the operation will no
8183 longer be invoked.
8184 Finally, @code{GNUNET_NAMECACHE_disconnect} closes the connection to the
8185 NAMECACHE.
8186
8187 The maximum size of a block that can be stored in the NAMECACHE is
8188 @code{GNUNET_NAMECACHE_MAX_VALUE_SIZE}, which is defined to be 63 kB.
8189
8190 @node The NAMECACHE Client-Service Protocol
8191 @subsection The NAMECACHE Client-Service Protocol
8192
8193
8194
8195 All messages in the NAMECACHE IPC protocol start with the
8196 @code{struct GNUNET_NAMECACHE_Header} which adds a request
8197 ID (32-bit integer) to the standard message header.
8198 The request ID is used to match requests with the
8199 respective responses from the NAMECACHE, as they are allowed to happen
8200 out-of-order.
8201
8202
8203 @menu
8204 * Lookup::
8205 * Store::
8206 @end menu
8207
8208 @node Lookup
8209 @subsubsection Lookup
8210
8211
8212
8213 The @code{struct LookupBlockMessage} is used to lookup a block stored in
8214 the cache.
8215 It contains the query hash. The NAMECACHE always responds with a
8216 @code{struct LookupBlockResponseMessage}. If the NAMECACHE has no
8217 response, it sets the expiration time in the response to zero.
8218 Otherwise, the response is expected to contain the expiration time, the
8219 ECDSA signature, the derived key and the (variable-size) encrypted data
8220 of the block.
8221
8222 @node Store
8223 @subsubsection Store
8224
8225
8226
8227 The @code{struct BlockCacheMessage} is used to cache a block in the
8228 NAMECACHE.
8229 It has the same structure as the @code{struct LookupBlockResponseMessage}.
8230 The service responds with a @code{struct BlockCacheResponseMessage} which
8231 contains the result of the operation (success or failure).
8232 In the future, we might want to make it possible to provide an error
8233 message as well.
8234
8235 @node The NAMECACHE Plugin API
8236 @subsection The NAMECACHE Plugin API
8237
8238
8239 The NAMECACHE plugin API consists of two functions, @code{cache_block} to
8240 store a block in the database, and @code{lookup_block} to lookup a block
8241 in the database.
8242
8243
8244 @menu
8245 * Lookup2::
8246 * Store2::
8247 @end menu
8248
8249 @node Lookup2
8250 @subsubsection Lookup2
8251
8252
8253
8254 The @code{lookup_block} function is expected to return at most one block
8255 to the iterator, and return @code{GNUNET_NO} if there were no non-expired
8256 results.
8257 If there are multiple non-expired results in the cache, the lookup is
8258 supposed to return the result with the largest expiration time.
8259
8260 @node Store2
8261 @subsubsection Store2
8262
8263
8264
8265 The @code{cache_block} function is expected to try to store the block in
8266 the database, and return @code{GNUNET_SYSERR} if this was not possible
8267 for any reason.
8268 Furthermore, @code{cache_block} is expected to implicitly perform cache
8269 maintenance and purge blocks from the cache that have expired. Note that
8270 @code{cache_block} might encounter the case where the database already has
8271 another block stored under the same key. In this case, the plugin must
8272 ensure that the block with the larger expiration time is preserved.
8273 Obviously, this can done either by simply adding new blocks and selecting
8274 for the most recent expiration time during lookup, or by checking which
8275 block is more recent during the store operation.
8276
8277 @cindex REVOCATION Subsystem
8278 @node REVOCATION Subsystem
8279 @section REVOCATION Subsystem
8280
8281
8282 The REVOCATION subsystem is responsible for key revocation of Egos.
8283 If a user learns that theis private key has been compromised or has lost
8284 it, they can use the REVOCATION system to inform all of the other users
8285 that their private key is no longer valid.
8286 The subsystem thus includes ways to query for the validity of keys and to
8287 propagate revocation messages.
8288
8289 @menu
8290 * Dissemination::
8291 * Revocation Message Design Requirements::
8292 * libgnunetrevocation::
8293 * The REVOCATION Client-Service Protocol::
8294 * The REVOCATION Peer-to-Peer Protocol::
8295 @end menu
8296
8297 @node Dissemination
8298 @subsection Dissemination
8299
8300
8301
8302 When a revocation is performed, the revocation is first of all
8303 disseminated by flooding the overlay network.
8304 The goal is to reach every peer, so that when a peer needs to check if a
8305 key has been revoked, this will be purely a local operation where the
8306 peer looks at its local revocation list. Flooding the network is also the
8307 most robust form of key revocation --- an adversary would have to control
8308 a separator of the overlay graph to restrict the propagation of the
8309 revocation message. Flooding is also very easy to implement --- peers that
8310 receive a revocation message for a key that they have never seen before
8311 simply pass the message to all of their neighbours.
8312
8313 Flooding can only distribute the revocation message to peers that are
8314 online.
8315 In order to notify peers that join the network later, the revocation
8316 service performs efficient set reconciliation over the sets of known
8317 revocation messages whenever two peers (that both support REVOCATION
8318 dissemination) connect.
8319 The SET service is used to perform this operation efficiently.
8320
8321 @node Revocation Message Design Requirements
8322 @subsection Revocation Message Design Requirements
8323
8324
8325
8326 However, flooding is also quite costly, creating O(|E|) messages on a
8327 network with |E| edges.
8328 Thus, revocation messages are required to contain a proof-of-work, the
8329 result of an expensive computation (which, however, is cheap to verify).
8330 Only peers that have expended the CPU time necessary to provide
8331 this proof will be able to flood the network with the revocation message.
8332 This ensures that an attacker cannot simply flood the network with
8333 millions of revocation messages. The proof-of-work required by GNUnet is
8334 set to take days on a typical PC to compute; if the ability to quickly
8335 revoke a key is needed, users have the option to pre-compute revocation
8336 messages to store off-line and use instantly after their key has expired.
8337
8338 Revocation messages must also be signed by the private key that is being
8339 revoked. Thus, they can only be created while the private key is in the
8340 possession of the respective user. This is another reason to create a
8341 revocation message ahead of time and store it in a secure location.
8342
8343 @node libgnunetrevocation
8344 @subsection libgnunetrevocation
8345
8346
8347
8348 The REVOCATION API consists of two parts, to query and to issue
8349 revocations.
8350
8351
8352 @menu
8353 * Querying for revoked keys::
8354 * Preparing revocations::
8355 * Issuing revocations::
8356 @end menu
8357
8358 @node Querying for revoked keys
8359 @subsubsection Querying for revoked keys
8360
8361
8362
8363 @code{GNUNET_REVOCATION_query} is used to check if a given ECDSA public
8364 key has been revoked.
8365 The given callback will be invoked with the result of the check.
8366 The query can be canceled using @code{GNUNET_REVOCATION_query_cancel} on
8367 the return value.
8368
8369 @node Preparing revocations
8370 @subsubsection Preparing revocations
8371
8372
8373
8374 It is often desirable to create a revocation record ahead-of-time and
8375 store it in an off-line location to be used later in an emergency.
8376 This is particularly true for GNUnet revocations, where performing the
8377 revocation operation itself is computationally expensive and thus is
8378 likely to take some time.
8379 Thus, if users want the ability to perform revocations quickly in an
8380 emergency, they must pre-compute the revocation message.
8381 The revocation API enables this with two functions that are used to
8382 compute the revocation message, but not trigger the actual revocation
8383 operation.
8384
8385 @code{GNUNET_REVOCATION_check_pow} should be used to calculate the
8386 proof-of-work required in the revocation message. This function takes the
8387 public key, the required number of bits for the proof of work (which in
8388 GNUnet is a network-wide constant) and finally a proof-of-work number as
8389 arguments.
8390 The function then checks if the given proof-of-work number is a valid
8391 proof of work for the given public key. Clients preparing a revocation
8392 are expected to call this function repeatedly (typically with a
8393 monotonically increasing sequence of numbers of the proof-of-work number)
8394 until a given number satisfies the check.
8395 That number should then be saved for later use in the revocation
8396 operation.
8397
8398 @code{GNUNET_REVOCATION_sign_revocation} is used to generate the
8399 signature that is required in a revocation message.
8400 It takes the private key that (possibly in the future) is to be revoked
8401 and returns the signature.
8402 The signature can again be saved to disk for later use, which will then
8403 allow performing a revocation even without access to the private key.
8404
8405 @node Issuing revocations
8406 @subsubsection Issuing revocations
8407
8408
8409 Given a ECDSA public key, the signature from @code{GNUNET_REVOCATION_sign}
8410 and the proof-of-work,
8411 @code{GNUNET_REVOCATION_revoke} can be used to perform the
8412 actual revocation. The given callback is called upon completion of the
8413 operation. @code{GNUNET_REVOCATION_revoke_cancel} can be used to stop the
8414 library from calling the continuation; however, in that case it is
8415 undefined whether or not the revocation operation will be executed.
8416
8417 @node The REVOCATION Client-Service Protocol
8418 @subsection The REVOCATION Client-Service Protocol
8419
8420
8421 The REVOCATION protocol consists of four simple messages.
8422
8423 A @code{QueryMessage} containing a public ECDSA key is used to check if a
8424 particular key has been revoked. The service responds with a
8425 @code{QueryResponseMessage} which simply contains a bit that says if the
8426 given public key is still valid, or if it has been revoked.
8427
8428 The second possible interaction is for a client to revoke a key by
8429 passing a @code{RevokeMessage} to the service. The @code{RevokeMessage}
8430 contains the ECDSA public key to be revoked, a signature by the
8431 corresponding private key and the proof-of-work, The service responds
8432 with a @code{RevocationResponseMessage} which can be used to indicate
8433 that the @code{RevokeMessage} was invalid (i.e. proof of work incorrect),
8434 or otherwise indicates that the revocation has been processed
8435 successfully.
8436
8437 @node The REVOCATION Peer-to-Peer Protocol
8438 @subsection The REVOCATION Peer-to-Peer Protocol
8439
8440
8441
8442 Revocation uses two disjoint ways to spread revocation information among
8443 peers.
8444 First of all, P2P gossip exchanged via CORE-level neighbours is used to
8445 quickly spread revocations to all connected peers.
8446 Second, whenever two peers (that both support revocations) connect,
8447 the SET service is used to compute the union of the respective revocation
8448 sets.
8449
8450 In both cases, the exchanged messages are @code{RevokeMessage}s which
8451 contain the public key that is being revoked, a matching ECDSA signature,
8452 and a proof-of-work.
8453 Whenever a peer learns about a new revocation this way, it first
8454 validates the signature and the proof-of-work, then stores it to disk
8455 (typically to a file $GNUNET_DATA_HOME/revocation.dat) and finally
8456 spreads the information to all directly connected neighbours.
8457
8458 For computing the union using the SET service, the peer with the smaller
8459 hashed peer identity will connect (as a "client" in the two-party set
8460 protocol) to the other peer after one second (to reduce traffic spikes
8461 on connect) and initiate the computation of the set union.
8462 All revocation services use a common hash to identify the SET operation
8463 over revocation sets.
8464
8465 The current implementation accepts revocation set union operations from
8466 all peers at any time; however, well-behaved peers should only initiate
8467 this operation once after establishing a connection to a peer with a
8468 larger hashed peer identity.
8469
8470 @cindex FS
8471 @cindex FS Subsystem
8472 @node File-sharing (FS) Subsystem
8473 @section File-sharing (FS) Subsystem
8474
8475
8476
8477 This chapter describes the details of how the file-sharing service works.
8478 As with all services, it is split into an API (libgnunetfs), the service
8479 process (gnunet-service-fs) and user interface(s).
8480 The file-sharing service uses the datastore service to store blocks and
8481 the DHT (and indirectly datacache) for lookups for non-anonymous
8482 file-sharing.
8483 Furthermore, the file-sharing service uses the block library (and the
8484 block fs plugin) for validation of DHT operations.
8485
8486 In contrast to many other services, libgnunetfs is rather complex since
8487 the client library includes a large number of high-level abstractions;
8488 this is necessary since the Fs service itself largely only operates on
8489 the block level.
8490 The FS library is responsible for providing a file-based abstraction to
8491 applications, including directories, meta data, keyword search,
8492 verification, and so on.
8493
8494 The method used by GNUnet to break large files into blocks and to use
8495 keyword search is called the
8496 "Encoding for Censorship Resistant Sharing" (ECRS).
8497 ECRS is largely implemented in the fs library; block validation is also
8498 reflected in the block FS plugin and the FS service.
8499 ECRS on-demand encoding is implemented in the FS service.
8500
8501 NOTE: The documentation in this chapter is quite incomplete.
8502
8503 @menu
8504 * Encoding for Censorship-Resistant Sharing (ECRS)::
8505 * File-sharing persistence directory structure::
8506 @end menu
8507
8508 @cindex ECRS
8509 @cindex Encoding for Censorship-Resistant Sharing
8510 @node Encoding for Censorship-Resistant Sharing (ECRS)
8511 @subsection Encoding for Censorship-Resistant Sharing (ECRS)
8512
8513
8514
8515 When GNUnet shares files, it uses a content encoding that is called ECRS,
8516 the Encoding for Censorship-Resistant Sharing.
8517 Most of ECRS is described in the (so far unpublished) research paper
8518 attached to this page. ECRS obsoletes the previous ESED and ESED II
8519 encodings which were used in GNUnet before version 0.7.0.
8520 The rest of this page assumes that the reader is familiar with the
8521 attached paper. What follows is a description of some minor extensions
8522 that GNUnet makes over what is described in the paper.
8523 The reason why these extensions are not in the paper is that we felt
8524 that they were obvious or trivial extensions to the original scheme and
8525 thus did not warrant space in the research report.
8526
8527 @menu
8528 * Namespace Advertisements::
8529 * KSBlocks::
8530 @end menu
8531
8532 @node Namespace Advertisements
8533 @subsubsection Namespace Advertisements
8534
8535
8536 @c %**FIXME: all zeroses -> ?
8537
8538 An @code{SBlock} with identifier all zeros is a signed
8539 advertisement for a namespace. This special @code{SBlock} contains
8540 metadata describing the content of the namespace.
8541 Instead of the name of the identifier for a potential update, it contains
8542 the identifier for the root of the namespace.
8543 The URI should always be empty. The @code{SBlock} is signed with the
8544 content provider's RSA private key (just like any other SBlock). Peers
8545 can search for @code{SBlock}s in order to find out more about a namespace.
8546
8547 @node KSBlocks
8548 @subsubsection KSBlocks
8549
8550
8551
8552 GNUnet implements @code{KSBlocks} which are @code{KBlocks} that, instead
8553 of encrypting a CHK and metadata, encrypt an @code{SBlock} instead.
8554 In other words, @code{KSBlocks} enable GNUnet to find @code{SBlocks}
8555 using the global keyword search.
8556 Usually the encrypted @code{SBlock} is a namespace advertisement.
8557 The rationale behind @code{KSBlock}s and @code{SBlock}s is to enable
8558 peers to discover namespaces via keyword searches, and, to associate
8559 useful information with namespaces. When GNUnet finds @code{KSBlocks}
8560 during a normal keyword search, it adds the information to an internal
8561 list of discovered namespaces. Users looking for interesting namespaces
8562 can then inspect this list, reducing the need for out-of-band discovery
8563 of namespaces.
8564 Naturally, namespaces (or more specifically, namespace advertisements) can
8565 also be referenced from directories, but @code{KSBlock}s should make it
8566 easier to advertise namespaces for the owner of the pseudonym since they
8567 eliminate the need to first create a directory.
8568
8569 Collections are also advertised using @code{KSBlock}s.
8570
8571 @c https://old.gnunet.org/sites/default/files/ecrs.pdf
8572
8573 @node File-sharing persistence directory structure
8574 @subsection File-sharing persistence directory structure
8575
8576
8577
8578 This section documents how the file-sharing library implements
8579 persistence of file-sharing operations and specifically the resulting
8580 directory structure.
8581 This code is only active if the @code{GNUNET_FS_FLAGS_PERSISTENCE} flag
8582 was set when calling @code{GNUNET_FS_start}.
8583 In this case, the file-sharing library will try hard to ensure that all
8584 major operations (searching, downloading, publishing, unindexing) are
8585 persistent, that is, can live longer than the process itself.
8586 More specifically, an operation is supposed to live until it is
8587 explicitly stopped.
8588
8589 If @code{GNUNET_FS_stop} is called before an operation has been stopped, a
8590 @code{SUSPEND} event is generated and then when the process calls
8591 @code{GNUNET_FS_start} next time, a @code{RESUME} event is generated.
8592 Additionally, even if an application crashes (segfault, SIGKILL, system
8593 crash) and hence @code{GNUNET_FS_stop} is never called and no
8594 @code{SUSPEND} events are generated, operations are still resumed (with
8595 @code{RESUME} events).
8596 This is implemented by constantly writing the current state of the
8597 file-sharing operations to disk.
8598 Specifically, the current state is always written to disk whenever
8599 anything significant changes (the exception are block-wise progress in
8600 publishing and unindexing, since those operations would be slowed down
8601 significantly and can be resumed cheaply even without detailed
8602 accounting).
8603 Note that if the process crashes (or is killed) during a serialization
8604 operation, FS does not guarantee that this specific operation is
8605 recoverable (no strict transactional semantics, again for performance
8606 reasons). However, all other unrelated operations should resume nicely.
8607
8608 Since we need to serialize the state continuously and want to recover as
8609 much as possible even after crashing during a serialization operation,
8610 we do not use one large file for serialization.
8611 Instead, several directories are used for the various operations.
8612 When @code{GNUNET_FS_start} executes, the master directories are scanned
8613 for files describing operations to resume.
8614 Sometimes, these operations can refer to related operations in child
8615 directories which may also be resumed at this point.
8616 Note that corrupted files are cleaned up automatically.
8617 However, dangling files in child directories (those that are not
8618 referenced by files from the master directories) are not automatically
8619 removed.
8620
8621 Persistence data is kept in a directory that begins with the "STATE_DIR"
8622 prefix from the configuration file
8623 (by default, "$SERVICEHOME/persistence/") followed by the name of the
8624 client as given to @code{GNUNET_FS_start} (for example, "gnunet-gtk")
8625 followed by the actual name of the master or child directory.
8626
8627 The names for the master directories follow the names of the operations:
8628
8629 @itemize @bullet
8630 @item "search"
8631 @item "download"
8632 @item "publish"
8633 @item "unindex"
8634 @end itemize
8635
8636 Each of the master directories contains names (chosen at random) for each
8637 active top-level (master) operation.
8638 Note that a download that is associated with a search result is not a
8639 top-level operation.
8640
8641 In contrast to the master directories, the child directories are only
8642 consulted when another operation refers to them.
8643 For each search, a subdirectory (named after the master search
8644 synchronization file) contains the search results.
8645 Search results can have an associated download, which is then stored in
8646 the general "download-child" directory.
8647 Downloads can be recursive, in which case children are stored in
8648 subdirectories mirroring the structure of the recursive download
8649 (either starting in the master "download" directory or in the
8650 "download-child" directory depending on how the download was initiated).
8651 For publishing operations, the "publish-file" directory contains
8652 information about the individual files and directories that are part of
8653 the publication.
8654 However, this directory structure is flat and does not mirror the
8655 structure of the publishing operation.
8656 Note that unindex operations cannot have associated child operations.
8657
8658 @cindex REGEX subsystem
8659 @node REGEX Subsystem
8660 @section REGEX Subsystem
8661
8662
8663
8664 Using the REGEX subsystem, you can discover peers that offer a particular
8665 service using regular expressions.
8666 The peers that offer a service specify it using a regular expressions.
8667 Peers that want to patronize a service search using a string.
8668 The REGEX subsystem will then use the DHT to return a set of matching
8669 offerers to the patrons.
8670
8671 For the technical details, we have Max's defense talk and Max's Master's
8672 thesis.
8673
8674 @c An additional publication is under preparation and available to
8675 @c team members (in Git).
8676 @c FIXME: Where is the file? Point to it. Assuming that it's szengel2012ms
8677
8678 @menu
8679 * How to run the regex profiler::
8680 @end menu
8681
8682 @node How to run the regex profiler
8683 @subsection How to run the regex profiler
8684
8685
8686
8687 The gnunet-regex-profiler can be used to profile the usage of mesh/regex
8688 for a given set of regular expressions and strings.
8689 Mesh/regex allows you to announce your peer ID under a certain regex and
8690 search for peers matching a particular regex using a string.
8691 See @uref{https://bib.gnunet.org/full/date.html#2012_5f2, szengel2012ms} for a full
8692 introduction.
8693
8694 First of all, the regex profiler uses GNUnet testbed, thus all the
8695 implications for testbed also apply to the regex profiler
8696 (for example you need password-less ssh login to the machines listed in
8697 your hosts file).
8698
8699 @strong{Configuration}
8700
8701 Moreover, an appropriate configuration file is needed.
8702 Generally you can refer to the
8703 @file{contrib/regex_profiler_infiniband.conf} file in the sourcecode
8704 of GNUnet for an example configuration.
8705 In the following paragraph the important details are highlighted.
8706
8707 Announcing of the regular expressions is done by the
8708 gnunet-daemon-regexprofiler, therefore you have to make sure it is
8709 started, by adding it to the START_ON_DEMAND set of ARM:
8710
8711 @example
8712 [regexprofiler]
8713 START_ON_DEMAND = YES
8714 @end example
8715
8716 @noindent
8717 Furthermore you have to specify the location of the binary:
8718
8719 @example
8720 [regexprofiler]
8721 # Location of the gnunet-daemon-regexprofiler binary.
8722 BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler
8723 # Regex prefix that will be applied to all regular expressions and
8724 # search string.
8725 REGEX_PREFIX = "GNVPN-0001-PAD"
8726 @end example
8727
8728 @noindent
8729 When running the profiler with a large scale deployment, you probably
8730 want to reduce the workload of each peer.
8731 Use the following options to do this.
8732
8733 @example
8734 [dht]
8735 # Force network size estimation
8736 FORCE_NSE = 1
8737
8738 [dhtcache]
8739 DATABASE = heap
8740 # Disable RC-file for Bloom filter? (for benchmarking with limited IO
8741 # availability)
8742 DISABLE_BF_RC = YES
8743 # Disable Bloom filter entirely
8744 DISABLE_BF = YES
8745
8746 [nse]
8747 # Minimize proof-of-work CPU consumption by NSE
8748 WORKBITS = 1
8749 @end example
8750
8751 @noindent
8752 @strong{Options}
8753
8754 To finally run the profiler some options and the input data need to be
8755 specified on the command line.
8756
8757 @example
8758 gnunet-regex-profiler -c config-file -d log-file -n num-links \
8759 -p path-compression-length -s search-delay -t matching-timeout \
8760 -a num-search-strings hosts-file policy-dir search-strings-file
8761 @end example
8762
8763 @noindent
8764 Where...
8765
8766 @itemize @bullet
8767 @item ... @code{config-file} means the configuration file created earlier.
8768 @item ... @code{log-file} is the file where to write statistics output.
8769 @item ... @code{num-links} indicates the number of random links between
8770 started peers.
8771 @item ... @code{path-compression-length} is the maximum path compression
8772 length in the DFA.
8773 @item ... @code{search-delay} time to wait between peers finished linking
8774 and starting to match strings.
8775 @item ... @code{matching-timeout} timeout after which to cancel the
8776 searching.
8777 @item ... @code{num-search-strings} number of strings in the
8778 search-strings-file.
8779 @item ... the @code{hosts-file} should contain a list of hosts for the
8780 testbed, one per line in the following format:
8781
8782 @itemize @bullet
8783 @item @code{user@@host_ip:port}
8784 @end itemize
8785 @item ... the @code{policy-dir} is a folder containing text files
8786 containing one or more regular expressions. A peer is started for each
8787 file in that folder and the regular expressions in the corresponding file
8788 are announced by this peer.
8789 @item ... the @code{search-strings-file} is a text file containing search
8790 strings, one in each line.
8791 @end itemize
8792
8793 @noindent
8794 You can create regular expressions and search strings for every AS in the
8795 Internet using the attached scripts. You need one of the
8796 @uref{http://data.caida.org/datasets/routing/routeviews-prefix2as/, CAIDA routeviews prefix2as}
8797 data files for this. Run
8798
8799 @example
8800 create_regex.py <filename> <output path>
8801 @end example
8802
8803 @noindent
8804 to create the regular expressions and
8805
8806 @example
8807 create_strings.py <input path> <outfile>
8808 @end example
8809
8810 @noindent
8811 to create a search strings file from the previously created
8812 regular expressions.
8813
8814 @cindex REST subsystem
8815 @node REST Subsystem
8816 @section REST Subsystem
8817
8818
8819
8820 Using the REST subsystem, you can expose REST-based APIs or services.
8821 The REST service is designed as a pluggable architecture.
8822 To create a new REST endpoint, simply add a library in the form
8823 ``plugin_rest_*''.
8824 The REST service will automatically load all REST plugins on startup.
8825
8826 @strong{Configuration}
8827
8828 The REST service can be configured in various ways.
8829 The reference config file can be found in
8830 @file{src/rest/rest.conf}:
8831 @example
8832 [rest]
8833 REST_PORT=7776
8834 REST_ALLOW_HEADERS=Authorization,Accept,Content-Type
8835 REST_ALLOW_ORIGIN=*
8836 REST_ALLOW_CREDENTIALS=true
8837 @end example
8838
8839 The port as well as
8840 @deffn{cross-origin resource sharing} (CORS)
8841 @end deffn
8842 headers that are supposed to be advertised by the rest service are
8843 configurable.
8844
8845 @menu
8846 * Namespace considerations::
8847 * Endpoint documentation::
8848 @end menu
8849
8850 @node Namespace considerations
8851 @subsection Namespace considerations
8852
8853 The @command{gnunet-rest-service} will load all plugins that are installed.
8854 As such it is important that the endpoint namespaces do not clash.
8855
8856 For example, plugin X might expose the endpoint ``/xxx'' while plugin Y
8857 exposes endpoint ``/xxx/yyy''.
8858 This is a problem if plugin X is also supposed to handle a call
8859 to ``/xxx/yyy''.
8860 Currently the REST service will not complain or warn about such clashes,
8861 so please make sure that endpoints are unambiguous.
8862
8863 @node Endpoint documentation
8864 @subsection Endpoint documentation
8865
8866 This is WIP. Endpoints should be documented appropriately.
8867 Preferably using annotations.
8868
8869
8870 @cindex RPS Subsystem
8871 @node RPS Subsystem
8872 @section RPS Subsystem
8873
8874 In literature, Random Peer Sampling (RPS) refers to the problem of
8875 reliably@footnote{"Reliable" in this context means having no bias,
8876 neither spatial, nor temporal, nor through malicious activity.} drawing
8877 random samples from an unstructured p2p network.
8878
8879 Doing so in a reliable manner is not only hard because of inherent
8880 problems but also because of possible malicious peers that could try to
8881 bias the selection.
8882
8883 It is useful for all kind of gossip protocols that require the selection
8884 of random peers in the whole network like gathering statistics,
8885 spreading and aggregating information in the network, load balancing and
8886 overlay topology management.
8887
8888 The approach chosen in the RPS service implementation in GNUnet follows
8889 the @uref{https://bib.gnunet.org/full/date.html\#2009_5f0, Brahms}
8890 design.
8891
8892 The current state is "work in progress". There are a lot of things that
8893 need to be done, primarily finishing the experimental evaluation and a
8894 re-design of the API.
8895
8896 The abstract idea is to subscribe to connect to/start the RPS service
8897 and request random peers that will be returned when they represent a
8898 random selection from the whole network with high probability.
8899
8900 An additional feature to the original Brahms-design is the selection of
8901 sub-groups: The GNUnet implementation of RPS enables clients to ask for
8902 random peers from a group that is defined by a common shared secret.
8903 (The secret could of course also be public, depending on the use-case.)
8904
8905 Another addition to the original protocol was made: The sampler
8906 mechanism that was introduced in Brahms was slightly adapted and used to
8907 actually sample the peers and returned to the client.
8908 This is necessary as the original design only keeps peers connected to
8909 random other peers in the network. In order to return random peers to
8910 client requests independently random, they cannot be drawn from the
8911 connected peers.
8912 The adapted sampler makes sure that each request for random peers is
8913 independent from the others.
8914
8915 @menu
8916 * Brahms::
8917 @end menu
8918
8919 @node Brahms
8920 @subsection Brahms
8921 The high-level concept of Brahms is two-fold: Combining push-pull gossip
8922 with locally fixing a assumed bias using cryptographic min-wise
8923 permutations.
8924 The central data structure is the view - a peer's current local sample.
8925 This view is used to select peers to push to and pull from.
8926 This simple mechanism can be biased easily. For this reason Brahms
8927 'fixes' the bias by using the so-called sampler. A data structure that
8928 takes a list of elements as input and outputs a random one of them
8929 independently of the frequency in the input set. Both an element that
8930 was put into the sampler a single time and an element that was put into
8931 it a million times have the same probability of being the output.
8932 This is achieved with exploiting min-wise independent
8933 permutations.
8934 In the RPS service we use HMACs: On the initialisation of a sampler
8935 element, a key is chosen at random. On each input the HMAC with the
8936 random key is computed. The sampler element keeps the element with the
8937 minimal HMAC.
8938
8939 In order to fix the bias in the view, a fraction of the elements in the
8940 view are sampled through the sampler from the random stream of peer IDs.
8941
8942 According to the theoretical analysis of Bortnikov et al. this suffices
8943 to keep the network connected and having random peers in the view.
8944