doc/handbook/chapters/developer.texi

   1 @c ***********************************************************************
   2 @node GNUnet Developer Handbook
   3 @chapter GNUnet Developer Handbook
   4
   5 This book is intended to be an introduction for programmers that want to
   6 extend the GNUnet framework. GNUnet is more than a simple peer-to-peer
   7 application.
   8
   9 For developers, GNUnet is:
  10
  11 @itemize @bullet
  12 @item developed by a community that believes in the GNU philosophy
  13 @item Free Software (Free as in Freedom), licensed under the
  14 GNU Affero General Public License
  15 (@uref{https://www.gnu.org/licenses/licenses.html#AGPL})
  16 @item A set of standards, including coding conventions and
  17 architectural rules
  18 @item A set of layered protocols, both specifying the communication
  19 between peers as well as the communication between components
  20 of a single peer
  21 @item A set of libraries with well-defined APIs suitable for
  22 writing extensions
  23 @end itemize
  24
  25 In particular, the architecture specifies that a peer consists of many
  26 processes communicating via protocols. Processes can be written in almost
  27 any language.
  28 @code{C}, @code{Java} and @code{Guile} APIs exist for accessing existing
  29 services and for writing extensions.
  30 It is possible to write extensions in other languages by
  31 implementing the necessary IPC protocols.
  32
  33 GNUnet can be extended and improved along many possible dimensions, and
  34 anyone interested in Free Software and Freedom-enhancing Networking is
  35 welcome to join the effort. This Developer Handbook attempts to provide
  36 an initial introduction to some of the key design choices and central
  37 components of the system.
  38 This part of the GNUNet documentation is far from complete,
  39 and we welcome informed contributions, be it in the form of
  40 new chapters, sections or insightful comments.
  41
  42 @menu
  43 * Developer Introduction::
  44 * Internal dependencies::
  45 * Code overview::
  46 * System Architecture::
  47 * Subsystem stability::
  48 * Naming conventions and coding style guide::
  49 * Build-system::
  50 * Developing extensions for GNUnet using the gnunet-ext template::
  51 * Writing testcases::
  52 * Building GNUnet and its dependencies::
  53 * TESTING library::
  54 * Performance regression analysis with Gauger::
  55 * TESTBED Subsystem::
  56 * libgnunetutil::
  57 * Automatic Restart Manager (ARM)::
  58 * TRANSPORT Subsystem::
  59 * NAT library::
  60 * Distance-Vector plugin::
  61 * SMTP plugin::
  62 * Bluetooth plugin::
  63 * WLAN plugin::
  64 * ATS Subsystem::
  65 * CORE Subsystem::
  66 * CADET Subsystem::
  67 * NSE Subsystem::
  68 * HOSTLIST Subsystem::
  69 * IDENTITY Subsystem::
  70 * NAMESTORE Subsystem::
  71 * PEERINFO Subsystem::
  72 * PEERSTORE Subsystem::
  73 * SET Subsystem::
  74 * STATISTICS Subsystem::
  75 * Distributed Hash Table (DHT)::
  76 * GNU Name System (GNS)::
  77 * GNS Namecache::
  78 * REVOCATION Subsystem::
  79 * File-sharing (FS) Subsystem::
  80 * REGEX Subsystem::
  81 * REST Subsystem::
  82 @end menu
  83
  84 @node Developer Introduction
  85 @section Developer Introduction
  86
  87 This Developer Handbook is intended as first introduction to GNUnet for
  88 new developers that want to extend the GNUnet framework. After the
  89 introduction, each of the GNUnet subsystems (directories in the
  90 @file{src/} tree) is (supposed to be) covered in its own chapter. In
  91 addition to this documentation, GNUnet developers should be aware of the
  92 services available on the GNUnet server to them.
  93
  94 New developers can have a look a the GNUnet tutorials for C and java
  95 available in the @file{src/} directory of the repository or under the
  96 following links:
  97
  98 @c ** FIXME: Link to files in source, not online.
  99 @c ** FIXME: Where is the Java tutorial?
 100 @itemize @bullet
 101 @item @xref{Top, Introduction,, gnunet-c-tutorial, The GNUnet C Tutorial}.
 102 @c broken link
 103 @c @item @uref{https://git.gnunet.org/gnunet.git/plain/doc/gnunet-c-tutorial.pdf, GNUnet C tutorial}
 104 @item GNUnet Java tutorial
 105 @end itemize
 106
 107 In addition to the GNUnet Reference Documentation you are reading,
 108 the GNUnet server at @uref{https://gnunet.org} contains
 109 various resources for GNUnet developers and those
 110 who aspire to become regular contributors.
 111 They are all conveniently reachable via the "Developer"
 112 entry in the navigation menu. Some additional tools (such as static
 113 analysis reports) require a special developer access to perform certain
 114 operations. If you want (or require) access, you should contact
 115 @uref{http://grothoff.org/christian/, Christian Grothoff},
 116 GNUnet's maintainer.
 117
 118 @c FIXME: A good part of this belongs on the website or should be
 119 @c extended in subsections explaining usage of this. A simple list
 120 @c is just taking space people have to read.
 121 The public subsystems on the GNUnet server that help developers are:
 122
 123 @itemize @bullet
 124
 125 @item The version control system (git) keeps our code and enables
 126 distributed development.
 127 It is publicly accessible at @uref{https://git.gnunet.org/}.
 128 Only developers with write access can commit code, everyone else is
 129 encouraged to submit patches to the GNUnet-developers mailinglist:
 130 @uref{https://lists.gnu.org/mailman/listinfo/gnunet-developers, https://lists.gnu.org/mailman/listinfo/gnunet-developers}
 131
 132 @item The bugtracking system (Mantis).
 133 We use it to track feature requests, open bug reports and their
 134 resolutions.
 135 It can be accessed at
 136 @uref{https://bugs.gnunet.org/, https://bugs.gnunet.org/}.
 137 Anyone can report bugs.
 138
 139 @item Our site installation of the
 140 Continuous Integration (CI) system @code{Buildbot} is used
 141 to check GNUnet builds automatically on a range of platforms.
 142 The web interface of this CI is exposed at
 143 @uref{https://old.gnunet.org/buildbot/, https://old.gnunet.org/buildbot/}.
 144 Builds are triggered automatically 30 minutes after the last commit to
 145 our repository was made.
 146
 147 @item The current quality of our automated test suite is assessed using
 148 Code coverage analysis. This analysis is run daily; however the webpage
 149 is only updated if all automated tests pass at that time. Testcases that
 150 improve our code coverage are always welcome.
 151
 152 @item We try to automatically find bugs using a static analysis scan.
 153 This scan is run daily; however the webpage is only updated if all
 154 automated tests pass at the time. Note that not everything that is
 155 flagged by the analysis is a bug, sometimes even good code can be marked
 156 as possibly problematic. Nevertheless, developers are encouraged to at
 157 least be aware of all issues in their code that are listed.
 158
 159 @item We use Gauger for automatic performance regression visualization.
 160 @c FIXME: LINK!
 161 Details on how to use Gauger are here.
 162
 163 @item We use @uref{http://junit.org/, junit} to automatically test
 164 @command{gnunet-java}.
 165 Automatically generated, current reports on the test suite are here.
 166 @c FIXME: Likewise.
 167
 168 @item We use Cobertura to generate test coverage reports for gnunet-java.
 169 Current reports on test coverage are here.
 170 @c FIXME: Likewise.
 171
 172 @end itemize
 173
 174
 175
 176 @c ***********************************************************************
 177 @menu
 178 * Project overview::
 179 @end menu
 180
 181 @node Project overview
 182 @subsection Project overview
 183
 184 The GNUnet project consists at this point of several sub-projects. This
 185 section is supposed to give an initial overview about the various
 186 sub-projects. Note that this description also lists projects that are far
 187 from complete, including even those that have literally not a single line
 188 of code in them yet.
 189
 190 GNUnet sub-projects in order of likely relevance are currently:
 191
 192 @table @asis
 193
 194 @item @command{gnunet}
 195 Core of the P2P framework, including file-sharing, VPN and
 196 chat applications; this is what the Developer Handbook covers mostly
 197 @item @command{gnunet-gtk}
 198 Gtk+-based user interfaces, including:
 199
 200 @itemize @bullet
 201 @item @command{gnunet-fs-gtk} (file-sharing),
 202 @item @command{gnunet-statistics-gtk} (statistics over time),
 203 @item @command{gnunet-peerinfo-gtk}
 204 (information about current connections and known peers),
 205 @item @command{gnunet-namestore-gtk} (GNS record editor),
 206 @item @command{gnunet-conversation-gtk} (voice chat GUI) and
 207 @item @command{gnunet-setup} (setup tool for "everything")
 208 @end itemize
 209
 210 @item @command{gnunet-fuse}
 211 Mounting directories shared via GNUnet's file-sharing
 212 on GNU/Linux distributions
 213 @item @command{gnunet-update}
 214 Installation and update tool
 215 @item @command{gnunet-ext}
 216 Template for starting 'external' GNUnet projects
 217 @item @command{gnunet-java}
 218 Java APIs for writing GNUnet services and applications
 219 @item @command{gnunet-java-ext}
 220 @item @command{eclectic}
 221 Code to run GNUnet nodes on testbeds for research, development,
 222 testing and evaluation
 223 @c ** FIXME: Solve the status and location of gnunet-qt
 224 @item @command{gnunet-qt}
 225 Qt-based GNUnet GUI (is it deprecated?)
 226 @item @command{gnunet-cocoa}
 227 cocoa-based GNUnet GUI (is it deprecated?)
 228 @item @command{gnunet-guile}
 229 Guile bindings for GNUnet
 230 @item @command{gnunet-python}
 231 Python bindings for GNUnet
 232
 233 @end table
 234
 235 We are also working on various supporting libraries and tools:
 236 @c ** FIXME: What about gauger, and what about libmwmodem?
 237
 238 @table @asis
 239 @item @command{libextractor}
 240 GNU libextractor (meta data extraction)
 241 @item @command{libmicrohttpd}
 242 GNU libmicrohttpd (embedded HTTP(S) server library)
 243 @item @command{gauger}
 244 Tool for performance regression analysis
 245 @item @command{monkey}
 246 Tool for automated debugging of distributed systems
 247 @item @command{libmwmodem}
 248 Library for accessing satellite connection quality reports
 249 @item @command{libgnurl}
 250 gnURL (feature-restricted variant of cURL/libcurl)
 251 @item @command{www}
 252 work in progress of the new gnunet.org website (Jinja2 framework based to
 253 replace our current Drupal website)
 254 @item @command{bibliography}
 255 Our collected bibliography, papers, references, and so forth
 256 @item @command{gnunet-videos-}
 257 Videos about and around gnunet activities
 258 @end table
 259
 260 Finally, there are various external projects (see links for a list of
 261 those that have a public website) which build on top of the GNUnet
 262 framework.
 263
 264 @c ***********************************************************************
 265 @node Internal dependencies
 266 @section Internal dependencies
 267
 268 This section tries to give an overview of what processes a typical GNUnet
 269 peer running a particular application would consist of. All of the
 270 processes listed here should be automatically started by
 271 @command{gnunet-arm -s}.
 272 The list is given as a rough first guide to users for failure diagnostics.
 273 Ideally, end-users should never have to worry about these internal
 274 dependencies.
 275
 276 In terms of internal dependencies, a minimum file-sharing system consists
 277 of the following GNUnet processes (in order of dependency):
 278
 279 @itemize @bullet
 280 @item gnunet-service-arm
 281 @item gnunet-service-resolver (required by all)
 282 @item gnunet-service-statistics (required by all)
 283 @item gnunet-service-peerinfo
 284 @item gnunet-service-transport (requires peerinfo)
 285 @item gnunet-service-core (requires transport)
 286 @item gnunet-daemon-hostlist (requires core)
 287 @item gnunet-daemon-topology (requires hostlist, peerinfo)
 288 @item gnunet-service-datastore
 289 @item gnunet-service-dht (requires core)
 290 @item gnunet-service-identity
 291 @item gnunet-service-fs (requires identity, mesh, dht, datastore, core)
 292 @end itemize
 293
 294 @noindent
 295 A minimum VPN system consists of the following GNUnet processes (in
 296 order of dependency):
 297
 298 @itemize @bullet
 299 @item gnunet-service-arm
 300 @item gnunet-service-resolver (required by all)
 301 @item gnunet-service-statistics (required by all)
 302 @item gnunet-service-peerinfo
 303 @item gnunet-service-transport (requires peerinfo)
 304 @item gnunet-service-core (requires transport)
 305 @item gnunet-daemon-hostlist (requires core)
 306 @item gnunet-service-dht (requires core)
 307 @item gnunet-service-mesh (requires dht, core)
 308 @item gnunet-service-dns (requires dht)
 309 @item gnunet-service-regex (requires dht)
 310 @item gnunet-service-vpn (requires regex, dns, mesh, dht)
 311 @end itemize
 312
 313 @noindent
 314 A minimum GNS system consists of the following GNUnet processes (in
 315 order of dependency):
 316
 317 @itemize @bullet
 318 @item gnunet-service-arm
 319 @item gnunet-service-resolver (required by all)
 320 @item gnunet-service-statistics (required by all)
 321 @item gnunet-service-peerinfo
 322 @item gnunet-service-transport (requires peerinfo)
 323 @item gnunet-service-core (requires transport)
 324 @item gnunet-daemon-hostlist (requires core)
 325 @item gnunet-service-dht (requires core)
 326 @item gnunet-service-mesh (requires dht, core)
 327 @item gnunet-service-dns (requires dht)
 328 @item gnunet-service-regex (requires dht)
 329 @item gnunet-service-vpn (requires regex, dns, mesh, dht)
 330 @item gnunet-service-identity
 331 @item gnunet-service-namestore (requires identity)
 332 @item gnunet-service-gns (requires vpn, dns, dht, namestore, identity)
 333 @end itemize
 334
 335 @c ***********************************************************************
 336 @node Code overview
 337 @section Code overview
 338
 339 This section gives a brief overview of the GNUnet source code.
 340 Specifically, we sketch the function of each of the subdirectories in
 341 the @file{gnunet/src/} directory. The order given is roughly bottom-up
 342 (in terms of the layers of the system).
 343
 344 @table @asis
 345 @item @file{util/} --- libgnunetutil
 346 Library with general utility functions, all
 347 GNUnet binaries link against this library. Anything from memory
 348 allocation and data structures to cryptography and inter-process
 349 communication. The goal is to provide an OS-independent interface and
 350 more 'secure' or convenient implementations of commonly used primitives.
 351 The API is spread over more than a dozen headers, developers should study
 352 those closely to avoid duplicating existing functions.
 353 @pxref{libgnunetutil}.
 354 @item @file{hello/} --- libgnunethello
 355 HELLO messages are used to
 356 describe under which addresses a peer can be reached (for example,
 357 protocol, IP, port). This library manages parsing and generating of HELLO
 358 messages.
 359 @item @file{block/} --- libgnunetblock
 360 The DHT and other components of GNUnet
 361 store information in units called 'blocks'. Each block has a type and the
 362 type defines a particular format and how that binary format is to be
 363 linked to a hash code (the key for the DHT and for databases). The block
 364 library is a wrapper around block plugins which provide the necessary
 365 functions for each block type.
 366 @item @file{statistics/} --- statistics service
 367 The statistics service enables associating
 368 values (of type uint64_t) with a component name and a string. The main
 369 uses is debugging (counting events), performance tracking and user
 370 entertainment (what did my peer do today?).
 371 @item @file{arm/} --- Automatic Restart Manager (ARM)
 372 The automatic-restart-manager (ARM) service
 373 is the GNUnet master service. Its role is to start gnunet-services, to
 374 re-start them when they crashed and finally to shut down the system when
 375 requested.
 376 @item @file{peerinfo/} --- peerinfo service
 377 The peerinfo service keeps track of which peers are known
 378 to the local peer and also tracks the validated addresses for each peer
 379 (in the form of a HELLO message) for each of those peers. The peer is not
 380 necessarily connected to all peers known to the peerinfo service.
 381 Peerinfo provides persistent storage for peer identities --- peers are
 382 not forgotten just because of a system restart.
 383 @item @file{datacache/} --- libgnunetdatacache
 384 The datacache library provides (temporary) block storage for the DHT.
 385 Existing plugins can store blocks in Sqlite, Postgres or MySQL databases.
 386 All data stored in the cache is lost when the peer is stopped or
 387 restarted (datacache uses temporary tables).
 388 @item @file{datastore/} --- datastore service
 389 The datastore service stores file-sharing blocks in
 390 databases for extended periods of time. In contrast to the datacache, data
 391 is not lost when peers restart. However, quota restrictions may still
 392 cause old, expired or low-priority data to be eventually discarded.
 393 Existing plugins can store blocks in Sqlite, Postgres or MySQL databases.
 394 @item @file{template/} --- service template
 395 Template for writing a new service. Does nothing.
 396 @item @file{ats/} --- Automatic Transport Selection
 397 The automatic transport selection (ATS) service
 398 is responsible for deciding which address (i.e.
 399 which transport plugin) should be used for communication with other peers,
 400 and at what bandwidth.
 401 @item @file{nat/} --- libgnunetnat
 402 Library that provides basic functions for NAT traversal.
 403 The library supports NAT traversal with
 404 manual hole-punching by the user, UPnP and ICMP-based autonomous NAT
 405 traversal. The library also includes an API for testing if the current
 406 configuration works and the @code{gnunet-nat-server} which provides an
 407 external service to test the local configuration.
 408 @item @file{fragmentation/} --- libgnunetfragmentation
 409 Some transports (UDP and WLAN, mostly) have restrictions on the maximum
 410 transfer unit (MTU) for packets. The fragmentation library can be used to
 411 break larger packets into chunks of at most 1k and transmit the resulting
 412 fragments reliably (with acknowledgment, retransmission, timeouts,
 413 etc.).
 414 @item @file{transport/} --- transport service
 415 The transport service is responsible for managing the
 416 basic P2P communication. It uses plugins to support P2P communication
 417 over TCP, UDP, HTTP, HTTPS and other protocols.The transport service
 418 validates peer addresses, enforces bandwidth restrictions, limits the
 419 total number of connections and enforces connectivity restrictions (i.e.
 420 friends-only).
 421 @item @file{peerinfo-tool/} --- gnunet-peerinfo
 422 This directory contains the gnunet-peerinfo binary which can be used to
 423 inspect the peers and HELLOs known to the peerinfo service.
 424 @item @file{core/}
 425 The core service is responsible for establishing encrypted, authenticated
 426 connections with other peers, encrypting and decrypting messages and
 427 forwarding messages to higher-level services that are interested in them.
 428 @item @file{testing/} --- libgnunettesting
 429 The testing library allows starting (and stopping) peers
 430 for writing testcases.
 431 It also supports automatic generation of configurations for peers
 432 ensuring that the ports and paths are disjoint. libgnunettesting is also
 433 the foundation for the testbed service
 434 @item @file{testbed/} --- testbed service
 435 The testbed service is used for creating small or large scale deployments
 436 of GNUnet peers for evaluation of protocols.
 437 It facilitates peer deployments on multiple
 438 hosts (for example, in a cluster) and establishing various network
 439 topologies (both underlay and overlay).
 440 @item @file{nse/} --- Network Size Estimation
 441 The network size estimation (NSE) service
 442 implements a protocol for (securely) estimating the current size of the
 443 P2P network.
 444 @item @file{dht/} --- distributed hash table
 445 The distributed hash table (DHT) service provides a
 446 distributed implementation of a hash table to store blocks under hash
 447 keys in the P2P network.
 448 @item @file{hostlist/} --- hostlist service
 449 The hostlist service allows learning about
 450 other peers in the network by downloading HELLO messages from an HTTP
 451 server, can be configured to run such an HTTP server and also implements
 452 a P2P protocol to advertise and automatically learn about other peers
 453 that offer a public hostlist server.
 454 @item @file{topology/} --- topology service
 455 The topology service is responsible for
 456 maintaining the mesh topology. It tries to maintain connections to friends
 457 (depending on the configuration) and also tries to ensure that the peer
 458 has a decent number of active connections at all times. If necessary, new
 459 connections are added. All peers should run the topology service,
 460 otherwise they may end up not being connected to any other peer (unless
 461 some other service ensures that core establishes the required
 462 connections). The topology service also tells the transport service which
 463 connections are permitted (for friend-to-friend networking)
 464 @item @file{fs/} --- file-sharing
 465 The file-sharing (FS) service implements GNUnet's
 466 file-sharing application. Both anonymous file-sharing (using gap) and
 467 non-anonymous file-sharing (using dht) are supported.
 468 @item @file{cadet/} --- cadet service
 469 The CADET service provides a general-purpose routing abstraction to create
 470 end-to-end encrypted tunnels in mesh networks. We wrote a paper
 471 documenting key aspects of the design.
 472 @item @file{tun/} --- libgnunettun
 473 Library for building IPv4, IPv6 packets and creating
 474 checksums for UDP, TCP and ICMP packets. The header
 475 defines C structs for common Internet packet formats and in particular
 476 structs for interacting with TUN (virtual network) interfaces.
 477 @item @file{mysql/} --- libgnunetmysql
 478 Library for creating and executing prepared MySQL
 479 statements and to manage the connection to the MySQL database.
 480 Essentially a lightweight wrapper for the interaction between GNUnet
 481 components and libmysqlclient.
 482 @item @file{dns/}
 483 Service that allows intercepting and modifying DNS requests of
 484 the local machine. Currently used for IPv4-IPv6 protocol translation
 485 (DNS-ALG) as implemented by "pt/" and for the GNUnet naming system. The
 486 service can also be configured to offer an exit service for DNS traffic.
 487 @item @file{vpn/} --- VPN service
 488 The virtual public network (VPN) service provides a virtual
 489 tunnel interface (VTUN) for IP routing over GNUnet.
 490 Needs some other peers to run an "exit" service to work.
 491 Can be activated using the "gnunet-vpn" tool or integrated with DNS using
 492 the "pt" daemon.
 493 @item @file{exit/}
 494 Daemon to allow traffic from the VPN to exit this
 495 peer to the Internet or to specific IP-based services of the local peer.
 496 Currently, an exit service can only be restricted to IPv4 or IPv6, not to
 497 specific ports and or IP address ranges. If this is not acceptable,
 498 additional firewall rules must be added manually. exit currently only
 499 works for normal UDP, TCP and ICMP traffic; DNS queries need to leave the
 500 system via a DNS service.
 501 @item @file{pt/}
 502 protocol translation daemon. This daemon enables 4-to-6,
 503 6-to-4, 4-over-6 or 6-over-4 transitions for the local system. It
 504 essentially uses "DNS" to intercept DNS replies and then maps results to
 505 those offered by the VPN, which then sends them using mesh to some daemon
 506 offering an appropriate exit service.
 507 @item @file{identity/}
 508 Management of egos (alter egos) of a user; identities are
 509 essentially named ECC private keys and used for zones in the GNU name
 510 system and for namespaces in file-sharing, but might find other uses later
 511 @item @file{revocation/}
 512 Key revocation service, can be used to revoke the
 513 private key of an identity if it has been compromised
 514 @item @file{namecache/}
 515 Cache for resolution results for the GNU name system;
 516 data is encrypted and can be shared among users,
 517 loss of the data should ideally only result in a
 518 performance degradation (persistence not required)
 519 @item @file{namestore/}
 520 Database for the GNU name system with per-user private information,
 521 persistence required
 522 @item @file{gns/}
 523 GNU name system, a GNU approach to DNS and PKI.
 524 @item @file{dv/}
 525 A plugin for distance-vector (DV)-based routing.
 526 DV consists of a service and a transport plugin to provide peers
 527 with the illusion of a direct P2P connection for connections
 528 that use multiple (typically up to 3) hops in the actual underlay network.
 529 @item @file{regex/}
 530 Service for the (distributed) evaluation of regular expressions.
 531 @item @file{scalarproduct/}
 532 The scalar product service offers an API to perform a secure multiparty
 533 computation which calculates a scalar product between two peers
 534 without exposing the private input vectors of the peers to each other.
 535 @item @file{consensus/}
 536 The consensus service will allow a set of peers to agree
 537 on a set of values via a distributed set union computation.
 538 @item @file{rest/}
 539 The rest API allows access to GNUnet services using RESTful interaction.
 540 The services provide plugins that can exposed by the rest server.
 541 @c FIXME: Where did this disappear to?
 542 @c @item @file{experimentation/}
 543 @c The experimentation daemon coordinates distributed
 544 @c experimentation to evaluate transport and ATS properties.
 545 @end table
 546
 547 @c ***********************************************************************
 548 @node System Architecture
 549 @section System Architecture
 550
 551 @c FIXME: For those irritated by the textflow, we are missing images here,
 552 @c in the short term we should add them back, in the long term this should
 553 @c work without images or have images with alt-text.
 554
 555 GNUnet developers like LEGOs. The blocks are indestructible, can be
 556 stacked together to construct complex buildings and it is generally easy
 557 to swap one block for a different one that has the same shape. GNUnet's
 558 architecture is based on LEGOs:
 559
 560 @c @image{images/service_lego_block,5in,,picture of a LEGO block stack - 3 APIs as connectors upon Network Protocol on top of a Service}
 561
 562 This chapter documents the GNUnet LEGO system, also known as GNUnet's
 563 system architecture.
 564
 565 The most common GNUnet component is a service. Services offer an API (or
 566 several, depending on what you count as "an API") which is implemented as
 567 a library. The library communicates with the main process of the service
 568 using a service-specific network protocol. The main process of the service
 569 typically doesn't fully provide everything that is needed --- it has holes
 570 to be filled by APIs to other services.
 571
 572 A special kind of component in GNUnet are user interfaces and daemons.
 573 Like services, they have holes to be filled by APIs of other services.
 574 Unlike services, daemons do not implement their own network protocol and
 575 they have no API:
 576
 577 The GNUnet system provides a range of services, daemons and user
 578 interfaces, which are then combined into a layered GNUnet instance (also
 579 known as a peer).
 580
 581 Note that while it is generally possible to swap one service for another
 582 compatible service, there is often only one implementation. However,
 583 during development we often have a "new" version of a service in parallel
 584 with an "old" version. While the "new" version is not working, developers
 585 working on other parts of the service can continue their development by
 586 simply using the "old" service. Alternative design ideas can also be
 587 easily investigated by swapping out individual components. This is
 588 typically achieved by simply changing the name of the "BINARY" in the
 589 respective configuration section.
 590
 591 Key properties of GNUnet services are that they must be separate
 592 processes and that they must protect themselves by applying tight error
 593 checking against the network protocol they implement (thereby achieving a
 594 certain degree of robustness).
 595
 596 On the other hand, the APIs are implemented to tolerate failures of the
 597 service, isolating their host process from errors by the service. If the
 598 service process crashes, other services and daemons around it should not
 599 also fail, but instead wait for the service process to be restarted by
 600 ARM.
 601
 602
 603 @c ***********************************************************************
 604 @node Subsystem stability
 605 @section Subsystem stability
 606
 607 This section documents the current stability of the various GNUnet
 608 subsystems. Stability here describes the expected degree of compatibility
 609 with future versions of GNUnet. For each subsystem we distinguish between
 610 compatibility on the P2P network level (communication protocol between
 611 peers), the IPC level (communication between the service and the service
 612 library) and the API level (stability of the API). P2P compatibility is
 613 relevant in terms of which applications are likely going to be able to
 614 communicate with future versions of the network. IPC communication is
 615 relevant for the implementation of language bindings that re-implement the
 616 IPC messages. Finally, API compatibility is relevant to developers that
 617 hope to be able to avoid changes to applications build on top of the APIs
 618 of the framework.
 619
 620 The following table summarizes our current view of the stability of the
 621 respective protocols or APIs:
 622
 623 @multitable @columnfractions .20 .20 .20 .20
 624 @headitem Subsystem @tab P2P @tab IPC @tab C API
 625 @item util @tab n/a @tab n/a @tab stable
 626 @item arm @tab n/a @tab stable @tab stable
 627 @item ats @tab n/a @tab unstable @tab testing
 628 @item block @tab n/a @tab n/a @tab stable
 629 @item cadet @tab testing @tab testing @tab testing
 630 @item consensus @tab experimental @tab experimental @tab experimental
 631 @item core @tab stable @tab stable @tab stable
 632 @item datacache @tab n/a @tab n/a @tab stable
 633 @item datastore @tab n/a @tab stable @tab stable
 634 @item dht @tab stable @tab stable @tab stable
 635 @item dns @tab stable @tab stable @tab stable
 636 @item dv @tab testing @tab testing @tab n/a
 637 @item exit @tab testing @tab n/a @tab n/a
 638 @item fragmentation @tab stable @tab n/a @tab stable
 639 @item fs @tab stable @tab stable @tab stable
 640 @item gns @tab stable @tab stable @tab stable
 641 @item hello @tab n/a @tab n/a @tab testing
 642 @item hostlist @tab stable @tab stable @tab n/a
 643 @item identity @tab stable @tab stable @tab n/a
 644 @item multicast @tab experimental @tab experimental @tab experimental
 645 @item mysql @tab stable @tab n/a @tab stable
 646 @item namestore @tab n/a @tab stable @tab stable
 647 @item nat @tab n/a @tab n/a @tab stable
 648 @item nse @tab stable @tab stable @tab stable
 649 @item peerinfo @tab n/a @tab stable @tab stable
 650 @item psyc @tab experimental @tab experimental @tab experimental
 651 @item pt @tab n/a @tab n/a @tab n/a
 652 @item regex @tab stable @tab stable @tab stable
 653 @item revocation @tab stable @tab stable @tab stable
 654 @item social @tab experimental @tab experimental @tab experimental
 655 @item statistics @tab n/a @tab stable @tab stable
 656 @item testbed @tab n/a @tab testing @tab testing
 657 @item testing @tab n/a @tab n/a @tab testing
 658 @item topology @tab n/a @tab n/a @tab n/a
 659 @item transport @tab stable @tab stable @tab stable
 660 @item tun @tab n/a @tab n/a @tab stable
 661 @item vpn @tab testing @tab n/a @tab n/a
 662 @end multitable
 663
 664 Here is a rough explanation of the values:
 665
 666 @table @samp
 667 @item stable
 668 No incompatible changes are planned at this time; for IPC/APIs, if
 669 there are incompatible changes, they will be minor and might only require
 670 minimal changes to existing code; for P2P, changes will be avoided if at
 671 all possible for the 0.10.x-series
 672
 673 @item testing
 674 No incompatible changes are
 675 planned at this time, but the code is still known to be in flux; so while
 676 we have no concrete plans, our expectation is that there will still be
 677 minor modifications; for P2P, changes will likely be extensions that
 678 should not break existing code
 679
 680 @item unstable
 681 Changes are planned and will happen; however, they
 682 will not be totally radical and the result should still resemble what is
 683 there now; nevertheless, anticipated changes will break protocol/API
 684 compatibility
 685
 686 @item experimental
 687 Changes are planned and the result may look nothing like
 688 what the API/protocol looks like today
 689
 690 @item unknown
 691 Someone should think about where this subsystem headed
 692
 693 @item n/a
 694 This subsystem does not have an API/IPC-protocol/P2P-protocol
 695 @end table
 696
 697 @c ***********************************************************************
 698 @node Naming conventions and coding style guide
 699 @section Naming conventions and coding style guide
 700
 701 Here you can find some rules to help you write code for GNUnet.
 702
 703 @c ***********************************************************************
 704 @menu
 705 * Naming conventions::
 706 * Coding style::
 707 @end menu
 708
 709 @node Naming conventions
 710 @subsection Naming conventions
 711
 712
 713 @c ***********************************************************************
 714 @menu
 715 * include files::
 716 * binaries::
 717 * logging::
 718 * configuration::
 719 * exported symbols::
 720 * private (library-internal) symbols (including structs and macros)::
 721 * testcases::
 722 * performance tests::
 723 * src/ directories::
 724 @end menu
 725
 726 @node include files
 727 @subsubsection include files
 728
 729 @itemize @bullet
 730 @item _lib: library without need for a process
 731 @item _service: library that needs a service process
 732 @item _plugin: plugin definition
 733 @item _protocol: structs used in network protocol
 734 @item exceptions:
 735 @itemize @bullet
 736 @item gnunet_config.h --- generated
 737 @item platform.h --- first included
 738 @item plibc.h --- external library
 739 @item gnunet_common.h --- fundamental routines
 740 @item gnunet_directories.h --- generated
 741 @item gettext.h --- external library
 742 @end itemize
 743 @end itemize
 744
 745 @c ***********************************************************************
 746 @node binaries
 747 @subsubsection binaries
 748
 749 @itemize @bullet
 750 @item gnunet-service-xxx: service process (has listen socket)
 751 @item gnunet-daemon-xxx: daemon process (no listen socket)
 752 @item gnunet-helper-xxx[-yyy]: SUID helper for module xxx
 753 @item gnunet-yyy: command-line tool for end-users
 754 @item libgnunet_plugin_xxx_yyy.so: plugin for API xxx
 755 @item libgnunetxxx.so: library for API xxx
 756 @end itemize
 757
 758 @c ***********************************************************************
 759 @node logging
 760 @subsubsection logging
 761
 762 @itemize @bullet
 763 @item services and daemons use their directory name in
 764 @code{GNUNET_log_setup} (i.e. 'core') and log using
 765 plain 'GNUNET_log'.
 766 @item command-line tools use their full name in
 767 @code{GNUNET_log_setup} (i.e. 'gnunet-publish') and log using
 768 plain 'GNUNET_log'.
 769 @item service access libraries log using
 770 '@code{GNUNET_log_from}' and use '@code{DIRNAME-api}' for the
 771 component (i.e. 'core-api')
 772 @item pure libraries (without associated service) use
 773 '@code{GNUNET_log_from}' with the component set to their
 774 library name (without lib or '@file{.so}'),
 775 which should also be their directory name (i.e. '@file{nat}')
 776 @item plugins should use '@code{GNUNET_log_from}'
 777 with the directory name and the plugin name combined to produce
 778 the component name (i.e. 'transport-tcp').
 779 @item logging should be unified per-file by defining a
 780 @code{LOG} macro with the appropriate arguments,
 781 along these lines:
 782
 783 @example
 784 #define LOG(kind,...)
 785 GNUNET_log_from (kind, "example-api",__VA_ARGS__)
 786 @end example
 787
 788 @end itemize
 789
 790 @c ***********************************************************************
 791 @node configuration
 792 @subsubsection configuration
 793
 794 @itemize @bullet
 795 @item paths (that are substituted in all filenames) are in PATHS
 796 (have as few as possible)
 797 @item all options for a particular module (@file{src/MODULE})
 798 are under @code{[MODULE]}
 799 @item options for a plugin of a module
 800 are under @code{[MODULE-PLUGINNAME]}
 801 @end itemize
 802
 803 @c ***********************************************************************
 804 @node exported symbols
 805 @subsubsection exported symbols
 806
 807 @itemize @bullet
 808 @item must start with @code{GNUNET_modulename_} and be defined in
 809 @file{modulename.c}
 810 @item exceptions: those defined in @file{gnunet_common.h}
 811 @end itemize
 812
 813 @c ***********************************************************************
 814 @node private (library-internal) symbols (including structs and macros)
 815 @subsubsection private (library-internal) symbols (including structs and macros)
 816
 817 @itemize @bullet
 818 @item must NOT start with any prefix
 819 @item must not be exported in a way that linkers could use them or@ other
 820 libraries might see them via headers; they must be either
 821 declared/defined in C source files or in headers that are in the
 822 respective directory under @file{src/modulename/} and NEVER be declared
 823 in @file{src/include/}.
 824 @end itemize
 825
 826 @node testcases
 827 @subsubsection testcases
 828
 829 @itemize @bullet
 830 @item must be called @file{test_module-under-test_case-description.c}
 831 @item "case-description" maybe omitted if there is only one test
 832 @end itemize
 833
 834 @c ***********************************************************************
 835 @node performance tests
 836 @subsubsection performance tests
 837
 838 @itemize @bullet
 839 @item must be called @file{perf_module-under-test_case-description.c}
 840 @item "case-description" maybe omitted if there is only one performance
 841 test
 842 @item Must only be run if @code{HAVE_BENCHMARKS} is satisfied
 843 @end itemize
 844
 845 @c ***********************************************************************
 846 @node src/ directories
 847 @subsubsection src/ directories
 848
 849 @itemize @bullet
 850 @item gnunet-NAME: end-user applications (i.e., gnunet-search, gnunet-arm)
 851 @item gnunet-service-NAME: service processes with accessor library (i.e.,
 852 gnunet-service-arm)
 853 @item libgnunetNAME: accessor library (_service.h-header) or standalone
 854 library (_lib.h-header)
 855 @item gnunet-daemon-NAME: daemon process without accessor library (i.e.,
 856 gnunet-daemon-hostlist) and no GNUnet management port
 857 @item libgnunet_plugin_DIR_NAME: loadable plugins (i.e.,
 858 libgnunet_plugin_transport_tcp)
 859 @end itemize
 860
 861 @cindex Coding style
 862 @node Coding style
 863 @subsection Coding style
 864
 865 @c XXX: Adjust examples to GNU Standards!
 866 @itemize @bullet
 867 @item We follow the GNU Coding Standards (@pxref{Top, The GNU Coding Standards,, standards, The GNU Coding Standards});
 868 @item Indentation is done with spaces, two per level, no tabs;
 869 @item C99 struct initialization is fine;
 870 @item declare only one variable per line, for example:
 871
 872 @noindent
 873 instead of
 874
 875 @example
 876 int i,j;
 877 @end example
 878
 879 @noindent
 880 write:
 881
 882 @example
 883 int i;
 884 int j;
 885 @end example
 886
 887 @c TODO: include actual example from a file in source
 888
 889 @noindent
 890 This helps keep diffs small and forces developers to think precisely about
 891 the type of every variable.
 892 Note that @code{char *} is different from @code{const char*} and
 893 @code{int} is different from @code{unsigned int} or @code{uint32_t}.
 894 Each variable type should be chosen with care.
 895
 896 @item While @code{goto} should generally be avoided, having a
 897 @code{goto} to the end of a function to a block of clean up
 898 statements (free, close, etc.) can be acceptable.
 899
 900 @item Conditions should be written with constants on the left (to avoid
 901 accidental assignment) and with the @code{true} target being either the
 902 @code{error} case or the significantly simpler continuation. For example:
 903
 904 @example
 905 if (0 != stat ("filename," &sbuf)) @{
 906   error();
 907  @}
 908  else @{
 909    /* handle normal case here */
 910  @}
 911 @end example
 912
 913 @noindent
 914 instead of
 915
 916 @example
 917 if (stat ("filename," &sbuf) == 0) @{
 918   /* handle normal case here */
 919  @} else @{
 920   error();
 921  @}
 922 @end example
 923
 924 @noindent
 925 If possible, the error clause should be terminated with a @code{return} (or
 926 @code{goto} to some cleanup routine) and in this case, the @code{else} clause
 927 should be omitted:
 928
 929 @example
 930 if (0 != stat ("filename," &sbuf)) @{
 931   error();
 932   return;
 933  @}
 934 /* handle normal case here */
 935 @end example
 936
 937 This serves to avoid deep nesting. The 'constants on the left' rule
 938 applies to all constants (including. @code{GNUNET_SCHEDULER_NO_TASK}),
 939 NULL, and enums). With the two above rules (constants on left, errors in
 940 'true' branch), there is only one way to write most branches correctly.
 941
 942 @item Combined assignments and tests are allowed if they do not hinder
 943 code clarity. For example, one can write:
 944
 945 @example
 946 if (NULL == (value = lookup_function())) @{
 947   error();
 948   return;
 949  @}
 950 @end example
 951
 952 @item Use @code{break} and @code{continue} wherever possible to avoid
 953 deep(er) nesting. Thus, we would write:
 954
 955 @example
 956 next = head;
 957 while (NULL != (pos = next)) @{
 958   next = pos->next;
 959   if (! should_free (pos))
 960     continue;
 961   GNUNET_CONTAINER_DLL_remove (head, tail, pos);
 962   GNUNET_free (pos);
 963  @}
 964 @end example
 965
 966 instead of
 967
 968 @example
 969 next = head; while (NULL != (pos = next)) @{
 970   next = pos->next;
 971   if (should_free (pos)) @{
 972     /* unnecessary nesting! */
 973     GNUNET_CONTAINER_DLL_remove (head, tail, pos);
 974     GNUNET_free (pos);
 975    @}
 976   @}
 977 @end example
 978
 979 @item We primarily use @code{for} and @code{while} loops.
 980 A @code{while} loop is used if the method for advancing in the loop is
 981 not a straightforward increment operation. In particular, we use:
 982
 983 @example
 984 next = head;
 985 while (NULL != (pos = next))
 986 @{
 987   next = pos->next;
 988   if (! should_free (pos))
 989     continue;
 990   GNUNET_CONTAINER_DLL_remove (head, tail, pos);
 991   GNUNET_free (pos);
 992 @}
 993 @end example
 994
 995 to free entries in a list (as the iteration changes the structure of the
 996 list due to the free; the equivalent @code{for} loop does no longer
 997 follow the simple @code{for} paradigm of @code{for(INIT;TEST;INC)}).
 998 However, for loops that do follow the simple @code{for} paradigm we do
 999 use @code{for}, even if it involves linked lists:
1000
1001 @example
1002 /* simple iteration over a linked list */
1003 for (pos = head;
1004      NULL != pos;
1005      pos = pos->next)
1006 @{
1007    use (pos);
1008 @}
1009 @end example
1010
1011
1012 @item The first argument to all higher-order functions in GNUnet must be
1013 declared to be of type @code{void *} and is reserved for a closure. We do
1014 not use inner functions, as trampolines would conflict with setups that
1015 use non-executable stacks.
1016 The first statement in a higher-order function, which unusually should
1017 be part of the variable declarations, should assign the
1018 @code{cls} argument to the precise expected type. For example:
1019
1020 @example
1021 int callback (void *cls, char *args) @{
1022   struct Foo *foo = cls;
1023   int other_variables;
1024
1025    /* rest of function */
1026 @}
1027 @end example
1028
1029
1030 @item It is good practice to write complex @code{if} expressions instead
1031 of using deeply nested @code{if} statements. However, except for addition
1032 and multiplication, all operators should use parens. This is fine:
1033
1034 @example
1035 if ( (1 == foo) || ((0 == bar) && (x != y)) )
1036   return x;
1037 @end example
1038
1039
1040 However, this is not:
1041
1042 @example
1043 if (1 == foo)
1044   return x;
1045 if (0 == bar && x != y)
1046   return x;
1047 @end example
1048
1049 @noindent
1050 Note that splitting the @code{if} statement above is debatable as the
1051 @code{return x} is a very trivial statement. However, once the logic after
1052 the branch becomes more complicated (and is still identical), the "or"
1053 formulation should be used for sure.
1054
1055 @item There should be two empty lines between the end of the function and
1056 the comments describing the following function. There should be a single
1057 empty line after the initial variable declarations of a function. If a
1058 function has no local variables, there should be no initial empty line. If
1059 a long function consists of several complex steps, those steps might be
1060 separated by an empty line (possibly followed by a comment describing the
1061 following step). The code should not contain empty lines in arbitrary
1062 places; if in doubt, it is likely better to NOT have an empty line (this
1063 way, more code will fit on the screen).
1064 @end itemize
1065
1066 @c ***********************************************************************
1067 @node Build-system
1068 @section Build-system
1069
1070 If you have code that is likely not to compile or build rules you might
1071 want to not trigger for most developers, use @code{if HAVE_EXPERIMENTAL}
1072 in your @file{Makefile.am}.
1073 Then it is OK to (temporarily) add non-compiling (or known-to-not-port)
1074 code.
1075
1076 If you want to compile all testcases but NOT run them, run configure with
1077 the @code{--enable-test-suppression} option.
1078
1079 If you want to run all testcases, including those that take a while, run
1080 configure with the @code{--enable-expensive-testcases} option.
1081
1082 If you want to compile and run benchmarks, run configure with the
1083 @code{--enable-benchmarks} option.
1084
1085 If you want to obtain code coverage results, run configure with the
1086 @code{--enable-coverage} option and run the @file{coverage.sh} script in
1087 the @file{contrib/} directory.
1088
1089 @cindex gnunet-ext
1090 @node Developing extensions for GNUnet using the gnunet-ext template
1091 @section Developing extensions for GNUnet using the gnunet-ext template
1092
1093 For developers who want to write extensions for GNUnet we provide the
1094 gnunet-ext template to provide an easy to use skeleton.
1095
1096 gnunet-ext contains the build environment and template files for the
1097 development of GNUnet services, command line tools, APIs and tests.
1098
1099 First of all you have to obtain gnunet-ext from git:
1100
1101 @example
1102 git clone https://git.gnunet.org/gnunet-ext.git
1103 @end example
1104
1105 The next step is to bootstrap and configure it. For configure you have to
1106 provide the path containing GNUnet with
1107 @code{--with-gnunet=/path/to/gnunet} and the prefix where you want the
1108 install the extension using @code{--prefix=/path/to/install}:
1109
1110 @example
1111 ./bootstrap
1112 ./configure --prefix=/path/to/install --with-gnunet=/path/to/gnunet
1113 @end example
1114
1115 When your GNUnet installation is not included in the default linker search
1116 path, you have to add @code{/path/to/gnunet} to the file
1117 @file{/etc/ld.so.conf} and run @code{ldconfig} or your add it to the
1118 environmental variable @code{LD_LIBRARY_PATH} by using
1119
1120 @example
1121 export LD_LIBRARY_PATH=/path/to/gnunet/lib
1122 @end example
1123
1124 @cindex writing testcases
1125 @node Writing testcases
1126 @section Writing testcases
1127
1128 Ideally, any non-trivial GNUnet code should be covered by automated
1129 testcases. Testcases should reside in the same place as the code that is
1130 being tested. The name of source files implementing tests should begin
1131 with @code{test_} followed by the name of the file that contains
1132 the code that is being tested.
1133
1134 Testcases in GNUnet should be integrated with the autotools build system.
1135 This way, developers and anyone building binary packages will be able to
1136 run all testcases simply by running @code{make check}. The final
1137 testcases shipped with the distribution should output at most some brief
1138 progress information and not display debug messages by default. The
1139 success or failure of a testcase must be indicated by returning zero
1140 (success) or non-zero (failure) from the main method of the testcase.
1141 The integration with the autotools is relatively straightforward and only
1142 requires modifications to the @file{Makefile.am} in the directory
1143 containing the testcase. For a testcase testing the code in @file{foo.c}
1144 the @file{Makefile.am} would contain the following lines:
1145
1146 @example
1147 check_PROGRAMS = test_foo
1148 TESTS = $(check_PROGRAMS)
1149 test_foo_SOURCES = test_foo.c
1150 test_foo_LDADD = $(top_builddir)/src/util/libgnunetutil.la
1151 @end example
1152
1153 Naturally, other libraries used by the testcase may be specified in the
1154 @code{LDADD} directive as necessary.
1155
1156 Often testcases depend on additional input files, such as a configuration
1157 file. These support files have to be listed using the @code{EXTRA_DIST}
1158 directive in order to ensure that they are included in the distribution.
1159
1160 Example:
1161
1162 @example
1163 EXTRA_DIST = test_foo_data.conf
1164 @end example
1165
1166 Executing @code{make check} will run all testcases in the current
1167 directory and all subdirectories. Testcases can be compiled individually
1168 by running @code{make test_foo} and then invoked directly using
1169 @code{./test_foo}. Note that due to the use of plugins in GNUnet, it is
1170 typically necessary to run @code{make install} before running any
1171 testcases. Thus the canonical command @code{make check install} has to be
1172 changed to @code{make install check} for GNUnet.
1173
1174 @c ***********************************************************************
1175 @cindex Building GNUnet
1176 @node Building GNUnet and its dependencies
1177 @section Building GNUnet and its dependencies
1178
1179 In the following section we will outline how to build GNUnet and
1180 some of its dependencies. We will assume a fair amount of knowledge
1181 for building applications under UNIX-like systems. Furthermore we
1182 assume that the build environment is sane and that you are aware of
1183 any implications actions in this process could have.
1184 Instructions here can be seen as notes for developers (an extension to
1185 the 'HACKING' section in README) as well as package maintainers.
1186 @b{Users should rely on the available binary packages.}
1187 We will use Debian as an example Operating System environment. Substitute
1188 accordingly with your own Operating System environment.
1189
1190 For the full list of dependencies, consult the appropriate, up-to-date
1191 section in the @file{README} file.
1192
1193 First, we need to build or install (depending on your OS) the following
1194 packages. If you build them from source, build them in this exact order:
1195
1196 @example
1197 libgpgerror, libgcrypt, libnettle, libunbound, GnuTLS (with libunbound
1198 support)
1199 @end example
1200
1201 After we have build and installed those packages, we continue with
1202 packages closer to GNUnet in this step: libgnurl (our libcurl fork),
1203 GNU libmicrohttpd, and GNU libextractor. Again, if your package manager
1204 provides one of these packages, use the packages provided from it
1205 unless you have good reasons (package version too old, conflicts, etc).
1206 We advise against compiling widely used packages such as GnuTLS
1207 yourself if your OS provides a variant already unless you take care
1208 of maintenance of the packages then.
1209
1210 In the optimistic case, this command will give you all the dependencies:
1211
1212 @example
1213 sudo apt-get install libgnurl libmicrohttpd libextractor
1214 @end example
1215
1216 From experience we know that at the very least libgnurl is not
1217 available in some environments. You could substitute libgnurl
1218 with libcurl, but we recommend to install libgnurl, as it gives
1219 you a predefined libcurl with the small set GNUnet requires. In
1220 the past namespaces of libcurl and libgnurl were shared, which
1221 caused problems when you wanted to integrate both of them in one
1222 Operating System. This has been resolved, and they can be installed
1223 side by side now.
1224
1225 @cindex libgnurl
1226 @cindex compiling libgnurl
1227 GNUnet and some of its function depend on a limited subset of cURL/libcurl.
1228 Rather than trying to enforce a certain configuration on the world, we
1229 opted to maintain a microfork of it that ensures we can link against the
1230 right set of features. We called this specialized set of libcurl
1231 ``libgnurl''. It is fully ABI compatible with libcurl and currently used
1232 by GNUnet and some of its dependencies.
1233
1234 We download libgnurl and its digital signature from the GNU fileserver,
1235 assuming @env{TMPDIR} exists.
1236
1237 Note: TMPDIR might be @file{/tmp}, @env{TMPDIR}, @env{TMP} or any other
1238 location. For consistency we assume @env{TMPDIR} points to @file{/tmp}
1239 for the remainder of this section.
1240
1241 @example
1242 cd \$TMPDIR
1243 wget https://ftp.gnu.org/gnu/gnunet/gnurl-7.60.0.tar.Z
1244 wget https://ftp.gnu.org/gnu/gnunet/gnurl-7.60.0.tar.Z.sig
1245 @end example
1246
1247 Next, verify the digital signature of the file:
1248
1249 @example
1250 gpg --verify gnurl-7.60.0.tar.Z.sig
1251 @end example
1252
1253 If gpg fails, you might try with @command{gpg2} on your OS. If the error
1254 states that ``the key can not be found'' or it is unknown, you have to
1255 retrieve the key (A88C8ADD129828D7EAC02E52E22F9BBFEE348588) from a
1256 keyserver first:
1257
1258 @example
1259 gpg --keyserver pgp.mit.edu --recv-keys A88C8ADD129828D7EAC02E52E22F9BBFEE348588
1260 @end example
1261
1262 and rerun the verification command.
1263
1264 libgnurl will require the following packages to be present at runtime:
1265 gnutls (with DANE support / libunbound), libidn, zlib and at compile time:
1266 libtool, groff, perl, pkg-config, and python 2.7.
1267
1268 Once you have verified that all the required packages are present on your
1269 system, we can proceed to compile libgnurl:
1270
1271 @example
1272 tar -xvf gnurl-7.60.0.tar.Z
1273 cd gnurl-7.60.0
1274 sh configure --disable-ntlm-wb
1275 make
1276 make -C tests test
1277 sudo make install
1278 @end example
1279
1280 After you've compiled and installed libgnurl, we can proceed to building
1281 GNUnet.
1282
1283
1284
1285
1286 First, in addition to the GNUnet sources you might require downloading the
1287 latest version of various dependencies, depending on how recent the
1288 software versions in your distribution of GNU/Linux are.
1289 Most distributions do not include sufficiently recent versions of these
1290 dependencies.
1291 Thus, a typically installation on a "modern" GNU/Linux distribution
1292 requires you to install the following dependencies (ideally in this
1293 order):
1294
1295 @itemize @bullet
1296 @item libgpgerror and libgcrypt
1297 @item libnettle and libunbound (possibly from distribution), GnuTLS
1298 @item libgnurl (read the README)
1299 @item GNU libmicrohttpd
1300 @item GNU libextractor
1301 @end itemize
1302
1303 Make sure to first install the various mandatory and optional
1304 dependencies including development headers from your distribution.
1305
1306 Other dependencies that you should strongly consider to install is a
1307 database (MySQL, sqlite or Postgres).
1308 The following instructions will assume that you installed at least sqlite.
1309 For most distributions you should be able to find pre-build packages for
1310 the database. Again, make sure to install the client libraries @b{and} the
1311 respective development headers (if they are packaged separately) as well.
1312
1313 You can find specific, detailed instructions for installing of the
1314 dependencies (and possibly the rest of the GNUnet installation) in the
1315 platform-specific descriptions, which can be found in the Index.
1316 Please consult them now.
1317 If your distribution is not listed, please study the build
1318 instructions for Debian stable, carefully as you try to install the
1319 dependencies for your own distribution.
1320 Contributing additional instructions for further platforms is always
1321 appreciated.
1322 Please take in mind that operating system development tends to move at
1323 a rather fast speed. Due to this you should be aware that some of
1324 the instructions could be outdated by the time you are reading this.
1325 If you find a mistake, please tell us about it (or even better: send
1326 a patch to the documentation to fix it!).
1327
1328 Before proceeding further, please double-check the dependency list.
1329 Note that in addition to satisfying the dependencies, you might have to
1330 make sure that development headers for the various libraries are also
1331 installed.
1332 There maybe files for other distributions, or you might be able to find
1333 equivalent packages for your distribution.
1334
1335 While it is possible to build and install GNUnet without having root
1336 access, we will assume that you have full control over your system in
1337 these instructions.
1338 First, you should create a system user @emph{gnunet} and an additional
1339 group @emph{gnunetdns}. On the GNU/Linux distributions Debian and Ubuntu,
1340 type:
1341
1342 @example
1343 sudo adduser --system --home /var/lib/gnunet --group \
1344 --disabled-password gnunet
1345 sudo addgroup --system gnunetdns
1346 @end example
1347
1348 @noindent
1349 On other Unixes and GNU systems, this should have the same effect:
1350
1351 @example
1352 sudo useradd --system --groups gnunet --home-dir /var/lib/gnunet
1353 sudo addgroup --system gnunetdns
1354 @end example
1355
1356 Now compile and install GNUnet using:
1357
1358 @example
1359 tar xvf gnunet-@value{VERSION}.tar.gz
1360 cd gnunet-@value{VERSION}
1361 ./configure --with-sudo=sudo --with-nssdir=/lib
1362 make
1363 sudo make install
1364 @end example
1365
1366 If you want to be able to enable DEBUG-level log messages, add
1367 @code{--enable-logging=verbose} to the end of the
1368 @command{./configure} command.
1369 @code{DEBUG}-level log messages are in English only and
1370 should only be useful for developers (or for filing
1371 really detailed bug reports).
1372
1373 @noindent
1374 Next, edit the file @file{/etc/gnunet.conf} to contain the following:
1375
1376 @example
1377 [arm]
1378 START_SYSTEM_SERVICES = YES
1379 START_USER_SERVICES = NO
1380 @end example
1381
1382 @noindent
1383 You may need to update your @code{ld.so} cache to include
1384 files installed in @file{/usr/local/lib}:
1385
1386 @example
1387 # ldconfig
1388 @end example
1389
1390 @noindent
1391 Then, switch from user @code{root} to user @code{gnunet} to start
1392 the peer:
1393
1394 @example
1395 # su -s /bin/sh - gnunet
1396 $ gnunet-arm -c /etc/gnunet.conf -s
1397 @end example
1398
1399 You may also want to add the last line in the gnunet user's @file{crontab}
1400 prefixed with @code{@@reboot} so that it is executed whenever the system
1401 is booted:
1402
1403 @example
1404 @@reboot /usr/local/bin/gnunet-arm -c /etc/gnunet.conf -s
1405 @end example
1406
1407 @noindent
1408 This will only start the system-wide GNUnet services.
1409 Type @command{exit} to get back your root shell.
1410 Now, you need to configure the per-user part. For each
1411 user that should get access to GNUnet on the system, run
1412 (replace alice with your username):
1413
1414 @example
1415 sudo adduser alice gnunet
1416 @end example
1417
1418 @noindent
1419 to allow them to access the system-wide GNUnet services. Then, each
1420 user should create a configuration file @file{~/.config/gnunet.conf}
1421 with the lines:
1422
1423 @example
1424 [arm]
1425 START_SYSTEM_SERVICES = NO
1426 START_USER_SERVICES = YES
1427 DEFAULTSERVICES = gns
1428 @end example
1429
1430 @noindent
1431 and start the per-user services using
1432
1433 @example
1434 $ gnunet-arm -c ~/.config/gnunet.conf -s
1435 @end example
1436
1437 @noindent
1438 Again, adding a @code{crontab} entry to autostart the peer is advised:
1439
1440 @example
1441 @@reboot /usr/local/bin/gnunet-arm -c $HOME/.config/gnunet.conf -s
1442 @end example
1443
1444 @noindent
1445 Note that some GNUnet services (such as SOCKS5 proxies) may need a
1446 system-wide TCP port for each user.
1447 For those services, systems with more than one user may require each user
1448 to specify a different port number in their personal configuration file.
1449
1450 Finally, the user should perform the basic initial setup for the GNU Name
1451 System (GNS) certificate authority. This is done by running:
1452
1453 @example
1454 $ gnunet-gns-proxy-setup-ca
1455 @end example
1456
1457 @noindent
1458 The first generates the default zones, whereas the second setups the GNS
1459 Certificate Authority with the user's browser. Now, to activate GNS in the
1460 normal DNS resolution process, you need to edit your
1461 @file{/etc/nsswitch.conf} where you should find a line like this:
1462
1463 @example
1464 hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4
1465 @end example
1466
1467 @noindent
1468 The exact details may differ a bit, which is fine. Add the text
1469 @emph{"gns [NOTFOUND=return]"} after @emph{"files"}.
1470 Keep in mind that we included a backslash ("\") here just for
1471 markup reasons. You should write the text below on @b{one line}
1472 and @b{without} the "\":
1473
1474 @example
1475 hosts: files gns [NOTFOUND=return] mdns4_minimal \
1476 [NOTFOUND=return] dns mdns4
1477 @end example
1478
1479 @c FIXME: Document new behavior.
1480 You might want to make sure that @file{/lib/libnss_gns.so.2} exists on
1481 your system, it should have been created during the installation.
1482
1483
1484 @c **********************************************************************
1485 @cindex TESTING library
1486 @node TESTING library
1487 @section TESTING library
1488
1489 The TESTING library is used for writing testcases which involve starting a
1490 single or multiple peers. While peers can also be started by testcases
1491 using the ARM subsystem, using TESTING library provides an elegant way to
1492 do this. The configurations of the peers are auto-generated from a given
1493 template to have non-conflicting port numbers ensuring that peers'
1494 services do not run into bind errors. This is achieved by testing ports'
1495 availability by binding a listening socket to them before allocating them
1496 to services in the generated configurations.
1497
1498 An another advantage while using TESTING is that it shortens the testcase
1499 startup time as the hostkeys for peers are copied from a pre-computed set
1500 of hostkeys instead of generating them at peer startup which may take a
1501 considerable amount of time when starting multiple peers or on an embedded
1502 processor.
1503
1504 TESTING also allows for certain services to be shared among peers. This
1505 feature is invaluable when testing with multiple peers as it helps to
1506 reduce the number of services run per each peer and hence the total
1507 number of processes run per testcase.
1508
1509 TESTING library only handles creating, starting and stopping peers.
1510 Features useful for testcases such as connecting peers in a topology are
1511 not available in TESTING but are available in the TESTBED subsystem.
1512 Furthermore, TESTING only creates peers on the localhost, however by
1513 using TESTBED testcases can benefit from creating peers across multiple
1514 hosts.
1515
1516 @menu
1517 * API::
1518 * Finer control over peer stop::
1519 * Helper functions::
1520 * Testing with multiple processes::
1521 @end menu
1522
1523 @cindex TESTING API
1524 @node API
1525 @subsection API
1526
1527 TESTING abstracts a group of peers as a TESTING system. All peers in a
1528 system have common hostname and no two services of these peers have a
1529 same port or a UNIX domain socket path.
1530
1531 TESTING system can be created with the function
1532 @code{GNUNET_TESTING_system_create()} which returns a handle to the
1533 system. This function takes a directory path which is used for generating
1534 the configurations of peers, an IP address from which connections to the
1535 peers' services should be allowed, the hostname to be used in peers'
1536 configuration, and an array of shared service specifications of type
1537 @code{struct GNUNET_TESTING_SharedService}.
1538
1539 The shared service specification must specify the name of the service to
1540 share, the configuration pertaining to that shared service and the
1541 maximum number of peers that are allowed to share a single instance of
1542 the shared service.
1543
1544 TESTING system created with @code{GNUNET_TESTING_system_create()} chooses
1545 ports from the default range @code{12000} - @code{56000} while
1546 auto-generating configurations for peers.
1547 This range can be customised with the function
1548 @code{GNUNET_TESTING_system_create_with_portrange()}. This function is
1549 similar to @code{GNUNET_TESTING_system_create()} except that it take 2
1550 additional parameters --- the start and end of the port range to use.
1551
1552 A TESTING system is destroyed with the function
1553 @code{GNUNET_TESTING_system_destory()}. This function takes the handle of
1554 the system and a flag to remove the files created in the directory used
1555 to generate configurations.
1556
1557 A peer is created with the function
1558 @code{GNUNET_TESTING_peer_configure()}. This functions takes the system
1559 handle, a configuration template from which the configuration for the peer
1560 is auto-generated and the index from where the hostkey for the peer has to
1561 be copied from. When successful, this function returns a handle to the
1562 peer which can be used to start and stop it and to obtain the identity of
1563 the peer. If unsuccessful, a NULL pointer is returned with an error
1564 message. This function handles the generated configuration to have
1565 non-conflicting ports and paths.
1566
1567 Peers can be started and stopped by calling the functions
1568 @code{GNUNET_TESTING_peer_start()} and @code{GNUNET_TESTING_peer_stop()}
1569 respectively. A peer can be destroyed by calling the function
1570 @code{GNUNET_TESTING_peer_destroy}. When a peer is destroyed, the ports
1571 and paths in allocated in its configuration are reclaimed for usage in new
1572 peers.
1573
1574 @c ***********************************************************************
1575 @node Finer control over peer stop
1576 @subsection Finer control over peer stop
1577
1578 Using @code{GNUNET_TESTING_peer_stop()} is normally fine for testcases.
1579 However, calling this function for each peer is inefficient when trying to
1580 shutdown multiple peers as this function sends the termination signal to
1581 the given peer process and waits for it to terminate. It would be faster
1582 in this case to send the termination signals to the peers first and then
1583 wait on them. This is accomplished by the functions
1584 @code{GNUNET_TESTING_peer_kill()} which sends a termination signal to the
1585 peer, and the function @code{GNUNET_TESTING_peer_wait()} which waits on
1586 the peer.
1587
1588 Further finer control can be achieved by choosing to stop a peer
1589 asynchronously with the function @code{GNUNET_TESTING_peer_stop_async()}.
1590 This function takes a callback parameter and a closure for it in addition
1591 to the handle to the peer to stop. The callback function is called with
1592 the given closure when the peer is stopped. Using this function
1593 eliminates blocking while waiting for the peer to terminate.
1594
1595 An asynchronous peer stop can be canceled by calling the function
1596 @code{GNUNET_TESTING_peer_stop_async_cancel()}. Note that calling this
1597 function does not prevent the peer from terminating if the termination
1598 signal has already been sent to it. It does, however, cancels the
1599 callback to be called when the peer is stopped.
1600
1601 @c ***********************************************************************
1602 @node Helper functions
1603 @subsection Helper functions
1604
1605 Most of the testcases can benefit from an abstraction which configures a
1606 peer and starts it. This is provided by the function
1607 @code{GNUNET_TESTING_peer_run()}. This function takes the testing
1608 directory pathname, a configuration template, a callback and its closure.
1609 This function creates a peer in the given testing directory by using the
1610 configuration template, starts the peer and calls the given callback with
1611 the given closure.
1612
1613 The function @code{GNUNET_TESTING_peer_run()} starts the ARM service of
1614 the peer which starts the rest of the configured services. A similar
1615 function @code{GNUNET_TESTING_service_run} can be used to just start a
1616 single service of a peer. In this case, the peer's ARM service is not
1617 started; instead, only the given service is run.
1618
1619 @c ***********************************************************************
1620 @node Testing with multiple processes
1621 @subsection Testing with multiple processes
1622
1623 When testing GNUnet, the splitting of the code into a services and clients
1624 often complicates testing. The solution to this is to have the testcase
1625 fork @code{gnunet-service-arm}, ask it to start the required server and
1626 daemon processes and then execute appropriate client actions (to test the
1627 client APIs or the core module or both). If necessary, multiple ARM
1628 services can be forked using different ports (!) to simulate a network.
1629 However, most of the time only one ARM process is needed. Note that on
1630 exit, the testcase should shutdown ARM with a @code{TERM} signal (to give
1631 it the chance to cleanly stop its child processes).
1632
1633 The following code illustrates spawning and killing an ARM process from a
1634 testcase:
1635
1636 @example
1637 static void run (void *cls,
1638                  char *const *args,
1639                  const char *cfgfile,
1640                  const struct GNUNET_CONFIGURATION_Handle *cfg) @{
1641   struct GNUNET_OS_Process *arm_pid;
1642   arm_pid = GNUNET_OS_start_process (NULL,
1643                                      NULL,
1644                                      "gnunet-service-arm",
1645                                      "gnunet-service-arm",
1646                                      "-c",
1647                                      cfgname,
1648                                      NULL);
1649   /* do real test work here */
1650   if (0 != GNUNET_OS_process_kill (arm_pid, SIGTERM))
1651     GNUNET_log_strerror
1652       (GNUNET_ERROR_TYPE_WARNING, "kill");
1653   GNUNET_assert (GNUNET_OK == GNUNET_OS_process_wait (arm_pid));
1654   GNUNET_OS_process_close (arm_pid); @}
1655
1656 GNUNET_PROGRAM_run (argc, argv,
1657                     "NAME-OF-TEST",
1658                     "nohelp",
1659                     options,
1660                     &run,
1661                     cls);
1662 @end example
1663
1664
1665 An alternative way that works well to test plugins is to implement a
1666 mock-version of the environment that the plugin expects and then to
1667 simply load the plugin directly.
1668
1669 @c ***********************************************************************
1670 @node Performance regression analysis with Gauger
1671 @section Performance regression analysis with Gauger
1672
1673 To help avoid performance regressions, GNUnet uses Gauger. Gauger is a
1674 simple logging tool that allows remote hosts to send performance data to
1675 a central server, where this data can be analyzed and visualized. Gauger
1676 shows graphs of the repository revisions and the performance data recorded
1677 for each revision, so sudden performance peaks or drops can be identified
1678 and linked to a specific revision number.
1679
1680 In the case of GNUnet, the buildbots log the performance data obtained
1681 during the tests after each build. The data can be accessed on GNUnet's
1682 Gauger page.
1683
1684 The menu on the left allows to select either the results of just one
1685 build bot (under "Hosts") or review the data from all hosts for a given
1686 test result (under "Metrics"). In case of very different absolute value
1687 of the results, for instance arm vs. amd64 machines, the option
1688 "Normalize" on a metric view can help to get an idea about the
1689 performance evolution across all hosts.
1690
1691 Using Gauger in GNUnet and having the performance of a module tracked over
1692 time is very easy. First of course, the testcase must generate some
1693 consistent metric, which makes sense to have logged. Highly volatile or
1694 random dependent metrics probably are not ideal candidates for meaningful
1695 regression detection.
1696
1697 To start logging any value, just include @code{gauger.h} in your testcase
1698 code. Then, use the macro @code{GAUGER()} to make the Buildbots log
1699 whatever value is of interest for you to @code{gnunet.org}'s Gauger
1700 server. No setup is necessary as most Buildbots have already everything
1701 in place and new metrics are created on demand. To delete a metric, you
1702 need to contact a member of the GNUnet development team (a file will need
1703 to be removed manually from the respective directory).
1704
1705 The code in the test should look like this:
1706
1707 @example
1708 [other includes]
1709 #include <gauger.h>
1710
1711 int main (int argc, char *argv[]) @{
1712
1713   [run test, generate data]
1714     GAUGER("YOUR_MODULE",
1715            "METRIC_NAME",
1716            (float)value,
1717            "UNIT"); @}
1718 @end example
1719
1720 Where:
1721
1722 @table @asis
1723
1724 @item @strong{YOUR_MODULE} is a category in the gauger page and should be
1725 the name of the module or subsystem like "Core" or "DHT"
1726 @item @strong{METRIC} is
1727 the name of the metric being collected and should be concise and
1728 descriptive, like "PUT operations in sqlite-datastore".
1729 @item @strong{value} is the value
1730 of the metric that is logged for this run.
1731 @item @strong{UNIT} is the unit in
1732 which the value is measured, for instance "kb/s" or "kb of RAM/node".
1733 @end table
1734
1735 If you wish to use Gauger for your own project, you can grab a copy of the
1736 latest stable release or check out Gauger's Subversion repository.
1737
1738 @cindex TESTBED Subsystem
1739 @node TESTBED Subsystem
1740 @section TESTBED Subsystem
1741
1742 The TESTBED subsystem facilitates testing and measuring of multi-peer
1743 deployments on a single host or over multiple hosts.
1744
1745 The architecture of the testbed module is divided into the following:
1746 @itemize @bullet
1747
1748 @item Testbed API: An API which is used by the testing driver programs. It
1749 provides with functions for creating, destroying, starting, stopping
1750 peers, etc.
1751
1752 @item Testbed service (controller): A service which is started through the
1753 Testbed API. This service handles operations to create, destroy, start,
1754 stop peers, connect them, modify their configurations.
1755
1756 @item Testbed helper: When a controller has to be started on a host, the
1757 testbed API starts the testbed helper on that host which in turn starts
1758 the controller. The testbed helper receives a configuration for the
1759 controller through its stdin and changes it to ensure the controller
1760 doesn't run into any port conflict on that host.
1761 @end itemize
1762
1763
1764 The testbed service (controller) is different from the other GNUnet
1765 services in that it is not started by ARM and is not supposed to be run
1766 as a daemon. It is started by the testbed API through a testbed helper.
1767 In a typical scenario involving multiple hosts, a controller is started
1768 on each host. Controllers take up the actual task of creating peers,
1769 starting and stopping them on the hosts they run.
1770
1771 While running deployments on a single localhost the testbed API starts the
1772 testbed helper directly as a child process. When running deployments on
1773 remote hosts the testbed API starts Testbed Helpers on each remote host
1774 through remote shell. By default testbed API uses SSH as a remote shell.
1775 This can be changed by setting the environmental variable
1776 GNUNET_TESTBED_RSH_CMD to the required remote shell program. This
1777 variable can also contain parameters which are to be passed to the remote
1778 shell program. For e.g:
1779
1780 @example
1781 export GNUNET_TESTBED_RSH_CMD="ssh -o BatchMode=yes \
1782 -o NoHostAuthenticationForLocalhost=yes %h"
1783 @end example
1784
1785 Substitutions are allowed in the command string above,
1786 this allows for substitutions through placemarks which begin with a `%'.
1787 At present the following substitutions are supported
1788
1789 @itemize @bullet
1790 @item %h: hostname
1791 @item %u: username
1792 @item %p: port
1793 @end itemize
1794
1795 Note that the substitution placemark is replaced only when the
1796 corresponding field is available and only once. Specifying
1797
1798 @example
1799 %u@@%h
1800 @end example
1801
1802 doesn't work either. If you want to user username substitutions for
1803 @command{SSH}, use the argument @code{-l} before the
1804 username substitution.
1805
1806 For example:
1807 @example
1808 ssh -l %u -p %p %h
1809 @end example
1810
1811 The testbed API and the helper communicate through the helpers stdin and
1812 stdout. As the helper is started through a remote shell on remote hosts
1813 any output messages from the remote shell interfere with the communication
1814 and results in a failure while starting the helper. For this reason, it is
1815 suggested to use flags to make the remote shells produce no output
1816 messages and to have password-less logins. The default remote shell, SSH,
1817 the default options are:
1818
1819 @example
1820 -o BatchMode=yes -o NoHostBasedAuthenticationForLocalhost=yes"
1821 @end example
1822
1823 Password-less logins should be ensured by using SSH keys.
1824
1825 Since the testbed API executes the remote shell as a non-interactive
1826 shell, certain scripts like .bashrc, .profiler may not be executed. If
1827 this is the case testbed API can be forced to execute an interactive
1828 shell by setting up the environmental variable
1829 @code{GNUNET_TESTBED_RSH_CMD_SUFFIX} to a shell program.
1830
1831 An example could be:
1832
1833 @example
1834 export GNUNET_TESTBED_RSH_CMD_SUFFIX="sh -lc"
1835 @end example
1836
1837 The testbed API will then execute the remote shell program as:
1838
1839 @example
1840 $GNUNET_TESTBED_RSH_CMD -p $port $dest $GNUNET_TESTBED_RSH_CMD_SUFFIX \
1841 gnunet-helper-testbed
1842 @end example
1843
1844 On some systems, problems may arise while starting testbed helpers if
1845 GNUnet is installed into a custom location since the helper may not be
1846 found in the standard path. This can be addressed by setting the variable
1847 `@code{HELPER_BINARY_PATH}' to the path of the testbed helper.
1848 Testbed API will then use this path to start helper binaries both
1849 locally and remotely.
1850
1851 Testbed API can accessed by including the
1852 @file{gnunet_testbed_service.h} file and linking with
1853 @code{-lgnunettestbed}.
1854
1855 @c ***********************************************************************
1856 @menu
1857 * Supported Topologies::
1858 * Hosts file format::
1859 * Topology file format::
1860 * Testbed Barriers::
1861 * Automatic large-scale deployment in the PlanetLab testbed::
1862 * TESTBED Caveats::
1863 @end menu
1864
1865 @node Supported Topologies
1866 @subsection Supported Topologies
1867
1868 While testing multi-peer deployments, it is often needed that the peers
1869 are connected in some topology. This requirement is addressed by the
1870 function @code{GNUNET_TESTBED_overlay_connect()} which connects any given
1871 two peers in the testbed.
1872
1873 The API also provides a helper function
1874 @code{GNUNET_TESTBED_overlay_configure_topology()} to connect a given set
1875 of peers in any of the following supported topologies:
1876
1877 @itemize @bullet
1878
1879 @item @code{GNUNET_TESTBED_TOPOLOGY_CLIQUE}: All peers are connected with
1880 each other
1881
1882 @item @code{GNUNET_TESTBED_TOPOLOGY_LINE}: Peers are connected to form a
1883 line
1884
1885 @item @code{GNUNET_TESTBED_TOPOLOGY_RING}: Peers are connected to form a
1886 ring topology
1887
1888 @item @code{GNUNET_TESTBED_TOPOLOGY_2D_TORUS}: Peers are connected to
1889 form a 2 dimensional torus topology. The number of peers may not be a
1890 perfect square, in that case the resulting torus may not have the uniform
1891 poloidal and toroidal lengths
1892
1893 @item @code{GNUNET_TESTBED_TOPOLOGY_ERDOS_RENYI}: Topology is generated
1894 to form a random graph. The number of links to be present should be given
1895
1896 @item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD}: Peers are connected to
1897 form a 2D Torus with some random links among them. The number of random
1898 links are to be given
1899
1900 @item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD_RING}: Peers are
1901 connected to form a ring with some random links among them. The number of
1902 random links are to be given
1903
1904 @item @code{GNUNET_TESTBED_TOPOLOGY_SCALE_FREE}: Connects peers in a
1905 topology where peer connectivity follows power law - new peers are
1906 connected with high probability to well connected peers.
1907 (See Emergence of Scaling in Random Networks. Science 286,
1908 509-512, 1999
1909 (@uref{https://git.gnunet.org/bibliography.git/plain/docs/emergence_of_scaling_in_random_networks__barabasi_albert_science_286__1999.pdf, pdf}))
1910
1911 @item @code{GNUNET_TESTBED_TOPOLOGY_FROM_FILE}: The topology information
1912 is loaded from a file. The path to the file has to be given.
1913 @xref{Topology file format}, for the format of this file.
1914
1915 @item @code{GNUNET_TESTBED_TOPOLOGY_NONE}: No topology
1916 @end itemize
1917
1918
1919 The above supported topologies can be specified respectively by setting
1920 the variable @code{OVERLAY_TOPOLOGY} to the following values in the
1921 configuration passed to Testbed API functions
1922 @code{GNUNET_TESTBED_test_run()} and
1923 @code{GNUNET_TESTBED_run()}:
1924
1925 @itemize @bullet
1926 @item @code{CLIQUE}
1927 @item @code{RING}
1928 @item @code{LINE}
1929 @item @code{2D_TORUS}
1930 @item @code{RANDOM}
1931 @item @code{SMALL_WORLD}
1932 @item @code{SMALL_WORLD_RING}
1933 @item @code{SCALE_FREE}
1934 @item @code{FROM_FILE}
1935 @item @code{NONE}
1936 @end itemize
1937
1938
1939 Topologies @code{RANDOM}, @code{SMALL_WORLD} and @code{SMALL_WORLD_RING}
1940 require the option @code{OVERLAY_RANDOM_LINKS} to be set to the number of
1941 random links to be generated in the configuration. The option will be
1942 ignored for the rest of the topologies.
1943
1944 Topology @code{SCALE_FREE} requires the options
1945 @code{SCALE_FREE_TOPOLOGY_CAP} to be set to the maximum number of peers
1946 which can connect to a peer and @code{SCALE_FREE_TOPOLOGY_M} to be set to
1947 how many peers a peer should be at least connected to.
1948
1949 Similarly, the topology @code{FROM_FILE} requires the option
1950 @code{OVERLAY_TOPOLOGY_FILE} to contain the path of the file containing
1951 the topology information. This option is ignored for the rest of the
1952 topologies. @xref{Topology file format}, for the format of this file.
1953
1954 @c ***********************************************************************
1955 @node Hosts file format
1956 @subsection Hosts file format
1957
1958 The testbed API offers the function
1959 @code{GNUNET_TESTBED_hosts_load_from_file()} to load from a given file
1960 details about the hosts which testbed can use for deploying peers.
1961 This function is useful to keep the data about hosts
1962 separate instead of hard coding them in code.
1963
1964 Another helper function from testbed API, @code{GNUNET_TESTBED_run()}
1965 also takes a hosts file name as its parameter. It uses the above
1966 function to populate the hosts data structures and start controllers to
1967 deploy peers.
1968
1969 These functions require the hosts file to be of the following format:
1970 @itemize @bullet
1971 @item Each line is interpreted to have details about a host
1972 @item Host details should include the username to use for logging into the
1973 host, the hostname of the host and the port number to use for the remote
1974 shell program. All thee values should be given.
1975 @item These details should be given in the following format:
1976 @example
1977 <username>@@<hostname>:<port>
1978 @end example
1979 @end itemize
1980
1981 Note that having canonical hostnames may cause problems while resolving
1982 the IP addresses (See this bug). Hence it is advised to provide the hosts'
1983 IP numerical addresses as hostnames whenever possible.
1984
1985 @c ***********************************************************************
1986 @node Topology file format
1987 @subsection Topology file format
1988
1989 A topology file describes how peers are to be connected. It should adhere
1990 to the following format for testbed to parse it correctly.
1991
1992 Each line should begin with the target peer id. This should be followed by
1993 a colon(`:') and origin peer ids separated by `|'. All spaces except for
1994 newline characters are ignored. The API will then try to connect each
1995 origin peer to the target peer.
1996
1997 For example, the following file will result in 5 overlay connections:
1998 [2->1], [3->1],[4->3], [0->3], [2->0]@
1999 @code{@ 1:2|3@ 3:4| 0@ 0: 2@ }
2000
2001 @c ***********************************************************************
2002 @node Testbed Barriers
2003 @subsection Testbed Barriers
2004
2005 The testbed subsystem's barriers API facilitates coordination among the
2006 peers run by the testbed and the experiment driver. The concept is
2007 similar to the barrier synchronisation mechanism found in parallel
2008 programming or multi-threading paradigms - a peer waits at a barrier upon
2009 reaching it until the barrier is reached by a predefined number of peers.
2010 This predefined number of peers required to cross a barrier is also called
2011 quorum. We say a peer has reached a barrier if the peer is waiting for the
2012 barrier to be crossed. Similarly a barrier is said to be reached if the
2013 required quorum of peers reach the barrier. A barrier which is reached is
2014 deemed as crossed after all the peers waiting on it are notified.
2015
2016 The barriers API provides the following functions:
2017 @itemize @bullet
2018 @item @strong{@code{GNUNET_TESTBED_barrier_init()}:} function to
2019 initialize a barrier in the experiment
2020 @item @strong{@code{GNUNET_TESTBED_barrier_cancel()}:} function to cancel
2021 a barrier which has been initialized before
2022 @item @strong{@code{GNUNET_TESTBED_barrier_wait()}:} function to signal
2023 barrier service that the caller has reached a barrier and is waiting for
2024 it to be crossed
2025 @item @strong{@code{GNUNET_TESTBED_barrier_wait_cancel()}:} function to
2026 stop waiting for a barrier to be crossed
2027 @end itemize
2028
2029
2030 Among the above functions, the first two, namely
2031 @code{GNUNET_TESTBED_barrier_init()} and
2032 @code{GNUNET_TESTBED_barrier_cancel()} are used by experiment drivers. All
2033 barriers should be initialised by the experiment driver by calling
2034 @code{GNUNET_TESTBED_barrier_init()}. This function takes a name to
2035 identify the barrier, the quorum required for the barrier to be crossed
2036 and a notification callback for notifying the experiment driver when the
2037 barrier is crossed. @code{GNUNET_TESTBED_barrier_cancel()} cancels an
2038 initialised barrier and frees the resources allocated for it. This
2039 function can be called upon a initialised barrier before it is crossed.
2040
2041 The remaining two functions @code{GNUNET_TESTBED_barrier_wait()} and
2042 @code{GNUNET_TESTBED_barrier_wait_cancel()} are used in the peer's
2043 processes. @code{GNUNET_TESTBED_barrier_wait()} connects to the local
2044 barrier service running on the same host the peer is running on and
2045 registers that the caller has reached the barrier and is waiting for the
2046 barrier to be crossed. Note that this function can only be used by peers
2047 which are started by testbed as this function tries to access the local
2048 barrier service which is part of the testbed controller service. Calling
2049 @code{GNUNET_TESTBED_barrier_wait()} on an uninitialised barrier results
2050 in failure. @code{GNUNET_TESTBED_barrier_wait_cancel()} cancels the
2051 notification registered by @code{GNUNET_TESTBED_barrier_wait()}.
2052
2053
2054 @c ***********************************************************************
2055 @menu
2056 * Implementation::
2057 @end menu
2058
2059 @node Implementation
2060 @subsubsection Implementation
2061
2062 Since barriers involve coordination between experiment driver and peers,
2063 the barrier service in the testbed controller is split into two
2064 components. The first component responds to the message generated by the
2065 barrier API used by the experiment driver (functions
2066 @code{GNUNET_TESTBED_barrier_init()} and
2067 @code{GNUNET_TESTBED_barrier_cancel()}) and the second component to the
2068 messages generated by barrier API used by peers (functions
2069 @code{GNUNET_TESTBED_barrier_wait()} and
2070 @code{GNUNET_TESTBED_barrier_wait_cancel()}).
2071
2072 Calling @code{GNUNET_TESTBED_barrier_init()} sends a
2073 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_INIT} message to the master
2074 controller. The master controller then registers a barrier and calls
2075 @code{GNUNET_TESTBED_barrier_init()} for each its subcontrollers. In this
2076 way barrier initialisation is propagated to the controller hierarchy.
2077 While propagating initialisation, any errors at a subcontroller such as
2078 timeout during further propagation are reported up the hierarchy back to
2079 the experiment driver.
2080
2081 Similar to @code{GNUNET_TESTBED_barrier_init()},
2082 @code{GNUNET_TESTBED_barrier_cancel()} propagates
2083 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_CANCEL} message which causes
2084 controllers to remove an initialised barrier.
2085
2086 The second component is implemented as a separate service in the binary
2087 `gnunet-service-testbed' which already has the testbed controller service.
2088 Although this deviates from the gnunet process architecture of having one
2089 service per binary, it is needed in this case as this component needs
2090 access to barrier data created by the first component. This component
2091 responds to @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages from
2092 local peers when they call @code{GNUNET_TESTBED_barrier_wait()}. Upon
2093 receiving @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} message, the
2094 service checks if the requested barrier has been initialised before and
2095 if it was not initialised, an error status is sent through
2096 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to the local
2097 peer and the connection from the peer is terminated. If the barrier is
2098 initialised before, the barrier's counter for reached peers is incremented
2099 and a notification is registered to notify the peer when the barrier is
2100 reached. The connection from the peer is left open.
2101
2102 When enough peers required to attain the quorum send
2103 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages, the controller
2104 sends a @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to its
2105 parent informing that the barrier is crossed. If the controller has
2106 started further subcontrollers, it delays this message until it receives
2107 a similar notification from each of those subcontrollers. Finally, the
2108 barriers API at the experiment driver receives the
2109 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} when the barrier is
2110 reached at all the controllers.
2111
2112 The barriers API at the experiment driver responds to the
2113 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message by echoing it
2114 back to the master controller and notifying the experiment controller
2115 through the notification callback that a barrier has been crossed. The
2116 echoed @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message is
2117 propagated by the master controller to the controller hierarchy. This
2118 propagation triggers the notifications registered by peers at each of the
2119 controllers in the hierarchy. Note the difference between this downward
2120 propagation of the @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS}
2121 message from its upward propagation --- the upward propagation is needed
2122 for ensuring that the barrier is reached by all the controllers and the
2123 downward propagation is for triggering that the barrier is crossed.
2124
2125 @cindex PlanetLab testbed
2126 @node Automatic large-scale deployment in the PlanetLab testbed
2127 @subsection Automatic large-scale deployment in the PlanetLab testbed
2128
2129 PlanetLab is a testbed for computer networking and distributed systems
2130 research. It was established in 2002 and as of June 2010 was composed of
2131 1090 nodes at 507 sites worldwide.
2132
2133 To automate the GNUnet we created a set of automation tools to simplify
2134 the large-scale deployment. We provide you a set of scripts you can use
2135 to deploy GNUnet on a set of nodes and manage your installation.
2136
2137 Please also check @uref{https://old.gnunet.org/installation-fedora8-svn} and
2138 @uref{https://old.gnunet.org/installation-fedora12-svn} to find detailed
2139 instructions how to install GNUnet on a PlanetLab node.
2140
2141
2142 @c ***********************************************************************
2143 @menu
2144 * PlanetLab Automation for Fedora8 nodes::
2145 * Install buildslave on PlanetLab nodes running fedora core 8::
2146 * Setup a new PlanetLab testbed using GPLMT::
2147 * Why do i get an ssh error when using the regex profiler?::
2148 @end menu
2149
2150 @node PlanetLab Automation for Fedora8 nodes
2151 @subsubsection PlanetLab Automation for Fedora8 nodes
2152
2153 @c ***********************************************************************
2154 @node Install buildslave on PlanetLab nodes running fedora core 8
2155 @subsubsection Install buildslave on PlanetLab nodes running fedora core 8
2156 @c ** Actually this is a subsubsubsection, but must be fixed differently
2157 @c ** as subsubsection is the lowest.
2158
2159 Since most of the PlanetLab nodes are running the very old Fedora core 8
2160 image, installing the buildslave software is quite some pain. For our
2161 PlanetLab testbed we figured out how to install the buildslave software
2162 best.
2163
2164 @c This is a very terrible way to suggest installing software.
2165 @c FIXME: Is there an official, safer way instead of blind-piping a
2166 @c script?
2167 @c FIXME: Use newer pypi URLs below.
2168 Install Distribute for Python:
2169
2170 @example
2171 curl http://python-distribute.org/distribute_setup.py | sudo python
2172 @end example
2173
2174 Install Distribute for zope.interface <= 3.8.0 (4.0 and 4.0.1 will not
2175 work):
2176
2177 @example
2178 export PYPI=@value{PYPI-URL}
2179 wget $PYPI/z/zope.interface/zope.interface-3.8.0.tar.gz
2180 tar xzvf zope.interface-3.8.0.tar.gz
2181 cd zope.interface-3.8.0
2182 sudo python setup.py install
2183 @end example
2184
2185 Install the buildslave software (0.8.6 was the latest version):
2186
2187 @example
2188 export GCODE="http://buildbot.googlecode.com/files"
2189 wget $GCODE/buildbot-slave-0.8.6p1.tar.gz
2190 tar xvfz buildbot-slave-0.8.6p1.tar.gz
2191 cd buildslave-0.8.6p1
2192 sudo python setup.py install
2193 @end example
2194
2195 The setup will download the matching twisted package and install it.
2196 It will also try to install the latest version of zope.interface which
2197 will fail to install. Buildslave will work anyway since version 3.8.0
2198 was installed before!
2199
2200 @c ***********************************************************************
2201 @node Setup a new PlanetLab testbed using GPLMT
2202 @subsubsection Setup a new PlanetLab testbed using GPLMT
2203
2204 @itemize @bullet
2205 @item Get a new slice and assign nodes
2206 Ask your PlanetLab PI to give you a new slice and assign the nodes you
2207 need
2208 @item Install a buildmaster
2209 You can stick to the buildbot documentation:@
2210 @uref{http://buildbot.net/buildbot/docs/current/manual/installation.html}
2211 @item Install the buildslave software on all nodes
2212 To install the buildslave on all nodes assigned to your slice you can use
2213 the tasklist @code{install_buildslave_fc8.xml} provided with GPLMT:
2214
2215 @example
2216 ./gplmt.py -c contrib/tumple_gnunet.conf -t \
2217 contrib/tasklists/install_buildslave_fc8.xml -a -p <planetlab password>
2218 @end example
2219
2220 @item Create the buildmaster configuration and the slave setup commands
2221
2222 The master and the and the slaves have need to have credentials and the
2223 master has to have all nodes configured. This can be done with the
2224 @file{create_buildbot_configuration.py} script in the @file{scripts}
2225 directory.
2226
2227 This scripts takes a list of nodes retrieved directly from PlanetLab or
2228 read from a file and a configuration template and creates:
2229
2230 @itemize @bullet
2231 @item a tasklist which can be executed with gplmt to setup the slaves
2232 @item a master.cfg file containing a PlanetLab nodes
2233 @end itemize
2234
2235 A configuration template is included in the <contrib>, most important is
2236 that the script replaces the following tags in the template:
2237
2238 %GPLMT_BUILDER_DEFINITION :@ GPLMT_BUILDER_SUMMARY@ GPLMT_SLAVES@
2239 %GPLMT_SCHEDULER_BUILDERS
2240
2241 Create configuration for all nodes assigned to a slice:
2242
2243 @example
2244 ./create_buildbot_configuration.py -u <planetlab username> \
2245 -p <planetlab password> -s <slice> -m <buildmaster+port> \
2246 -t <template>
2247 @end example
2248
2249 Create configuration for some nodes in a file:
2250
2251 @example
2252 ./create_buildbot_configuration.p -f <node_file> \
2253 -m <buildmaster+port> -t <template>
2254 @end example
2255
2256 @item Copy the @file{master.cfg} to the buildmaster and start it
2257 Use @code{buildbot start <basedir>} to start the server
2258 @item Setup the buildslaves
2259 @end itemize
2260
2261 @c ***********************************************************************
2262 @node Why do i get an ssh error when using the regex profiler?
2263 @subsubsection Why do i get an ssh error when using the regex profiler?
2264
2265 Why do i get an ssh error "Permission denied (publickey,password)." when
2266 using the regex profiler although passwordless ssh to localhost works
2267 using publickey and ssh-agent?
2268
2269 You have to generate a public/private-key pair with no password:@
2270 @code{ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_localhost}@
2271 and then add the following to your ~/.ssh/config file:
2272
2273 @code{Host 127.0.0.1@ IdentityFile ~/.ssh/id_localhost}
2274
2275 now make sure your hostsfile looks like
2276
2277 @example
2278 [USERNAME]@@127.0.0.1:22@
2279 [USERNAME]@@127.0.0.1:22
2280 @end example
2281
2282 You can test your setup by running @code{ssh 127.0.0.1} in a
2283 terminal and then in the opened session run it again.
2284 If you were not asked for a password on either login,
2285 then you should be good to go.
2286
2287 @cindex TESTBED Caveats
2288 @node TESTBED Caveats
2289 @subsection TESTBED Caveats
2290
2291 This section documents a few caveats when using the GNUnet testbed
2292 subsystem.
2293
2294 @c ***********************************************************************
2295 @menu
2296 * CORE must be started::
2297 * ATS must want the connections::
2298 @end menu
2299
2300 @node CORE must be started
2301 @subsubsection CORE must be started
2302
2303 A uncomplicated issue is bug #3993
2304 (@uref{https://bugs.gnunet.org/view.php?id=3993, https://bugs.gnunet.org/view.php?id=3993}):
2305 Your configuration MUST somehow ensure that for each peer the
2306 @code{CORE} service is started when the peer is setup, otherwise
2307 @code{TESTBED} may fail to connect peers when the topology is initialized,
2308 as @code{TESTBED} will start some @code{CORE} services but not
2309 necessarily all (but it relies on all of them running). The easiest way
2310 is to set
2311
2312 @example
2313 [core]
2314 IMMEDIATE_START = YES
2315 @end example
2316
2317 @noindent
2318 in the configuration file.
2319 Alternatively, having any service that directly or indirectly depends on
2320 @code{CORE} being started with @code{IMMEDIATE_START} will also do.
2321 This issue largely arises if users try to over-optimize by not
2322 starting any services with @code{IMMEDIATE_START}.
2323
2324 @c ***********************************************************************
2325 @node ATS must want the connections
2326 @subsubsection ATS must want the connections
2327
2328 When TESTBED sets up connections, it only offers the respective HELLO
2329 information to the TRANSPORT service. It is then up to the ATS service to
2330 @strong{decide} to use the connection. The ATS service will typically
2331 eagerly establish any connection if the number of total connections is
2332 low (relative to bandwidth). Details may further depend on the
2333 specific ATS backend that was configured. If ATS decides to NOT establish
2334 a connection (even though TESTBED provided the required information), then
2335 that connection will count as failed for TESTBED. Note that you can
2336 configure TESTBED to tolerate a certain number of connection failures
2337 (see '-e' option of gnunet-testbed-profiler). This issue largely arises
2338 for dense overlay topologies, especially if you try to create cliques
2339 with more than 20 peers.
2340
2341 @cindex libgnunetutil
2342 @node libgnunetutil
2343 @section libgnunetutil
2344
2345 libgnunetutil is the fundamental library that all GNUnet code builds upon.
2346 Ideally, this library should contain most of the platform dependent code
2347 (except for user interfaces and really special needs that only few
2348 applications have). It is also supposed to offer basic services that most
2349 if not all GNUnet binaries require. The code of libgnunetutil is in the
2350 @file{src/util/} directory. The public interface to the library is in the
2351 gnunet_util.h header. The functions provided by libgnunetutil fall
2352 roughly into the following categories (in roughly the order of importance
2353 for new developers):
2354
2355 @itemize @bullet
2356 @item logging (common_logging.c)
2357 @item memory allocation (common_allocation.c)
2358 @item endianess conversion (common_endian.c)
2359 @item internationalization (common_gettext.c)
2360 @item String manipulation (string.c)
2361 @item file access (disk.c)
2362 @item buffered disk IO (bio.c)
2363 @item time manipulation (time.c)
2364 @item configuration parsing (configuration.c)
2365 @item command-line handling (getopt*.c)
2366 @item cryptography (crypto_*.c)
2367 @item data structures (container_*.c)
2368 @item CPS-style scheduling (scheduler.c)
2369 @item Program initialization (program.c)
2370 @item Networking (network.c, client.c, server*.c, service.c)
2371 @item message queuing (mq.c)
2372 @item bandwidth calculations (bandwidth.c)
2373 @item Other OS-related (os*.c, plugin.c, signal.c)
2374 @item Pseudonym management (pseudonym.c)
2375 @end itemize
2376
2377 It should be noted that only developers that fully understand this entire
2378 API will be able to write good GNUnet code.
2379
2380 Ideally, porting GNUnet should only require porting the gnunetutil
2381 library. More testcases for the gnunetutil APIs are therefore a great
2382 way to make porting of GNUnet easier.
2383
2384 @menu
2385 * Logging::
2386 * Interprocess communication API (IPC)::
2387 * Cryptography API::
2388 * Message Queue API::
2389 * Service API::
2390 * Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps::
2391 * CONTAINER_MDLL API::
2392 @end menu
2393
2394 @cindex Logging
2395 @cindex log levels
2396 @node Logging
2397 @subsection Logging
2398
2399 GNUnet is able to log its activity, mostly for the purposes of debugging
2400 the program at various levels.
2401
2402 @file{gnunet_common.h} defines several @strong{log levels}:
2403 @table @asis
2404
2405 @item ERROR for errors
2406 (really problematic situations, often leading to crashes)
2407 @item WARNING for warnings
2408 (troubling situations that might have negative consequences, although
2409 not fatal)
2410 @item INFO for various information.
2411 Used somewhat rarely, as GNUnet statistics is used to hold and display
2412 most of the information that users might find interesting.
2413 @item DEBUG for debugging.
2414 Does not produce much output on normal builds, but when extra logging is
2415 enabled at compile time, a staggering amount of data is outputted under
2416 this log level.
2417 @end table
2418
2419
2420 Normal builds of GNUnet (configured with @code{--enable-logging[=yes]})
2421 are supposed to log nothing under DEBUG level. The
2422 @code{--enable-logging=verbose} configure option can be used to create a
2423 build with all logging enabled. However, such build will produce large
2424 amounts of log data, which is inconvenient when one tries to hunt down a
2425 specific problem.
2426
2427 To mitigate this problem, GNUnet provides facilities to apply a filter to
2428 reduce the logs:
2429 @table @asis
2430
2431 @item Logging by default When no log levels are configured in any other
2432 way (see below), GNUnet will default to the WARNING log level. This
2433 mostly applies to GNUnet command line utilities, services and daemons;
2434 tests will always set log level to WARNING or, if
2435 @code{--enable-logging=verbose} was passed to configure, to DEBUG. The
2436 default level is suggested for normal operation.
2437 @item The -L option Most GNUnet executables accept an "-L loglevel" or
2438 "--log=loglevel" option. If used, it makes the process set a global log
2439 level to "loglevel". Thus it is possible to run some processes
2440 with -L DEBUG, for example, and others with -L ERROR to enable specific
2441 settings to diagnose problems with a particular process.
2442 @item Configuration files.  Because GNUnet
2443 service and daemon processes are usually launched by gnunet-arm, it is not
2444 possible to pass different custom command line options directly to every
2445 one of them. The options passed to @code{gnunet-arm} only affect
2446 gnunet-arm and not the rest of GNUnet. However, one can specify a
2447 configuration key "OPTIONS" in the section that corresponds to a service
2448 or a daemon, and put a value of "-L loglevel" there. This will make the
2449 respective service or daemon set its log level to "loglevel" (as the
2450 value of OPTIONS will be passed as a command-line argument).
2451
2452 To specify the same log level for all services without creating separate
2453 "OPTIONS" entries in the configuration for each one, the user can specify
2454 a config key "GLOBAL_POSTFIX" in the [arm] section of the configuration
2455 file. The value of GLOBAL_POSTFIX will be appended to all command lines
2456 used by the ARM service to run other services. It can contain any option
2457 valid for all GNUnet commands, thus in particular the "-L loglevel"
2458 option. The ARM service itself is, however, unaffected by GLOBAL_POSTFIX;
2459 to set log level for it, one has to specify "OPTIONS" key in the [arm]
2460 section.
2461 @item Environment variables.
2462 Setting global per-process log levels with "-L loglevel" does not offer
2463 sufficient log filtering granularity, as one service will call interface
2464 libraries and supporting libraries of other GNUnet services, potentially
2465 producing lots of debug log messages from these libraries. Also, changing
2466 the config file is not always convenient (especially when running the
2467 GNUnet test suite).@ To fix that, and to allow GNUnet to use different
2468 log filtering at runtime without re-compiling the whole source tree, the
2469 log calls were changed to be configurable at run time. To configure them
2470 one has to define environment variables "GNUNET_FORCE_LOGFILE",
2471 "GNUNET_LOG" and/or "GNUNET_FORCE_LOG":
2472 @itemize @bullet
2473
2474 @item "GNUNET_LOG" only affects the logging when no global log level is
2475 configured by any other means (that is, the process does not explicitly
2476 set its own log level, there are no "-L loglevel" options on command line
2477 or in configuration files), and can be used to override the default
2478 WARNING log level.
2479
2480 @item "GNUNET_FORCE_LOG" will completely override any other log
2481 configuration options given.
2482
2483 @item "GNUNET_FORCE_LOGFILE" will completely override the location of the
2484 file to log messages to. It should contain a relative or absolute file
2485 name. Setting GNUNET_FORCE_LOGFILE is equivalent to passing
2486 "--log-file=logfile" or "-l logfile" option (see below). It supports "[]"
2487 format in file names, but not "@{@}" (see below).
2488 @end itemize
2489
2490
2491 Because environment variables are inherited by child processes when they
2492 are launched, starting or re-starting the ARM service with these
2493 variables will propagate them to all other services.
2494
2495 "GNUNET_LOG" and "GNUNET_FORCE_LOG" variables must contain a specially
2496 formatted @strong{logging definition} string, which looks like this:@
2497
2498 @c FIXME: Can we close this with [/component] instead?
2499 @example
2500 [component];[file];[function];[from_line[-to_line]];loglevel[/component...]
2501 @end example
2502
2503 That is, a logging definition consists of definition entries, separated by
2504 slashes ('/'). If only one entry is present, there is no need to add a
2505 slash to its end (although it is not forbidden either).@ All definition
2506 fields (component, file, function, lines and loglevel) are mandatory, but
2507 (except for the loglevel) they can be empty. An empty field means
2508 "match anything". Note that even if fields are empty, the semicolon (';')
2509 separators must be present.@ The loglevel field is mandatory, and must
2510 contain one of the log level names (ERROR, WARNING, INFO or DEBUG).@
2511 The lines field might contain one non-negative number, in which case it
2512 matches only one line, or a range "from_line-to_line", in which case it
2513 matches any line in the interval [from_line;to_line] (that is, including
2514 both start and end line).@ GNUnet mostly defaults component name to the
2515 name of the service that is implemented in a process ('transport',
2516 'core', 'peerinfo', etc), but logging calls can specify custom component
2517 names using @code{GNUNET_log_from}.@ File name and function name are
2518 provided by the compiler (__FILE__ and __FUNCTION__ built-ins).
2519
2520 Component, file and function fields are interpreted as non-extended
2521 regular expressions (GNU libc regex functions are used). Matching is
2522 case-sensitive, "^" and "$" will match the beginning and the end of the
2523 text. If a field is empty, its contents are automatically replaced with
2524 a ".*" regular expression, which matches anything. Matching is done in
2525 the default way, which means that the expression matches as long as it's
2526 contained anywhere in the string. Thus "GNUNET_" will match both
2527 "GNUNET_foo" and "BAR_GNUNET_BAZ". Use '^' and/or '$' to make sure that
2528 the expression matches at the start and/or at the end of the string.
2529 The semicolon (';') can't be escaped, and GNUnet will not use it in
2530 component names (it can't be used in function names and file names
2531 anyway).
2532
2533 @end table
2534
2535
2536 Every logging call in GNUnet code will be (at run time) matched against
2537 the log definitions passed to the process. If a log definition fields are
2538 matching the call arguments, then the call log level is compared the the
2539 log level of that definition. If the call log level is less or equal to
2540 the definition log level, the call is allowed to proceed. Otherwise the
2541 logging call is forbidden, and nothing is logged. If no definitions
2542 matched at all, GNUnet will use the global log level or (if a global log
2543 level is not specified) will default to WARNING (that is, it will allow
2544 the call to proceed, if its level is less or equal to the global log
2545 level or to WARNING).
2546
2547 That is, definitions are evaluated from left to right, and the first
2548 matching definition is used to allow or deny the logging call. Thus it is
2549 advised to place narrow definitions at the beginning of the logdef
2550 string, and generic definitions - at the end.
2551
2552 Whether a call is allowed or not is only decided the first time this
2553 particular call is made. The evaluation result is then cached, so that
2554 any attempts to make the same call later will be allowed or disallowed
2555 right away. Because of that runtime log level evaluation should not
2556 significantly affect the process performance.
2557 Log definition parsing is only done once, at the first call to
2558 @code{GNUNET_log_setup ()} made by the process (which is usually done soon after
2559 it starts).
2560
2561 At the moment of writing there is no way to specify logging definitions
2562 from configuration files, only via environment variables.
2563
2564 At the moment GNUnet will stop processing a log definition when it
2565 encounters an error in definition formatting or an error in regular
2566 expression syntax, and will not report the failure in any way.
2567
2568
2569 @c ***********************************************************************
2570 @menu
2571 * Examples::
2572 * Log files::
2573 * Updated behavior of GNUNET_log::
2574 @end menu
2575
2576 @node Examples
2577 @subsubsection Examples
2578
2579 @table @asis
2580
2581 @item @code{GNUNET_FORCE_LOG=";;;;DEBUG" gnunet-arm -s} Start GNUnet
2582 process tree, running all processes with DEBUG level (one should be
2583 careful with it, as log files will grow at alarming rate!)
2584 @item @code{GNUNET_FORCE_LOG="core;;;;DEBUG" gnunet-arm -s} Start GNUnet
2585 process tree, running the core service under DEBUG level (everything else
2586 will use configured or default level).
2587
2588 @item Start GNUnet process tree, allowing any logging calls from
2589 gnunet-service-transport_validation.c (everything else will use
2590 configured or default level).
2591
2592 @example
2593 GNUNET_FORCE_LOG=";gnunet-service-transport_validation.c;;; DEBUG" \
2594 gnunet-arm -s
2595 @end example
2596
2597 @item Start GNUnet process tree, allowing any logging calls from
2598 gnunet-gnunet-service-fs_push.c (everything else will use configured or
2599 default level).
2600
2601 @example
2602 GNUNET_FORCE_LOG="fs;gnunet-service-fs_push.c;;;DEBUG" gnunet-arm -s
2603 @end example
2604
2605 @item Start GNUnet process tree, allowing any logging calls from the
2606 GNUNET_NETWORK_socket_select function (everything else will use
2607 configured or default level).
2608
2609 @example
2610 GNUNET_FORCE_LOG=";;GNUNET_NETWORK_socket_select;;DEBUG" gnunet-arm -s
2611 @end example
2612
2613 @item Start GNUnet process tree, allowing any logging calls from the
2614 components that have "transport" in their names, and are made from
2615 function that have "send" in their names. Everything else will be allowed
2616 to be logged only if it has WARNING level.
2617
2618 @example
2619 GNUNET_FORCE_LOG="transport.*;;.*send.*;;DEBUG/;;;;WARNING" gnunet-arm -s
2620 @end example
2621
2622 @end table
2623
2624
2625 On Windows, one can use batch files to run GNUnet processes with special
2626 environment variables, without affecting the whole system. Such batch
2627 file will look like this:
2628
2629 @example
2630 set GNUNET_FORCE_LOG=;;do_transmit;;DEBUG@ gnunet-arm -s
2631 @end example
2632
2633 (note the absence of double quotes in the environment variable definition,
2634 as opposed to earlier examples, which use the shell).
2635 Another limitation, on Windows, GNUNET_FORCE_LOGFILE @strong{MUST} be set
2636 in order to GNUNET_FORCE_LOG to work.
2637
2638
2639 @cindex Log files
2640 @node Log files
2641 @subsubsection Log files
2642
2643 GNUnet can be told to log everything into a file instead of stderr (which
2644 is the default) using the "--log-file=logfile" or "-l logfile" option.
2645 This option can also be passed via command line, or from the "OPTION" and
2646 "GLOBAL_POSTFIX" configuration keys (see above). The file name passed
2647 with this option is subject to GNUnet filename expansion. If specified in
2648 "GLOBAL_POSTFIX", it is also subject to ARM service filename expansion,
2649 in particular, it may contain "@{@}" (left and right curly brace)
2650 sequence, which will be replaced by ARM with the name of the service.
2651 This is used to keep logs from more than one service separate, while only
2652 specifying one template containing "@{@}" in GLOBAL_POSTFIX.
2653
2654 As part of a secondary file name expansion, the first occurrence of "[]"
2655 sequence ("left square brace" followed by "right square brace") in the
2656 file name will be replaced with a process identifier or the process when
2657 it initializes its logging subsystem. As a result, all processes will log
2658 into different files. This is convenient for isolating messages of a
2659 particular process, and prevents I/O races when multiple processes try to
2660 write into the file at the same time. This expansion is done
2661 independently of "@{@}" expansion that ARM service does (see above).
2662
2663 The log file name that is specified via "-l" can contain format characters
2664 from the 'strftime' function family. For example, "%Y" will be replaced
2665 with the current year. Using "basename-%Y-%m-%d.log" would include the
2666 current year, month and day in the log file. If a GNUnet process runs for
2667 long enough to need more than one log file, it will eventually clean up
2668 old log files. Currently, only the last three log files (plus the current
2669 log file) are preserved. So once the fifth log file goes into use (so
2670 after 4 days if you use "%Y-%m-%d" as above), the first log file will be
2671 automatically deleted. Note that if your log file name only contains "%Y",
2672 then log files would be kept for 4 years and the logs from the first year
2673 would be deleted once year 5 begins. If you do not use any date-related
2674 string format codes, logs would never be automatically deleted by GNUnet.
2675
2676
2677 @c ***********************************************************************
2678
2679 @node Updated behavior of GNUNET_log
2680 @subsubsection Updated behavior of GNUNET_log
2681
2682 It's currently quite common to see constructions like this all over the
2683 code:
2684
2685 @example
2686 #if MESH_DEBUG
2687 GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, "MESH: client disconnected\n");
2688 #endif
2689 @end example
2690
2691 The reason for the #if is not to avoid displaying the message when
2692 disabled (GNUNET_ERROR_TYPE takes care of that), but to avoid the
2693 compiler including it in the binary at all, when compiling GNUnet for
2694 platforms with restricted storage space / memory (MIPS routers,
2695 ARM plug computers / dev boards, etc).
2696
2697 This presents several problems: the code gets ugly, hard to write and it
2698 is very easy to forget to include the #if guards, creating non-consistent
2699 code. A new change in GNUNET_log aims to solve these problems.
2700
2701 @strong{This change requires to @file{./configure} with at least
2702 @code{--enable-logging=verbose} to see debug messages.}
2703
2704 Here is an example of code with dense debug statements:
2705
2706 @example
2707 switch (restrict_topology) @{
2708 case GNUNET_TESTING_TOPOLOGY_CLIQUE:#if VERBOSE_TESTING
2709 GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but clique
2710 topology\n")); #endif unblacklisted_connections = create_clique (pg,
2711 &remove_connections, BLACKLIST, GNUNET_NO); break; case
2712 GNUNET_TESTING_TOPOLOGY_SMALL_WORLD_RING: #if VERBOSE_TESTING GNUNET_log
2713 (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but small world (ring)
2714 topology\n")); #endif unblacklisted_connections = create_small_world_ring
2715 (pg,&remove_connections, BLACKLIST); break;
2716 @end example
2717
2718
2719 Pretty hard to follow, huh?
2720
2721 From now on, it is not necessary to include the #if / #endif statements to
2722 achieve the same behavior. The @code{GNUNET_log} and @code{GNUNET_log_from}
2723 macros take
2724 care of it for you, depending on the configure option:
2725
2726 @itemize @bullet
2727 @item If @code{--enable-logging} is set to @code{no}, the binary will
2728 contain no log messages at all.
2729 @item If @code{--enable-logging} is set to @code{yes}, the binary will
2730 contain no DEBUG messages, and therefore running with @command{-L DEBUG}
2731 will have
2732 no effect. Other messages (ERROR, WARNING, INFO, etc) will be included.
2733 @item If @code{--enable-logging} is set to @code{verbose}, or
2734 @code{veryverbose} the binary will contain DEBUG messages (still, it will
2735 be necessary to run with @command{-L DEBUG} or set the DEBUG config option
2736 to show
2737 them).
2738 @end itemize
2739
2740
2741 If you are a developer:
2742 @itemize @bullet
2743 @item please make sure that you @code{./configure
2744 --enable-logging=@{verbose,veryverbose@}}, so you can see DEBUG messages.
2745 @item please remove the @code{#if} statements around @code{GNUNET_log
2746 (GNUNET_ERROR_TYPE_DEBUG, ...)} lines, to improve the readability of your
2747 code.
2748 @end itemize
2749
2750 Since now activating DEBUG automatically makes it VERBOSE and activates
2751 @strong{all} debug messages by default, you probably want to use the
2752 @uref{https://old.gnunet.org/logging, https://old.gnunet.org/logging}
2753 functionality to filter only relevant messages.
2754 A suitable configuration could be:
2755
2756 @example
2757 $ export GNUNET_FORCE_LOG="^YOUR_SUBSYSTEM$;;;;DEBUG/;;;;WARNING"
2758 @end example
2759
2760 Which will behave almost like enabling DEBUG in that subsystem before the
2761 change. Of course you can adapt it to your particular needs, this is only
2762 a quick example.
2763
2764 @cindex Interprocess communication API
2765 @cindex ICP
2766 @node Interprocess communication API (IPC)
2767 @subsection Interprocess communication API (IPC)
2768
2769 In GNUnet a variety of new message types might be defined and used in
2770 interprocess communication, in this tutorial we use the
2771 @code{struct AddressLookupMessage} as a example to introduce how to
2772 construct our own message type in GNUnet and how to implement the message
2773 communication between service and client.
2774 (Here, a client uses the @code{struct AddressLookupMessage} as a request
2775 to ask the server to return the address of any other peer connecting to
2776 the service.)
2777
2778
2779 @c ***********************************************************************
2780 @menu
2781 * Define new message types::
2782 * Define message struct::
2783 * Client - Establish connection::
2784 * Client - Initialize request message::
2785 * Client - Send request and receive response::
2786 * Server - Startup service::
2787 * Server - Add new handles for specified messages::
2788 * Server - Process request message::
2789 * Server - Response to client::
2790 * Server - Notification of clients::
2791 * Conversion between Network Byte Order (Big Endian) and Host Byte Order::
2792 @end menu
2793
2794 @node Define new message types
2795 @subsubsection Define new message types
2796
2797 First of all, you should define the new message type in
2798 @file{gnunet_protocols.h}:
2799
2800 @example
2801  // Request to look addresses of peers in server.
2802 #define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP 29
2803   // Response to the address lookup request.
2804 #define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY 30
2805 @end example
2806
2807 @c ***********************************************************************
2808 @node Define message struct
2809 @subsubsection Define message struct
2810
2811 After the type definition, the specified message structure should also be
2812 described in the header file, e.g. transport.h in our case.
2813
2814 @example
2815 struct AddressLookupMessage @{
2816   struct GNUNET_MessageHeader header;
2817   int32_t numeric_only GNUNET_PACKED;
2818   struct GNUNET_TIME_AbsoluteNBO timeout;
2819   uint32_t addrlen GNUNET_PACKED;
2820   /* followed by 'addrlen' bytes of the actual address, then
2821      followed by the 0-terminated name of the transport */ @};
2822 GNUNET_NETWORK_STRUCT_END
2823 @end example
2824
2825
2826 Please note @code{GNUNET_NETWORK_STRUCT_BEGIN} and @code{GNUNET_PACKED}
2827 which both ensure correct alignment when sending structs over the network.
2828
2829 @menu
2830 @end menu
2831
2832 @c ***********************************************************************
2833 @node Client - Establish connection
2834 @subsubsection Client - Establish connection
2835 @c %**end of header
2836
2837
2838 At first, on the client side, the underlying API is employed to create a
2839 new connection to a service, in our example the transport service would be
2840 connected.
2841
2842 @example
2843 struct GNUNET_CLIENT_Connection *client;
2844 client = GNUNET_CLIENT_connect ("transport", cfg);
2845 @end example
2846
2847 @c ***********************************************************************
2848 @node Client - Initialize request message
2849 @subsubsection Client - Initialize request message
2850 @c %**end of header
2851
2852 When the connection is ready, we initialize the message. In this step,
2853 all the fields of the message should be properly initialized, namely the
2854 size, type, and some extra user-defined data, such as timeout, name of
2855 transport, address and name of transport.
2856
2857 @example
2858 struct AddressLookupMessage *msg;
2859 size_t len = sizeof (struct AddressLookupMessage)
2860   + addressLen
2861   + strlen (nameTrans)
2862   + 1;
2863 msg->header->size = htons (len);
2864 msg->header->type = htons
2865 (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP);
2866 msg->timeout = GNUNET_TIME_absolute_hton (abs_timeout);
2867 msg->addrlen = htonl (addressLen);
2868 char *addrbuf = (char *) &msg[1];
2869 memcpy (addrbuf, address, addressLen);
2870 char *tbuf = &addrbuf[addressLen];
2871 memcpy (tbuf, nameTrans, strlen (nameTrans) + 1);
2872 @end example
2873
2874 Note that, here the functions @code{htonl}, @code{htons} and
2875 @code{GNUNET_TIME_absolute_hton} are applied to convert little endian
2876 into big endian, about the usage of the big/small endian order and the
2877 corresponding conversion function please refer to Introduction of
2878 Big Endian and Little Endian.
2879
2880 @c ***********************************************************************
2881 @node Client - Send request and receive response
2882 @subsubsection Client - Send request and receive response
2883 @c %**end of header
2884
2885 @b{FIXME: This is very outdated, see the tutorial for the current API!}
2886
2887 Next, the client would send the constructed message as a request to the
2888 service and wait for the response from the service. To accomplish this
2889 goal, there are a number of API calls that can be used. In this example,
2890 @code{GNUNET_CLIENT_transmit_and_get_response} is chosen as the most
2891 appropriate function to use.
2892
2893 @example
2894 GNUNET_CLIENT_transmit_and_get_response
2895 (client, msg->header, timeout, GNUNET_YES, &address_response_processor,
2896 arp_ctx);
2897 @end example
2898
2899 the argument @code{address_response_processor} is a function with
2900 @code{GNUNET_CLIENT_MessageHandler} type, which is used to process the
2901 reply message from the service.
2902
2903 @node Server - Startup service
2904 @subsubsection Server - Startup service
2905
2906 After receiving the request message, we run a standard GNUnet service
2907 startup sequence using @code{GNUNET_SERVICE_run}, as follows,
2908
2909 @example
2910 int main(int argc, char**argv) @{
2911   GNUNET_SERVICE_run(argc, argv, "transport"
2912   GNUNET_SERVICE_OPTION_NONE, &run, NULL)); @}
2913 @end example
2914
2915 @c ***********************************************************************
2916 @node Server - Add new handles for specified messages
2917 @subsubsection Server - Add new handles for specified messages
2918 @c %**end of header
2919
2920 in the function above the argument @code{run} is used to initiate
2921 transport service,and defined like this:
2922
2923 @example
2924 static void run (void *cls,
2925 struct GNUNET_SERVER_Handle *serv,
2926 const struct GNUNET_CONFIGURATION_Handle *cfg) @{
2927   GNUNET_SERVER_add_handlers (serv, handlers); @}
2928 @end example
2929
2930
2931 Here, @code{GNUNET_SERVER_add_handlers} must be called in the run
2932 function to add new handlers in the service. The parameter
2933 @code{handlers} is a list of @code{struct GNUNET_SERVER_MessageHandler}
2934 to tell the service which function should be called when a particular
2935 type of message is received, and should be defined in this way:
2936
2937 @example
2938 static struct GNUNET_SERVER_MessageHandler handlers[] = @{
2939   @{&handle_start,
2940    NULL,
2941    GNUNET_MESSAGE_TYPE_TRANSPORT_START,
2942    0@},
2943   @{&handle_send,
2944    NULL,
2945    GNUNET_MESSAGE_TYPE_TRANSPORT_SEND,
2946    0@},
2947   @{&handle_try_connect,
2948    NULL,
2949    GNUNET_MESSAGE_TYPE_TRANSPORT_TRY_CONNECT,
2950    sizeof (struct TryConnectMessage)
2951   @},
2952   @{&handle_address_lookup,
2953    NULL,
2954    GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP,
2955    0@},
2956   @{NULL,
2957    NULL,
2958    0,
2959    0@}
2960 @};
2961 @end example
2962
2963
2964 As shown, the first member of the struct in the first area is a callback
2965 function, which is called to process the specified message types, given
2966 as the third member. The second parameter is the closure for the callback
2967 function, which is set to @code{NULL} in most cases, and the last
2968 parameter is the expected size of the message of this type, usually we
2969 set it to 0 to accept variable size, for special cases the exact size of
2970 the specified message also can be set. In addition, the terminator sign
2971 depicted as @code{@{NULL, NULL, 0, 0@}} is set in the last area.
2972
2973 @c ***********************************************************************
2974 @node Server - Process request message
2975 @subsubsection Server - Process request message
2976 @c %**end of header
2977
2978 After the initialization of transport service, the request message would
2979 be processed. Before handling the main message data, the validity of this
2980 message should be checked out, e.g., to check whether the size of message
2981 is correct.
2982
2983 @example
2984 size = ntohs (message->size);
2985 if (size < sizeof (struct AddressLookupMessage)) @{
2986   GNUNET_break_op (0);
2987   GNUNET_SERVER_receive_done (client, GNUNET_SYSERR);
2988   return; @}
2989 @end example
2990
2991
2992 Note that, opposite to the construction method of the request message in
2993 the client, in the server the function @code{nothl} and @code{ntohs}
2994 should be employed during the extraction of the data from the message, so
2995 that the data in big endian order can be converted back into little
2996 endian order. See more in detail please refer to Introduction of
2997 Big Endian and Little Endian.
2998
2999 Moreover in this example, the name of the transport stored in the message
3000 is a 0-terminated string, so we should also check whether the name of the
3001 transport in the received message is 0-terminated:
3002
3003 @example
3004 nameTransport = (const char *) &address[addressLen];
3005 if (nameTransport[size - sizeof
3006                   (struct AddressLookupMessage)
3007                   - addressLen - 1] != '\0') @{
3008   GNUNET_break_op (0);
3009   GNUNET_SERVER_receive_done (client,
3010                               GNUNET_SYSERR);
3011   return; @}
3012 @end example
3013
3014 Here, @code{GNUNET_SERVER_receive_done} should be called to tell the
3015 service that the request is done and can receive the next message. The
3016 argument @code{GNUNET_SYSERR} here indicates that the service didn't
3017 understand the request message, and the processing of this request would
3018 be terminated.
3019
3020 In comparison to the aforementioned situation, when the argument is equal
3021 to @code{GNUNET_OK}, the service would continue to process the request
3022 message.
3023
3024 @c ***********************************************************************
3025 @node Server - Response to client
3026 @subsubsection Server - Response to client
3027 @c %**end of header
3028
3029 Once the processing of current request is done, the server should give the
3030 response to the client. A new @code{struct AddressLookupMessage} would be
3031 produced by the server in a similar way as the client did and sent to the
3032 client, but here the type should be
3033 @code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY} rather than
3034 @code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP} in client.
3035 @example
3036 struct AddressLookupMessage *msg;
3037 size_t len = sizeof (struct AddressLookupMessage)
3038   + addressLen
3039   + strlen (nameTrans) + 1;
3040 msg->header->size = htons (len);
3041 msg->header->type = htons
3042   (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
3043
3044 // ...
3045
3046 struct GNUNET_SERVER_TransmitContext *tc;
3047 tc = GNUNET_SERVER_transmit_context_create (client);
3048 GNUNET_SERVER_transmit_context_append_data
3049 (tc,
3050  NULL,
3051  0,
3052  GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
3053 GNUNET_SERVER_transmit_context_run (tc, rtimeout);
3054 @end example
3055
3056
3057 Note that, there are also a number of other APIs provided to the service
3058 to send the message.
3059
3060 @c ***********************************************************************
3061 @node Server - Notification of clients
3062 @subsubsection Server - Notification of clients
3063 @c %**end of header
3064
3065 Often a service needs to (repeatedly) transmit notifications to a client
3066 or a group of clients. In these cases, the client typically has once
3067 registered for a set of events and then needs to receive a message
3068 whenever such an event happens (until the client disconnects). The use of
3069 a notification context can help manage message queues to clients and
3070 handle disconnects. Notification contexts can be used to send
3071 individualized messages to a particular client or to broadcast messages
3072 to a group of clients. An individualized notification might look like
3073 this:
3074
3075 @example
3076 GNUNET_SERVER_notification_context_unicast(nc,
3077                                            client,
3078                                            msg,
3079                                            GNUNET_YES);
3080 @end example
3081
3082
3083 Note that after processing the original registration message for
3084 notifications, the server code still typically needs to call
3085 @code{GNUNET_SERVER_receive_done} so that the client can transmit further
3086 messages to the server.
3087
3088 @c ***********************************************************************
3089 @node Conversion between Network Byte Order (Big Endian) and Host Byte Order
3090 @subsubsection Conversion between Network Byte Order (Big Endian) and Host Byte Order
3091 @c %** subsub? it's a referenced page on the ipc document.
3092 @c %**end of header
3093
3094 Here we can simply comprehend big endian and little endian as Network Byte
3095 Order and Host Byte Order respectively. What is the difference between
3096 both two?
3097
3098 Usually in our host computer we store the data byte as Host Byte Order,
3099 for example, we store a integer in the RAM which might occupies 4 Byte,
3100 as Host Byte Order the higher Byte would be stored at the lower address
3101 of RAM, and the lower Byte would be stored at the higher address of RAM.
3102 However, contrast to this, Network Byte Order just take the totally
3103 opposite way to store the data, says, it will store the lower Byte at the
3104 lower address, and the higher Byte will stay at higher address.
3105
3106 For the current communication of network, we normally exchange the
3107 information by surveying the data package, every two host wants to
3108 communicate with each other must send and receive data package through
3109 network. In order to maintain the identity of data through the
3110 transmission in the network, the order of the Byte storage must changed
3111 before sending and after receiving the data.
3112
3113 There ten convenient functions to realize the conversion of Byte Order in
3114 GNUnet, as following:
3115
3116 @table @asis
3117
3118 @item uint16_t htons(uint16_t hostshort) Convert host byte order to net
3119 byte order with short int
3120 @item uint32_t htonl(uint32_t hostlong) Convert host byte
3121 order to net byte order with long int
3122 @item uint16_t ntohs(uint16_t netshort)
3123 Convert net byte order to host byte order with short int
3124 @item uint32_t
3125 ntohl(uint32_t netlong) Convert net byte order to host byte order with
3126 long int
3127 @item unsigned long long GNUNET_ntohll (unsigned long long netlonglong)
3128 Convert net byte order to host byte order with long long int
3129 @item unsigned long long GNUNET_htonll (unsigned long long hostlonglong)
3130 Convert host byte order to net byte order with long long int
3131 @item struct GNUNET_TIME_RelativeNBO GNUNET_TIME_relative_hton
3132 (struct GNUNET_TIME_Relative a) Convert relative time to network byte
3133 order.
3134 @item struct GNUNET_TIME_Relative GNUNET_TIME_relative_ntoh
3135 (struct GNUNET_TIME_RelativeNBO a) Convert relative time from network
3136 byte order.
3137 @item struct GNUNET_TIME_AbsoluteNBO GNUNET_TIME_absolute_hton
3138 (struct GNUNET_TIME_Absolute a) Convert relative time to network byte
3139 order.
3140 @item struct GNUNET_TIME_Absolute GNUNET_TIME_absolute_ntoh
3141 (struct GNUNET_TIME_AbsoluteNBO a) Convert relative time from network
3142 byte order.
3143 @end table
3144
3145 @cindex Cryptography API
3146 @node Cryptography API
3147 @subsection Cryptography API
3148 @c %**end of header
3149
3150 The gnunetutil APIs provides the cryptographic primitives used in GNUnet.
3151 GNUnet uses 2048 bit RSA keys for the session key exchange and for signing
3152 messages by peers and most other public-key operations. Most researchers
3153 in cryptography consider 2048 bit RSA keys as secure and practically
3154 unbreakable for a long time. The API provides functions to create a fresh
3155 key pair, read a private key from a file (or create a new file if the
3156 file does not exist), encrypt, decrypt, sign, verify and extraction of
3157 the public key into a format suitable for network transmission.
3158
3159 For the encryption of files and the actual data exchanged between peers
3160 GNUnet uses 256-bit AES encryption. Fresh, session keys are negotiated
3161 for every new connection.@ Again, there is no published technique to
3162 break this cipher in any realistic amount of time. The API provides
3163 functions for generation of keys, validation of keys (important for
3164 checking that decryptions using RSA succeeded), encryption and decryption.
3165
3166 GNUnet uses SHA-512 for computing one-way hash codes. The API provides
3167 functions to compute a hash over a block in memory or over a file on disk.
3168
3169 The crypto API also provides functions for randomizing a block of memory,
3170 obtaining a single random number and for generating a permutation of the
3171 numbers 0 to n-1. Random number generation distinguishes between WEAK and
3172 STRONG random number quality; WEAK random numbers are pseudo-random
3173 whereas STRONG random numbers use entropy gathered from the operating
3174 system.
3175
3176 Finally, the crypto API provides a means to deterministically generate a
3177 1024-bit RSA key from a hash code. These functions should most likely not
3178 be used by most applications; most importantly,
3179 GNUNET_CRYPTO_rsa_key_create_from_hash does not create an RSA-key that
3180 should be considered secure for traditional applications of RSA.
3181
3182 @cindex Message Queue API
3183 @node Message Queue API
3184 @subsection Message Queue API
3185 @c %**end of header
3186
3187 @strong{ Introduction }@
3188 Often, applications need to queue messages that
3189 are to be sent to other GNUnet peers, clients or services. As all of
3190 GNUnet's message-based communication APIs, by design, do not allow
3191 messages to be queued, it is common to implement custom message queues
3192 manually when they are needed. However, writing very similar code in
3193 multiple places is tedious and leads to code duplication.
3194
3195 MQ (for Message Queue) is an API that provides the functionality to
3196 implement and use message queues. We intend to eventually replace all of
3197 the custom message queue implementations in GNUnet with MQ.
3198
3199 @strong{ Basic Concepts }@
3200 The two most important entities in MQ are queues and envelopes.
3201
3202 Every queue is backed by a specific implementation (e.g. for mesh, stream,
3203 connection, server client, etc.) that will actually deliver the queued
3204 messages. For convenience,@ some queues also allow to specify a list of
3205 message handlers. The message queue will then also wait for incoming
3206 messages and dispatch them appropriately.
3207
3208 An envelope holds the the memory for a message, as well as metadata
3209 (Where is the envelope queued? What should happen after it has been
3210 sent?). Any envelope can only be queued in one message queue.
3211
3212 @strong{ Creating Queues }@
3213 The following is a list of currently available message queues. Note that
3214 to avoid layering issues, message queues for higher level APIs are not
3215 part of @code{libgnunetutil}, but@ the respective API itself provides the
3216 queue implementation.
3217
3218 @table @asis
3219
3220 @item @code{GNUNET_MQ_queue_for_connection_client}
3221 Transmits queued messages over a @code{GNUNET_CLIENT_Connection} handle.
3222 Also supports receiving with message handlers.
3223
3224 @item @code{GNUNET_MQ_queue_for_server_client}
3225 Transmits queued messages over a @code{GNUNET_SERVER_Client} handle. Does
3226 not support incoming message handlers.
3227
3228 @item @code{GNUNET_MESH_mq_create} Transmits queued messages over a
3229 @code{GNUNET_MESH_Tunnel} handle. Does not support incoming message
3230 handlers.
3231
3232 @item @code{GNUNET_MQ_queue_for_callbacks} This is the most general
3233 implementation. Instead of delivering and receiving messages with one of
3234 GNUnet's communication APIs, implementation callbacks are called. Refer to
3235 "Implementing Queues" for a more detailed explanation.
3236 @end table
3237
3238
3239 @strong{ Allocating Envelopes }@
3240 A GNUnet message (as defined by the GNUNET_MessageHeader) has three
3241 parts: The size, the type, and the body.
3242
3243 MQ provides macros to allocate an envelope containing a message
3244 conveniently, automatically setting the size and type fields of the
3245 message.
3246
3247 Consider the following simple message, with the body consisting of a
3248 single number value.
3249 @c why the empty code function?
3250 @code{}
3251
3252 @example
3253 struct NumberMessage @{
3254   /** Type: GNUNET_MESSAGE_TYPE_EXAMPLE_1 */
3255   struct GNUNET_MessageHeader header;
3256   uint32_t number GNUNET_PACKED;
3257 @};
3258 @end example
3259
3260 An envelope containing an instance of the NumberMessage can be
3261 constructed like this:
3262
3263 @example
3264 struct GNUNET_MQ_Envelope *ev;
3265 struct NumberMessage *msg;
3266 ev = GNUNET_MQ_msg (msg, GNUNET_MESSAGE_TYPE_EXAMPLE_1);
3267 msg->number = htonl (42);
3268 @end example
3269
3270 In the above code, @code{GNUNET_MQ_msg} is a macro. The return value is
3271 the newly allocated envelope. The first argument must be a pointer to some
3272 @code{struct} containing a @code{struct GNUNET_MessageHeader header}
3273 field, while the second argument is the desired message type, in host
3274 byte order.
3275
3276 The @code{msg} pointer now points to an allocated message, where the
3277 message type and the message size are already set. The message's size is
3278 inferred from the type of the @code{msg} pointer: It will be set to
3279 'sizeof(*msg)', properly converted to network byte order.
3280
3281 If the message body's size is dynamic, the the macro
3282 @code{GNUNET_MQ_msg_extra} can be used to allocate an envelope whose
3283 message has additional space allocated after the @code{msg} structure.
3284
3285 If no structure has been defined for the message,
3286 @code{GNUNET_MQ_msg_header_extra} can be used to allocate additional space
3287 after the message header. The first argument then must be a pointer to a
3288 @code{GNUNET_MessageHeader}.
3289
3290 @strong{Envelope Properties}@
3291 A few functions in MQ allow to set additional properties on envelopes:
3292
3293 @table @asis
3294
3295 @item @code{GNUNET_MQ_notify_sent} Allows to specify a function that will
3296 be called once the envelope's message has been sent irrevocably.
3297 An envelope can be canceled precisely up to the@ point where the notify
3298 sent callback has been called.
3299
3300 @item @code{GNUNET_MQ_disable_corking} No corking will be used when
3301 sending the message. Not every@ queue supports this flag, per default,
3302 envelopes are sent with corking.@
3303
3304 @end table
3305
3306
3307 @strong{Sending Envelopes}@
3308 Once an envelope has been constructed, it can be queued for sending with
3309 @code{GNUNET_MQ_send}.
3310
3311 Note that in order to avoid memory leaks, an envelope must either be sent
3312 (the queue will free it) or destroyed explicitly with
3313 @code{GNUNET_MQ_discard}.
3314
3315 @strong{Canceling Envelopes}@
3316 An envelope queued with @code{GNUNET_MQ_send} can be canceled with
3317 @code{GNUNET_MQ_cancel}. Note that after the notify sent callback has
3318 been called, canceling a message results in undefined behavior.
3319 Thus it is unsafe to cancel an envelope that does not have a notify sent
3320 callback. When canceling an envelope, it is not necessary@ to call
3321 @code{GNUNET_MQ_discard}, and the envelope can't be sent again.
3322
3323 @strong{ Implementing Queues }@
3324 @code{TODO}
3325
3326 @cindex Service API
3327 @node Service API
3328 @subsection Service API
3329 @c %**end of header
3330
3331 Most GNUnet code lives in the form of services. Services are processes
3332 that offer an API for other components of the system to build on. Those
3333 other components can be command-line tools for users, graphical user
3334 interfaces or other services. Services provide their API using an IPC
3335 protocol. For this, each service must listen on either a TCP port or a
3336 UNIX domain socket; for this, the service implementation uses the server
3337 API. This use of server is exposed directly to the users of the service
3338 API. Thus, when using the service API, one is usually also often using
3339 large parts of the server API. The service API provides various
3340 convenience functions, such as parsing command-line arguments and the
3341 configuration file, which are not found in the server API.
3342 The dual to the service/server API is the client API, which can be used to
3343 access services.
3344
3345 The most common way to start a service is to use the
3346 @code{GNUNET_SERVICE_run} function from the program's main function.
3347 @code{GNUNET_SERVICE_run} will then parse the command line and
3348 configuration files and, based on the options found there,
3349 start the server. It will then give back control to the main
3350 program, passing the server and the configuration to the
3351 @code{GNUNET_SERVICE_Main} callback. @code{GNUNET_SERVICE_run}
3352 will also take care of starting the scheduler loop.
3353 If this is inappropriate (for example, because the scheduler loop
3354 is already running), @code{GNUNET_SERVICE_start} and
3355 related functions provide an alternative to @code{GNUNET_SERVICE_run}.
3356
3357 When starting a service, the service_name option is used to determine
3358 which sections in the configuration file should be used to configure the
3359 service. A typical value here is the name of the @file{src/}
3360 sub-directory, for example @file{statistics}.
3361 The same string would also be given to
3362 @code{GNUNET_CLIENT_connect} to access the service.
3363
3364 Once a service has been initialized, the program should use the
3365 @code{GNUNET_SERVICE_Main} callback to register message handlers
3366 using @code{GNUNET_SERVER_add_handlers}.
3367 The service will already have registered a handler for the
3368 "TEST" message.
3369
3370 @findex GNUNET_SERVICE_Options
3371 The option bitfield (@code{enum GNUNET_SERVICE_Options})
3372 determines how a service should behave during shutdown.
3373 There are three key strategies:
3374
3375 @table @asis
3376
3377 @item instant (@code{GNUNET_SERVICE_OPTION_NONE})
3378 Upon receiving the shutdown
3379 signal from the scheduler, the service immediately terminates the server,
3380 closing all existing connections with clients.
3381 @item manual (@code{GNUNET_SERVICE_OPTION_MANUAL_SHUTDOWN})
3382 The service does nothing by itself
3383 during shutdown. The main program will need to take the appropriate
3384 action by calling GNUNET_SERVER_destroy or GNUNET_SERVICE_stop (depending
3385 on how the service was initialized) to terminate the service. This method
3386 is used by gnunet-service-arm and rather uncommon.
3387 @item soft (@code{GNUNET_SERVICE_OPTION_SOFT_SHUTDOWN})
3388 Upon receiving the shutdown signal from the scheduler,
3389 the service immediately tells the server to stop
3390 listening for incoming clients. Requests from normal existing clients are
3391 still processed and the server/service terminates once all normal clients
3392 have disconnected. Clients that are not expected to ever disconnect (such
3393 as clients that monitor performance values) can be marked as 'monitor'
3394 clients using GNUNET_SERVER_client_mark_monitor. Those clients will
3395 continue to be processed until all 'normal' clients have disconnected.
3396 Then, the server will terminate, closing the monitor connections.
3397 This mode is for example used by 'statistics', allowing existing 'normal'
3398 clients to set (possibly persistent) statistic values before terminating.
3399
3400 @end table
3401
3402 @c ***********************************************************************
3403 @node Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
3404 @subsection Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
3405 @c %**end of header
3406
3407 A commonly used data structure in GNUnet is a (multi-)hash map. It is most
3408 often used to map a peer identity to some data structure, but also to map
3409 arbitrary keys to values (for example to track requests in the distributed
3410 hash table or in file-sharing). As it is commonly used, the DHT is
3411 actually sometimes responsible for a large share of GNUnet's overall
3412 memory consumption (for some processes, 30% is not uncommon). The
3413 following text documents some API quirks (and their implications for
3414 applications) that were recently introduced to minimize the footprint of
3415 the hash map.
3416
3417
3418 @c ***********************************************************************
3419 @menu
3420 * Analysis::
3421 * Solution::
3422 * Migration::
3423 * Conclusion::
3424 * Availability::
3425 @end menu
3426
3427 @node Analysis
3428 @subsubsection Analysis
3429 @c %**end of header
3430
3431 The main reason for the "excessive" memory consumption by the hash map is
3432 that GNUnet uses 512-bit cryptographic hash codes --- and the
3433 (multi-)hash map also uses the same 512-bit 'struct GNUNET_HashCode'. As
3434 a result, storing just the keys requires 64 bytes of memory for each key.
3435 As some applications like to keep a large number of entries in the hash
3436 map (after all, that's what maps are good for), 64 bytes per hash is
3437 significant: keeping a pointer to the value and having a linked list for
3438 collisions consume between 8 and 16 bytes, and 'malloc' may add about the
3439 same overhead per allocation, putting us in the 16 to 32 byte per entry
3440 ballpark. Adding a 64-byte key then triples the overall memory
3441 requirement for the hash map.
3442
3443 To make things "worse", most of the time storing the key in the hash map
3444 is not required: it is typically already in memory elsewhere! In most
3445 cases, the values stored in the hash map are some application-specific
3446 struct that _also_ contains the hash. Here is a simplified example:
3447
3448 @example
3449 struct MyValue @{
3450 struct GNUNET_HashCode key;
3451 unsigned int my_data; @};
3452
3453 // ...
3454 val = GNUNET_malloc (sizeof (struct MyValue));
3455 val->key = key;
3456 val->my_data = 42;
3457 GNUNET_CONTAINER_multihashmap_put (map, &key, val, ...);
3458 @end example
3459
3460 This is a common pattern as later the entries might need to be removed,
3461 and at that time it is convenient to have the key immediately at hand:
3462
3463 @example
3464 GNUNET_CONTAINER_multihashmap_remove (map, &val->key, val);
3465 @end example
3466
3467
3468 Note that here we end up with two times 64 bytes for the key, plus maybe
3469 64 bytes total for the rest of the 'struct MyValue' and the map entry in
3470 the hash map. The resulting redundant storage of the key increases
3471 overall memory consumption per entry from the "optimal" 128 bytes to 192
3472 bytes. This is not just an extreme example: overheads in practice are
3473 actually sometimes close to those highlighted in this example. This is
3474 especially true for maps with a significant number of entries, as there
3475 we tend to really try to keep the entries small.
3476
3477 @c ***********************************************************************
3478 @node Solution
3479 @subsubsection Solution
3480 @c %**end of header
3481
3482 The solution that has now been implemented is to @strong{optionally}
3483 allow the hash map to not make a (deep) copy of the hash but instead have
3484 a pointer to the hash/key in the entry. This reduces the memory
3485 consumption for the key from 64 bytes to 4 to 8 bytes. However, it can
3486 also only work if the key is actually stored in the entry (which is the
3487 case most of the time) and if the entry does not modify the key (which in
3488 all of the code I'm aware of has been always the case if there key is
3489 stored in the entry). Finally, when the client stores an entry in the
3490 hash map, it @strong{must} provide a pointer to the key within the entry,
3491 not just a pointer to a transient location of the key. If
3492 the client code does not meet these requirements, the result is a dangling
3493 pointer and undefined behavior of the (multi-)hash map API.
3494
3495 @c ***********************************************************************
3496 @node Migration
3497 @subsubsection Migration
3498 @c %**end of header
3499
3500 To use the new feature, first check that the values contain the respective
3501 key (and never modify it). Then, all calls to
3502 @code{GNUNET_CONTAINER_multihashmap_put} on the respective map must be
3503 audited and most likely changed to pass a pointer into the value's struct.
3504 For the initial example, the new code would look like this:
3505
3506 @example
3507 struct MyValue @{
3508 struct GNUNET_HashCode key;
3509 unsigned int my_data; @};
3510
3511 // ...
3512 val = GNUNET_malloc (sizeof (struct MyValue));
3513 val->key = key; val->my_data = 42;
3514 GNUNET_CONTAINER_multihashmap_put (map, &val->key, val, ...);
3515 @end example
3516
3517
3518 Note that @code{&val} was changed to @code{&val->key} in the argument to
3519 the @code{put} call. This is critical as often @code{key} is on the stack
3520 or in some other transient data structure and thus having the hash map
3521 keep a pointer to @code{key} would not work. Only the key inside of
3522 @code{val} has the same lifetime as the entry in the map (this must of
3523 course be checked as well). Naturally, @code{val->key} must be
3524 initialized before the @code{put} call. Once all @code{put} calls have
3525 been converted and double-checked, you can change the call to create the
3526 hash map from
3527
3528 @example
3529 map =
3530 GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_NO);
3531 @end example
3532
3533 to
3534
3535 @example
3536 map = GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_YES);
3537 @end example
3538
3539 If everything was done correctly, you now use about 60 bytes less memory
3540 per entry in @code{map}. However, if now (or in the future) any call to
3541 @code{put} does not ensure that the given key is valid until the entry is
3542 removed from the map, undefined behavior is likely to be observed.
3543
3544 @c ***********************************************************************
3545 @node Conclusion
3546 @subsubsection Conclusion
3547 @c %**end of header
3548
3549 The new optimization can is often applicable and can result in a
3550 reduction in memory consumption of up to 30% in practice. However, it
3551 makes the code less robust as additional invariants are imposed on the
3552 multi hash map client. Thus applications should refrain from enabling the
3553 new mode unless the resulting performance increase is deemed significant
3554 enough. In particular, it should generally not be used in new code (wait
3555 at least until benchmarks exist).
3556
3557 @c ***********************************************************************
3558 @node Availability
3559 @subsubsection Availability
3560 @c %**end of header
3561
3562 The new multi hash map code was committed in SVN 24319 (which made its
3563 way into GNUnet version 0.9.4).
3564 Various subsystems (transport, core, dht, file-sharing) were
3565 previously audited and modified to take advantage of the new capability.
3566 In particular, memory consumption of the file-sharing service is expected
3567 to drop by 20-30% due to this change.
3568
3569
3570 @cindex CONTAINER_MDLL API
3571 @node CONTAINER_MDLL API
3572 @subsection CONTAINER_MDLL API
3573 @c %**end of header
3574
3575 This text documents the GNUNET_CONTAINER_MDLL API. The
3576 GNUNET_CONTAINER_MDLL API is similar to the GNUNET_CONTAINER_DLL API in
3577 that it provides operations for the construction and manipulation of
3578 doubly-linked lists. The key difference to the (simpler) DLL-API is that
3579 the MDLL-version allows a single element (instance of a "struct") to be
3580 in multiple linked lists at the same time.
3581
3582 Like the DLL API, the MDLL API stores (most of) the data structures for
3583 the doubly-linked list with the respective elements; only the 'head' and
3584 'tail' pointers are stored "elsewhere" --- and the application needs to
3585 provide the locations of head and tail to each of the calls in the
3586 MDLL API. The key difference for the MDLL API is that the "next" and
3587 "previous" pointers in the struct can no longer be simply called "next"
3588 and "prev" --- after all, the element may be in multiple doubly-linked
3589 lists, so we cannot just have one "next" and one "prev" pointer!
3590
3591 The solution is to have multiple fields that must have a name of the
3592 format "next_XX" and "prev_XX" where "XX" is the name of one of the
3593 doubly-linked lists. Here is a simple example:
3594
3595 @example
3596 struct MyMultiListElement @{
3597   struct MyMultiListElement *next_ALIST;
3598   struct MyMultiListElement *prev_ALIST;
3599   struct MyMultiListElement *next_BLIST;
3600   struct MyMultiListElement *prev_BLIST;
3601   void
3602   *data;
3603 @};
3604 @end example
3605
3606
3607 Note that by convention, we use all-uppercase letters for the list names.
3608 In addition, the program needs to have a location for the head and tail
3609 pointers for both lists, for example:
3610
3611 @example
3612 static struct MyMultiListElement *head_ALIST;
3613 static struct MyMultiListElement *tail_ALIST;
3614 static struct MyMultiListElement *head_BLIST;
3615 static struct MyMultiListElement *tail_BLIST;
3616 @end example
3617
3618
3619 Using the MDLL-macros, we can now insert an element into the ALIST:
3620
3621 @example
3622 GNUNET_CONTAINER_MDLL_insert (ALIST, head_ALIST, tail_ALIST, element);
3623 @end example
3624
3625
3626 Passing "ALIST" as the first argument to MDLL specifies which of the
3627 next/prev fields in the 'struct MyMultiListElement' should be used. The
3628 extra "ALIST" argument and the "_ALIST" in the names of the
3629 next/prev-members are the only differences between the MDDL and DLL-API.
3630 Like the DLL-API, the MDLL-API offers functions for inserting (at head,
3631 at tail, after a given element) and removing elements from the list.
3632 Iterating over the list should be done by directly accessing the
3633 "next_XX" and/or "prev_XX" members.
3634
3635 @cindex Automatic Restart Manager
3636 @cindex ARM
3637 @node Automatic Restart Manager (ARM)
3638 @section Automatic Restart Manager (ARM)
3639 @c %**end of header
3640
3641 GNUnet's Automated Restart Manager (ARM) is the GNUnet service responsible
3642 for system initialization and service babysitting. ARM starts and halts
3643 services, detects configuration changes and restarts services impacted by
3644 the changes as needed. It's also responsible for restarting services in
3645 case of crashes and is planned to incorporate automatic debugging for
3646 diagnosing service crashes providing developers insights about crash
3647 reasons. The purpose of this document is to give GNUnet developer an idea
3648 about how ARM works and how to interact with it.
3649
3650 @menu
3651 * Basic functionality::
3652 * Key configuration options::
3653 * ARM - Availability::
3654 * Reliability::
3655 @end menu
3656
3657 @c ***********************************************************************
3658 @node Basic functionality
3659 @subsection Basic functionality
3660 @c %**end of header
3661
3662 @itemize @bullet
3663 @item ARM source code can be found under "src/arm".@ Service processes are
3664 managed by the functions in "gnunet-service-arm.c" which is controlled
3665 with "gnunet-arm.c" (main function in that file is ARM's entry point).
3666
3667 @item The functions responsible for communicating with ARM , starting and
3668 stopping services -including ARM service itself- are provided by the
3669 ARM API "arm_api.c".@ Function: GNUNET_ARM_connect() returns to the caller
3670 an ARM handle after setting it to the caller's context (configuration and
3671 scheduler in use). This handle can be used afterwards by the caller to
3672 communicate with ARM. Functions GNUNET_ARM_start_service() and
3673 GNUNET_ARM_stop_service() are used for starting and stopping services
3674 respectively.
3675
3676 @item A typical example of using these basic ARM services can be found in
3677 file test_arm_api.c. The test case connects to ARM, starts it, then uses
3678 it to start a service "resolver", stops the "resolver" then stops "ARM".
3679 @end itemize
3680
3681 @c ***********************************************************************
3682 @node Key configuration options
3683 @subsection Key configuration options
3684 @c %**end of header
3685
3686 Configurations for ARM and services should be available in a .conf file
3687 (As an example, see test_arm_api_data.conf). When running ARM, the
3688 configuration file to use should be passed to the command:
3689
3690 @example
3691 $ gnunet-arm -s -c configuration_to_use.conf
3692 @end example
3693
3694 If no configuration is passed, the default configuration file will be used
3695 (see GNUNET_PREFIX/share/gnunet/defaults.conf which is created from
3696 contrib/defaults.conf).@ Each of the services is having a section starting
3697 by the service name between square brackets, for example: "[arm]".
3698 The following options configure how ARM configures or interacts with the
3699 various services:
3700
3701 @table @asis
3702
3703 @item PORT Port number on which the service is listening for incoming TCP
3704 connections. ARM will start the services should it notice a request at
3705 this port.
3706
3707 @item HOSTNAME Specifies on which host the service is deployed. Note
3708 that ARM can only start services that are running on the local system
3709 (but will not check that the hostname matches the local machine name).
3710 This option is used by the @code{gnunet_client_lib.h} implementation to
3711 determine which system to connect to. The default is "localhost".
3712
3713 @item BINARY The name of the service binary file.
3714
3715 @item OPTIONS To be passed to the service.
3716
3717 @item PREFIX A command to pre-pend to the actual command, for example,
3718 running a service with "valgrind" or "gdb"
3719
3720 @item DEBUG Run in debug mode (much verbosity).
3721
3722 @item START_ON_DEMAND ARM will listen to UNIX domain socket and/or TCP port of
3723 the service and start the service on-demand.
3724
3725 @item IMMEDIATE_START ARM will always start this service when the peer
3726 is started.
3727
3728 @item ACCEPT_FROM IPv4 addresses the service accepts connections from.
3729
3730 @item ACCEPT_FROM6 IPv6 addresses the service accepts connections from.
3731
3732 @end table
3733
3734
3735 Options that impact the operation of ARM overall are in the "[arm]"
3736 section. ARM is a normal service and has (except for START_ON_DEMAND) all of the
3737 options that other services do. In addition, ARM has the
3738 following options:
3739
3740 @table @asis
3741
3742 @item GLOBAL_PREFIX Command to be pre-pended to all services that are
3743 going to run.
3744
3745 @item GLOBAL_POSTFIX Global option that will be supplied to all the
3746 services that are going to run.
3747
3748 @end table
3749
3750 @c ***********************************************************************
3751 @node ARM - Availability
3752 @subsection ARM - Availability
3753 @c %**end of header
3754
3755 As mentioned before, one of the features provided by ARM is starting
3756 services on demand. Consider the example of one service "client" that
3757 wants to connect to another service a "server". The "client" will ask ARM
3758 to run the "server". ARM starts the "server". The "server" starts
3759 listening to incoming connections. The "client" will establish a
3760 connection with the "server". And then, they will start to communicate
3761 together.@ One problem with that scheme is that it's slow!@
3762 The "client" service wants to communicate with the "server" service at
3763 once and is not willing wait for it to be started and listening to
3764 incoming connections before serving its request.@ One solution for that
3765 problem will be that ARM starts all services as default services. That
3766 solution will solve the problem, yet, it's not quite practical, for some
3767 services that are going to be started can never be used or are going to
3768 be used after a relatively long time.@
3769 The approach followed by ARM to solve this problem is as follows:
3770
3771 @itemize @bullet
3772
3773 @item For each service having a PORT field in the configuration file and
3774 that is not one of the default services ( a service that accepts incoming
3775 connections from clients), ARM creates listening sockets for all addresses
3776 associated with that service.
3777
3778 @item The "client" will immediately establish a connection with
3779 the "server".
3780
3781 @item ARM --- pretending to be the "server" --- will listen on the
3782 respective port and notice the incoming connection from the "client"
3783 (but not accept it), instead
3784
3785 @item Once there is an incoming connection, ARM will start the "server",
3786 passing on the listen sockets (now, the service is started and can do its
3787 work).
3788
3789 @item Other client services now can directly connect directly to the
3790 "server".
3791
3792 @end itemize
3793
3794 @c ***********************************************************************
3795 @node Reliability
3796 @subsection Reliability
3797
3798 One of the features provided by ARM, is the automatic restart of crashed
3799 services.@ ARM needs to know which of the running services died. Function
3800 "gnunet-service-arm.c/maint_child_death()" is responsible for that. The
3801 function is scheduled to run upon receiving a SIGCHLD signal. The
3802 function, then, iterates ARM's list of services running and monitors
3803 which service has died (crashed). For all crashing services, ARM restarts
3804 them.@
3805 Now, considering the case of a service having a serious problem causing it
3806 to crash each time it's started by ARM. If ARM keeps blindly restarting
3807 such a service, we are going to have the pattern:
3808 start-crash-restart-crash-restart-crash and so forth!! Which is of course
3809 not practical.@
3810 For that reason, ARM schedules the service to be restarted after waiting
3811 for some delay that grows exponentially with each crash/restart of that
3812 service.@ To clarify the idea, considering the following example:
3813
3814 @itemize @bullet
3815
3816 @item Service S crashed.
3817
3818 @item ARM receives the SIGCHLD and inspects its list of services to find
3819 the dead one(s).
3820
3821 @item ARM finds S dead and schedules it for restarting after "backoff"
3822 time which is initially set to 1ms. ARM will double the backoff time
3823 correspondent to S (now backoff(S) = 2ms)
3824
3825 @item Because there is a severe problem with S, it crashed again.
3826
3827 @item Again ARM receives the SIGCHLD and detects that it's S again that's
3828 crashed. ARM schedules it for restarting but after its new backoff time
3829 (which became 2ms), and doubles its backoff time (now backoff(S) = 4).
3830
3831 @item and so on, until backoff(S) reaches a certain threshold
3832 (@code{EXPONENTIAL_BACKOFF_THRESHOLD} is set to half an hour),
3833 after reaching it, backoff(S) will remain half an hour,
3834 hence ARM won't be busy for a lot of time trying to restart a
3835 problematic service.
3836 @end itemize
3837
3838 @cindex TRANSPORT Subsystem
3839 @node TRANSPORT Subsystem
3840 @section TRANSPORT Subsystem
3841 @c %**end of header
3842
3843 This chapter documents how the GNUnet transport subsystem works. The
3844 GNUnet transport subsystem consists of three main components: the
3845 transport API (the interface used by the rest of the system to access the
3846 transport service), the transport service itself (most of the interesting
3847 functions, such as choosing transports, happens here) and the transport
3848 plugins. A transport plugin is a concrete implementation for how two
3849 GNUnet peers communicate; many plugins exist, for example for
3850 communication via TCP, UDP, HTTP, HTTPS and others. Finally, the
3851 transport subsystem uses supporting code, especially the NAT/UPnP
3852 library to help with tasks such as NAT traversal.
3853
3854 Key tasks of the transport service include:
3855
3856 @itemize @bullet
3857
3858 @item Create our HELLO message, notify clients and neighbours if our HELLO
3859 changes (using NAT library as necessary)
3860
3861 @item Validate HELLOs from other peers (send PING), allow other peers to
3862 validate our HELLO's addresses (send PONG)
3863
3864 @item Upon request, establish connections to other peers (using address
3865 selection from ATS subsystem) and maintain them (again using PINGs and
3866 PONGs) as long as desired
3867
3868 @item Accept incoming connections, give ATS service the opportunity to
3869 switch communication channels
3870
3871 @item Notify clients about peers that have connected to us or that have
3872 been disconnected from us
3873
3874 @item If a (stateful) connection goes down unexpectedly (without explicit
3875 DISCONNECT), quickly attempt to recover (without notifying clients) but do
3876 notify clients quickly if reconnecting fails
3877
3878 @item Send (payload) messages arriving from clients to other peers via
3879 transport plugins and receive messages from other peers, forwarding
3880 those to clients
3881
3882 @item Enforce inbound traffic limits (using flow-control if it is
3883 applicable); outbound traffic limits are enforced by CORE, not by us (!)
3884
3885 @item Enforce restrictions on P2P connection as specified by the blacklist
3886 configuration and blacklisting clients
3887 @end itemize
3888
3889 Note that the term "clients" in the list above really refers to the
3890 GNUnet-CORE service, as CORE is typically the only client of the
3891 transport service.
3892
3893 @menu
3894 * Address validation protocol::
3895 @end menu
3896
3897 @node Address validation protocol
3898 @subsection Address validation protocol
3899 @c %**end of header
3900
3901 This section documents how the GNUnet transport service validates
3902 connections with other peers. It is a high-level description of the
3903 protocol necessary to understand the details of the implementation. It
3904 should be noted that when we talk about PING and PONG messages in this
3905 section, we refer to transport-level PING and PONG messages, which are
3906 different from core-level PING and PONG messages (both in implementation
3907 and function).
3908
3909 The goal of transport-level address validation is to minimize the chances
3910 of a successful man-in-the-middle attack against GNUnet peers on the
3911 transport level. Such an attack would not allow the adversary to decrypt
3912 the P2P transmissions, but a successful attacker could at least measure
3913 traffic volumes and latencies (raising the adversaries capabilities by
3914 those of a global passive adversary in the worst case). The scenarios we
3915 are concerned about is an attacker, Mallory, giving a @code{HELLO} to
3916 Alice that claims to be for Bob, but contains Mallory's IP address
3917 instead of Bobs (for some transport).
3918 Mallory would then forward the traffic to Bob (by initiating a
3919 connection to Bob and claiming to be Alice). As a further
3920 complication, the scheme has to work even if say Alice is behind a NAT
3921 without traversal support and hence has no address of her own (and thus
3922 Alice must always initiate the connection to Bob).
3923
3924 An additional constraint is that @code{HELLO} messages do not contain a
3925 cryptographic signature since other peers must be able to edit
3926 (i.e. remove) addresses from the @code{HELLO} at any time (this was
3927 not true in GNUnet 0.8.x). A basic @strong{assumption} is that each peer
3928 knows the set of possible network addresses that it @strong{might}
3929 be reachable under (so for example, the external IP address of the
3930 NAT plus the LAN address(es) with the respective ports).
3931
3932 The solution is the following. If Alice wants to validate that a given
3933 address for Bob is valid (i.e. is actually established @strong{directly}
3934 with the intended target), she sends a PING message over that connection
3935 to Bob. Note that in this case, Alice initiated the connection so only
3936 Alice knows which address was used for sure (Alice may be behind NAT, so
3937 whatever address Bob sees may not be an address Alice knows she has).
3938 Bob checks that the address given in the @code{PING} is actually one
3939 of Bob's addresses (ie: does not belong to Mallory), and if it is,
3940 sends back a @code{PONG} (with a signature that says that Bob
3941 owns/uses the address from the @code{PING}).
3942 Alice checks the signature and is happy if it is valid and the address
3943 in the @code{PONG} is the address Alice used.
3944 This is similar to the 0.8.x protocol where the @code{HELLO} contained a
3945 signature from Bob for each address used by Bob.
3946 Here, the purpose code for the signature is
3947 @code{GNUNET_SIGNATURE_PURPOSE_TRANSPORT_PONG_OWN}. After this, Alice will
3948 remember Bob's address and consider the address valid for a while (12h in
3949 the current implementation). Note that after this exchange, Alice only
3950 considers Bob's address to be valid, the connection itself is not
3951 considered 'established'. In particular, Alice may have many addresses
3952 for Bob that Alice considers valid.
3953
3954 The @code{PONG} message is protected with a nonce/challenge against replay
3955 attacks (@uref{http://en.wikipedia.org/wiki/Replay_attack, replay})
3956 and uses an expiration time for the signature (but those are almost
3957 implementation details).
3958
3959 @cindex NAT library
3960 @node NAT library
3961 @section NAT library
3962 @c %**end of header
3963
3964 The goal of the GNUnet NAT library is to provide a general-purpose API for
3965 NAT traversal @strong{without} third-party support. So protocols that
3966 involve contacting a third peer to help establish a connection between
3967 two peers are outside of the scope of this API. That does not mean that
3968 GNUnet doesn't support involving a third peer (we can do this with the
3969 distance-vector transport or using application-level protocols), it just
3970 means that the NAT API is not concerned with this possibility. The API is
3971 written so that it will work for IPv6-NAT in the future as well as
3972 current IPv4-NAT. Furthermore, the NAT API is always used, even for peers
3973 that are not behind NAT --- in that case, the mapping provided is simply
3974 the identity.
3975
3976 NAT traversal is initiated by calling @code{GNUNET_NAT_register}. Given a
3977 set of addresses that the peer has locally bound to (TCP or UDP), the NAT
3978 library will return (via callback) a (possibly longer) list of addresses
3979 the peer @strong{might} be reachable under. Internally, depending on the
3980 configuration, the NAT library will try to punch a hole (using UPnP) or
3981 just "know" that the NAT was manually punched and generate the respective
3982 external IP address (the one that should be globally visible) based on
3983 the given information.
3984
3985 The NAT library also supports ICMP-based NAT traversal. Here, the other
3986 peer can request connection-reversal by this peer (in this special case,
3987 the peer is even allowed to configure a port number of zero). If the NAT
3988 library detects a connection-reversal request, it returns the respective
3989 target address to the client as well. It should be noted that
3990 connection-reversal is currently only intended for TCP, so other plugins
3991 @strong{must} pass @code{NULL} for the reversal callback. Naturally, the
3992 NAT library also supports requesting connection reversal from a remote
3993 peer (@code{GNUNET_NAT_run_client}).
3994
3995 Once initialized, the NAT handle can be used to test if a given address is
3996 possibly a valid address for this peer (@code{GNUNET_NAT_test_address}).
3997 This is used for validating our addresses when generating PONGs.
3998
3999 Finally, the NAT library contains an API to test if our NAT configuration
4000 is correct. Using @code{GNUNET_NAT_test_start} @strong{before} binding to
4001 the respective port, the NAT library can be used to test if the
4002 configuration works. The test function act as a local client, initialize
4003 the NAT traversal and then contact a @code{gnunet-nat-server} (running by
4004 default on @code{gnunet.org}) and ask for a connection to be established.
4005 This way, it is easy to test if the current NAT configuration is valid.
4006
4007 @node Distance-Vector plugin
4008 @section Distance-Vector plugin
4009 @c %**end of header
4010
4011 The Distance Vector (DV) transport is a transport mechanism that allows
4012 peers to act as relays for each other, thereby connecting peers that would
4013 otherwise be unable to connect. This gives a larger connection set to
4014 applications that may work better with more peers to choose from (for
4015 example, File Sharing and/or DHT).
4016
4017 The Distance Vector transport essentially has two functions. The first is
4018 "gossiping" connection information about more distant peers to directly
4019 connected peers. The second is taking messages intended for non-directly
4020 connected peers and encapsulating them in a DV wrapper that contains the
4021 required information for routing the message through forwarding peers. Via
4022 gossiping, optimal routes through the known DV neighborhood are discovered
4023 and utilized and the message encapsulation provides some benefits in
4024 addition to simply getting the message from the correct source to the
4025 proper destination.
4026
4027 The gossiping function of DV provides an up to date routing table of
4028 peers that are available up to some number of hops. We call this a
4029 fisheye view of the network (like a fish, nearby objects are known while
4030 more distant ones unknown). Gossip messages are sent only to directly
4031 connected peers, but they are sent about other knowns peers within the
4032 "fisheye distance". Whenever two peers connect, they immediately gossip
4033 to each other about their appropriate other neighbors. They also gossip
4034 about the newly connected peer to previously
4035 connected neighbors. In order to keep the routing tables up to date,
4036 disconnect notifications are propagated as gossip as well (because
4037 disconnects may not be sent/received, timeouts are also used remove
4038 stagnant routing table entries).
4039
4040 Routing of messages via DV is straightforward. When the DV transport is
4041 notified of a message destined for a non-direct neighbor, the appropriate
4042 forwarding peer is selected, and the base message is encapsulated in a DV
4043 message which contains information about the initial peer and the intended
4044 recipient. At each forwarding hop, the initial peer is validated (the
4045 forwarding peer ensures that it has the initial peer in its neighborhood,
4046 otherwise the message is dropped). Next the base message is
4047 re-encapsulated in a new DV message for the next hop in the forwarding
4048 chain (or delivered to the current peer, if it has arrived at the
4049 destination).
4050
4051 Assume a three peer network with peers Alice, Bob and Carol. Assume that
4052
4053 @example
4054 Alice <-> Bob and Bob <-> Carol
4055 @end example
4056
4057 @noindent
4058 are direct (e.g. over TCP or UDP transports) connections, but that
4059 Alice cannot directly connect to Carol.
4060 This may be the case due to NAT or firewall restrictions, or perhaps
4061 based on one of the peers respective configurations. If the Distance
4062 Vector transport is enabled on all three peers, it will automatically
4063 discover (from the gossip protocol) that Alice and Carol can connect via
4064 Bob and provide a "virtual" Alice <-> Carol connection. Routing between
4065 Alice and Carol happens as follows; Alice creates a message destined for
4066 Carol and notifies the DV transport about it. The DV transport at Alice
4067 looks up Carol in the routing table and finds that the message must be
4068 sent through Bob for Carol. The message is encapsulated setting Alice as
4069 the initiator and Carol as the destination and sent to Bob. Bob receives
4070 the messages, verifies that both Alice and Carol are known to Bob, and
4071 re-wraps the message in a new DV message for Carol.
4072 The DV transport at Carol receives this message, unwraps the original
4073 message, and delivers it to Carol as though it came directly from Alice.
4074
4075 @cindex SMTP plugin
4076 @node SMTP plugin
4077 @section SMTP plugin
4078 @c %**end of header
4079 @c TODO: Update!
4080
4081 This section describes the new SMTP transport plugin for GNUnet as it
4082 exists in the 0.7.x and 0.8.x branch. SMTP support is currently not
4083 available in GNUnet 0.9.x. This page also describes the transport layer
4084 abstraction (as it existed in 0.7.x and 0.8.x) in more detail and gives
4085 some benchmarking results. The performance results presented are quite
4086 old and maybe outdated at this point.
4087 For the readers in the year 2019, you will notice by the mention of
4088 version 0.7, 0.8, and 0.9 that this section has to be taken with your
4089 usual grain of salt and be updated eventually.
4090
4091 @itemize @bullet
4092 @item Why use SMTP for a peer-to-peer transport?
4093 @item SMTPHow does it work?
4094 @item How do I configure my peer?
4095 @item How do I test if it works?
4096 @item How fast is it?
4097 @item Is there any additional documentation?
4098 @end itemize
4099
4100
4101 @menu
4102 * Why use SMTP for a peer-to-peer transport?::
4103 * How does it work?::
4104 * How do I configure my peer?::
4105 * How do I test if it works?::
4106 * How fast is it?::
4107 @end menu
4108
4109 @node Why use SMTP for a peer-to-peer transport?
4110 @subsection Why use SMTP for a peer-to-peer transport?
4111 @c %**end of header
4112
4113 There are many reasons why one would not want to use SMTP:
4114
4115 @itemize @bullet
4116 @item SMTP is using more bandwidth than TCP, UDP or HTTP
4117 @item SMTP has a much higher latency.
4118 @item SMTP requires significantly more computation (encoding and decoding
4119 time) for the peers.
4120 @item SMTP is significantly more complicated to configure.
4121 @item SMTP may be abused by tricking GNUnet into sending mail to@
4122 non-participating third parties.
4123 @end itemize
4124
4125 So why would anybody want to use SMTP?
4126 @itemize @bullet
4127 @item SMTP can be used to contact peers behind NAT boxes (in virtual
4128 private networks).
4129 @item SMTP can be used to circumvent policies that limit or prohibit
4130 peer-to-peer traffic by masking as "legitimate" traffic.
4131 @item SMTP uses E-mail addresses which are independent of a specific IP,
4132 which can be useful to address peers that use dynamic IP addresses.
4133 @item SMTP can be used to initiate a connection (e.g. initial address
4134 exchange) and peers can then negotiate the use of a more efficient
4135 protocol (e.g. TCP) for the actual communication.
4136 @end itemize
4137
4138 In summary, SMTP can for example be used to send a message to a peer
4139 behind a NAT box that has a dynamic IP to tell the peer to establish a
4140 TCP connection to a peer outside of the private network. Even an
4141 extraordinary overhead for this first message would be irrelevant in this
4142 type of situation.
4143
4144 @node How does it work?
4145 @subsection How does it work?
4146 @c %**end of header
4147
4148 When a GNUnet peer needs to send a message to another GNUnet peer that has
4149 advertised (only) an SMTP transport address, GNUnet base64-encodes the
4150 message and sends it in an E-mail to the advertised address. The
4151 advertisement contains a filter which is placed in the E-mail header,
4152 such that the receiving host can filter the tagged E-mails and forward it
4153 to the GNUnet peer process. The filter can be specified individually by
4154 each peer and be changed over time. This makes it impossible to censor
4155 GNUnet E-mail messages by searching for a generic filter.
4156
4157 @node How do I configure my peer?
4158 @subsection How do I configure my peer?
4159 @c %**end of header
4160
4161 First, you need to configure @code{procmail} to filter your inbound E-mail
4162 for GNUnet traffic. The GNUnet messages must be delivered into a pipe, for
4163 example @code{/tmp/gnunet.smtp}. You also need to define a filter that is
4164 used by @command{procmail} to detect GNUnet messages. You are free to
4165 choose whichever filter you like, but you should make sure that it does
4166 not occur in your other E-mail. In our example, we will use
4167 @code{X-mailer: GNUnet}. The @code{~/.procmailrc} configuration file then
4168 looks like this:
4169
4170 @example
4171 :0:
4172 * ^X-mailer: GNUnet
4173 /tmp/gnunet.smtp
4174 # where do you want your other e-mail delivered to
4175 # (default: /var/spool/mail/)
4176 :0: /var/spool/mail/
4177 @end example
4178
4179 After adding this file, first make sure that your regular E-mail still
4180 works (e.g. by sending an E-mail to yourself). Then edit the GNUnet
4181 configuration. In the section @code{SMTP} you need to specify your E-mail
4182 address under @code{EMAIL}, your mail server (for outgoing mail) under
4183 @code{SERVER}, the filter (X-mailer: GNUnet in the example) under
4184 @code{FILTER} and the name of the pipe under @code{PIPE}.@ The completed
4185 section could then look like this:
4186
4187 @example
4188 EMAIL = me@@mail.gnu.org MTU = 65000 SERVER = mail.gnu.org:25 FILTER =
4189 "X-mailer: GNUnet" PIPE = /tmp/gnunet.smtp
4190 @end example
4191
4192 Finally, you need to add @code{smtp} to the list of @code{TRANSPORTS} in
4193 the @code{GNUNETD} section. GNUnet peers will use the E-mail address that
4194 you specified to contact your peer until the advertisement times out.
4195 Thus, if you are not sure if everything works properly or if you are not
4196 planning to be online for a long time, you may want to configure this
4197 timeout to be short, e.g. just one hour. For this, set
4198 @code{HELLOEXPIRES} to @code{1} in the @code{GNUNETD} section.
4199
4200 This should be it, but you may probably want to test it first.
4201
4202 @node How do I test if it works?
4203 @subsection How do I test if it works?
4204 @c %**end of header
4205
4206 Any transport can be subjected to some rudimentary tests using the
4207 @code{gnunet-transport-check} tool. The tool sends a message to the local
4208 node via the transport and checks that a valid message is received. While
4209 this test does not involve other peers and can not check if firewalls or
4210 other network obstacles prohibit proper operation, this is a great
4211 testcase for the SMTP transport since it tests pretty much nearly all of
4212 the functionality.
4213
4214 @code{gnunet-transport-check} should only be used without running
4215 @code{gnunetd} at the same time. By default, @code{gnunet-transport-check}
4216 tests all transports that are specified in the configuration file. But
4217 you can specifically test SMTP by giving the option
4218 @code{--transport=smtp}.
4219
4220 Note that this test always checks if a transport can receive and send.
4221 While you can configure most transports to only receive or only send
4222 messages, this test will only work if you have configured the transport
4223 to send and receive messages.
4224
4225 @node How fast is it?
4226 @subsection How fast is it?
4227 @c %**end of header
4228
4229 We have measured the performance of the UDP, TCP and SMTP transport layer
4230 directly and when used from an application using the GNUnet core.
4231 Measuring just the transport layer gives the better view of the actual
4232 overhead of the protocol, whereas evaluating the transport from the
4233 application puts the overhead into perspective from a practical point of
4234 view.
4235
4236 The loopback measurements of the SMTP transport were performed on three
4237 different machines spanning a range of modern SMTP configurations. We
4238 used a PIII-800 running RedHat 7.3 with the Purdue Computer Science
4239 configuration which includes filters for spam. We also used a Xenon 2 GHZ
4240 with a vanilla RedHat 8.0 sendmail configuration. Furthermore, we used
4241 qmail on a PIII-1000 running Sorcerer GNU Linux (SGL). The numbers for
4242 UDP and TCP are provided using the SGL configuration. The qmail benchmark
4243 uses qmail's internal filtering whereas the sendmail benchmarks relies on
4244 procmail to filter and deliver the mail. We used the transport layer to
4245 send a message of b bytes (excluding transport protocol headers) directly
4246 to the local machine. This way, network latency and packet loss on the
4247 wire have no impact on the timings. n messages were sent sequentially over
4248 the transport layer, sending message i+1 after the i-th message was
4249 received. All messages were sent over the same connection and the time to
4250 establish the connection was not taken into account since this overhead is
4251 minuscule in practice --- as long as a connection is used for a
4252 significant number of messages.
4253
4254 @multitable @columnfractions .20 .15 .15 .15 .15 .15
4255 @headitem Transport @tab UDP @tab TCP @tab SMTP (Purdue sendmail)
4256 @tab SMTP (RH 8.0) @tab SMTP (SGL qmail)
4257 @item  11 bytes @tab 31 ms @tab 55 ms @tab  781 s @tab 77 s @tab 24 s
4258 @item  407 bytes @tab 37 ms @tab 62 ms @tab  789 s @tab 78 s @tab 25 s
4259 @item 1,221 bytes @tab 46 ms @tab 73 ms @tab  804 s @tab 78 s @tab 25 s
4260 @end multitable
4261
4262 The benchmarks show that UDP and TCP are, as expected, both significantly
4263 faster compared with any of the SMTP services. Among the SMTP
4264 implementations, there can be significant differences depending on the
4265 SMTP configuration. Filtering with an external tool like procmail that
4266 needs to re-parse its configuration for each mail can be very expensive.
4267 Applying spam filters can also significantly impact the performance of
4268 the underlying SMTP implementation. The microbenchmark shows that SMTP
4269 can be a viable solution for initiating peer-to-peer sessions: a couple of
4270 seconds to connect to a peer are probably not even going to be noticed by
4271 users. The next benchmark measures the possible throughput for a
4272 transport. Throughput can be measured by sending multiple messages in
4273 parallel and measuring packet loss. Note that not only UDP but also the
4274 TCP transport can actually loose messages since the TCP implementation
4275 drops messages if the @code{write} to the socket would block. While the
4276 SMTP protocol never drops messages itself, it is often so
4277 slow that only a fraction of the messages can be sent and received in the
4278 given time-bounds. For this benchmark we report the message loss after
4279 allowing t time for sending m messages. If messages were not sent (or
4280 received) after an overall timeout of t, they were considered lost. The
4281 benchmark was performed using two Xeon 2 GHZ machines running RedHat 8.0
4282 with sendmail. The machines were connected with a direct 100 MBit Ethernet
4283 connection.@ Figures udp1200, tcp1200 and smtp-MTUs show that the
4284 throughput for messages of size 1,200 octets is 2,343 kbps, 3,310 kbps
4285 and 6 kbps for UDP, TCP and SMTP respectively. The high per-message
4286 overhead of SMTP can be improved by increasing the MTU, for example, an
4287 MTU of 12,000 octets improves the throughput to 13 kbps as figure
4288 smtp-MTUs shows. Our research paper) has some more details on the
4289 benchmarking results.
4290
4291 @cindex Bluetooth plugin
4292 @node Bluetooth plugin
4293 @section Bluetooth plugin
4294 @c %**end of header
4295
4296 This page describes the new Bluetooth transport plugin for GNUnet. The
4297 plugin is still in the testing stage so don't expect it to work
4298 perfectly. If you have any questions or problems just post them here or
4299 ask on the IRC channel.
4300
4301 @itemize @bullet
4302 @item What do I need to use the Bluetooth plugin transport?
4303 @item BluetoothHow does it work?
4304 @item What possible errors should I be aware of?
4305 @item How do I configure my peer?
4306 @item How can I test it?
4307 @end itemize
4308
4309 @menu
4310 * What do I need to use the Bluetooth plugin transport?::
4311 * How does it work2?::
4312 * What possible errors should I be aware of?::
4313 * How do I configure my peer2?::
4314 * How can I test it?::
4315 * The implementation of the Bluetooth transport plugin::
4316 @end menu
4317
4318 @node What do I need to use the Bluetooth plugin transport?
4319 @subsection What do I need to use the Bluetooth plugin transport?
4320 @c %**end of header
4321
4322 If you are a GNU/Linux user and you want to use the Bluetooth
4323 transport plugin you should install the
4324 @command{BlueZ development libraries} (if they aren't already
4325 installed).
4326 For instructions about how to install the libraries you should
4327 check out the BlueZ site
4328 (@uref{http://www.bluez.org/, http://www.bluez.org}). If you don't know if
4329 you have the necessary libraries, don't worry, just run the GNUnet
4330 configure script and you will be able to see a notification at the end
4331 which will warn you if you don't have the necessary libraries.
4332
4333 If you are a Windows user you should have installed the
4334 @emph{MinGW}/@emph{MSys2} with the latest updates (especially the
4335 @emph{ws2bth} header). If this is your first build of GNUnet on Windows
4336 you should check out the SBuild repository. It will semi-automatically
4337 assembles a @emph{MinGW}/@emph{MSys2} installation with a lot of extra
4338 packages which are needed for the GNUnet build. So this will ease your
4339 work!@ Finally you just have to be sure that you have the correct drivers
4340 for your Bluetooth device installed and that your device is on and in a
4341 discoverable mode. The Windows Bluetooth Stack supports only the RFCOMM
4342 protocol so we cannot turn on your device programatically!
4343
4344 @c FIXME: Change to unique title
4345 @node How does it work2?
4346 @subsection How does it work2?
4347 @c %**end of header
4348
4349 The Bluetooth transport plugin uses virtually the same code as the WLAN
4350 plugin and only the helper binary is different. The helper takes a single
4351 argument, which represents the interface name and is specified in the
4352 configuration file. Here are the basic steps that are followed by the
4353 helper binary used on GNU/Linux:
4354
4355 @itemize @bullet
4356 @item it verifies if the name corresponds to a Bluetooth interface name
4357 @item it verifies if the interface is up (if it is not, it tries to bring
4358 it up)
4359 @item it tries to enable the page and inquiry scan in order to make the
4360 device discoverable and to accept incoming connection requests
4361 @emph{The above operations require root access so you should start the
4362 transport plugin with root privileges.}
4363 @item it finds an available port number and registers a SDP service which
4364 will be used to find out on which port number is the server listening on
4365 and switch the socket in listening mode
4366 @item it sends a HELLO message with its address
4367 @item finally it forwards traffic from the reading sockets to the STDOUT
4368 and from the STDIN to the writing socket
4369 @end itemize
4370
4371 Once in a while the device will make an inquiry scan to discover the
4372 nearby devices and it will send them randomly HELLO messages for peer
4373 discovery.
4374
4375 @node What possible errors should I be aware of?
4376 @subsection What possible errors should I be aware of?
4377 @c %**end of header
4378
4379 @emph{This section is dedicated for GNU/Linux users}
4380
4381 Well there are many ways in which things could go wrong but I will try to
4382 present some tools that you could use to debug and some scenarios.
4383
4384 @itemize @bullet
4385
4386 @item @code{bluetoothd -n -d} : use this command to enable logging in the
4387 foreground and to print the logging messages
4388
4389 @item @code{hciconfig}: can be used to configure the Bluetooth devices.
4390 If you run it without any arguments it will print information about the
4391 state of the interfaces. So if you receive an error that the device
4392 couldn't be brought up you should try to bring it manually and to see if
4393 it works (use @code{hciconfig -a hciX up}). If you can't and the
4394 Bluetooth address has the form 00:00:00:00:00:00 it means that there is
4395 something wrong with the D-Bus daemon or with the Bluetooth daemon. Use
4396 @code{bluetoothd} tool to see the logs
4397
4398 @item @code{sdptool} can be used to control and interrogate SDP servers.
4399 If you encounter problems regarding the SDP server (like the SDP server is
4400 down) you should check out if the D-Bus daemon is running correctly and to
4401 see if the Bluetooth daemon started correctly(use @code{bluetoothd} tool).
4402 Also, sometimes the SDP service could work but somehow the device couldn't
4403 register its service. Use @code{sdptool browse [dev-address]} to see if
4404 the service is registered. There should be a service with the name of the
4405 interface and GNUnet as provider.
4406
4407 @item @code{hcitool} : another useful tool which can be used to configure
4408 the device and to send some particular commands to it.
4409
4410 @item @code{hcidump} : could be used for low level debugging
4411 @end itemize
4412
4413 @c FIXME: A more unique name
4414 @node How do I configure my peer2?
4415 @subsection How do I configure my peer2?
4416 @c %**end of header
4417
4418 On GNU/Linux, you just have to be sure that the interface name
4419 corresponds to the one that you want to use.
4420 Use the @code{hciconfig} tool to check that.
4421 By default it is set to hci0 but you can change it.
4422
4423 A basic configuration looks like this:
4424
4425 @example
4426 [transport-bluetooth]
4427 # Name of the interface (typically hciX)
4428 INTERFACE = hci0
4429 # Real hardware, no testing
4430 TESTMODE = 0 TESTING_IGNORE_KEYS = ACCEPT_FROM;
4431 @end example
4432
4433 In order to use the Bluetooth transport plugin when the transport service
4434 is started, you must add the plugin name to the default transport service
4435 plugins list. For example:
4436
4437 @example
4438 [transport] ...  PLUGINS = dns bluetooth ...
4439 @end example
4440
4441 If you want to use only the Bluetooth plugin set
4442 @emph{PLUGINS = bluetooth}
4443
4444 On Windows, you cannot specify which device to use. The only thing that
4445 you should do is to add @emph{bluetooth} on the plugins list of the
4446 transport service.
4447
4448 @node How can I test it?
4449 @subsection How can I test it?
4450 @c %**end of header
4451
4452 If you have two Bluetooth devices on the same machine and you are using
4453 GNU/Linux you must:
4454
4455 @itemize @bullet
4456
4457 @item create two different file configuration (one which will use the
4458 first interface (@emph{hci0}) and the other which will use the second
4459 interface (@emph{hci1})). Let's name them @emph{peer1.conf} and
4460 @emph{peer2.conf}.
4461
4462 @item run @emph{gnunet-peerinfo -c peerX.conf -s} in order to generate the
4463 peers private keys. The @strong{X} must be replace with 1 or 2.
4464
4465 @item run @emph{gnunet-arm -c peerX.conf -s -i=transport} in order to
4466 start the transport service. (Make sure that you have "bluetooth" on the
4467 transport plugins list if the Bluetooth transport service doesn't start.)
4468
4469 @item run @emph{gnunet-peerinfo -c peer1.conf -s} to get the first peer's
4470 ID. If you already know your peer ID (you saved it from the first
4471 command), this can be skipped.
4472
4473 @item run @emph{gnunet-transport -c peer2.conf -p=PEER1_ID -s} to start
4474 sending data for benchmarking to the other peer.
4475
4476 @end itemize
4477
4478
4479 This scenario will try to connect the second peer to the first one and
4480 then start sending data for benchmarking.
4481
4482 On Windows you cannot test the plugin functionality using two Bluetooth
4483 devices from the same machine because after you install the drivers there
4484 will occur some conflicts between the Bluetooth stacks. (At least that is
4485 what happened on my machine : I wasn't able to use the Bluesoleil stack and
4486 the WINDCOMM one in the same time).
4487
4488 If you have two different machines and your configuration files are good
4489 you can use the same scenario presented on the beginning of this section.
4490
4491 Another way to test the plugin functionality is to create your own
4492 application which will use the GNUnet framework with the Bluetooth
4493 transport service.
4494
4495 @node The implementation of the Bluetooth transport plugin
4496 @subsection The implementation of the Bluetooth transport plugin
4497 @c %**end of header
4498
4499 This page describes the implementation of the Bluetooth transport plugin.
4500
4501 First I want to remind you that the Bluetooth transport plugin uses
4502 virtually the same code as the WLAN plugin and only the helper binary is
4503 different. Also the scope of the helper binary from the Bluetooth
4504 transport plugin is the same as the one used for the WLAN transport
4505 plugin: it accesses the interface and then it forwards traffic in both
4506 directions between the Bluetooth interface and stdin/stdout of the
4507 process involved.
4508
4509 The Bluetooth plugin transport could be used both on GNU/Linux and Windows
4510 platforms.
4511
4512 @itemize @bullet
4513 @item Linux functionality
4514 @item Windows functionality
4515 @item Pending Features
4516 @end itemize
4517
4518
4519
4520 @menu
4521 * Linux functionality::
4522 * THE INITIALIZATION::
4523 * THE LOOP::
4524 * Details about the broadcast implementation::
4525 * Windows functionality::
4526 * Pending features::
4527 @end menu
4528
4529 @node Linux functionality
4530 @subsubsection Linux functionality
4531 @c %**end of header
4532
4533 In order to implement the plugin functionality on GNU/Linux I
4534 used the BlueZ stack.
4535 For the communication with the other devices I used the RFCOMM
4536 protocol. Also I used the HCI protocol to gain some control over the
4537 device. The helper binary takes a single argument (the name of the
4538 Bluetooth interface) and is separated in two stages:
4539
4540 @c %** 'THE INITIALIZATION' should be in bigger letters or stand out, not
4541 @c %** starting a new section?
4542 @node THE INITIALIZATION
4543 @subsubsection THE INITIALIZATION
4544
4545 @itemize @bullet
4546 @item first, it checks if we have root privileges
4547 (@emph{Remember that we need to have root privileges in order to be able
4548 to bring the interface up if it is down or to change its state.}).
4549
4550 @item second, it verifies if the interface with the given name exists.
4551
4552 @strong{If the interface with that name exists and it is a Bluetooth
4553 interface:}
4554
4555 @item it creates a RFCOMM socket which will be used for listening and call
4556 the @emph{open_device} method
4557
4558 On the @emph{open_device} method:
4559 @itemize @bullet
4560 @item creates a HCI socket used to send control events to the the device
4561 @item searches for the device ID using the interface name
4562 @item saves the device MAC address
4563 @item checks if the interface is down and tries to bring it UP
4564 @item checks if the interface is in discoverable mode and tries to make it
4565 discoverable
4566 @item closes the HCI socket and binds the RFCOMM one
4567 @item switches the RFCOMM socket in listening mode
4568 @item registers the SDP service (the service will be used by the other
4569 devices to get the port on which this device is listening on)
4570 @end itemize
4571
4572 @item drops the root privileges
4573
4574 @strong{If the interface is not a Bluetooth interface the helper exits
4575 with a suitable error}
4576 @end itemize
4577
4578 @c %** Same as for @node entry above
4579 @node THE LOOP
4580 @subsubsection THE LOOP
4581
4582 The helper binary uses a list where it saves all the connected neighbour
4583 devices (@emph{neighbours.devices}) and two buffers (@emph{write_pout} and
4584 @emph{write_std}). The first message which is send is a control message
4585 with the device's MAC address in order to announce the peer presence to
4586 the neighbours. Here are a short description of what happens in the main
4587 loop:
4588
4589 @itemize @bullet
4590 @item Every time when it receives something from the STDIN it processes
4591 the data and saves the message in the first buffer (@emph{write_pout}).
4592 When it has something in the buffer, it gets the destination address from
4593 the buffer, searches the destination address in the list (if there is no
4594 connection with that device, it creates a new one and saves it to the
4595 list) and sends the message.
4596 @item Every time when it receives something on the listening socket it
4597 accepts the connection and saves the socket on a list with the reading
4598 sockets. @item Every time when it receives something from a reading
4599 socket it parses the message, verifies the CRC and saves it in the
4600 @emph{write_std} buffer in order to be sent later to the STDOUT.
4601 @end itemize
4602
4603 So in the main loop we use the select function to wait until one of the
4604 file descriptor saved in one of the two file descriptors sets used is
4605 ready to use. The first set (@emph{rfds}) represents the reading set and
4606 it could contain the list with the reading sockets, the STDIN file
4607 descriptor or the listening socket. The second set (@emph{wfds}) is the
4608 writing set and it could contain the sending socket or the STDOUT file
4609 descriptor. After the select function returns, we check which file
4610 descriptor is ready to use and we do what is supposed to do on that kind
4611 of event. @emph{For example:} if it is the listening socket then we
4612 accept a new connection and save the socket in the reading list; if it is
4613 the STDOUT file descriptor, then we write to STDOUT the message from the
4614 @emph{write_std} buffer.
4615
4616 To find out on which port a device is listening on we connect to the local
4617 SDP server and search the registered service for that device.
4618
4619 @emph{You should be aware of the fact that if the device fails to connect
4620 to another one when trying to send a message it will attempt one more
4621 time. If it fails again, then it skips the message.}
4622 @emph{Also you should know that the transport Bluetooth plugin has
4623 support for @strong{broadcast messages}.}
4624
4625 @node Details about the broadcast implementation
4626 @subsubsection Details about the broadcast implementation
4627 @c %**end of header
4628
4629 First I want to point out that the broadcast functionality for the CONTROL
4630 messages is not implemented in a conventional way. Since the inquiry scan
4631 time is too big and it will take some time to send a message to all the
4632 discoverable devices I decided to tackle the problem in a different way.
4633 Here is how I did it:
4634
4635 @itemize @bullet
4636 @item If it is the first time when I have to broadcast a message I make an
4637 inquiry scan and save all the devices' addresses to a vector.
4638 @item After the inquiry scan ends I take the first address from the list
4639 and I try to connect to it. If it fails, I try to connect to the next one.
4640 If it succeeds, I save the socket to a list and send the message to the
4641 device.
4642 @item When I have to broadcast another message, first I search on the list
4643 for a new device which I'm not connected to. If there is no new device on
4644 the list I go to the beginning of the list and send the message to the
4645 old devices. After 5 cycles I make a new inquiry scan to check out if
4646 there are new discoverable devices and save them to the list. If there
4647 are no new discoverable devices I reset the cycling counter and go again
4648 through the old list and send messages to the devices saved in it.
4649 @end itemize
4650
4651 @strong{Therefore}:
4652
4653 @itemize @bullet
4654 @item every time when I have a broadcast message I look up on the list
4655 for a new device and send the message to it
4656 @item if I reached the end of the list for 5 times and I'm connected to
4657 all the devices from the list I make a new inquiry scan.
4658 @emph{The number of the list's cycles after an inquiry scan could be
4659 increased by redefining the MAX_LOOPS variable}
4660 @item when there are no new devices I send messages to the old ones.
4661 @end itemize
4662
4663 Doing so, the broadcast control messages will reach the devices but with
4664 delay.
4665
4666 @emph{NOTICE:} When I have to send a message to a certain device first I
4667 check on the broadcast list to see if we are connected to that device. If
4668 not we try to connect to it and in case of success we save the address and
4669 the socket on the list. If we are already connected to that device we
4670 simply use the socket.
4671
4672 @node Windows functionality
4673 @subsubsection Windows functionality
4674 @c %**end of header
4675
4676 For Windows I decided to use the Microsoft Bluetooth stack which has the
4677 advantage of coming standard from Windows XP SP2. The main disadvantage is
4678 that it only supports the RFCOMM protocol so we will not be able to have
4679 a low level control over the Bluetooth device. Therefore it is the user
4680 responsibility to check if the device is up and in the discoverable mode.
4681 Also there are no tools which could be used for debugging in order to read
4682 the data coming from and going to a Bluetooth device, which obviously
4683 hindered my work. Another thing that slowed down the implementation of the
4684 plugin (besides that I wasn't too accommodated with the win32 API) was that
4685 there were some bugs on MinGW regarding the Bluetooth. Now they are solved
4686 but you should keep in mind that you should have the latest updates
4687 (especially the @emph{ws2bth} header).
4688
4689 Besides the fact that it uses the Windows Sockets, the Windows
4690 implementation follows the same principles as the GNU/Linux one:
4691
4692 @itemize @bullet
4693 @item It has a initalization part where it initializes the
4694 Windows Sockets, creates a RFCOMM socket which will be binded and switched
4695 to the listening mode and registers a SDP service. In the Microsoft
4696 Bluetooth API there are two ways to work with the SDP:
4697 @itemize @bullet
4698 @item an easy way which works with very simple service records
4699 @item a hard way which is useful when you need to update or to delete the
4700 record
4701 @end itemize
4702 @end itemize
4703
4704 Since I only needed the SDP service to find out on which port the device
4705 is listening on and that did not change, I decided to use the easy way.
4706 In order to register the service I used the @emph{WSASetService} function
4707 and I generated the @emph{Universally Unique Identifier} with the
4708 @emph{guidgen.exe} Windows's tool.
4709
4710 In the loop section the only difference from the GNU/Linux implementation
4711 is that I used the @code{GNUNET_NETWORK} library for
4712 functions like @emph{accept}, @emph{bind}, @emph{connect} or
4713 @emph{select}. I decided to use the
4714 @code{GNUNET_NETWORK} library because I also needed to interact
4715 with the STDIN and STDOUT handles and on Windows
4716 the select function is only defined for sockets,
4717 and it will not work for arbitrary file handles.
4718
4719 Another difference between GNU/Linux and Windows implementation is that in
4720 GNU/Linux, the Bluetooth address is represented in 48 bits
4721 while in Windows is represented in 64 bits.
4722 Therefore I had to do some changes on @emph{plugin_transport_wlan} header.
4723
4724 Also, currently on Windows the Bluetooth plugin doesn't have support for
4725 broadcast messages. When it receives a broadcast message it will skip it.
4726
4727 @node Pending features
4728 @subsubsection Pending features
4729 @c %**end of header
4730
4731 @itemize @bullet
4732 @item Implement the broadcast functionality on Windows @emph{(currently
4733 working on)}
4734 @item Implement a testcase for the helper :@ @emph{The testcase
4735 consists of a program which emulates the plugin and uses the helper. It
4736 will simulate connections, disconnections and data transfers.}
4737 @end itemize
4738
4739 If you have a new idea about a feature of the plugin or suggestions about
4740 how I could improve the implementation you are welcome to comment or to
4741 contact me.
4742
4743 @node WLAN plugin
4744 @section WLAN plugin
4745 @c %**end of header
4746
4747 This section documents how the wlan transport plugin works. Parts which
4748 are not implemented yet or could be better implemented are described at
4749 the end.
4750
4751 @cindex ATS Subsystem
4752 @node ATS Subsystem
4753 @section ATS Subsystem
4754 @c %**end of header
4755
4756 ATS stands for "automatic transport selection", and the function of ATS in
4757 GNUnet is to decide on which address (and thus transport plugin) should
4758 be used for two peers to communicate, and what bandwidth limits should be
4759 imposed on such an individual connection. To help ATS make an informed
4760 decision, higher-level services inform the ATS service about their
4761 requirements and the quality of the service rendered. The ATS service
4762 also interacts with the transport service to be appraised of working
4763 addresses and to communicate its resource allocation decisions. Finally,
4764 the ATS service's operation can be observed using a monitoring API.
4765
4766 The main logic of the ATS service only collects the available addresses,
4767 their performance characteristics and the applications requirements, but
4768 does not make the actual allocation decision. This last critical step is
4769 left to an ATS plugin, as we have implemented (currently three) different
4770 allocation strategies which differ significantly in their performance and
4771 maturity, and it is still unclear if any particular plugin is generally
4772 superior.
4773
4774 @cindex CORE Subsystem
4775 @node CORE Subsystem
4776 @section CORE Subsystem
4777 @c %**end of header
4778
4779 The CORE subsystem in GNUnet is responsible for securing link-layer
4780 communications between nodes in the GNUnet overlay network. CORE builds
4781 on the TRANSPORT subsystem which provides for the actual, insecure,
4782 unreliable link-layer communication (for example, via UDP or WLAN), and
4783 then adds fundamental security to the connections:
4784
4785 @itemize @bullet
4786 @item confidentiality with so-called perfect forward secrecy; we use
4787 ECDHE
4788 (@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman})
4789 powered by Curve25519
4790 (@uref{http://cr.yp.to/ecdh.html, Curve25519}) for the key
4791 exchange and then use symmetric encryption, encrypting with both AES-256
4792 (@uref{http://en.wikipedia.org/wiki/Rijndael, AES-256}) and
4793 Twofish (@uref{http://en.wikipedia.org/wiki/Twofish, Twofish})
4794 @item @uref{http://en.wikipedia.org/wiki/Authentication, authentication}
4795 is achieved by signing the ephemeral keys using Ed25519
4796 (@uref{http://ed25519.cr.yp.to/, Ed25519}), a deterministic
4797 variant of ECDSA
4798 (@uref{http://en.wikipedia.org/wiki/ECDSA, ECDSA})
4799 @item integrity protection (using SHA-512
4800 (@uref{http://en.wikipedia.org/wiki/SHA-2, SHA-512}) to do
4801 encrypt-then-MAC
4802 (@uref{http://en.wikipedia.org/wiki/Authenticated_encryption, encrypt-then-MAC}))
4803 @item Replay
4804 (@uref{http://en.wikipedia.org/wiki/Replay_attack, replay})
4805 protection (using nonces, timestamps, challenge-response,
4806 message counters and ephemeral keys)
4807 @item liveness (keep-alive messages, timeout)
4808 @end itemize
4809
4810 @menu
4811 * Limitations::
4812 * When is a peer "connected"?::
4813 * libgnunetcore::
4814 * The CORE Client-Service Protocol::
4815 * The CORE Peer-to-Peer Protocol::
4816 @end menu
4817
4818 @cindex core subsystem limitations
4819 @node Limitations
4820 @subsection Limitations
4821 @c %**end of header
4822
4823 CORE does not perform
4824 @uref{http://en.wikipedia.org/wiki/Routing, routing}; using CORE it is
4825 only possible to communicate with peers that happen to already be
4826 "directly" connected with each other. CORE also does not have an
4827 API to allow applications to establish such "direct" connections --- for
4828 this, applications can ask TRANSPORT, but TRANSPORT might not be able to
4829 establish a "direct" connection. The TOPOLOGY subsystem is responsible for
4830 trying to keep a few "direct" connections open at all times. Applications
4831 that need to talk to particular peers should use the CADET subsystem, as
4832 it can establish arbitrary "indirect" connections.
4833
4834 Because CORE does not perform routing, CORE must only be used directly by
4835 applications that either perform their own routing logic (such as
4836 anonymous file-sharing) or that do not require routing, for example
4837 because they are based on flooding the network. CORE communication is
4838 unreliable and delivery is possibly out-of-order. Applications that
4839 require reliable communication should use the CADET service. Each
4840 application can only queue one message per target peer with the CORE
4841 service at any time; messages cannot be larger than approximately
4842 63 kilobytes. If messages are small, CORE may group multiple messages
4843 (possibly from different applications) prior to encryption. If permitted
4844 by the application (using the @uref{http://baus.net/on-tcp_cork/, cork}
4845 option), CORE may delay transmissions to facilitate grouping of multiple
4846 small messages. If cork is not enabled, CORE will transmit the message as
4847 soon as TRANSPORT allows it (TRANSPORT is responsible for limiting
4848 bandwidth and congestion control). CORE does not allow flow control;
4849 applications are expected to process messages at line-speed. If flow
4850 control is needed, applications should use the CADET service.
4851
4852 @cindex when is a peer connected
4853 @node When is a peer "connected"?
4854 @subsection When is a peer "connected"?
4855 @c %**end of header
4856
4857 In addition to the security features mentioned above, CORE also provides
4858 one additional key feature to applications using it, and that is a
4859 limited form of protocol-compatibility checking. CORE distinguishes
4860 between TRANSPORT-level connections (which enable communication with other
4861 peers) and application-level connections. Applications using the CORE API
4862 will (typically) learn about application-level connections from CORE, and
4863 not about TRANSPORT-level connections. When a typical application uses
4864 CORE, it will specify a set of message types
4865 (from @code{gnunet_protocols.h}) that it understands. CORE will then
4866 notify the application about connections it has with other peers if and
4867 only if those applications registered an intersecting set of message
4868 types with their CORE service. Thus, it is quite possible that CORE only
4869 exposes a subset of the established direct connections to a particular
4870 application --- and different applications running above CORE might see
4871 different sets of connections at the same time.
4872
4873 A special case are applications that do not register a handler for any
4874 message type.
4875 CORE assumes that these applications merely want to monitor connections
4876 (or "all" messages via other callbacks) and will notify those applications
4877 about all connections. This is used, for example, by the
4878 @code{gnunet-core} command-line tool to display the active connections.
4879 Note that it is also possible that the TRANSPORT service has more active
4880 connections than the CORE service, as the CORE service first has to
4881 perform a key exchange with connecting peers before exchanging information
4882 about supported message types and notifying applications about the new
4883 connection.
4884
4885 @cindex libgnunetcore
4886 @node libgnunetcore
4887 @subsection libgnunetcore
4888 @c %**end of header
4889
4890 The CORE API (defined in @file{gnunet_core_service.h}) is the basic
4891 messaging API used by P2P applications built using GNUnet. It provides
4892 applications the ability to send and receive encrypted messages to the
4893 peer's "directly" connected neighbours.
4894
4895 As CORE connections are generally "direct" connections,@ applications must
4896 not assume that they can connect to arbitrary peers this way, as "direct"
4897 connections may not always be possible. Applications using CORE are
4898 notified about which peers are connected. Creating new "direct"
4899 connections must be done using the TRANSPORT API.
4900
4901 The CORE API provides unreliable, out-of-order delivery. While the
4902 implementation tries to ensure timely, in-order delivery, both message
4903 losses and reordering are not detected and must be tolerated by the
4904 application. Most important, the core will NOT perform retransmission if
4905 messages could not be delivered.
4906
4907 Note that CORE allows applications to queue one message per connected
4908 peer. The rate at which each connection operates is influenced by the
4909 preferences expressed by local application as well as restrictions
4910 imposed by the other peer. Local applications can express their
4911 preferences for particular connections using the "performance" API of the
4912 ATS service.
4913
4914 Applications that require more sophisticated transmission capabilities
4915 such as TCP-like behavior, or if you intend to send messages to arbitrary
4916 remote peers, should use the CADET API.
4917
4918 The typical use of the CORE API is to connect to the CORE service using
4919 @code{GNUNET_CORE_connect}, process events from the CORE service (such as
4920 peers connecting, peers disconnecting and incoming messages) and send
4921 messages to connected peers using
4922 @code{GNUNET_CORE_notify_transmit_ready}. Note that applications must
4923 cancel pending transmission requests if they receive a disconnect event
4924 for a peer that had a transmission pending; furthermore, queuing more
4925 than one transmission request per peer per application using the
4926 service is not permitted.
4927
4928 The CORE API also allows applications to monitor all communications of the
4929 peer prior to encryption (for outgoing messages) or after decryption (for
4930 incoming messages). This can be useful for debugging, diagnostics or to
4931 establish the presence of cover traffic (for anonymity). As monitoring
4932 applications are often not interested in the payload, the monitoring
4933 callbacks can be configured to only provide the message headers (including
4934 the message type and size) instead of copying the full data stream to the
4935 monitoring client.
4936
4937 The init callback of the @code{GNUNET_CORE_connect} function is called
4938 with the hash of the public key of the peer. This public key is used to
4939 identify the peer globally in the GNUnet network. Applications are
4940 encouraged to check that the provided hash matches the hash that they are
4941 using (as theoretically the application may be using a different
4942 configuration file with a different private key, which would result in
4943 hard to find bugs).
4944
4945 As with most service APIs, the CORE API isolates applications from crashes
4946 of the CORE service. If the CORE service crashes, the application will see
4947 disconnect events for all existing connections. Once the connections are
4948 re-established, the applications will be receive matching connect events.
4949
4950 @cindex core clinet-service protocol
4951 @node The CORE Client-Service Protocol
4952 @subsection The CORE Client-Service Protocol
4953 @c %**end of header
4954
4955 This section describes the protocol between an application using the CORE
4956 service (the client) and the CORE service process itself.
4957
4958
4959 @menu
4960 * Setup2::
4961 * Notifications::
4962 * Sending::
4963 @end menu
4964
4965 @node Setup2
4966 @subsubsection Setup2
4967 @c %**end of header
4968
4969 When a client connects to the CORE service, it first sends a
4970 @code{InitMessage} which specifies options for the connection and a set of
4971 message type values which are supported by the application. The options
4972 bitmask specifies which events the client would like to be notified about.
4973 The options include:
4974
4975 @table @asis
4976 @item GNUNET_CORE_OPTION_NOTHING No notifications
4977 @item GNUNET_CORE_OPTION_STATUS_CHANGE Peers connecting and disconnecting
4978 @item GNUNET_CORE_OPTION_FULL_INBOUND All inbound messages (after
4979 decryption) with full payload
4980 @item GNUNET_CORE_OPTION_HDR_INBOUND Just the @code{MessageHeader}
4981 of all inbound messages
4982 @item GNUNET_CORE_OPTION_FULL_OUTBOUND All outbound
4983 messages (prior to encryption) with full payload
4984 @item GNUNET_CORE_OPTION_HDR_OUTBOUND Just the @code{MessageHeader} of all
4985 outbound messages
4986 @end table
4987
4988 Typical applications will only monitor for connection status changes.
4989
4990 The CORE service responds to the @code{InitMessage} with an
4991 @code{InitReplyMessage} which contains the peer's identity. Afterwards,
4992 both CORE and the client can send messages.
4993
4994 @node Notifications
4995 @subsubsection Notifications
4996 @c %**end of header
4997
4998 The CORE will send @code{ConnectNotifyMessage}s and
4999 @code{DisconnectNotifyMessage}s whenever peers connect or disconnect from
5000 the CORE (assuming their type maps overlap with the message types
5001 registered by the client). When the CORE receives a message that matches
5002 the set of message types specified during the @code{InitMessage} (or if
5003 monitoring is enabled in for inbound messages in the options), it sends a
5004 @code{NotifyTrafficMessage} with the peer identity of the sender and the
5005 decrypted payload. The same message format (except with
5006 @code{GNUNET_MESSAGE_TYPE_CORE_NOTIFY_OUTBOUND} for the message type) is
5007 used to notify clients monitoring outbound messages; here, the peer
5008 identity given is that of the receiver.
5009
5010 @node Sending
5011 @subsubsection Sending
5012 @c %**end of header
5013
5014 When a client wants to transmit a message, it first requests a
5015 transmission slot by sending a @code{SendMessageRequest} which specifies
5016 the priority, deadline and size of the message. Note that these values
5017 may be ignored by CORE. When CORE is ready for the message, it answers
5018 with a @code{SendMessageReady} response. The client can then transmit the
5019 payload with a @code{SendMessage} message. Note that the actual message
5020 size in the @code{SendMessage} is allowed to be smaller than the size in
5021 the original request. A client may at any time send a fresh
5022 @code{SendMessageRequest}, which then superceeds the previous
5023 @code{SendMessageRequest}, which is then no longer valid. The client can
5024 tell which @code{SendMessageRequest} the CORE service's
5025 @code{SendMessageReady} message is for as all of these messages contain a
5026 "unique" request ID (based on a counter incremented by the client
5027 for each request).
5028
5029 @cindex CORE Peer-to-Peer Protocol
5030 @node The CORE Peer-to-Peer Protocol
5031 @subsection The CORE Peer-to-Peer Protocol
5032 @c %**end of header
5033
5034
5035 @menu
5036 * Creating the EphemeralKeyMessage::
5037 * Establishing a connection::
5038 * Encryption and Decryption::
5039 * Type maps::
5040 @end menu
5041
5042 @cindex EphemeralKeyMessage creation
5043 @node Creating the EphemeralKeyMessage
5044 @subsubsection Creating the EphemeralKeyMessage
5045 @c %**end of header
5046
5047 When the CORE service starts, each peer creates a fresh ephemeral (ECC)
5048 public-private key pair and signs the corresponding
5049 @code{EphemeralKeyMessage} with its long-term key (which we usually call
5050 the peer's identity; the hash of the public long term key is what results
5051 in a @code{struct GNUNET_PeerIdentity} in all GNUnet APIs. The ephemeral
5052 key is ONLY used for an ECDHE
5053 (@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman})
5054 exchange by the CORE service to establish symmetric session keys. A peer
5055 will use the same @code{EphemeralKeyMessage} for all peers for
5056 @code{REKEY_FREQUENCY}, which is usually 12 hours. After that time, it
5057 will create a fresh ephemeral key (forgetting the old one) and broadcast
5058 the new @code{EphemeralKeyMessage} to all connected peers, resulting in
5059 fresh symmetric session keys. Note that peers independently decide on
5060 when to discard ephemeral keys; it is not a protocol violation to discard
5061 keys more often. Ephemeral keys are also never stored to disk; restarting
5062 a peer will thus always create a fresh ephemeral key. The use of ephemeral
5063 keys is what provides @uref{http://en.wikipedia.org/wiki/Forward_secrecy, forward secrecy}.
5064
5065 Just before transmission, the @code{EphemeralKeyMessage} is patched to
5066 reflect the current sender_status, which specifies the current state of
5067 the connection from the point of view of the sender. The possible values
5068 are:
5069
5070 @itemize @bullet
5071 @item @code{KX_STATE_DOWN} Initial value, never used on the network
5072 @item @code{KX_STATE_KEY_SENT} We sent our ephemeral key, do not know the
5073 key of the other peer
5074 @item @code{KX_STATE_KEY_RECEIVED} This peer has received a valid
5075 ephemeral key of the other peer, but we are waiting for the other peer to
5076 confirm it's authenticity (ability to decode) via challenge-response.
5077 @item @code{KX_STATE_UP} The connection is fully up from the point of
5078 view of the sender (now performing keep-alives)
5079 @item @code{KX_STATE_REKEY_SENT} The sender has initiated a rekeying
5080 operation; the other peer has so far failed to confirm a working
5081 connection using the new ephemeral key
5082 @end itemize
5083
5084 @node Establishing a connection
5085 @subsubsection Establishing a connection
5086 @c %**end of header
5087
5088 Peers begin their interaction by sending a @code{EphemeralKeyMessage} to
5089 the other peer once the TRANSPORT service notifies the CORE service about
5090 the connection.
5091 A peer receiving an @code{EphemeralKeyMessage} with a status
5092 indicating that the sender does not have the receiver's ephemeral key, the
5093 receiver's @code{EphemeralKeyMessage} is sent in response.
5094 Additionally, if the receiver has not yet confirmed the authenticity of
5095 the sender, it also sends an (encrypted)@code{PingMessage} with a
5096 challenge (and the identity of the target) to the other peer. Peers
5097 receiving a @code{PingMessage} respond with an (encrypted)
5098 @code{PongMessage} which includes the challenge. Peers receiving a
5099 @code{PongMessage} check the challenge, and if it matches set the
5100 connection to @code{KX_STATE_UP}.
5101
5102 @node Encryption and Decryption
5103 @subsubsection Encryption and Decryption
5104 @c %**end of header
5105
5106 All functions related to the key exchange and encryption/decryption of
5107 messages can be found in @file{gnunet-service-core_kx.c} (except for the
5108 cryptographic primitives, which are in @file{util/crypto*.c}).
5109 Given the key material from ECDHE, a Key derivation function
5110 (@uref{https://en.wikipedia.org/wiki/Key_derivation_function, Key derivation function})
5111 is used to derive two pairs of encryption and decryption keys for AES-256
5112 and TwoFish, as well as initialization vectors and authentication keys
5113 (for HMAC
5114 (@uref{https://en.wikipedia.org/wiki/HMAC, HMAC})).
5115 The HMAC is computed over the encrypted payload.
5116 Encrypted messages include an iv_seed and the HMAC in the header.
5117
5118 Each encrypted message in the CORE service includes a sequence number and
5119 a timestamp in the encrypted payload. The CORE service remembers the
5120 largest observed sequence number and a bit-mask which represents which of
5121 the previous 32 sequence numbers were already used.
5122 Messages with sequence numbers lower than the largest observed sequence
5123 number minus 32 are discarded. Messages with a timestamp that is less
5124 than @code{REKEY_TOLERANCE} off (5 minutes) are also discarded. This of
5125 course means that system clocks need to be reasonably synchronized for
5126 peers to be able to communicate. Additionally, as the ephemeral key
5127 changes every 12 hours, a peer would not even be able to decrypt messages
5128 older than 12 hours.
5129
5130 @node Type maps
5131 @subsubsection Type maps
5132 @c %**end of header
5133
5134 Once an encrypted connection has been established, peers begin to exchange
5135 type maps. Type maps are used to allow the CORE service to determine which
5136 (encrypted) connections should be shown to which applications. A type map
5137 is an array of 65536 bits representing the different types of messages
5138 understood by applications using the CORE service. Each CORE service
5139 maintains this map, simply by setting the respective bit for each message
5140 type supported by any of the applications using the CORE service. Note
5141 that bits for message types embedded in higher-level protocols (such as
5142 MESH) will not be included in these type maps.
5143
5144 Typically, the type map of a peer will be sparse. Thus, the CORE service
5145 attempts to compress its type map using @code{gzip}-style compression
5146 ("deflate") prior to transmission. However, if the compression fails to
5147 compact the map, the map may also be transmitted without compression
5148 (resulting in @code{GNUNET_MESSAGE_TYPE_CORE_COMPRESSED_TYPE_MAP} or
5149 @code{GNUNET_MESSAGE_TYPE_CORE_BINARY_TYPE_MAP} messages respectively).
5150 Upon receiving a type map, the respective CORE service notifies
5151 applications about the connection to the other peer if they support any
5152 message type indicated in the type map (or no message type at all).
5153 If the CORE service experience a connect or disconnect event from an
5154 application, it updates its type map (setting or unsetting the respective
5155 bits) and notifies its neighbours about the change.
5156 The CORE services of the neighbours then in turn generate connect and
5157 disconnect events for the peer that sent the type map for their respective
5158 applications. As CORE messages may be lost, the CORE service confirms
5159 receiving a type map by sending back a
5160 @code{GNUNET_MESSAGE_TYPE_CORE_CONFIRM_TYPE_MAP}. If such a confirmation
5161 (with the correct hash of the type map) is not received, the sender will
5162 retransmit the type map (with exponential back-off).
5163
5164 @cindex CADET Subsystem
5165 @node CADET Subsystem
5166 @section CADET Subsystem
5167
5168 The CADET subsystem in GNUnet is responsible for secure end-to-end
5169 communications between nodes in the GNUnet overlay network. CADET builds
5170 on the CORE subsystem which provides for the link-layer communication and
5171 then adds routing, forwarding and additional security to the connections.
5172 CADET offers the same cryptographic services as CORE, but on an
5173 end-to-end level. This is done so peers retransmitting traffic on behalf
5174 of other peers cannot access the payload data.
5175
5176 @itemize @bullet
5177 @item CADET provides confidentiality with so-called perfect forward
5178 secrecy; we use ECDHE powered by Curve25519 for the key exchange and then
5179 use symmetric encryption, encrypting with both AES-256 and Twofish
5180 @item authentication is achieved by signing the ephemeral keys using
5181 Ed25519, a deterministic variant of ECDSA
5182 @item integrity protection (using SHA-512 to do encrypt-then-MAC, although
5183 only 256 bits are sent to reduce overhead)
5184 @item replay protection (using nonces, timestamps, challenge-response,
5185 message counters and ephemeral keys)
5186 @item liveness (keep-alive messages, timeout)
5187 @end itemize
5188
5189 Additional to the CORE-like security benefits, CADET offers other
5190 properties that make it a more universal service than CORE.
5191
5192 @itemize @bullet
5193 @item CADET can establish channels to arbitrary peers in GNUnet. If a
5194 peer is not immediately reachable, CADET will find a path through the
5195 network and ask other peers to retransmit the traffic on its behalf.
5196 @item CADET offers (optional) reliability mechanisms. In a reliable
5197 channel traffic is guaranteed to arrive complete, unchanged and in-order.
5198 @item CADET takes care of flow and congestion control mechanisms, not
5199 allowing the sender to send more traffic than the receiver or the network
5200 are able to process.
5201 @end itemize
5202
5203 @menu
5204 * libgnunetcadet::
5205 @end menu
5206
5207 @cindex libgnunetcadet
5208 @node libgnunetcadet
5209 @subsection libgnunetcadet
5210
5211
5212 The CADET API (defined in @file{gnunet_cadet_service.h}) is the
5213 messaging API used by P2P applications built using GNUnet.
5214 It provides applications the ability to send and receive encrypted
5215 messages to any peer participating in GNUnet.
5216 The API is heavily base on the CORE API.
5217
5218 CADET delivers messages to other peers in "channels".
5219 A channel is a permanent connection defined by a destination peer
5220 (identified by its public key) and a port number.
5221 Internally, CADET tunnels all channels towards a destination peer
5222 using one session key and relays the data on multiple "connections",
5223 independent from the channels.
5224
5225 Each channel has optional parameters, the most important being the
5226 reliability flag.
5227 Should a message get lost on TRANSPORT/CORE level, if a channel is
5228 created with as reliable, CADET will retransmit the lost message and
5229 deliver it in order to the destination application.
5230
5231 To communicate with other peers using CADET, it is necessary to first
5232 connect to the service using @code{GNUNET_CADET_connect}.
5233 This function takes several parameters in form of callbacks, to allow the
5234 client to react to various events, like incoming channels or channels that
5235 terminate, as well as specify a list of ports the client wishes to listen
5236 to (at the moment it is not possible to start listening on further ports
5237 once connected, but nothing prevents a client to connect several times to
5238 CADET, even do one connection per listening port).
5239 The function returns a handle which has to be used for any further
5240 interaction with the service.
5241
5242 To connect to a remote peer a client has to call the
5243 @code{GNUNET_CADET_channel_create} function. The most important parameters
5244 given are the remote peer's identity (it public key) and a port, which
5245 specifies which application on the remote peer to connect to, similar to
5246 TCP/UDP ports. CADET will then find the peer in the GNUnet network and
5247 establish the proper low-level connections and do the necessary key
5248 exchanges to assure and authenticated, secure and verified communication.
5249 Similar to @code{GNUNET_CADET_connect},@code{GNUNET_CADET_create_channel}
5250 returns a handle to interact with the created channel.
5251
5252 For every message the client wants to send to the remote application,
5253 @code{GNUNET_CADET_notify_transmit_ready} must be called, indicating the
5254 channel on which the message should be sent and the size of the message
5255 (but not the message itself!). Once CADET is ready to send the message,
5256 the provided callback will fire, and the message contents are provided to
5257 this callback.
5258
5259 Please note the CADET does not provide an explicit notification of when a
5260 channel is connected. In loosely connected networks, like big wireless
5261 mesh networks, this can take several seconds, even minutes in the worst
5262 case. To be alerted when a channel is online, a client can call
5263 @code{GNUNET_CADET_notify_transmit_ready} immediately after
5264 @code{GNUNET_CADET_create_channel}. When the callback is activated, it
5265 means that the channel is online. The callback can give 0 bytes to CADET
5266 if no message is to be sent, this is OK.
5267
5268 If a transmission was requested but before the callback fires it is no
5269 longer needed, it can be canceled with
5270 @code{GNUNET_CADET_notify_transmit_ready_cancel}, which uses the handle
5271 given back by @code{GNUNET_CADET_notify_transmit_ready}.
5272 As in the case of CORE, only one message can be requested at a time: a
5273 client must not call @code{GNUNET_CADET_notify_transmit_ready} again until
5274 the callback is called or the request is canceled.
5275
5276 When a channel is no longer needed, a client can call
5277 @code{GNUNET_CADET_channel_destroy} to get rid of it.
5278 Note that CADET will try to transmit all pending traffic before notifying
5279 the remote peer of the destruction of the channel, including
5280 retransmitting lost messages if the channel was reliable.
5281
5282 Incoming channels, channels being closed by the remote peer, and traffic
5283 on any incoming or outgoing channels are given to the client when CADET
5284 executes the callbacks given to it at the time of
5285 @code{GNUNET_CADET_connect}.
5286
5287 Finally, when an application no longer wants to use CADET, it should call
5288 @code{GNUNET_CADET_disconnect}, but first all channels and pending
5289 transmissions must be closed (otherwise CADET will complain).
5290
5291 @cindex NSE Subsystem
5292 @node NSE Subsystem
5293 @section NSE Subsystem
5294
5295
5296 NSE stands for @dfn{Network Size Estimation}. The NSE subsystem provides
5297 other subsystems and users with a rough estimate of the number of peers
5298 currently participating in the GNUnet overlay.
5299 The computed value is not a precise number as producing a precise number
5300 in a decentralized, efficient and secure way is impossible.
5301 While NSE's estimate is inherently imprecise, NSE also gives the expected
5302 range. For a peer that has been running in a stable network for a
5303 while, the real network size will typically (99.7% of the time) be in the
5304 range of [2/3 estimate, 3/2 estimate]. We will now give an overview of the
5305 algorithm used to calculate the estimate;
5306 all of the details can be found in this technical report.
5307
5308 @c FIXME: link to the report.
5309
5310 @menu
5311 * Motivation::
5312 * Principle::
5313 * libgnunetnse::
5314 * The NSE Client-Service Protocol::
5315 * The NSE Peer-to-Peer Protocol::
5316 @end menu
5317
5318 @node Motivation
5319 @subsection Motivation
5320
5321
5322 Some subsystems, like DHT, need to know the size of the GNUnet network to
5323 optimize some parameters of their own protocol. The decentralized nature
5324 of GNUnet makes efficient and securely counting the exact number of peers
5325 infeasible. Although there are several decentralized algorithms to count
5326 the number of peers in a system, so far there is none to do so securely.
5327 Other protocols may allow any malicious peer to manipulate the final
5328 result or to take advantage of the system to perform
5329 @dfn{Denial of Service} (DoS) attacks against the network.
5330 GNUnet's NSE protocol avoids these drawbacks.
5331
5332
5333
5334 @menu
5335 * Security::
5336 @end menu
5337
5338 @cindex NSE security
5339 @cindex nse security
5340 @node Security
5341 @subsubsection Security
5342
5343
5344 The NSE subsystem is designed to be resilient against these attacks.
5345 It uses @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proofs of work}
5346 to prevent one peer from impersonating a large number of participants,
5347 which would otherwise allow an adversary to artificially inflate the
5348 estimate.
5349 The DoS protection comes from the time-based nature of the protocol:
5350 the estimates are calculated periodically and out-of-time traffic is
5351 either ignored or stored for later retransmission by benign peers.
5352 In particular, peers cannot trigger global network communication at will.
5353
5354 @cindex NSE principle
5355 @cindex nse principle
5356 @node Principle
5357 @subsection Principle
5358
5359
5360 The algorithm calculates the estimate by finding the globally closest
5361 peer ID to a random, time-based value.
5362
5363 The idea is that the closer the ID is to the random value, the more
5364 "densely packed" the ID space is, and therefore, more peers are in the
5365 network.
5366
5367
5368
5369 @menu
5370 * Example::
5371 * Algorithm::
5372 * Target value::
5373 * Timing::
5374 * Controlled Flooding::
5375 * Calculating the estimate::
5376 @end menu
5377
5378 @node Example
5379 @subsubsection Example
5380
5381
5382 Suppose all peers have IDs between 0 and 100 (our ID space), and the
5383 random value is 42.
5384 If the closest peer has the ID 70 we can imagine that the average
5385 "distance" between peers is around 30 and therefore the are around 3
5386 peers in the whole ID space. On the other hand, if the closest peer has
5387 the ID 44, we can imagine that the space is rather packed with peers,
5388 maybe as much as 50 of them.
5389 Naturally, we could have been rather unlucky, and there is only one peer
5390 and happens to have the ID 44. Thus, the current estimate is calculated
5391 as the average over multiple rounds, and not just a single sample.
5392
5393 @node Algorithm
5394 @subsubsection Algorithm
5395
5396
5397 Given that example, one can imagine that the job of the subsystem is to
5398 efficiently communicate the ID of the closest peer to the target value
5399 to all the other peers, who will calculate the estimate from it.
5400
5401 @node Target value
5402 @subsubsection Target value
5403
5404 @c %**end of header
5405
5406 The target value itself is generated by hashing the current time, rounded
5407 down to an agreed value. If the rounding amount is 1h (default) and the
5408 time is 12:34:56, the time to hash would be 12:00:00. The process is
5409 repeated each rounding amount (in this example would be every hour).
5410 Every repetition is called a round.
5411
5412 @node Timing
5413 @subsubsection Timing
5414 @c %**end of header
5415
5416 The NSE subsystem has some timing control to avoid everybody broadcasting
5417 its ID all at one. Once each peer has the target random value, it
5418 compares its own ID to the target and calculates the hypothetical size of
5419 the network if that peer were to be the closest.
5420 Then it compares the hypothetical size with the estimate from the previous
5421 rounds. For each value there is an associated point in the period,
5422 let's call it "broadcast time". If its own hypothetical estimate
5423 is the same as the previous global estimate, its "broadcast time" will be
5424 in the middle of the round. If its bigger it will be earlier and if its
5425 smaller (the most likely case) it will be later. This ensures that the
5426 peers closest to the target value start broadcasting their ID the first.
5427
5428 @node Controlled Flooding
5429 @subsubsection Controlled Flooding
5430
5431 @c %**end of header
5432
5433 When a peer receives a value, first it verifies that it is closer than the
5434 closest value it had so far, otherwise it answers the incoming message
5435 with a message containing the better value. Then it checks a proof of
5436 work that must be included in the incoming message, to ensure that the
5437 other peer's ID is not made up (otherwise a malicious peer could claim to
5438 have an ID of exactly the target value every round). Once validated, it
5439 compares the broadcast time of the received value with the current time
5440 and if it's not too early, sends the received value to its neighbors.
5441 Otherwise it stores the value until the correct broadcast time comes.
5442 This prevents unnecessary traffic of sub-optimal values, since a better
5443 value can come before the broadcast time, rendering the previous one
5444 obsolete and saving the traffic that would have been used to broadcast it
5445 to the neighbors.
5446
5447 @node Calculating the estimate
5448 @subsubsection Calculating the estimate
5449
5450 @c %**end of header
5451
5452 Once the closest ID has been spread across the network each peer gets the
5453 exact distance between this ID and the target value of the round and
5454 calculates the estimate with a mathematical formula described in the tech
5455 report. The estimate generated with this method for a single round is not
5456 very precise. Remember the case of the example, where the only peer is the
5457 ID 44 and we happen to generate the target value 42, thinking there are
5458 50 peers in the network. Therefore, the NSE subsystem remembers the last
5459 64 estimates and calculates an average over them, giving a result of which
5460 usually has one bit of uncertainty (the real size could be half of the
5461 estimate or twice as much). Note that the actual network size is
5462 calculated in powers of two of the raw input, thus one bit of uncertainty
5463 means a factor of two in the size estimate.
5464
5465 @cindex libgnunetnse
5466 @node libgnunetnse
5467 @subsection libgnunetnse
5468
5469 @c %**end of header
5470
5471 The NSE subsystem has the simplest API of all services, with only two
5472 calls: @code{GNUNET_NSE_connect} and @code{GNUNET_NSE_disconnect}.
5473
5474 The connect call gets a callback function as a parameter and this function
5475 is called each time the network agrees on an estimate. This usually is
5476 once per round, with some exceptions: if the closest peer has a late
5477 local clock and starts spreading its ID after everyone else agreed on a
5478 value, the callback might be activated twice in a round, the second value
5479 being always bigger than the first. The default round time is set to
5480 1 hour.
5481
5482 The disconnect call disconnects from the NSE subsystem and the callback
5483 is no longer called with new estimates.
5484
5485
5486
5487 @menu
5488 * Results::
5489 * libgnunetnse - Examples::
5490 @end menu
5491
5492 @node Results
5493 @subsubsection Results
5494
5495 @c %**end of header
5496
5497 The callback provides two values: the average and the
5498 @uref{http://en.wikipedia.org/wiki/Standard_deviation, standard deviation}
5499 of the last 64 rounds. The values provided by the callback function are
5500 logarithmic, this means that the real estimate numbers can be obtained by
5501 calculating 2 to the power of the given value (2average). From a
5502 statistics point of view this means that:
5503
5504 @itemize @bullet
5505 @item 68% of the time the real size is included in the interval
5506 [(2average-stddev), 2]
5507 @item 95% of the time the real size is included in the interval
5508 [(2average-2*stddev, 2^average+2*stddev]
5509 @item 99.7% of the time the real size is included in the interval
5510 [(2average-3*stddev, 2average+3*stddev]
5511 @end itemize
5512
5513 The expected standard variation for 64 rounds in a network of stable size
5514 is 0.2. Thus, we can say that normally:
5515
5516 @itemize @bullet
5517 @item 68% of the time the real size is in the range [-13%, +15%]
5518 @item 95% of the time the real size is in the range [-24%, +32%]
5519 @item 99.7% of the time the real size is in the range [-34%, +52%]
5520 @end itemize
5521
5522 As said in the introduction, we can be quite sure that usually the real
5523 size is between one third and three times the estimate. This can of
5524 course vary with network conditions.
5525 Thus, applications may want to also consider the provided standard
5526 deviation value, not only the average (in particular, if the standard
5527 variation is very high, the average maybe meaningless: the network size is
5528 changing rapidly).
5529
5530 @node libgnunetnse - Examples
5531 @subsubsection libgnunetnse -Examples
5532
5533 @c %**end of header
5534
5535 Let's close with a couple examples.
5536
5537 @table @asis
5538
5539 @item Average: 10, std dev: 1 Here the estimate would be
5540 2^10 = 1024 peers. (The range in which we can be 95% sure is:
5541 [2^8, 2^12] = [256, 4096]. We can be very (>99.7%) sure that the network
5542 is not a hundred peers and absolutely sure that it is not a million peers,
5543 but somewhere around a thousand.)
5544
5545 @item Average 22, std dev: 0.2 Here the estimate would be
5546 2^22 = 4 Million peers. (The range in which we can be 99.7% sure
5547 is: [2^21.4, 2^22.6] = [2.8M, 6.3M]. We can be sure that the network size
5548 is around four million, with absolutely way of it being 1 million.)
5549
5550 @end table
5551
5552 To put this in perspective, if someone remembers the LHC Higgs boson
5553 results, were announced with "5 sigma" and "6 sigma" certainties. In this
5554 case a 5 sigma minimum would be 2 million and a 6 sigma minimum,
5555 1.8 million.
5556
5557 @node The NSE Client-Service Protocol
5558 @subsection The NSE Client-Service Protocol
5559
5560 @c %**end of header
5561
5562 As with the API, the client-service protocol is very simple, only has 2
5563 different messages, defined in @code{src/nse/nse.h}:
5564
5565 @itemize @bullet
5566 @item @code{GNUNET_MESSAGE_TYPE_NSE_START}@ This message has no parameters
5567 and is sent from the client to the service upon connection.
5568 @item @code{GNUNET_MESSAGE_TYPE_NSE_ESTIMATE}@ This message is sent from
5569 the service to the client for every new estimate and upon connection.
5570 Contains a timestamp for the estimate, the average and the standard
5571 deviation for the respective round.
5572 @end itemize
5573
5574 When the @code{GNUNET_NSE_disconnect} API call is executed, the client
5575 simply disconnects from the service, with no message involved.
5576
5577 @cindex NSE Peer-to-Peer Protocol
5578 @node The NSE Peer-to-Peer Protocol
5579 @subsection The NSE Peer-to-Peer Protocol
5580
5581 @c %**end of header
5582
5583 The NSE subsystem only has one message in the P2P protocol, the
5584 @code{GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD} message.
5585
5586 This message key contents are the timestamp to identify the round
5587 (differences in system clocks may cause some peers to send messages way
5588 too early or way too late, so the timestamp allows other peers to
5589 identify such messages easily), the
5590 @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proof of work}
5591 used to make it difficult to mount a
5592 @uref{http://en.wikipedia.org/wiki/Sybil_attack, Sybil attack}, and the
5593 public key, which is used to verify the signature on the message.
5594
5595 Every peer stores a message for the previous, current and next round. The
5596 messages for the previous and current round are given to peers that
5597 connect to us. The message for the next round is simply stored until our
5598 system clock advances to the next round. The message for the current round
5599 is what we are flooding the network with right now.
5600 At the beginning of each round the peer does the following:
5601
5602 @itemize @bullet
5603 @item calculates its own distance to the target value
5604 @item creates, signs and stores the message for the current round (unless
5605 it has a better message in the "next round" slot which came early in the
5606 previous round)
5607 @item calculates, based on the stored round message (own or received) when
5608 to start flooding it to its neighbors
5609 @end itemize
5610
5611 Upon receiving a message the peer checks the validity of the message
5612 (round, proof of work, signature). The next action depends on the
5613 contents of the incoming message:
5614
5615 @itemize @bullet
5616 @item if the message is worse than the current stored message, the peer
5617 sends the current message back immediately, to stop the other peer from
5618 spreading suboptimal results
5619 @item if the message is better than the current stored message, the peer
5620 stores the new message and calculates the new target time to start
5621 spreading it to its neighbors (excluding the one the message came from)
5622 @item if the message is for the previous round, it is compared to the
5623 message stored in the "previous round slot", which may then be updated
5624 @item if the message is for the next round, it is compared to the message
5625 stored in the "next round slot", which again may then be updated
5626 @end itemize
5627
5628 Finally, when it comes to send the stored message for the current round to
5629 the neighbors there is a random delay added for each neighbor, to avoid
5630 traffic spikes and minimize cross-messages.
5631
5632 @cindex HOSTLIST Subsystem
5633 @node HOSTLIST Subsystem
5634 @section HOSTLIST Subsystem
5635
5636 @c %**end of header
5637
5638 Peers in the GNUnet overlay network need address information so that they
5639 can connect with other peers. GNUnet uses so called HELLO messages to
5640 store and exchange peer addresses.
5641 GNUnet provides several methods for peers to obtain this information:
5642
5643 @itemize @bullet
5644 @item out-of-band exchange of HELLO messages (manually, using for example
5645 gnunet-peerinfo)
5646 @item HELLO messages shipped with GNUnet (automatic with distribution)
5647 @item UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast)
5648 @item topology gossiping (learning from other peers we already connected
5649 to), and
5650 @item the HOSTLIST daemon covered in this section, which is particularly
5651 relevant for bootstrapping new peers.
5652 @end itemize
5653
5654 New peers have no existing connections (and thus cannot learn from gossip
5655 among peers), may not have other peers in their LAN and might be started
5656 with an outdated set of HELLO messages from the distribution.
5657 In this case, getting new peers to connect to the network requires either
5658 manual effort or the use of a HOSTLIST to obtain HELLOs.
5659
5660 @menu
5661 * HELLOs::
5662 * Overview for the HOSTLIST subsystem::
5663 * Interacting with the HOSTLIST daemon::
5664 * Hostlist security address validation::
5665 * The HOSTLIST daemon::
5666 * The HOSTLIST server::
5667 * The HOSTLIST client::
5668 * Usage::
5669 @end menu
5670
5671 @node HELLOs
5672 @subsection HELLOs
5673
5674 @c %**end of header
5675
5676 The basic information peers require to connect to other peers are
5677 contained in so called HELLO messages you can think of as a business card.
5678 Besides the identity of the peer (based on the cryptographic public key) a
5679 HELLO message may contain address information that specifies ways to
5680 contact a peer. By obtaining HELLO messages, a peer can learn how to
5681 contact other peers.
5682
5683 @node Overview for the HOSTLIST subsystem
5684 @subsection Overview for the HOSTLIST subsystem
5685
5686 @c %**end of header
5687
5688 The HOSTLIST subsystem provides a way to distribute and obtain contact
5689 information to connect to other peers using a simple HTTP GET request.
5690 It's implementation is split in three parts, the main file for the daemon
5691 itself (@file{gnunet-daemon-hostlist.c}), the HTTP client used to download
5692 peer information (@file{hostlist-client.c}) and the server component used
5693 to provide this information to other peers (@file{hostlist-server.c}).
5694 The server is basically a small HTTP web server (based on GNU
5695 libmicrohttpd) which provides a list of HELLOs known to the local peer for
5696 download. The client component is basically a HTTP client
5697 (based on libcurl) which can download hostlists from one or more websites.
5698 The hostlist format is a binary blob containing a sequence of HELLO
5699 messages. Note that any HTTP server can theoretically serve a hostlist,
5700 the build-in hostlist server makes it simply convenient to offer this
5701 service.
5702
5703
5704 @menu
5705 * Features::
5706 * HOSTLIST - Limitations::
5707 @end menu
5708
5709 @node Features
5710 @subsubsection Features
5711
5712 @c %**end of header
5713
5714 The HOSTLIST daemon can:
5715
5716 @itemize @bullet
5717 @item provide HELLO messages with validated addresses obtained from
5718 PEERINFO to download for other peers
5719 @item download HELLO messages and forward these message to the TRANSPORT
5720 subsystem for validation
5721 @item advertises the URL of this peer's hostlist address to other peers
5722 via gossip
5723 @item automatically learn about hostlist servers from the gossip of other
5724 peers
5725 @end itemize
5726
5727 @node HOSTLIST - Limitations
5728 @subsubsection HOSTLIST - Limitations
5729
5730 @c %**end of header
5731
5732 The HOSTLIST daemon does not:
5733
5734 @itemize @bullet
5735 @item verify the cryptographic information in the HELLO messages
5736 @item verify the address information in the HELLO messages
5737 @end itemize
5738
5739 @node Interacting with the HOSTLIST daemon
5740 @subsection Interacting with the HOSTLIST daemon
5741
5742 @c %**end of header
5743
5744 The HOSTLIST subsystem is currently implemented as a daemon, so there is
5745 no need for the user to interact with it and therefore there is no
5746 command line tool and no API to communicate with the daemon. In the
5747 future, we can envision changing this to allow users to manually trigger
5748 the download of a hostlist.
5749
5750 Since there is no command line interface to interact with HOSTLIST, the
5751 only way to interact with the hostlist is to use STATISTICS to obtain or
5752 modify information about the status of HOSTLIST:
5753
5754 @example
5755 $ gnunet-statistics -s hostlist
5756 @end example
5757
5758 @noindent
5759 In particular, HOSTLIST includes a @strong{persistent} value in statistics
5760 that specifies when the hostlist server might be queried next. As this
5761 value is exponentially increasing during runtime, developers may want to
5762 reset or manually adjust it. Note that HOSTLIST (but not STATISTICS) needs
5763 to be shutdown if changes to this value are to have any effect on the
5764 daemon (as HOSTLIST does not monitor STATISTICS for changes to the
5765 download frequency).
5766
5767 @node Hostlist security address validation
5768 @subsection Hostlist security address validation
5769
5770 @c %**end of header
5771
5772 Since information obtained from other parties cannot be trusted without
5773 validation, we have to distinguish between @emph{validated} and
5774 @emph{not validated} addresses. Before using (and so trusting)
5775 information from other parties, this information has to be double-checked
5776 (validated). Address validation is not done by HOSTLIST but by the
5777 TRANSPORT service.
5778
5779 The HOSTLIST component is functionally located between the PEERINFO and
5780 the TRANSPORT subsystem. When acting as a server, the daemon obtains valid
5781 (@emph{validated}) peer information (HELLO messages) from the PEERINFO
5782 service and provides it to other peers. When acting as a client, it
5783 contacts the HOSTLIST servers specified in the configuration, downloads
5784 the (unvalidated) list of HELLO messages and forwards these information
5785 to the TRANSPORT server to validate the addresses.
5786
5787 @cindex HOSTLIST daemon
5788 @node The HOSTLIST daemon
5789 @subsection The HOSTLIST daemon
5790
5791 @c %**end of header
5792
5793 The hostlist daemon is the main component of the HOSTLIST subsystem. It is
5794 started by the ARM service and (if configured) starts the HOSTLIST client
5795 and server components.
5796
5797 If the daemon provides a hostlist itself it can advertise it's own
5798 hostlist to other peers. To do so it sends a
5799 @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to other peers
5800 when they connect to this peer on the CORE level. This hostlist
5801 advertisement message contains the URL to access the HOSTLIST HTTP
5802 server of the sender. The daemon may also subscribe to this type of
5803 message from CORE service, and then forward these kind of message to the
5804 HOSTLIST client. The client then uses all available URLs to download peer
5805 information when necessary.
5806
5807 When starting, the HOSTLIST daemon first connects to the CORE subsystem
5808 and if hostlist learning is enabled, registers a CORE handler to receive
5809 this kind of messages. Next it starts (if configured) the client and
5810 server. It passes pointers to CORE connect and disconnect and receive
5811 handlers where the client and server store their functions, so the daemon
5812 can notify them about CORE events.
5813
5814 To clean up on shutdown, the daemon has a cleaning task, shutting down all
5815 subsystems and disconnecting from CORE.
5816
5817 @cindex HOSTLIST server
5818 @node The HOSTLIST server
5819 @subsection The HOSTLIST server
5820
5821 @c %**end of header
5822
5823 The server provides a way for other peers to obtain HELLOs. Basically it
5824 is a small web server other peers can connect to and download a list of
5825 HELLOs using standard HTTP; it may also advertise the URL of the hostlist
5826 to other peers connecting on CORE level.
5827
5828
5829 @menu
5830 * The HTTP Server::
5831 * Advertising the URL::
5832 @end menu
5833
5834 @node The HTTP Server
5835 @subsubsection The HTTP Server
5836
5837 @c %**end of header
5838
5839 During startup, the server starts a web server listening on the port
5840 specified with the HTTPPORT value (default 8080). In addition it connects
5841 to the PEERINFO service to obtain peer information. The HOSTLIST server
5842 uses the GNUNET_PEERINFO_iterate function to request HELLO information for
5843 all peers and adds their information to a new hostlist if they are
5844 suitable (expired addresses and HELLOs without addresses are both not
5845 suitable) and the maximum size for a hostlist is not exceeded
5846 (MAX_BYTES_PER_HOSTLISTS = 500000).
5847 When PEERINFO finishes (with a last NULL callback), the server destroys
5848 the previous hostlist response available for download on the web server
5849 and replaces it with the updated hostlist. The hostlist format is
5850 basically a sequence of HELLO messages (as obtained from PEERINFO) without
5851 any special tokenization. Since each HELLO message contains a size field,
5852 the response can easily be split into separate HELLO messages by the
5853 client.
5854
5855 A HOSTLIST client connecting to the HOSTLIST server will receive the
5856 hostlist as a HTTP response and the the server will terminate the
5857 connection with the result code @code{HTTP 200 OK}.
5858 The connection will be closed immediately if no hostlist is available.
5859
5860 @node Advertising the URL
5861 @subsubsection Advertising the URL
5862
5863 @c %**end of header
5864
5865 The server also advertises the URL to download the hostlist to other peers
5866 if hostlist advertisement is enabled.
5867 When a new peer connects and has hostlist learning enabled, the server
5868 sends a @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to this
5869 peer using the CORE service.
5870
5871 @cindex HOSTLIST client
5872 @node The HOSTLIST client
5873 @subsection The HOSTLIST client
5874
5875 @c %**end of header
5876
5877 The client provides the functionality to download the list of HELLOs from
5878 a set of URLs.
5879 It performs a standard HTTP request to the URLs configured and learned
5880 from advertisement messages received from other peers. When a HELLO is
5881 downloaded, the HOSTLIST client forwards the HELLO to the TRANSPORT
5882 service for validation.
5883
5884 The client supports two modes of operation:
5885
5886 @itemize @bullet
5887 @item download of HELLOs (bootstrapping)
5888 @item learning of URLs
5889 @end itemize
5890
5891 @menu
5892 * Bootstrapping::
5893 * Learning::
5894 @end menu
5895
5896 @node Bootstrapping
5897 @subsubsection Bootstrapping
5898
5899 @c %**end of header
5900
5901 For bootstrapping, it schedules a task to download the hostlist from the
5902 set of known URLs.
5903 The downloads are only performed if the number of current
5904 connections is smaller than a minimum number of connections
5905 (at the moment 4).
5906 The interval between downloads increases exponentially; however, the
5907 exponential growth is limited if it becomes longer than an hour.
5908 At that point, the frequency growth is capped at
5909 (#number of connections * 1h).
5910
5911 Once the decision has been taken to download HELLOs, the daemon chooses a
5912 random URL from the list of known URLs. URLs can be configured in the
5913 configuration or be learned from advertisement messages.
5914 The client uses a HTTP client library (libcurl) to initiate the download
5915 using the libcurl multi interface.
5916 Libcurl passes the data to the callback_download function which
5917 stores the data in a buffer if space is available and the maximum size for
5918 a hostlist download is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000).
5919 When a full HELLO was downloaded, the HOSTLIST client offers this
5920 HELLO message to the TRANSPORT service for validation.
5921 When the download is finished or failed, statistical information about the
5922 quality of this URL is updated.
5923
5924 @cindex HOSTLIST learning
5925 @node Learning
5926 @subsubsection Learning
5927
5928 @c %**end of header
5929
5930 The client also manages hostlist advertisements from other peers. The
5931 HOSTLIST daemon forwards @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT}
5932 messages to the client subsystem, which extracts the URL from the message.
5933 Next, a test of the newly obtained URL is performed by triggering a
5934 download from the new URL. If the URL works correctly, it is added to the
5935 list of working URLs.
5936
5937 The size of the list of URLs is restricted, so if an additional server is
5938 added and the list is full, the URL with the worst quality ranking
5939 (determined through successful downloads and number of HELLOs e.g.) is
5940 discarded. During shutdown the list of URLs is saved to a file for
5941 persistance and loaded on startup. URLs from the configuration file are
5942 never discarded.
5943
5944 @node Usage
5945 @subsection Usage
5946
5947 @c %**end of header
5948
5949 To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES
5950 section for the ARM services. This is done in the default configuration.
5951
5952 For more information on how to configure the HOSTLIST subsystem see the
5953 installation handbook:@
5954 Configuring the hostlist to bootstrap@
5955 Configuring your peer to provide a hostlist
5956
5957 @cindex IDENTITY Subsystem
5958 @node IDENTITY Subsystem
5959 @section IDENTITY Subsystem
5960
5961 @c %**end of header
5962
5963 Identities of "users" in GNUnet are called egos.
5964 Egos can be used as pseudonyms ("fake names") or be tied to an
5965 organization (for example, "GNU") or even the actual identity of a human.
5966 GNUnet users are expected to have many egos. They might have one tied to
5967 their real identity, some for organizations they manage, and more for
5968 different domains where they want to operate under a pseudonym.
5969
5970 The IDENTITY service allows users to manage their egos. The identity
5971 service manages the private keys egos of the local user; it does not
5972 manage identities of other users (public keys). Public keys for other
5973 users need names to become manageable. GNUnet uses the
5974 @dfn{GNU Name System} (GNS) to give names to other users and manage their
5975 public keys securely. This chapter is about the IDENTITY service,
5976 which is about the management of private keys.
5977
5978 On the network, an ego corresponds to an ECDSA key (over Curve25519,
5979 using RFC 6979, as required by GNS). Thus, users can perform actions
5980 under a particular ego by using (signing with) a particular private key.
5981 Other users can then confirm that the action was really performed by that
5982 ego by checking the signature against the respective public key.
5983
5984 The IDENTITY service allows users to associate a human-readable name with
5985 each ego. This way, users can use names that will remind them of the
5986 purpose of a particular ego.
5987 The IDENTITY service will store the respective private keys and
5988 allows applications to access key information by name.
5989 Users can change the name that is locally (!) associated with an ego.
5990 Egos can also be deleted, which means that the private key will be removed
5991 and it thus will not be possible to perform actions with that ego in the
5992 future.
5993
5994 Additionally, the IDENTITY subsystem can associate service functions with
5995 egos.
5996 For example, GNS requires the ego that should be used for the shorten
5997 zone. GNS will ask IDENTITY for an ego for the "gns-short" service.
5998 The IDENTITY service has a mapping of such service strings to the name of
5999 the ego that the user wants to use for this service, for example
6000 "my-short-zone-ego".
6001
6002 Finally, the IDENTITY API provides access to a special ego, the
6003 anonymous ego. The anonymous ego is special in that its private key is not
6004 really private, but fixed and known to everyone.
6005 Thus, anyone can perform actions as anonymous. This can be useful as with
6006 this trick, code does not have to contain a special case to distinguish
6007 between anonymous and pseudonymous egos.
6008
6009 @menu
6010 * libgnunetidentity::
6011 * The IDENTITY Client-Service Protocol::
6012 @end menu
6013
6014 @cindex libgnunetidentity
6015 @node libgnunetidentity
6016 @subsection libgnunetidentity
6017 @c %**end of header
6018
6019
6020 @menu
6021 * Connecting to the service::
6022 * Operations on Egos::
6023 * The anonymous Ego::
6024 * Convenience API to lookup a single ego::
6025 * Associating egos with service functions::
6026 @end menu
6027
6028 @node Connecting to the service
6029 @subsubsection Connecting to the service
6030
6031 @c %**end of header
6032
6033 First, typical clients connect to the identity service using
6034 @code{GNUNET_IDENTITY_connect}. This function takes a callback as a
6035 parameter.
6036 If the given callback parameter is non-null, it will be invoked to notify
6037 the application about the current state of the identities in the system.
6038
6039 @itemize @bullet
6040 @item First, it will be invoked on all known egos at the time of the
6041 connection. For each ego, a handle to the ego and the user's name for the
6042 ego will be passed to the callback. Furthermore, a @code{void **} context
6043 argument will be provided which gives the client the opportunity to
6044 associate some state with the ego.
6045 @item Second, the callback will be invoked with NULL for the ego, the name
6046 and the context. This signals that the (initial) iteration over all egos
6047 has completed.
6048 @item Then, the callback will be invoked whenever something changes about
6049 an ego.
6050 If an ego is renamed, the callback is invoked with the ego handle of the
6051 ego that was renamed, and the new name. If an ego is deleted, the callback
6052 is invoked with the ego handle and a name of NULL. In the deletion case,
6053 the application should also release resources stored in the context.
6054 @item When the application destroys the connection to the identity service
6055 using @code{GNUNET_IDENTITY_disconnect}, the callback is again invoked
6056 with the ego and a name of NULL (equivalent to deletion of the egos).
6057 This should again be used to clean up the per-ego context.
6058 @end itemize
6059
6060 The ego handle passed to the callback remains valid until the callback is
6061 invoked with a name of NULL, so it is safe to store a reference to the
6062 ego's handle.
6063
6064 @node Operations on Egos
6065 @subsubsection Operations on Egos
6066
6067 @c %**end of header
6068
6069 Given an ego handle, the main operations are to get its associated private
6070 key using @code{GNUNET_IDENTITY_ego_get_private_key} or its associated
6071 public key using @code{GNUNET_IDENTITY_ego_get_public_key}.
6072
6073 The other operations on egos are pretty straightforward.
6074 Using @code{GNUNET_IDENTITY_create}, an application can request the
6075 creation of an ego by specifying the desired name.
6076 The operation will fail if that name is
6077 already in use. Using @code{GNUNET_IDENTITY_rename} the name of an
6078 existing ego can be changed. Finally, egos can be deleted using
6079 @code{GNUNET_IDENTITY_delete}. All of these operations will trigger
6080 updates to the callback given to the @code{GNUNET_IDENTITY_connect}
6081 function of all applications that are connected with the identity service
6082 at the time. @code{GNUNET_IDENTITY_cancel} can be used to cancel the
6083 operations before the respective continuations would be called.
6084 It is not guaranteed that the operation will not be completed anyway,
6085 only the continuation will no longer be called.
6086
6087 @node The anonymous Ego
6088 @subsubsection The anonymous Ego
6089
6090 @c %**end of header
6091
6092 A special way to obtain an ego handle is to call
6093 @code{GNUNET_IDENTITY_ego_get_anonymous}, which returns an ego for the
6094 "anonymous" user --- anyone knows and can get the private key for this
6095 user, so it is suitable for operations that are supposed to be anonymous
6096 but require signatures (for example, to avoid a special path in the code).
6097 The anonymous ego is always valid and accessing it does not require a
6098 connection to the identity service.
6099
6100 @node Convenience API to lookup a single ego
6101 @subsubsection Convenience API to lookup a single ego
6102
6103
6104 As applications commonly simply have to lookup a single ego, there is a
6105 convenience API to do just that. Use @code{GNUNET_IDENTITY_ego_lookup} to
6106 lookup a single ego by name. Note that this is the user's name for the
6107 ego, not the service function. The resulting ego will be returned via a
6108 callback and will only be valid during that callback. The operation can
6109 be canceled via @code{GNUNET_IDENTITY_ego_lookup_cancel}
6110 (cancellation is only legal before the callback is invoked).
6111
6112 @node Associating egos with service functions
6113 @subsubsection Associating egos with service functions
6114
6115
6116 The @code{GNUNET_IDENTITY_set} function is used to associate a particular
6117 ego with a service function. The name used by the service and the ego are
6118 given as arguments.
6119 Afterwards, the service can use its name to lookup the associated ego
6120 using @code{GNUNET_IDENTITY_get}.
6121
6122 @node The IDENTITY Client-Service Protocol
6123 @subsection The IDENTITY Client-Service Protocol
6124
6125 @c %**end of header
6126
6127 A client connecting to the identity service first sends a message with
6128 type
6129 @code{GNUNET_MESSAGE_TYPE_IDENTITY_START} to the service. After that, the
6130 client will receive information about changes to the egos by receiving
6131 messages of type @code{GNUNET_MESSAGE_TYPE_IDENTITY_UPDATE}.
6132 Those messages contain the private key of the ego and the user's name of
6133 the ego (or zero bytes for the name to indicate that the ego was deleted).
6134 A special bit @code{end_of_list} is used to indicate the end of the
6135 initial iteration over the identity service's egos.
6136
6137 The client can trigger changes to the egos by sending @code{CREATE},
6138 @code{RENAME} or @code{DELETE} messages.
6139 The CREATE message contains the private key and the desired name.@
6140 The RENAME message contains the old name and the new name.@
6141 The DELETE message only needs to include the name of the ego to delete.@
6142 The service responds to each of these messages with a @code{RESULT_CODE}
6143 message which indicates success or error of the operation, and possibly
6144 a human-readable error message.
6145
6146 Finally, the client can bind the name of a service function to an ego by
6147 sending a @code{SET_DEFAULT} message with the name of the service function
6148 and the private key of the ego.
6149 Such bindings can then be resolved using a @code{GET_DEFAULT} message,
6150 which includes the name of the service function. The identity service
6151 will respond to a GET_DEFAULT request with a SET_DEFAULT message
6152 containing the respective information, or with a RESULT_CODE to
6153 indicate an error.
6154
6155 @cindex NAMESTORE Subsystem
6156 @node NAMESTORE Subsystem
6157 @section NAMESTORE Subsystem
6158
6159 The NAMESTORE subsystem provides persistent storage for local GNS zone
6160 information. All local GNS zone information are managed by NAMESTORE. It
6161 provides both the functionality to administer local GNS information (e.g.
6162 delete and add records) as well as to retrieve GNS information (e.g to
6163 list name information in a client).
6164 NAMESTORE does only manage the persistent storage of zone information
6165 belonging to the user running the service: GNS information from other
6166 users obtained from the DHT are stored by the NAMECACHE subsystem.
6167
6168 NAMESTORE uses a plugin-based database backend to store GNS information
6169 with good performance. Here sqlite, MySQL and PostgreSQL are supported
6170 database backends.
6171 NAMESTORE clients interact with the IDENTITY subsystem to obtain
6172 cryptographic information about zones based on egos as described with the
6173 IDENTITY subsystem, but internally NAMESTORE refers to zones using the
6174 ECDSA private key.
6175 In addition, it collaborates with the NAMECACHE subsystem and
6176 stores zone information when local information are modified in the
6177 GNS cache to increase look-up performance for local information.
6178
6179 NAMESTORE provides functionality to look-up and store records, to iterate
6180 over a specific or all zones and to monitor zones for changes. NAMESTORE
6181 functionality can be accessed using the NAMESTORE api or the NAMESTORE
6182 command line tool.
6183
6184 @menu
6185 * libgnunetnamestore::
6186 @end menu
6187
6188 @cindex libgnunetnamestore
6189 @node libgnunetnamestore
6190 @subsection libgnunetnamestore
6191
6192 To interact with NAMESTORE clients first connect to the NAMESTORE service
6193 using the @code{GNUNET_NAMESTORE_connect} passing a configuration handle.
6194 As a result they obtain a NAMESTORE handle, they can use for operations,
6195 or NULL is returned if the connection failed.
6196
6197 To disconnect from NAMESTORE, clients use
6198 @code{GNUNET_NAMESTORE_disconnect} and specify the handle to disconnect.
6199
6200 NAMESTORE internally uses the ECDSA private key to refer to zones. These
6201 private keys can be obtained from the IDENTITY subsytem.
6202 Here @emph{egos} @emph{can be used to refer to zones or the default ego
6203 assigned to the GNS subsystem can be used to obtained the master zone's
6204 private key.}
6205
6206
6207 @menu
6208 * Editing Zone Information::
6209 * Iterating Zone Information::
6210 * Monitoring Zone Information::
6211 @end menu
6212
6213 @node Editing Zone Information
6214 @subsubsection Editing Zone Information
6215
6216 @c %**end of header
6217
6218 NAMESTORE provides functions to lookup records stored under a label in a
6219 zone and to store records under a label in a zone.
6220
6221 To store (and delete) records, the client uses the
6222 @code{GNUNET_NAMESTORE_records_store} function and has to provide
6223 namestore handle to use, the private key of the zone, the label to store
6224 the records under, the records and number of records plus an callback
6225 function.
6226 After the operation is performed NAMESTORE will call the provided
6227 callback function with the result GNUNET_SYSERR on failure
6228 (including timeout/queue drop/failure to validate), GNUNET_NO if content
6229 was already there or not found GNUNET_YES (or other positive value) on
6230 success plus an additional error message.
6231
6232 Records are deleted by using the store command with 0 records to store.
6233 It is important to note, that records are not merged when records exist
6234 with the label.
6235 So a client has first to retrieve records, merge with existing records
6236 and then store the result.
6237
6238 To perform a lookup operation, the client uses the
6239 @code{GNUNET_NAMESTORE_records_store} function. Here it has to pass the
6240 namestore handle, the private key of the zone and the label. It also has
6241 to provide a callback function which will be called with the result of
6242 the lookup operation:
6243 the zone for the records, the label, and the records including the
6244 number of records included.
6245
6246 A special operation is used to set the preferred nickname for a zone.
6247 This nickname is stored with the zone and is automatically merged with
6248 all labels and records stored in a zone. Here the client uses the
6249 @code{GNUNET_NAMESTORE_set_nick} function and passes the private key of
6250 the zone, the nickname as string plus a the callback with the result of
6251 the operation.
6252
6253 @node Iterating Zone Information
6254 @subsubsection Iterating Zone Information
6255
6256 @c %**end of header
6257
6258 A client can iterate over all information in a zone or all zones managed
6259 by NAMESTORE.
6260 Here a client uses the @code{GNUNET_NAMESTORE_zone_iteration_start}
6261 function and passes the namestore handle, the zone to iterate over and a
6262 callback function to call with the result.
6263 If the client wants to iterate over all the WHAT!? FIXME, it passes NULL for the zone.
6264 A @code{GNUNET_NAMESTORE_ZoneIterator} handle is returned to be used to
6265 continue iteration.
6266
6267 NAMESTORE calls the callback for every result and expects the client to
6268 call @code{GNUNET_NAMESTORE_zone_iterator_next} to continue to iterate or
6269 @code{GNUNET_NAMESTORE_zone_iterator_stop} to interrupt the iteration.
6270 When NAMESTORE reached the last item it will call the callback with a
6271 NULL value to indicate.
6272
6273 @node Monitoring Zone Information
6274 @subsubsection Monitoring Zone Information
6275
6276 @c %**end of header
6277
6278 Clients can also monitor zones to be notified about changes. Here the
6279 clients uses the @code{GNUNET_NAMESTORE_zone_monitor_start} function and
6280 passes the private key of the zone and and a callback function to call
6281 with updates for a zone.
6282 The client can specify to obtain zone information first by iterating over
6283 the zone and specify a synchronization callback to be called when the
6284 client and the namestore are synced.
6285
6286 On an update, NAMESTORE will call the callback with the private key of the
6287 zone, the label and the records and their number.
6288
6289 To stop monitoring, the client calls
6290 @code{GNUNET_NAMESTORE_zone_monitor_stop} and passes the handle obtained
6291 from the function to start the monitoring.
6292
6293 @cindex PEERINFO Subsystem
6294 @node PEERINFO Subsystem
6295 @section PEERINFO Subsystem
6296
6297 @c %**end of header
6298
6299 The PEERINFO subsystem is used to store verified (validated) information
6300 about known peers in a persistent way. It obtains these addresses for
6301 example from TRANSPORT service which is in charge of address validation.
6302 Validation means that the information in the HELLO message are checked by
6303 connecting to the addresses and performing a cryptographic handshake to
6304 authenticate the peer instance stating to be reachable with these
6305 addresses.
6306 Peerinfo does not validate the HELLO messages itself but only stores them
6307 and gives them to interested clients.
6308
6309 As future work, we think about moving from storing just HELLO messages to
6310 providing a generic persistent per-peer information store.
6311 More and more subsystems tend to need to store per-peer information in
6312 persistent way.
6313 To not duplicate this functionality we plan to provide a PEERSTORE
6314 service providing this functionality.
6315
6316 @menu
6317 * PEERINFO - Features::
6318 * PEERINFO - Limitations::
6319 * DeveloperPeer Information::
6320 * Startup::
6321 * Managing Information::
6322 * Obtaining Information::
6323 * The PEERINFO Client-Service Protocol::
6324 * libgnunetpeerinfo::
6325 @end menu
6326
6327 @node PEERINFO - Features
6328 @subsection PEERINFO - Features
6329
6330 @c %**end of header
6331
6332 @itemize @bullet
6333 @item Persistent storage
6334 @item Client notification mechanism on update
6335 @item Periodic clean up for expired information
6336 @item Differentiation between public and friend-only HELLO
6337 @end itemize
6338
6339 @node PEERINFO - Limitations
6340 @subsection PEERINFO - Limitations
6341
6342
6343 @itemize @bullet
6344 @item Does not perform HELLO validation
6345 @end itemize
6346
6347 @node DeveloperPeer Information
6348 @subsection DeveloperPeer Information
6349
6350 @c %**end of header
6351
6352 The PEERINFO subsystem stores these information in the form of HELLO
6353 messages you can think of as business cards.
6354 These HELLO messages contain the public key of a peer and the addresses
6355 a peer can be reached under.
6356 The addresses include an expiration date describing how long they are
6357 valid. This information is updated regularly by the TRANSPORT service by
6358 revalidating the address.
6359 If an address is expired and not renewed, it can be removed from the
6360 HELLO message.
6361
6362 Some peer do not want to have their HELLO messages distributed to other
6363 peers, especially when GNUnet's friend-to-friend modus is enabled.
6364 To prevent this undesired distribution. PEERINFO distinguishes between
6365 @emph{public} and @emph{friend-only} HELLO messages.
6366 Public HELLO messages can be freely distributed to other (possibly
6367 unknown) peers (for example using the hostlist, gossiping, broadcasting),
6368 whereas friend-only HELLO messages may not be distributed to other peers.
6369 Friend-only HELLO messages have an additional flag @code{friend_only} set
6370 internally. For public HELLO message this flag is not set.
6371 PEERINFO does and cannot not check if a client is allowed to obtain a
6372 specific HELLO type.
6373
6374 The HELLO messages can be managed using the GNUnet HELLO library.
6375 Other GNUnet systems can obtain these information from PEERINFO and use
6376 it for their purposes.
6377 Clients are for example the HOSTLIST component providing these
6378 information to other peers in form of a hostlist or the TRANSPORT
6379 subsystem using these information to maintain connections to other peers.
6380
6381 @node Startup
6382 @subsection Startup
6383
6384 @c %**end of header
6385
6386 During startup the PEERINFO services loads persistent HELLOs from disk.
6387 First PEERINFO parses the directory configured in the HOSTS value of the
6388 @code{PEERINFO} configuration section to store PEERINFO information.
6389 For all files found in this directory valid HELLO messages are extracted.
6390 In addition it loads HELLO messages shipped with the GNUnet distribution.
6391 These HELLOs are used to simplify network bootstrapping by providing
6392 valid peer information with the distribution.
6393 The use of these HELLOs can be prevented by setting the
6394 @code{USE_INCLUDED_HELLOS} in the @code{PEERINFO} configuration section to
6395 @code{NO}. Files containing invalid information are removed.
6396
6397 @node Managing Information
6398 @subsection Managing Information
6399
6400 @c %**end of header
6401
6402 The PEERINFO services stores information about known PEERS and a single
6403 HELLO message for every peer.
6404 A peer does not need to have a HELLO if no information are available.
6405 HELLO information from different sources, for example a HELLO obtained
6406 from a remote HOSTLIST and a second HELLO stored on disk, are combined
6407 and merged into one single HELLO message per peer which will be given to
6408 clients. During this merge process the HELLO is immediately written to
6409 disk to ensure persistence.
6410
6411 PEERINFO in addition periodically scans the directory where information
6412 are stored for empty HELLO messages with expired TRANSPORT addresses.
6413 This periodic task scans all files in the directory and recreates the
6414 HELLO messages it finds.
6415 Expired TRANSPORT addresses are removed from the HELLO and if the
6416 HELLO does not contain any valid addresses, it is discarded and removed
6417 from the disk.
6418
6419 @node Obtaining Information
6420 @subsection Obtaining Information
6421
6422 @c %**end of header
6423
6424 When a client requests information from PEERINFO, PEERINFO performs a
6425 lookup for the respective peer or all peers if desired and transmits this
6426 information to the client.
6427 The client can specify if friend-only HELLOs have to be included or not
6428 and PEERINFO filters the respective HELLO messages before transmitting
6429 information.
6430
6431 To notify clients about changes to PEERINFO information, PEERINFO
6432 maintains a list of clients interested in this notifications.
6433 Such a notification occurs if a HELLO for a peer was updated (due to a
6434 merge for example) or a new peer was added.
6435
6436 @node The PEERINFO Client-Service Protocol
6437 @subsection The PEERINFO Client-Service Protocol
6438
6439 @c %**end of header
6440
6441 To connect and disconnect to and from the PEERINFO Service PEERINFO
6442 utilizes the util client/server infrastructure, so no special messages
6443 types are used here.
6444
6445 To add information for a peer, the plain HELLO message is transmitted to
6446 the service without any wrapping. All pieces of information required are
6447 stored within the HELLO message.
6448 The PEERINFO service provides a message handler accepting and processing
6449 these HELLO messages.
6450
6451 When obtaining PEERINFO information using the iterate functionality
6452 specific messages are used. To obtain information for all peers, a
6453 @code{struct ListAllPeersMessage} with message type
6454 @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL} and a flag
6455 include_friend_only to indicate if friend-only HELLO messages should be
6456 included are transmitted. If information for a specific peer is required
6457 a @code{struct ListAllPeersMessage} with
6458 @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET} containing the peer identity is
6459 used.
6460
6461 For both variants the PEERINFO service replies for each HELLO message it
6462 wants to transmit with a @code{struct ListAllPeersMessage} with type
6463 @code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO} containing the plain HELLO.
6464 The final message is @code{struct GNUNET_MessageHeader} with type
6465 @code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO}. If the client receives this
6466 message, it can proceed with the next request if any is pending.
6467
6468 @node libgnunetpeerinfo
6469 @subsection libgnunetpeerinfo
6470
6471 @c %**end of header
6472
6473 The PEERINFO API consists mainly of three different functionalities:
6474
6475 @itemize @bullet
6476 @item maintaining a connection to the service
6477 @item adding new information to the PEERINFO service
6478 @item retrieving information from the PEERINFO service
6479 @end itemize
6480
6481 @menu
6482 * Connecting to the PEERINFO Service::
6483 * Adding Information to the PEERINFO Service::
6484 * Obtaining Information from the PEERINFO Service::
6485 @end menu
6486
6487 @node Connecting to the PEERINFO Service
6488 @subsubsection Connecting to the PEERINFO Service
6489
6490 @c %**end of header
6491
6492 To connect to the PEERINFO service the function
6493 @code{GNUNET_PEERINFO_connect} is used, taking a configuration handle as
6494 an argument, and to disconnect from PEERINFO the function
6495 @code{GNUNET_PEERINFO_disconnect}, taking the PEERINFO
6496 handle returned from the connect function has to be called.
6497
6498 @node Adding Information to the PEERINFO Service
6499 @subsubsection Adding Information to the PEERINFO Service
6500
6501 @c %**end of header
6502
6503 @code{GNUNET_PEERINFO_add_peer} adds a new peer to the PEERINFO subsystem
6504 storage. This function takes the PEERINFO handle as an argument, the HELLO
6505 message to store and a continuation with a closure to be called with the
6506 result of the operation.
6507 The @code{GNUNET_PEERINFO_add_peer} returns a handle to this operation
6508 allowing to cancel the operation with the respective cancel function
6509 @code{GNUNET_PEERINFO_add_peer_cancel}. To retrieve information from
6510 PEERINFO you can iterate over all information stored with PEERINFO or you
6511 can tell PEERINFO to notify if new peer information are available.
6512
6513 @node Obtaining Information from the PEERINFO Service
6514 @subsubsection Obtaining Information from the PEERINFO Service
6515
6516 @c %**end of header
6517
6518 To iterate over information in PEERINFO you use
6519 @code{GNUNET_PEERINFO_iterate}.
6520 This function expects the PEERINFO handle, a flag if HELLO messages
6521 intended for friend only mode should be included, a timeout how long the
6522 operation should take and a callback with a callback closure to be called
6523 for the results.
6524 If you want to obtain information for a specific peer, you can specify
6525 the peer identity, if this identity is NULL, information for all peers are
6526 returned. The function returns a handle to allow to cancel the operation
6527 using @code{GNUNET_PEERINFO_iterate_cancel}.
6528
6529 To get notified when peer information changes, you can use
6530 @code{GNUNET_PEERINFO_notify}.
6531 This function expects a configuration handle and a flag if friend-only
6532 HELLO messages should be included. The PEERINFO service will notify you
6533 about every change and the callback function will be called to notify you
6534 about changes. The function returns a handle to cancel notifications
6535 with @code{GNUNET_PEERINFO_notify_cancel}.
6536
6537 @cindex PEERSTORE Subsystem
6538 @node PEERSTORE Subsystem
6539 @section PEERSTORE Subsystem
6540
6541 @c %**end of header
6542
6543 GNUnet's PEERSTORE subsystem offers persistent per-peer storage for other
6544 GNUnet subsystems. GNUnet subsystems can use PEERSTORE to persistently
6545 store and retrieve arbitrary data.
6546 Each data record stored with PEERSTORE contains the following fields:
6547
6548 @itemize @bullet
6549 @item subsystem: Name of the subsystem responsible for the record.
6550 @item peerid: Identity of the peer this record is related to.
6551 @item key: a key string identifying the record.
6552 @item value: binary record value.
6553 @item expiry: record expiry date.
6554 @end itemize
6555
6556 @menu
6557 * Functionality::
6558 * Architecture::
6559 * libgnunetpeerstore::
6560 @end menu
6561
6562 @node Functionality
6563 @subsection Functionality
6564
6565 @c %**end of header
6566
6567 Subsystems can store any type of value under a (subsystem, peerid, key)
6568 combination. A "replace" flag set during store operations forces the
6569 PEERSTORE to replace any old values stored under the same
6570 (subsystem, peerid, key) combination with the new value.
6571 Additionally, an expiry date is set after which the record is *possibly*
6572 deleted by PEERSTORE.
6573
6574 Subsystems can iterate over all values stored under any of the following
6575 combination of fields:
6576
6577 @itemize @bullet
6578 @item (subsystem)
6579 @item (subsystem, peerid)
6580 @item (subsystem, key)
6581 @item (subsystem, peerid, key)
6582 @end itemize
6583
6584 Subsystems can also request to be notified about any new values stored
6585 under a (subsystem, peerid, key) combination by sending a "watch"
6586 request to PEERSTORE.
6587
6588 @node Architecture
6589 @subsection Architecture
6590
6591 @c %**end of header
6592
6593 PEERSTORE implements the following components:
6594
6595 @itemize @bullet
6596 @item PEERSTORE service: Handles store, iterate and watch operations.
6597 @item PEERSTORE API: API to be used by other subsystems to communicate and
6598 issue commands to the PEERSTORE service.
6599 @item PEERSTORE plugins: Handles the persistent storage. At the moment,
6600 only an "sqlite" plugin is implemented.
6601 @end itemize
6602
6603 @cindex libgnunetpeerstore
6604 @node libgnunetpeerstore
6605 @subsection libgnunetpeerstore
6606
6607 @c %**end of header
6608
6609 libgnunetpeerstore is the library containing the PEERSTORE API. Subsystems
6610 wishing to communicate with the PEERSTORE service use this API to open a
6611 connection to PEERSTORE. This is done by calling
6612 @code{GNUNET_PEERSTORE_connect} which returns a handle to the newly
6613 created connection.
6614 This handle has to be used with any further calls to the API.
6615
6616 To store a new record, the function @code{GNUNET_PEERSTORE_store} is to
6617 be used which requires the record fields and a continuation function that
6618 will be called by the API after the STORE request is sent to the
6619 PEERSTORE service.
6620 Note that calling the continuation function does not mean that the record
6621 is successfully stored, only that the STORE request has been successfully
6622 sent to the PEERSTORE service.
6623 @code{GNUNET_PEERSTORE_store_cancel} can be called to cancel the STORE
6624 request only before the continuation function has been called.
6625
6626 To iterate over stored records, the function
6627 @code{GNUNET_PEERSTORE_iterate} is
6628 to be used. @emph{peerid} and @emph{key} can be set to NULL. An iterator
6629 callback function will be called with each matching record found and a
6630 NULL record at the end to signal the end of result set.
6631 @code{GNUNET_PEERSTORE_iterate_cancel} can be used to cancel the ITERATE
6632 request before the iterator callback is called with a NULL record.
6633
6634 To be notified with new values stored under a (subsystem, peerid, key)
6635 combination, the function @code{GNUNET_PEERSTORE_watch} is to be used.
6636 This will register the watcher with the PEERSTORE service, any new
6637 records matching the given combination will trigger the callback
6638 function passed to @code{GNUNET_PEERSTORE_watch}. This continues until
6639 @code{GNUNET_PEERSTORE_watch_cancel} is called or the connection to the
6640 service is destroyed.
6641
6642 After the connection is no longer needed, the function
6643 @code{GNUNET_PEERSTORE_disconnect} can be called to disconnect from the
6644 PEERSTORE service.
6645 Any pending ITERATE or WATCH requests will be destroyed.
6646 If the @code{sync_first} flag is set to @code{GNUNET_YES}, the API will
6647 delay the disconnection until all pending STORE requests are sent to
6648 the PEERSTORE service, otherwise, the pending STORE requests will be
6649 destroyed as well.
6650
6651 @cindex SET Subsystem
6652 @node SET Subsystem
6653 @section SET Subsystem
6654
6655 @c %**end of header
6656
6657 The SET service implements efficient set operations between two peers
6658 over a mesh tunnel.
6659 Currently, set union and set intersection are the only supported
6660 operations. Elements of a set consist of an @emph{element type} and
6661 arbitrary binary @emph{data}.
6662 The size of an element's data is limited to around 62 KB.
6663
6664 @menu
6665 * Local Sets::
6666 * Set Modifications::
6667 * Set Operations::
6668 * Result Elements::
6669 * libgnunetset::
6670 * The SET Client-Service Protocol::
6671 * The SET Intersection Peer-to-Peer Protocol::
6672 * The SET Union Peer-to-Peer Protocol::
6673 @end menu
6674
6675 @node Local Sets
6676 @subsection Local Sets
6677
6678 @c %**end of header
6679
6680 Sets created by a local client can be modified and reused for multiple
6681 operations. As each set operation requires potentially expensive special
6682 auxiliary data to be computed for each element of a set, a set can only
6683 participate in one type of set operation (i.e. union or intersection).
6684 The type of a set is determined upon its creation.
6685 If a the elements of a set are needed for an operation of a different
6686 type, all of the set's element must be copied to a new set of appropriate
6687 type.
6688
6689 @node Set Modifications
6690 @subsection Set Modifications
6691
6692 @c %**end of header
6693
6694 Even when set operations are active, one can add to and remove elements
6695 from a set.
6696 However, these changes will only be visible to operations that have been
6697 created after the changes have taken place. That is, every set operation
6698 only sees a snapshot of the set from the time the operation was started.
6699 This mechanism is @emph{not} implemented by copying the whole set, but by
6700 attaching @emph{generation information} to each element and operation.
6701
6702 @node Set Operations
6703 @subsection Set Operations
6704
6705 @c %**end of header
6706
6707 Set operations can be started in two ways: Either by accepting an
6708 operation request from a remote peer, or by requesting a set operation
6709 from a remote peer.
6710 Set operations are uniquely identified by the involved @emph{peers}, an
6711 @emph{application id} and the @emph{operation type}.
6712
6713 The client is notified of incoming set operations by @emph{set listeners}.
6714 A set listener listens for incoming operations of a specific operation
6715 type and application id.
6716 Once notified of an incoming set request, the client can accept the set
6717 request (providing a local set for the operation) or reject it.
6718
6719 @node Result Elements
6720 @subsection Result Elements
6721
6722 @c %**end of header
6723
6724 The SET service has three @emph{result modes} that determine how an
6725 operation's result set is delivered to the client:
6726
6727 @itemize @bullet
6728 @item @strong{Full Result Set.} All elements of set resulting from the set
6729 operation are returned to the client.
6730 @item @strong{Added Elements.} Only elements that result from the
6731 operation and are not already in the local peer's set are returned.
6732 Note that for some operations (like set intersection) this result mode
6733 will never return any elements.
6734 This can be useful if only the remove peer is actually interested in
6735 the result of the set operation.
6736 @item @strong{Removed Elements.} Only elements that are in the local
6737 peer's initial set but not in the operation's result set are returned.
6738 Note that for some operations (like set union) this result mode will
6739 never return any elements. This can be useful if only the remove peer is
6740 actually interested in the result of the set operation.
6741 @end itemize
6742
6743 @cindex libgnunetset
6744 @node libgnunetset
6745 @subsection libgnunetset
6746
6747 @c %**end of header
6748
6749 @menu
6750 * Sets::
6751 * Listeners::
6752 * Operations::
6753 * Supplying a Set::
6754 * The Result Callback::
6755 @end menu
6756
6757 @node Sets
6758 @subsubsection Sets
6759
6760 @c %**end of header
6761
6762 New sets are created with @code{GNUNET_SET_create}. Both the local peer's
6763 configuration (as each set has its own client connection) and the
6764 operation type must be specified.
6765 The set exists until either the client calls @code{GNUNET_SET_destroy} or
6766 the client's connection to the service is disrupted.
6767 In the latter case, the client is notified by the return value of
6768 functions dealing with sets. This return value must always be checked.
6769
6770 Elements are added and removed with @code{GNUNET_SET_add_element} and
6771 @code{GNUNET_SET_remove_element}.
6772
6773 @node Listeners
6774 @subsubsection Listeners
6775
6776 @c %**end of header
6777
6778 Listeners are created with @code{GNUNET_SET_listen}. Each time time a
6779 remote peer suggests a set operation with an application id and operation
6780 type matching a listener, the listener's callback is invoked.
6781 The client then must synchronously call either @code{GNUNET_SET_accept}
6782 or @code{GNUNET_SET_reject}. Note that the operation will not be started
6783 until the client calls @code{GNUNET_SET_commit}
6784 (see Section "Supplying a Set").
6785
6786 @node Operations
6787 @subsubsection Operations
6788
6789 @c %**end of header
6790
6791 Operations to be initiated by the local peer are created with
6792 @code{GNUNET_SET_prepare}. Note that the operation will not be started
6793 until the client calls @code{GNUNET_SET_commit}
6794 (see Section "Supplying a Set").
6795
6796 @node Supplying a Set
6797 @subsubsection Supplying a Set
6798
6799 @c %**end of header
6800
6801 To create symmetry between the two ways of starting a set operation
6802 (accepting and initiating it), the operation handles returned by
6803 @code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare} do not yet have a
6804 set to operate on, thus they can not do any work yet.
6805
6806 The client must call @code{GNUNET_SET_commit} to specify a set to use for
6807 an operation. @code{GNUNET_SET_commit} may only be called once per set
6808 operation.
6809
6810 @node The Result Callback
6811 @subsubsection The Result Callback
6812
6813 @c %**end of header
6814
6815 Clients must specify both a result mode and a result callback with
6816 @code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare}. The result
6817 callback with a status indicating either that an element was received, or
6818 the operation failed or succeeded.
6819 The interpretation of the received element depends on the result mode.
6820 The callback needs to know which result mode it is used in, as the
6821 arguments do not indicate if an element is part of the full result set,
6822 or if it is in the difference between the original set and the final set.
6823
6824 @node The SET Client-Service Protocol
6825 @subsection The SET Client-Service Protocol
6826
6827 @c %**end of header
6828
6829 @menu
6830 * Creating Sets::
6831 * Listeners2::
6832 * Initiating Operations::
6833 * Modifying Sets::
6834 * Results and Operation Status::
6835 * Iterating Sets::
6836 @end menu
6837
6838 @node Creating Sets
6839 @subsubsection Creating Sets
6840
6841 @c %**end of header
6842
6843 For each set of a client, there exists a client connection to the service.
6844 Sets are created by sending the @code{GNUNET_SERVICE_SET_CREATE} message
6845 over a new client connection. Multiple operations for one set are
6846 multiplexed over one client connection, using a request id supplied by
6847 the client.
6848
6849 @node Listeners2
6850 @subsubsection Listeners2
6851
6852 @c %**end of header
6853
6854 Each listener also requires a seperate client connection. By sending the
6855 @code{GNUNET_SERVICE_SET_LISTEN} message, the client notifies the service
6856 of the application id and operation type it is interested in. A client
6857 rejects an incoming request by sending @code{GNUNET_SERVICE_SET_REJECT}
6858 on the listener's client connection.
6859 In contrast, when accepting an incoming request, a
6860 @code{GNUNET_SERVICE_SET_ACCEPT} message must be sent over the@ set that
6861 is supplied for the set operation.
6862
6863 @node Initiating Operations
6864 @subsubsection Initiating Operations
6865
6866 @c %**end of header
6867
6868 Operations with remote peers are initiated by sending a
6869 @code{GNUNET_SERVICE_SET_EVALUATE} message to the service. The@ client
6870 connection that this message is sent by determines the set to use.
6871
6872 @node Modifying Sets
6873 @subsubsection Modifying Sets
6874
6875 @c %**end of header
6876
6877 Sets are modified with the @code{GNUNET_SERVICE_SET_ADD} and
6878 @code{GNUNET_SERVICE_SET_REMOVE} messages.
6879
6880
6881 @c %@menu
6882 @c %* Results and Operation Status::
6883 @c %* Iterating Sets::
6884 @c %@end menu
6885
6886 @node Results and Operation Status
6887 @subsubsection Results and Operation Status
6888 @c %**end of header
6889
6890 The service notifies the client of result elements and success/failure of
6891 a set operation with the @code{GNUNET_SERVICE_SET_RESULT} message.
6892
6893 @node Iterating Sets
6894 @subsubsection Iterating Sets
6895
6896 @c %**end of header
6897
6898 All elements of a set can be requested by sending
6899 @code{GNUNET_SERVICE_SET_ITER_REQUEST}. The server responds with
6900 @code{GNUNET_SERVICE_SET_ITER_ELEMENT} and eventually terminates the
6901 iteration with @code{GNUNET_SERVICE_SET_ITER_DONE}.
6902 After each received element, the client
6903 must send @code{GNUNET_SERVICE_SET_ITER_ACK}. Note that only one set
6904 iteration may be active for a set at any given time.
6905
6906 @node The SET Intersection Peer-to-Peer Protocol
6907 @subsection The SET Intersection Peer-to-Peer Protocol
6908
6909 @c %**end of header
6910
6911 The intersection protocol operates over CADET and starts with a
6912 GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer
6913 initiating the operation to the peer listening for inbound requests.
6914 It includes the number of elements of the initiating peer, which is used
6915 to decide which side will send a Bloom filter first.
6916
6917 The listening peer checks if the operation type and application
6918 identifier are acceptable for its current state.
6919 If not, it responds with a GNUNET_MESSAGE_TYPE_SET_RESULT and a status of
6920 GNUNET_SET_STATUS_FAILURE (and terminates the CADET channel).
6921
6922 If the application accepts the request, the listener sends back a
6923 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} if it has
6924 more elements in the set than the client.
6925 Otherwise, it immediately starts with the Bloom filter exchange.
6926 If the initiator receives a
6927 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} response,
6928 it beings the Bloom filter exchange, unless the set size is indicated to
6929 be zero, in which case the intersection is considered finished after
6930 just the initial handshake.
6931
6932
6933 @menu
6934 * The Bloom filter exchange::
6935 * Salt::
6936 @end menu
6937
6938 @node The Bloom filter exchange
6939 @subsubsection The Bloom filter exchange
6940
6941 @c %**end of header
6942
6943 In this phase, each peer transmits a Bloom filter over the remaining
6944 keys of the local set to the other peer using a
6945 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_BF} message. This
6946 message additionally includes the number of elements left in the sender's
6947 set, as well as the XOR over all of the keys in that set.
6948
6949 The number of bits 'k' set per element in the Bloom filter is calculated
6950 based on the relative size of the two sets.
6951 Furthermore, the size of the Bloom filter is calculated based on 'k' and
6952 the number of elements in the set to maximize the amount of data filtered
6953 per byte transmitted on the wire (while avoiding an excessively high
6954 number of iterations).
6955
6956 The receiver of the message removes all elements from its local set that
6957 do not pass the Bloom filter test.
6958 It then checks if the set size of the sender and the XOR over the keys
6959 match what is left of its own set. If they do, it sends a
6960 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_DONE} back to indicate
6961 that the latest set is the final result.
6962 Otherwise, the receiver starts another Bloom filter exchange, except
6963 this time as the sender.
6964
6965 @node Salt
6966 @subsubsection Salt
6967
6968 @c %**end of header
6969
6970 Bloomfilter operations are probabilistic: With some non-zero probability
6971 the test may incorrectly say an element is in the set, even though it is
6972 not.
6973
6974 To mitigate this problem, the intersection protocol iterates exchanging
6975 Bloom filters using a different random 32-bit salt in each iteration (the
6976 salt is also included in the message).
6977 With different salts, set operations may fail for different elements.
6978 Merging the results from the executions, the probability of failure drops
6979 to zero.
6980
6981 The iterations terminate once both peers have established that they have
6982 sets of the same size, and where the XOR over all keys computes the same
6983 512-bit value (leaving a failure probability of 2-511).
6984
6985 @node The SET Union Peer-to-Peer Protocol
6986 @subsection The SET Union Peer-to-Peer Protocol
6987
6988 @c %**end of header
6989
6990 The SET union protocol is based on Eppstein's efficient set reconciliation
6991 without prior context. You should read this paper first if you want to
6992 understand the protocol.
6993
6994 The union protocol operates over CADET and starts with a
6995 GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer
6996 initiating the operation to the peer listening for inbound requests.
6997 It includes the number of elements of the initiating peer, which is
6998 currently not used.
6999
7000 The listening peer checks if the operation type and application
7001 identifier are acceptable for its current state. If not, it responds with
7002 a @code{GNUNET_MESSAGE_TYPE_SET_RESULT} and a status of
7003 @code{GNUNET_SET_STATUS_FAILURE} (and terminates the CADET channel).
7004
7005 If the application accepts the request, it sends back a strata estimator
7006 using a message of type GNUNET_MESSAGE_TYPE_SET_UNION_P2P_SE. The
7007 initiator evaluates the strata estimator and initiates the exchange of
7008 invertible Bloom filters, sending a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
7009
7010 During the IBF exchange, if the receiver cannot invert the Bloom filter or
7011 detects a cycle, it sends a larger IBF in response (up to a defined
7012 maximum limit; if that limit is reached, the operation fails).
7013 Elements decoded while processing the IBF are transmitted to the other
7014 peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS, or requested from the
7015 other peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS messages,
7016 depending on the sign observed during decoding of the IBF.
7017 Peers respond to a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS message
7018 with the respective element in a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS
7019 message. If the IBF fully decodes, the peer responds with a
7020 GNUNET_MESSAGE_TYPE_SET_UNION_P2P_DONE message instead of another
7021 GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
7022
7023 All Bloom filter operations use a salt to mingle keys before hashing them
7024 into buckets, such that future iterations have a fresh chance of
7025 succeeding if they failed due to collisions before.
7026
7027 @cindex STATISTICS Subsystem
7028 @node STATISTICS Subsystem
7029 @section STATISTICS Subsystem
7030
7031 @c %**end of header
7032
7033 In GNUnet, the STATISTICS subsystem offers a central place for all
7034 subsystems to publish unsigned 64-bit integer run-time statistics.
7035 Keeping this information centrally means that there is a unified way for
7036 the user to obtain data on all subsystems, and individual subsystems do
7037 not have to always include a custom data export method for performance
7038 metrics and other statistics. For example, the TRANSPORT system uses
7039 STATISTICS to update information about the number of directly connected
7040 peers and the bandwidth that has been consumed by the various plugins.
7041 This information is valuable for diagnosing connectivity and performance
7042 issues.
7043
7044 Following the GNUnet service architecture, the STATISTICS subsystem is
7045 divided into an API which is exposed through the header
7046 @strong{gnunet_statistics_service.h} and the STATISTICS service
7047 @strong{gnunet-service-statistics}. The @strong{gnunet-statistics}
7048 command-line tool can be used to obtain (and change) information about
7049 the values stored by the STATISTICS service. The STATISTICS service does
7050 not communicate with other peers.
7051
7052 Data is stored in the STATISTICS service in the form of tuples
7053 @strong{(subsystem, name, value, persistence)}. The subsystem determines
7054 to which other GNUnet's subsystem the data belongs. name is the name
7055 through which value is associated. It uniquely identifies the record
7056 from among other records belonging to the same subsystem.
7057 In some parts of the code, the pair @strong{(subsystem, name)} is called
7058 a @strong{statistic} as it identifies the values stored in the STATISTCS
7059 service.The persistence flag determines if the record has to be preserved
7060 across service restarts. A record is said to be persistent if this flag
7061 is set for it; if not, the record is treated as a non-persistent record
7062 and it is lost after service restart. Persistent records are written to
7063 and read from the file @strong{statistics.data} before shutdown
7064 and upon startup. The file is located in the HOME directory of the peer.
7065
7066 An anomaly of the STATISTICS service is that it does not terminate
7067 immediately upon receiving a shutdown signal if it has any clients
7068 connected to it. It waits for all the clients that are not monitors to
7069 close their connections before terminating itself.
7070 This is to prevent the loss of data during peer shutdown --- delaying the
7071 STATISTICS service shutdown helps other services to store important data
7072 to STATISTICS during shutdown.
7073
7074 @menu
7075 * libgnunetstatistics::
7076 * The STATISTICS Client-Service Protocol::
7077 @end menu
7078
7079 @cindex libgnunetstatistics
7080 @node libgnunetstatistics
7081 @subsection libgnunetstatistics
7082
7083 @c %**end of header
7084
7085 @strong{libgnunetstatistics} is the library containing the API for the
7086 STATISTICS subsystem. Any process requiring to use STATISTICS should use
7087 this API by to open a connection to the STATISTICS service.
7088 This is done by calling the function @code{GNUNET_STATISTICS_create()}.
7089 This function takes the subsystem's name which is trying to use STATISTICS
7090 and a configuration.
7091 All values written to STATISTICS with this connection will be placed in
7092 the section corresponding to the given subsystem's name.
7093 The connection to STATISTICS can be destroyed with the function
7094 @code{GNUNET_STATISTICS_destroy()}. This function allows for the
7095 connection to be destroyed immediately or upon transferring all
7096 pending write requests to the service.
7097
7098 Note: STATISTICS subsystem can be disabled by setting @code{DISABLE = YES}
7099 under the @code{[STATISTICS]} section in the configuration. With such a
7100 configuration all calls to @code{GNUNET_STATISTICS_create()} return
7101 @code{NULL} as the STATISTICS subsystem is unavailable and no other
7102 functions from the API can be used.
7103
7104
7105 @menu
7106 * Statistics retrieval::
7107 * Setting statistics and updating them::
7108 * Watches::
7109 @end menu
7110
7111 @node Statistics retrieval
7112 @subsubsection Statistics retrieval
7113
7114 @c %**end of header
7115
7116 Once a connection to the statistics service is obtained, information
7117 about any other system which uses statistics can be retrieved with the
7118 function GNUNET_STATISTICS_get().
7119 This function takes the connection handle, the name of the subsystem
7120 whose information we are interested in (a @code{NULL} value will
7121 retrieve information of all available subsystems using STATISTICS), the
7122 name of the statistic we are interested in (a @code{NULL} value will
7123 retrieve all available statistics), a continuation callback which is
7124 called when all of requested information is retrieved, an iterator
7125 callback which is called for each parameter in the retrieved information
7126 and a closure for the aforementioned callbacks. The library then invokes
7127 the iterator callback for each value matching the request.
7128
7129 Call to @code{GNUNET_STATISTICS_get()} is asynchronous and can be
7130 canceled with the function @code{GNUNET_STATISTICS_get_cancel()}.
7131 This is helpful when retrieving statistics takes too long and especially
7132 when we want to shutdown and cleanup everything.
7133
7134 @node Setting statistics and updating them
7135 @subsubsection Setting statistics and updating them
7136
7137 @c %**end of header
7138
7139 So far we have seen how to retrieve statistics, here we will learn how we
7140 can set statistics and update them so that other subsystems can retrieve
7141 them.
7142
7143 A new statistic can be set using the function
7144 @code{GNUNET_STATISTICS_set()}.
7145 This function takes the name of the statistic and its value and a flag to
7146 make the statistic persistent.
7147 The value of the statistic should be of the type @code{uint64_t}.
7148 The function does not take the name of the subsystem; it is determined
7149 from the previous @code{GNUNET_STATISTICS_create()} invocation. If
7150 the given statistic is already present, its value is overwritten.
7151
7152 An existing statistics can be updated, i.e its value can be increased or
7153 decreased by an amount with the function
7154 @code{GNUNET_STATISTICS_update()}.
7155 The parameters to this function are similar to
7156 @code{GNUNET_STATISTICS_set()}, except that it takes the amount to be
7157 changed as a type @code{int64_t} instead of the value.
7158
7159 The library will combine multiple set or update operations into one
7160 message if the client performs requests at a rate that is faster than the
7161 available IPC with the STATISTICS service. Thus, the client does not have
7162 to worry about sending requests too quickly.
7163
7164 @node Watches
7165 @subsubsection Watches
7166
7167 @c %**end of header
7168
7169 As interesting feature of STATISTICS lies in serving notifications
7170 whenever a statistic of our interest is modified.
7171 This is achieved by registering a watch through the function
7172 @code{GNUNET_STATISTICS_watch()}.
7173 The parameters of this function are similar to those of
7174 @code{GNUNET_STATISTICS_get()}.
7175 Changes to the respective statistic's value will then cause the given
7176 iterator callback to be called.
7177 Note: A watch can only be registered for a specific statistic. Hence
7178 the subsystem name and the parameter name cannot be @code{NULL} in a
7179 call to @code{GNUNET_STATISTICS_watch()}.
7180
7181 A registered watch will keep notifying any value changes until
7182 @code{GNUNET_STATISTICS_watch_cancel()} is called with the same
7183 parameters that are used for registering the watch.
7184
7185 @node The STATISTICS Client-Service Protocol
7186 @subsection The STATISTICS Client-Service Protocol
7187 @c %**end of header
7188
7189
7190 @menu
7191 * Statistics retrieval2::
7192 * Setting and updating statistics::
7193 * Watching for updates::
7194 @end menu
7195
7196 @node Statistics retrieval2
7197 @subsubsection Statistics retrieval2
7198
7199 @c %**end of header
7200
7201 To retrieve statistics, the client transmits a message of type
7202 @code{GNUNET_MESSAGE_TYPE_STATISTICS_GET} containing the given subsystem
7203 name and statistic parameter to the STATISTICS service.
7204 The service responds with a message of type
7205 @code{GNUNET_MESSAGE_TYPE_STATISTICS_VALUE} for each of the statistics
7206 parameters that match the client request for the client. The end of
7207 information retrieved is signaled by the service by sending a message of
7208 type @code{GNUNET_MESSAGE_TYPE_STATISTICS_END}.
7209
7210 @node Setting and updating statistics
7211 @subsubsection Setting and updating statistics
7212
7213 @c %**end of header
7214
7215 The subsystem name, parameter name, its value and the persistence flag are
7216 communicated to the service through the message
7217 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}.
7218
7219 When the service receives a message of type
7220 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}, it retrieves the subsystem
7221 name and checks for a statistic parameter with matching the name given in
7222 the message.
7223 If a statistic parameter is found, the value is overwritten by the new
7224 value from the message; if not found then a new statistic parameter is
7225 created with the given name and value.
7226
7227 In addition to just setting an absolute value, it is possible to perform a
7228 relative update by sending a message of type
7229 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET} with an update flag
7230 (@code{GNUNET_STATISTICS_SETFLAG_RELATIVE}) signifying that the value in
7231 the message should be treated as an update value.
7232
7233 @node Watching for updates
7234 @subsubsection Watching for updates
7235
7236 @c %**end of header
7237
7238 The function registers the watch at the service by sending a message of
7239 type @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH}. The service then sends
7240 notifications through messages of type
7241 @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH_VALUE} whenever the statistic
7242 parameter's value is changed.
7243
7244 @cindex DHT
7245 @cindex Distributed Hash Table
7246 @node Distributed Hash Table (DHT)
7247 @section Distributed Hash Table (DHT)
7248
7249 @c %**end of header
7250
7251 GNUnet includes a generic distributed hash table that can be used by
7252 developers building P2P applications in the framework.
7253 This section documents high-level features and how developers are
7254 expected to use the DHT.
7255 We have a research paper detailing how the DHT works.
7256 Also, Nate's thesis includes a detailed description and performance
7257 analysis (in chapter 6).
7258
7259 Key features of GNUnet's DHT include:
7260
7261 @itemize @bullet
7262 @item stores key-value pairs with values up to (approximately) 63k in size
7263 @item works with many underlay network topologies (small-world, random
7264 graph), underlay does not need to be a full mesh / clique
7265 @item support for extended queries (more than just a simple 'key'),
7266 filtering duplicate replies within the network (bloomfilter) and content
7267 validation (for details, please read the subsection on the block library)
7268 @item can (optionally) return paths taken by the PUT and GET operations
7269 to the application
7270 @item provides content replication to handle churn
7271 @end itemize
7272
7273 GNUnet's DHT is randomized and unreliable. Unreliable means that there is
7274 no strict guarantee that a value stored in the DHT is always
7275 found --- values are only found with high probability.
7276 While this is somewhat true in all P2P DHTs, GNUnet developers should be
7277 particularly wary of this fact (this will help you write secure,
7278 fault-tolerant code). Thus, when writing any application using the DHT,
7279 you should always consider the possibility that a value stored in the
7280 DHT by you or some other peer might simply not be returned, or returned
7281 with a significant delay.
7282 Your application logic must be written to tolerate this (naturally, some
7283 loss of performance or quality of service is expected in this case).
7284
7285 @menu
7286 * Block library and plugins::
7287 * libgnunetdht::
7288 * The DHT Client-Service Protocol::
7289 * The DHT Peer-to-Peer Protocol::
7290 @end menu
7291
7292 @node Block library and plugins
7293 @subsection Block library and plugins
7294
7295 @c %**end of header
7296
7297 @menu
7298 * What is a Block?::
7299 * The API of libgnunetblock::
7300 * Queries::
7301 * Sample Code::
7302 * Conclusion2::
7303 @end menu
7304
7305 @node What is a Block?
7306 @subsubsection What is a Block?
7307
7308 @c %**end of header
7309
7310 Blocks are small (< 63k) pieces of data stored under a key (struct
7311 GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which defines
7312 their data format. Blocks are used in GNUnet as units of static data
7313 exchanged between peers and stored (or cached) locally.
7314 Uses of blocks include file-sharing (the files are broken up into blocks),
7315 the VPN (DNS information is stored in blocks) and the DHT (all
7316 information in the DHT and meta-information for the maintenance of the
7317 DHT are both stored using blocks).
7318 The block subsystem provides a few common functions that must be
7319 available for any type of block.
7320
7321 @cindex libgnunetblock API
7322 @node The API of libgnunetblock
7323 @subsubsection The API of libgnunetblock
7324
7325 @c %**end of header
7326
7327 The block library requires for each (family of) block type(s) a block
7328 plugin (implementing @file{gnunet_block_plugin.h}) that provides basic
7329 functions that are needed by the DHT (and possibly other subsystems) to
7330 manage the block.
7331 These block plugins are typically implemented within their respective
7332 subsystems.
7333 The main block library is then used to locate, load and query the
7334 appropriate block plugin.
7335 Which plugin is appropriate is determined by the block type (which is
7336 just a 32-bit integer). Block plugins contain code that specifies which
7337 block types are supported by a given plugin. The block library loads all
7338 block plugins that are installed at the local peer and forwards the
7339 application request to the respective plugin.
7340
7341 The central functions of the block APIs (plugin and main library) are to
7342 allow the mapping of blocks to their respective key (if possible) and the
7343 ability to check that a block is well-formed and matches a given
7344 request (again, if possible).
7345 This way, GNUnet can avoid storing invalid blocks, storing blocks under
7346 the wrong key and forwarding blocks in response to a query that they do
7347 not answer.
7348
7349 One key function of block plugins is that it allows GNUnet to detect
7350 duplicate replies (via the Bloom filter). All plugins MUST support
7351 detecting duplicate replies (by adding the current response to the
7352 Bloom filter and rejecting it if it is encountered again).
7353 If a plugin fails to do this, responses may loop in the network.
7354
7355 @node Queries
7356 @subsubsection Queries
7357 @c %**end of header
7358
7359 The query format for any block in GNUnet consists of four main components.
7360 First, the type of the desired block must be specified. Second, the query
7361 must contain a hash code. The hash code is used for lookups in hash
7362 tables and databases and must not be unique for the block (however, if
7363 possible a unique hash should be used as this would be best for
7364 performance).
7365 Third, an optional Bloom filter can be specified to exclude known results;
7366 replies that hash to the bits set in the Bloom filter are considered
7367 invalid. False-positives can be eliminated by sending the same query
7368 again with a different Bloom filter mutator value, which parameterizes
7369 the hash function that is used.
7370 Finally, an optional application-specific "eXtended query" (xquery) can
7371 be specified to further constrain the results. It is entirely up to
7372 the type-specific plugin to determine whether or not a given block
7373 matches a query (type, hash, Bloom filter, and xquery).
7374 Naturally, not all xquery's are valid and some types of blocks may not
7375 support Bloom filters either, so the plugin also needs to check if the
7376 query is valid in the first place.
7377
7378 Depending on the results from the plugin, the DHT will then discard the
7379 (invalid) query, forward the query, discard the (invalid) reply, cache the
7380 (valid) reply, and/or forward the (valid and non-duplicate) reply.
7381
7382 @node Sample Code
7383 @subsubsection Sample Code
7384
7385 @c %**end of header
7386
7387 The source code in @strong{plugin_block_test.c} is a good starting point
7388 for new block plugins --- it does the minimal work by implementing a
7389 plugin that performs no validation at all.
7390 The respective @strong{Makefile.am} shows how to build and install a
7391 block plugin.
7392
7393 @node Conclusion2
7394 @subsubsection Conclusion2
7395
7396 @c %**end of header
7397
7398 In conclusion, GNUnet subsystems that want to use the DHT need to define a
7399 block format and write a plugin to match queries and replies. For testing,
7400 the @code{GNUNET_BLOCK_TYPE_TEST} block type can be used; it accepts
7401 any query as valid and any reply as matching any query.
7402 This type is also used for the DHT command line tools.
7403 However, it should NOT be used for normal applications due to the lack
7404 of error checking that results from this primitive implementation.
7405
7406 @cindex libgnunetdht
7407 @node libgnunetdht
7408 @subsection libgnunetdht
7409
7410 @c %**end of header
7411
7412 The DHT API itself is pretty simple and offers the usual GET and PUT
7413 functions that work as expected. The specified block type refers to the
7414 block library which allows the DHT to run application-specific logic for
7415 data stored in the network.
7416
7417
7418 @menu
7419 * GET::
7420 * PUT::
7421 * MONITOR::
7422 * DHT Routing Options::
7423 @end menu
7424
7425 @node GET
7426 @subsubsection GET
7427
7428 @c %**end of header
7429
7430 When using GET, the main consideration for developers (other than the
7431 block library) should be that after issuing a GET, the DHT will
7432 continuously cause (small amounts of) network traffic until the operation
7433 is explicitly canceled.
7434 So GET does not simply send out a single network request once; instead,
7435 the DHT will continue to search for data. This is needed to achieve good
7436 success rates and also handles the case where the respective PUT
7437 operation happens after the GET operation was started.
7438 Developers should not cancel an existing GET operation and then
7439 explicitly re-start it to trigger a new round of network requests;
7440 this is simply inefficient, especially as the internal automated version
7441 can be more efficient, for example by filtering results in the network
7442 that have already been returned.
7443
7444 If an application that performs a GET request has a set of replies that it
7445 already knows and would like to filter, it can call@
7446 @code{GNUNET_DHT_get_filter_known_results} with an array of hashes over
7447 the respective blocks to tell the DHT that these results are not
7448 desired (any more).
7449 This way, the DHT will filter the respective blocks using the block
7450 library in the network, which may result in a significant reduction in
7451 bandwidth consumption.
7452
7453 @node PUT
7454 @subsubsection PUT
7455
7456 @c %**end of header
7457
7458 @c inconsistent use of ``must'' above it's written ``MUST''
7459 In contrast to GET operations, developers @strong{must} manually re-run
7460 PUT operations periodically (if they intend the content to continue to be
7461 available). Content stored in the DHT expires or might be lost due to
7462 churn.
7463 Furthermore, GNUnet's DHT typically requires multiple rounds of PUT
7464 operations before a key-value pair is consistently available to all
7465 peers (the DHT randomizes paths and thus storage locations, and only
7466 after multiple rounds of PUTs there will be a sufficient number of
7467 replicas in large DHTs). An explicit PUT operation using the DHT API will
7468 only cause network traffic once, so in order to ensure basic availability
7469 and resistance to churn (and adversaries), PUTs must be repeated.
7470 While the exact frequency depends on the application, a rule of thumb is
7471 that there should be at least a dozen PUT operations within the content
7472 lifetime. Content in the DHT typically expires after one day, so
7473 DHT PUT operations should be repeated at least every 1-2 hours.
7474
7475 @node MONITOR
7476 @subsubsection MONITOR
7477
7478 @c %**end of header
7479
7480 The DHT API also allows applications to monitor messages crossing the
7481 local DHT service.
7482 The types of messages used by the DHT are GET, PUT and RESULT messages.
7483 Using the monitoring API, applications can choose to monitor these
7484 requests, possibly limiting themselves to requests for a particular block
7485 type.
7486
7487 The monitoring API is not only useful for diagnostics, it can also be
7488 used to trigger application operations based on PUT operations.
7489 For example, an application may use PUTs to distribute work requests to
7490 other peers.
7491 The workers would then monitor for PUTs that give them work, instead of
7492 looking for work using GET operations.
7493 This can be beneficial, especially if the workers have no good way to
7494 guess the keys under which work would be stored.
7495 Naturally, additional protocols might be needed to ensure that the desired
7496 number of workers will process the distributed workload.
7497
7498 @node DHT Routing Options
7499 @subsubsection DHT Routing Options
7500
7501 @c %**end of header
7502
7503 There are two important options for GET and PUT requests:
7504
7505 @table @asis
7506 @item GNUNET_DHT_RO_DEMULITPLEX_EVERYWHERE This option means that all
7507 peers should process the request, even if their peer ID is not closest to
7508 the key. For a PUT request, this means that all peers that a request
7509 traverses may make a copy of the data.
7510 Similarly for a GET request, all peers will check their local database
7511 for a result. Setting this option can thus significantly improve caching
7512 and reduce bandwidth consumption --- at the expense of a larger DHT
7513 database. If in doubt, we recommend that this option should be used.
7514 @item GNUNET_DHT_RO_RECORD_ROUTE This option instructs the DHT to record
7515 the path that a GET or a PUT request is taking through the overlay
7516 network. The resulting paths are then returned to the application with
7517 the respective result. This allows the receiver of a result to construct
7518 a path to the originator of the data, which might then be used for
7519 routing. Naturally, setting this option requires additional bandwidth
7520 and disk space, so applications should only set this if the paths are
7521 needed by the application logic.
7522 @item GNUNET_DHT_RO_FIND_PEER This option is an internal option used by
7523 the DHT's peer discovery mechanism and should not be used by applications.
7524 @item GNUNET_DHT_RO_BART This option is currently not implemented. It may
7525 in the future offer performance improvements for clique topologies.
7526 @end table
7527
7528 @node The DHT Client-Service Protocol
7529 @subsection The DHT Client-Service Protocol
7530
7531 @c %**end of header
7532
7533 @menu
7534 * PUTting data into the DHT::
7535 * GETting data from the DHT::
7536 * Monitoring the DHT::
7537 @end menu
7538
7539 @node PUTting data into the DHT
7540 @subsubsection PUTting data into the DHT
7541
7542 @c %**end of header
7543
7544 To store (PUT) data into the DHT, the client sends a
7545 @code{struct GNUNET_DHT_ClientPutMessage} to the service.
7546 This message specifies the block type, routing options, the desired
7547 replication level, the expiration time, key,
7548 value and a 64-bit unique ID for the operation. The service responds with
7549 a @code{struct GNUNET_DHT_ClientPutConfirmationMessage} with the same
7550 64-bit unique ID. Note that the service sends the confirmation as soon as
7551 it has locally processed the PUT request. The PUT may still be
7552 propagating through the network at this time.
7553
7554 In the future, we may want to change this to provide (limited) feedback
7555 to the client, for example if we detect that the PUT operation had no
7556 effect because the same key-value pair was already stored in the DHT.
7557 However, changing this would also require additional state and messages
7558 in the P2P interaction.
7559
7560 @node GETting data from the DHT
7561 @subsubsection GETting data from the DHT
7562
7563 @c %**end of header
7564
7565 To retrieve (GET) data from the DHT, the client sends a
7566 @code{struct GNUNET_DHT_ClientGetMessage} to the service. The message
7567 specifies routing options, a replication level (for replicating the GET,
7568 not the content), the desired block type, the key, the (optional)
7569 extended query and unique 64-bit request ID.
7570
7571 Additionally, the client may send any number of
7572 @code{struct GNUNET_DHT_ClientGetResultSeenMessage}s to notify the
7573 service about results that the client is already aware of.
7574 These messages consist of the key, the unique 64-bit ID of the request,
7575 and an arbitrary number of hash codes over the blocks that the client is
7576 already aware of. As messages are restricted to 64k, a client that
7577 already knows more than about a thousand blocks may need to send
7578 several of these messages. Naturally, the client should transmit these
7579 messages as quickly as possible after the original GET request such that
7580 the DHT can filter those results in the network early on. Naturally, as
7581 these messages are sent after the original request, it is conceivable
7582 that the DHT service may return blocks that match those already known
7583 to the client anyway.
7584
7585 In response to a GET request, the service will send @code{struct
7586 GNUNET_DHT_ClientResultMessage}s to the client. These messages contain the
7587 block type, expiration, key, unique ID of the request and of course the
7588 value (a block). Depending on the options set for the respective
7589 operations, the replies may also contain the path the GET and/or the PUT
7590 took through the network.
7591
7592 A client can stop receiving replies either by disconnecting or by sending
7593 a @code{struct GNUNET_DHT_ClientGetStopMessage} which must contain the
7594 key and the 64-bit unique ID of the original request. Using an
7595 explicit "stop" message is more common as this allows a client to run
7596 many concurrent GET operations over the same connection with the DHT
7597 service --- and to stop them individually.
7598
7599 @node Monitoring the DHT
7600 @subsubsection Monitoring the DHT
7601
7602 @c %**end of header
7603
7604 To begin monitoring, the client sends a
7605 @code{struct GNUNET_DHT_MonitorStartStop} message to the DHT service.
7606 In this message, flags can be set to enable (or disable) monitoring of
7607 GET, PUT and RESULT messages that pass through a peer. The message can
7608 also restrict monitoring to a particular block type or a particular key.
7609 Once monitoring is enabled, the DHT service will notify the client about
7610 any matching event using @code{struct GNUNET_DHT_MonitorGetMessage}s for
7611 GET events, @code{struct GNUNET_DHT_MonitorPutMessage} for PUT events
7612 and @code{struct GNUNET_DHT_MonitorGetRespMessage} for RESULTs. Each of
7613 these messages contains all of the information about the event.
7614
7615 @node The DHT Peer-to-Peer Protocol
7616 @subsection The DHT Peer-to-Peer Protocol
7617 @c %**end of header
7618
7619
7620 @menu
7621 * Routing GETs or PUTs::
7622 * PUTting data into the DHT2::
7623 * GETting data from the DHT2::
7624 @end menu
7625
7626 @node Routing GETs or PUTs
7627 @subsubsection Routing GETs or PUTs
7628
7629 @c %**end of header
7630
7631 When routing GETs or PUTs, the DHT service selects a suitable subset of
7632 neighbours for forwarding. The exact number of neighbours can be zero or
7633 more and depends on the hop counter of the query (initially zero) in
7634 relation to the (log of) the network size estimate, the desired
7635 replication level and the peer's connectivity.
7636 Depending on the hop counter and our network size estimate, the selection
7637 of the peers maybe randomized or by proximity to the key.
7638 Furthermore, requests include a set of peers that a request has already
7639 traversed; those peers are also excluded from the selection.
7640
7641 @node PUTting data into the DHT2
7642 @subsubsection PUTting data into the DHT2
7643
7644 @c %**end of header
7645
7646 To PUT data into the DHT, the service sends a @code{struct PeerPutMessage}
7647 of type @code{GNUNET_MESSAGE_TYPE_DHT_P2P_PUT} to the respective
7648 neighbour.
7649 In addition to the usual information about the content (type, routing
7650 options, desired replication level for the content, expiration time, key
7651 and value), the message contains a fixed-size Bloom filter with
7652 information about which peers (may) have already seen this request.
7653 This Bloom filter is used to ensure that DHT messages never loop back to
7654 a peer that has already processed the request.
7655 Additionally, the message includes the current hop counter and, depending
7656 on the routing options, the message may include the full path that the
7657 message has taken so far.
7658 The Bloom filter should already contain the identity of the previous hop;
7659 however, the path should not include the identity of the previous hop and
7660 the receiver should append the identity of the sender to the path, not
7661 its own identity (this is done to reduce bandwidth).
7662
7663 @node GETting data from the DHT2
7664 @subsubsection GETting data from the DHT2
7665
7666 @c %**end of header
7667
7668 A peer can search the DHT by sending @code{struct PeerGetMessage}s of type
7669 @code{GNUNET_MESSAGE_TYPE_DHT_P2P_GET} to other peers. In addition to the
7670 usual information about the request (type, routing options, desired
7671 replication level for the request, the key and the extended query), a GET
7672 request also contains a hop counter, a Bloom filter over the peers
7673 that have processed the request already and depending on the routing
7674 options the full path traversed by the GET.
7675 Finally, a GET request includes a variable-size second Bloom filter and a
7676 so-called Bloom filter mutator value which together indicate which
7677 replies the sender has already seen. During the lookup, each block that
7678 matches they block type, key and extended query is additionally subjected
7679 to a test against this Bloom filter.
7680 The block plugin is expected to take the hash of the block and combine it
7681 with the mutator value and check if the result is not yet in the Bloom
7682 filter. The originator of the query will from time to time modify the
7683 mutator to (eventually) allow false-positives filtered by the Bloom filter
7684 to be returned.
7685
7686 Peers that receive a GET request perform a local lookup (depending on
7687 their proximity to the key and the query options) and forward the request
7688 to other peers.
7689 They then remember the request (including the Bloom filter for blocking
7690 duplicate results) and when they obtain a matching, non-filtered response
7691 a @code{struct PeerResultMessage} of type
7692 @code{GNUNET_MESSAGE_TYPE_DHT_P2P_RESULT} is forwarded to the previous
7693 hop.
7694 Whenever a result is forwarded, the block plugin is used to update the
7695 Bloom filter accordingly, to ensure that the same result is never
7696 forwarded more than once.
7697 The DHT service may also cache forwarded results locally if the
7698 "CACHE_RESULTS" option is set to "YES" in the configuration.
7699
7700 @cindex GNS
7701 @cindex GNU Name System
7702 @node GNU Name System (GNS)
7703 @section GNU Name System (GNS)
7704
7705 @c %**end of header
7706
7707 The GNU Name System (GNS) is a decentralized database that enables users
7708 to securely resolve names to values.
7709 Names can be used to identify other users (for example, in social
7710 networking), or network services (for example, VPN services running at a
7711 peer in GNUnet, or purely IP-based services on the Internet).
7712 Users interact with GNS by typing in a hostname that ends in a
7713 top-level domain that is configured in the ``GNS'' section, matches
7714 an identity of the user or ends in a Base32-encoded public key.
7715
7716 Videos giving an overview of most of the GNS and the motivations behind
7717 it is available here and here.
7718 The remainder of this chapter targets developers that are familiar with
7719 high level concepts of GNS as presented in these talks.
7720 @c TODO: Add links to here and here and to these.
7721
7722 GNS-aware applications should use the GNS resolver to obtain the
7723 respective records that are stored under that name in GNS.
7724 Each record consists of a type, value, expiration time and flags.
7725
7726 The type specifies the format of the value. Types below 65536 correspond
7727 to DNS record types, larger values are used for GNS-specific records.
7728 Applications can define new GNS record types by reserving a number and
7729 implementing a plugin (which mostly needs to convert the binary value
7730 representation to a human-readable text format and vice-versa).
7731 The expiration time specifies how long the record is to be valid.
7732 The GNS API ensures that applications are only given non-expired values.
7733 The flags are typically irrelevant for applications, as GNS uses them
7734 internally to control visibility and validity of records.
7735
7736 Records are stored along with a signature.
7737 The signature is generated using the private key of the authoritative
7738 zone. This allows any GNS resolver to verify the correctness of a
7739 name-value mapping.
7740
7741 Internally, GNS uses the NAMECACHE to cache information obtained from
7742 other users, the NAMESTORE to store information specific to the local
7743 users, and the DHT to exchange data between users.
7744 A plugin API is used to enable applications to define new GNS
7745 record types.
7746
7747 @menu
7748 * libgnunetgns::
7749 * libgnunetgnsrecord::
7750 * GNS plugins::
7751 * The GNS Client-Service Protocol::
7752 * Hijacking the DNS-Traffic using gnunet-service-dns::
7753 * Serving DNS lookups via GNS on W32::
7754 * Importing DNS Zones into GNS::
7755 @end menu
7756
7757 @node libgnunetgns
7758 @subsection libgnunetgns
7759
7760 @c %**end of header
7761
7762 The GNS API itself is extremely simple. Clients first connect to the
7763 GNS service using @code{GNUNET_GNS_connect}.
7764 They can then perform lookups using @code{GNUNET_GNS_lookup} or cancel
7765 pending lookups using @code{GNUNET_GNS_lookup_cancel}.
7766 Once finished, clients disconnect using @code{GNUNET_GNS_disconnect}.
7767
7768 @menu
7769 * Looking up records::
7770 * Accessing the records::
7771 * Creating records::
7772 * Future work::
7773 @end menu
7774
7775 @node Looking up records
7776 @subsubsection Looking up records
7777
7778 @c %**end of header
7779
7780 @code{GNUNET_GNS_lookup} takes a number of arguments:
7781
7782 @table @asis
7783 @item handle This is simply the GNS connection handle from
7784 @code{GNUNET_GNS_connect}.
7785 @item name The client needs to specify the name to
7786 be resolved. This can be any valid DNS or GNS hostname.
7787 @item zone The client
7788 needs to specify the public key of the GNS zone against which the
7789 resolution should be done.
7790 Note that a key must be provided, the client should
7791 look up plausible values using its configuration,
7792 the identity service and by attempting to interpret the
7793 TLD as a base32-encoded public key.
7794 @item type This is the desired GNS or DNS record type
7795 to look for. While all records for the given name will be returned, this
7796 can be important if the client wants to resolve record types that
7797 themselves delegate resolution, such as CNAME, PKEY or GNS2DNS.
7798 Resolving a record of any of these types will only work if the respective
7799 record type is specified in the request, as the GNS resolver will
7800 otherwise follow the delegation and return the records from the
7801 respective destination, instead of the delegating record.
7802 @item only_cached This argument should typically be set to
7803 @code{GNUNET_NO}. Setting it to @code{GNUNET_YES} disables resolution via
7804 the overlay network.
7805 @item shorten_zone_key If GNS encounters new names during resolution,
7806 their respective zones can automatically be learned and added to the
7807 "shorten zone". If this is desired, clients must pass the private key of
7808 the shorten zone. If NULL is passed, shortening is disabled.
7809 @item proc This argument identifies
7810 the function to call with the result. It is given proc_cls, the number of
7811 records found (possibly zero) and the array of the records as arguments.
7812 proc will only be called once. After proc,> has been called, the lookup
7813 must no longer be canceled.
7814 @item proc_cls The closure for proc.
7815 @end table
7816
7817 @node Accessing the records
7818 @subsubsection Accessing the records
7819
7820 @c %**end of header
7821
7822 The @code{libgnunetgnsrecord} library provides an API to manipulate the
7823 GNS record array that is given to proc. In particular, it offers
7824 functions such as converting record values to human-readable
7825 strings (and back). However, most @code{libgnunetgnsrecord} functions are
7826 not interesting to GNS client applications.
7827
7828 For DNS records, the @code{libgnunetdnsparser} library provides
7829 functions for parsing (and serializing) common types of DNS records.
7830
7831 @node Creating records
7832 @subsubsection Creating records
7833
7834 @c %**end of header
7835
7836 Creating GNS records is typically done by building the respective record
7837 information (possibly with the help of @code{libgnunetgnsrecord} and
7838 @code{libgnunetdnsparser}) and then using the @code{libgnunetnamestore} to
7839 publish the information. The GNS API is not involved in this
7840 operation.
7841
7842 @node Future work
7843 @subsubsection Future work
7844
7845 @c %**end of header
7846
7847 In the future, we want to expand @code{libgnunetgns} to allow
7848 applications to observe shortening operations performed during GNS
7849 resolution, for example so that users can receive visual feedback when
7850 this happens.
7851
7852 @node libgnunetgnsrecord
7853 @subsection libgnunetgnsrecord
7854
7855 @c %**end of header
7856
7857 The @code{libgnunetgnsrecord} library is used to manipulate GNS
7858 records (in plaintext or in their encrypted format).
7859 Applications mostly interact with @code{libgnunetgnsrecord} by using the
7860 functions to convert GNS record values to strings or vice-versa, or to
7861 lookup a GNS record type number by name (or vice-versa).
7862 The library also provides various other functions that are mostly
7863 used internally within GNS, such as converting keys to names, checking for
7864 expiration, encrypting GNS records to GNS blocks, verifying GNS block
7865 signatures and decrypting GNS records from GNS blocks.
7866
7867 We will now discuss the four commonly used functions of the API.@
7868 @code{libgnunetgnsrecord} does not perform these operations itself,
7869 but instead uses plugins to perform the operation.
7870 GNUnet includes plugins to support common DNS record types as well as
7871 standard GNS record types.
7872
7873 @menu
7874 * Value handling::
7875 * Type handling::
7876 @end menu
7877
7878 @node Value handling
7879 @subsubsection Value handling
7880
7881 @c %**end of header
7882
7883 @code{GNUNET_GNSRECORD_value_to_string} can be used to convert
7884 the (binary) representation of a GNS record value to a human readable,
7885 0-terminated UTF-8 string.
7886 NULL is returned if the specified record type is not supported by any
7887 available plugin.
7888
7889 @code{GNUNET_GNSRECORD_string_to_value} can be used to try to convert a
7890 human readable string to the respective (binary) representation of
7891 a GNS record value.
7892
7893 @node Type handling
7894 @subsubsection Type handling
7895
7896 @c %**end of header
7897
7898 @code{GNUNET_GNSRECORD_typename_to_number} can be used to obtain the
7899 numeric value associated with a given typename. For example, given the
7900 typename "A" (for DNS A reocrds), the function will return the number 1.
7901 A list of common DNS record types is
7902 @uref{http://en.wikipedia.org/wiki/List_of_DNS_record_types, here}.
7903 Note that not all DNS record types are supported by GNUnet GNSRECORD
7904 plugins at this time.
7905
7906 @code{GNUNET_GNSRECORD_number_to_typename} can be used to obtain the
7907 typename associated with a given numeric value.
7908 For example, given the type number 1, the function will return the
7909 typename "A".
7910
7911 @node GNS plugins
7912 @subsection GNS plugins
7913
7914 @c %**end of header
7915
7916 Adding a new GNS record type typically involves writing (or extending) a
7917 GNSRECORD plugin. The plugin needs to implement the
7918 @code{gnunet_gnsrecord_plugin.h} API which provides basic functions that
7919 are needed by GNSRECORD to convert typenames and values of the respective
7920 record type to strings (and back).
7921 These gnsrecord plugins are typically implemented within their respective
7922 subsystems.
7923 Examples for such plugins can be found in the GNSRECORD, GNS and
7924 CONVERSATION subsystems.
7925
7926 The @code{libgnunetgnsrecord} library is then used to locate, load and
7927 query the appropriate gnsrecord plugin.
7928 Which plugin is appropriate is determined by the record type (which is
7929 just a 32-bit integer). The @code{libgnunetgnsrecord} library loads all
7930 block plugins that are installed at the local peer and forwards the
7931 application request to the plugins. If the record type is not
7932 supported by the plugin, it should simply return an error code.
7933
7934 The central functions of the block APIs (plugin and main library) are the
7935 same four functions for converting between values and strings, and
7936 typenames and numbers documented in the previous subsection.
7937
7938 @node The GNS Client-Service Protocol
7939 @subsection The GNS Client-Service Protocol
7940 @c %**end of header
7941
7942 The GNS client-service protocol consists of two simple messages, the
7943 @code{LOOKUP} message and the @code{LOOKUP_RESULT}. Each @code{LOOKUP}
7944 message contains a unique 32-bit identifier, which will be included in the
7945 corresponding response. Thus, clients can send many lookup requests in
7946 parallel and receive responses out-of-order.
7947 A @code{LOOKUP} request also includes the public key of the GNS zone,
7948 the desired record type and fields specifying whether shortening is
7949 enabled or networking is disabled. Finally, the @code{LOOKUP} message
7950 includes the name to be resolved.
7951
7952 The response includes the number of records and the records themselves
7953 in the format created by @code{GNUNET_GNSRECORD_records_serialize}.
7954 They can thus be deserialized using
7955 @code{GNUNET_GNSRECORD_records_deserialize}.
7956
7957 @node Hijacking the DNS-Traffic using gnunet-service-dns
7958 @subsection Hijacking the DNS-Traffic using gnunet-service-dns
7959
7960 @c %**end of header
7961
7962 This section documents how the gnunet-service-dns (and the
7963 gnunet-helper-dns) intercepts DNS queries from the local system.
7964 This is merely one method for how we can obtain GNS queries.
7965 It is also possible to change @code{resolv.conf} to point to a machine
7966 running @code{gnunet-dns2gns} or to modify libc's name system switch
7967 (NSS) configuration to include a GNS resolution plugin.
7968 The method described in this chapter is more of a last-ditch catch-all
7969 approach.
7970
7971 @code{gnunet-service-dns} enables intercepting DNS traffic using policy
7972 based routing.
7973 We MARK every outgoing DNS-packet if it was not sent by our application.
7974 Using a second routing table in the Linux kernel these marked packets are
7975 then routed through our virtual network interface and can thus be
7976 captured unchanged.
7977
7978 Our application then reads the query and decides how to handle it.
7979 If the query can be addressed via GNS, it is passed to
7980 @code{gnunet-service-gns} and resolved internally using GNS.
7981 In the future, a reverse query for an address of the configured virtual
7982 network could be answered with records kept about previous forward
7983 queries.
7984 Queries that are not hijacked by some application using the DNS service
7985 will be sent to the original recipient.
7986 The answer to the query will always be sent back through the virtual
7987 interface with the original nameserver as source address.
7988
7989
7990 @menu
7991 * Network Setup Details::
7992 @end menu
7993
7994 @node Network Setup Details
7995 @subsubsection Network Setup Details
7996
7997 @c %**end of header
7998
7999 The DNS interceptor adds the following rules to the Linux kernel:
8000 @example
8001 iptables -t mangle -I OUTPUT 1 -p udp --sport $LOCALPORT --dport 53 \
8002 -j ACCEPT iptables -t mangle -I OUTPUT 2 -p udp --dport 53 -j MARK \
8003 --set-mark 3 ip rule add fwmark 3 table2 ip route add default via \
8004 $VIRTUALDNS table2
8005 @end example
8006
8007 @c FIXME: Rewrite to reflect display which is no longer content by line
8008 @c FIXME: due to the < 74 characters limit.
8009 Line 1 makes sure that all packets coming from a port our application
8010 opened beforehand (@code{$LOCALPORT}) will be routed normally.
8011 Line 2 marks every other packet to a DNS-Server with mark 3 (chosen
8012 arbitrarily). The third line adds a routing policy based on this mark
8013 3 via the routing table.
8014
8015 @node Serving DNS lookups via GNS on W32
8016 @subsection Serving DNS lookups via GNS on W32
8017
8018 @c %**end of header
8019
8020 This section documents how the libw32nsp (and
8021 gnunet-gns-helper-service-w32) do DNS resolutions of DNS queries on the
8022 local system. This only applies to GNUnet running on W32.
8023
8024 W32 has a concept of "Namespaces" and "Namespace providers".
8025 These are used to present various name systems to applications in a
8026 generic way.
8027 Namespaces include DNS, mDNS, NLA and others. For each namespace any
8028 number of providers could be registered, and they are queried in an order
8029 of priority (which is adjustable).
8030
8031 Applications can resolve names by using WSALookupService*() family of
8032 functions.
8033
8034 However, these are WSA-only facilities. Common BSD socket functions for
8035 namespace resolutions are gethostbyname and getaddrinfo (among others).
8036 These functions are implemented internally (by default - by mswsock,
8037 which also implements the default DNS provider) as wrappers around
8038 WSALookupService*() functions (see "Sample Code for a Service Provider"
8039 on MSDN).
8040
8041 On W32 GNUnet builds a libw32nsp - a namespace provider, which can then be
8042 installed into the system by using w32nsp-install (and uninstalled by
8043 w32nsp-uninstall), as described in "Installation Handbook".
8044
8045 libw32nsp is very simple and has almost no dependencies. As a response to
8046 NSPLookupServiceBegin(), it only checks that the provider GUID passed to
8047 it by the caller matches GNUnet DNS Provider GUID,
8048 then connects to
8049 gnunet-gns-helper-service-w32 at 127.0.0.1:5353 (hardcoded) and sends the
8050 name resolution request there, returning the connected socket to the
8051 caller.
8052
8053 When the caller invokes NSPLookupServiceNext(), libw32nsp reads a
8054 completely formed reply from that socket, unmarshalls it, then gives
8055 it back to the caller.
8056
8057 At the moment gnunet-gns-helper-service-w32 is implemented to ever give
8058 only one reply, and subsequent calls to NSPLookupServiceNext() will fail
8059 with WSA_NODATA (first call to NSPLookupServiceNext() might also fail if
8060 GNS failed to find the name, or there was an error connecting to it).
8061
8062 gnunet-gns-helper-service-w32 does most of the processing:
8063
8064 @itemize @bullet
8065 @item Maintains a connection to GNS.
8066 @item Reads GNS config and loads appropriate keys.
8067 @item Checks service GUID and decides on the type of record to look up,
8068 refusing to make a lookup outright when unsupported service GUID is
8069 passed.
8070 @item Launches the lookup
8071 @end itemize
8072
8073 When lookup result arrives, gnunet-gns-helper-service-w32 forms a complete
8074 reply (including filling a WSAQUERYSETW structure and, possibly, a binary
8075 blob with a hostent structure for gethostbyname() client), marshalls it,
8076 and sends it back to libw32nsp. If no records were found, it sends an
8077 empty header.
8078
8079 This works for most normal applications that use gethostbyname() or
8080 getaddrinfo() to resolve names, but fails to do anything with
8081 applications that use alternative means of resolving names (such as
8082 sending queries to a DNS server directly by themselves).
8083 This includes some of well known utilities, like "ping" and "nslookup".
8084
8085 @node Importing DNS Zones into GNS
8086 @subsection Importing DNS Zones into GNS
8087
8088 @c %**end of header
8089
8090 This section discusses the challenges and problems faced when writing the
8091 Ascension tool. It also takes a look at possible improvements in the future.
8092
8093 @menu
8094 * Conversions between DNS and GNS::
8095 * DNS Zone Size::
8096 * Performance::
8097 @end menu
8098
8099 @node Conversions between DNS and GNS
8100 @subsubsection Conversions between DNS and GNS
8101
8102 The differences between the two name systems lies in the details
8103 and is not always transparent. For instance an SRV record is converted to a
8104 GNS only BOX record.
8105
8106 This is done by converting to a BOX record from an existing SRV record:
8107
8108 @example
8109 # SRV
8110 # _service._proto.name. TTL class SRV priority weight port target
8111 _sip._tcp.example.com. 14000 IN SRV     0 0 5060 www.example.com.
8112 # BOX
8113 # TTL BOX flags port protocol recordtype priority weight port target
8114 14000 BOX n 5060 6 33 0 0 5060 www.example.com
8115 @end example
8116
8117 Other records that have such a transformation is the MX record type, as well as
8118 the SOA record type.
8119
8120 Transformation of a SOA record into GNS works as described in the following
8121 example. Very important to note are the rname and mname keys.
8122 @example
8123 # BIND syntax for a clean SOA record
8124 @   IN SOA master.example.com. hostmaster.example.com. (
8125     2017030300 ; serial
8126     3600       ; refresh
8127     1800       ; retry
8128     604800     ; expire
8129     600 )      ; ttl
8130 # Recordline for adding the record
8131 $ gnunet-namestore -z example.com -a -n @ -t SOA -V rname=master.example.com \
8132   mname=hostmaster.example.com 2017030300,3600,1800,604800,600 -e 7200s
8133 @end example
8134
8135 The transformation of MX records is done in a simple way.
8136 @example
8137 # mail.example.com. 3600 IN MX 10 mail.example.com.
8138 $ gnunet-namestore -z example.com -n mail -R 3600 MX n 10,mail
8139 @end example
8140
8141 Finally, one of the biggest struggling points were the NS records that are found
8142 in top level domain zones. The intended behaviour for those is to add GNS2DNS
8143 records for those so that gnunet-gns can resolve records for those domains on
8144 its own. This requires migration of the DNS GLUE records as well, provided that
8145 they are within the same zone.
8146
8147 The following two examples show one record with a GLUE record and the other one
8148 does not have a GLUE record. This takes place in the 'com' TLD.
8149
8150 @example
8151 # ns1.example.com 86400 IN A 127.0.0.1
8152 # example.com 86400 IN NS ns1.example.com.
8153 $ gnunet-namestore -z com -n example -R 86400 GNS2DNS n example.com@@127.0.0.1
8154
8155 # example.com 86400 IN NS ns1.example.org.
8156 $ gnunet-namestore -z com -n example -R 86400 GNS2DNS n example.com@@ns1.example.org
8157 @end example
8158
8159 As you can see, one of the GNS2DNS records has an IP address listed and the
8160 other one a DNS name. For the first one there is a GLUE record to do the
8161 translation directly and the second one will issue another DNS query to figure
8162 out the IP of ns1.example.org.
8163
8164 A solution was found by creating a hierarchical zone structure in GNS and linking
8165 the zones using PKEY records to one another. This allows the resolution of the
8166 name servers to work within GNS while not taking control over unwanted zones.
8167
8168 Currently the following record types are supported:
8169 @itemize @bullet
8170 @item A
8171 @item AAAA
8172 @item CNAME
8173 @item MX
8174 @item NS
8175 @item SRV
8176 @item TXT
8177 @end itemize
8178
8179 This is not due to a technical limitation but rather a practical one. The
8180 problem occurs with DNSSEC enabled DNS zones. As records within those zones are
8181 signed periodically, and every new signature is an update to the zone, there are
8182 many revisions of zones. This results in a problem with bigger zones as there
8183 are lots of records that have been signed again but no major changes.  Also
8184 trying to add records that are unknown that require a different format take time
8185 as they cause a CLI call of the namestore.  Furthermore certain record types
8186 need transformation into a GNS compatible format which, depending on the record
8187 type, takes more time.
8188
8189 @node DNS Zone Size
8190 @subsubsection DNS Zone Size
8191
8192 Another very big problem exists with very large zones. When migrating a small
8193 zone the delay between adding of records and their expiry is negligible. However
8194 when working with a TLD zone that has more that 1 million records this delay
8195 becomes a problem.
8196
8197 Records will start to expire well before the zone has finished migrating. This
8198 causes unwanted anomalies when trying to resolve records.
8199
8200 A good solution has not been found yet. One of the idea that floated around was
8201 that the records should be added with the s (shadow) flag to keep the records
8202 resolvable even if they expired. However this would introduce the problem of how
8203 to detect if a record has been removed from the zone and would require deletion
8204 of said record(s).
8205
8206 Another problem that still persists is how to refresh records. Expired records
8207 are still displayed when calling gnunet-namestore but do not resolve with
8208 gnunet-gns. When doing incremental zone transfers this becomes especially
8209 apparent.
8210
8211 I estimate that the limit lies at about 200'000 records in a zone as this is
8212 the limit that my machine is capable of adding within one hour.  This was
8213 calculated by running cProfile on the application with a zone of 5000 records
8214 and calculating what abouts a much bigger zones with 8 million records would
8215 take. This results in a nice metric of records migrated per hour.
8216
8217 @node Performance
8218 @subsubsection Performance
8219 The performance when migrating a zone using the Ascension tool is limited by a
8220 handful of factors. First of all ascension is written in Python3 and calls the
8221 CLI tools of GNUnet. Furthermore all the records that are added to the same
8222 label are signed using the zones private key. This signing operation is very
8223 resource heavy and was optimized during development by adding the '-R'
8224 (Recordline) option to gnunet-namestore. This allows to add multiple records
8225 at once using the CLI.
8226
8227 The result of this was a much faster migration of TLD zones, as most records
8228 with the same label have two name servers.
8229
8230 Another improvement that could be made is with the addition of multiple threads
8231 when opening the GNUnet CLI tools. This could be implemented by simply creating
8232 more workers in the program but performance improvements were not tested.
8233
8234 During the entire development of Ascension sqlite was used as a database
8235 backend for GNUnet. Other backends have not been tested yet.
8236
8237 In conclusion there are many bottlenecks still around in the program, namely the
8238 signing process and the single threaded implementation. In the future a solution
8239 that uses the C API would be cleaner and better.
8240
8241 @cindex GNS Namecache
8242 @node GNS Namecache
8243 @section GNS Namecache
8244
8245 @c %**end of header
8246
8247 The NAMECACHE subsystem is responsible for caching (encrypted) resolution
8248 results of the GNU Name System (GNS). GNS makes zone information available
8249 to other users via the DHT. However, as accessing the DHT for every
8250 lookup is expensive (and as the DHT's local cache is lost whenever the
8251 peer is restarted), GNS uses the NAMECACHE as a more persistent cache for
8252 DHT lookups.
8253 Thus, instead of always looking up every name in the DHT, GNS first
8254 checks if the result is already available locally in the NAMECACHE.
8255 Only if there is no result in the NAMECACHE, GNS queries the DHT.
8256 The NAMECACHE stores data in the same (encrypted) format as the DHT.
8257 It thus makes no sense to iterate over all items in the
8258 NAMECACHE --- the NAMECACHE does not have a way to provide the keys
8259 required to decrypt the entries.
8260
8261 Blocks in the NAMECACHE share the same expiration mechanism as blocks in
8262 the DHT --- the block expires wheneever any of the records in
8263 the (encrypted) block expires.
8264 The expiration time of the block is the only information stored in
8265 plaintext. The NAMECACHE service internally performs all of the required
8266 work to expire blocks, clients do not have to worry about this.
8267 Also, given that NAMECACHE stores only GNS blocks that local users
8268 requested, there is no configuration option to limit the size of the
8269 NAMECACHE. It is assumed to be always small enough (a few MB) to fit on
8270 the drive.
8271
8272 The NAMECACHE supports the use of different database backends via a
8273 plugin API.
8274
8275 @menu
8276 * libgnunetnamecache::
8277 * The NAMECACHE Client-Service Protocol::
8278 * The NAMECACHE Plugin API::
8279 @end menu
8280
8281 @node libgnunetnamecache
8282 @subsection libgnunetnamecache
8283
8284 @c %**end of header
8285
8286 The NAMECACHE API consists of five simple functions. First, there is
8287 @code{GNUNET_NAMECACHE_connect} to connect to the NAMECACHE service.
8288 This returns the handle required for all other operations on the
8289 NAMECACHE. Using @code{GNUNET_NAMECACHE_block_cache} clients can insert a
8290 block into the cache.
8291 @code{GNUNET_NAMECACHE_lookup_block} can be used to lookup blocks that
8292 were stored in the NAMECACHE. Both operations can be canceled using
8293 @code{GNUNET_NAMECACHE_cancel}. Note that canceling a
8294 @code{GNUNET_NAMECACHE_block_cache} operation can result in the block
8295 being stored in the NAMECACHE --- or not. Cancellation primarily ensures
8296 that the continuation function with the result of the operation will no
8297 longer be invoked.
8298 Finally, @code{GNUNET_NAMECACHE_disconnect} closes the connection to the
8299 NAMECACHE.
8300
8301 The maximum size of a block that can be stored in the NAMECACHE is
8302 @code{GNUNET_NAMECACHE_MAX_VALUE_SIZE}, which is defined to be 63 kB.
8303
8304 @node The NAMECACHE Client-Service Protocol
8305 @subsection The NAMECACHE Client-Service Protocol
8306
8307 @c %**end of header
8308
8309 All messages in the NAMECACHE IPC protocol start with the
8310 @code{struct GNUNET_NAMECACHE_Header} which adds a request
8311 ID (32-bit integer) to the standard message header.
8312 The request ID is used to match requests with the
8313 respective responses from the NAMECACHE, as they are allowed to happen
8314 out-of-order.
8315
8316
8317 @menu
8318 * Lookup::
8319 * Store::
8320 @end menu
8321
8322 @node Lookup
8323 @subsubsection Lookup
8324
8325 @c %**end of header
8326
8327 The @code{struct LookupBlockMessage} is used to lookup a block stored in
8328 the cache.
8329 It contains the query hash. The NAMECACHE always responds with a
8330 @code{struct LookupBlockResponseMessage}. If the NAMECACHE has no
8331 response, it sets the expiration time in the response to zero.
8332 Otherwise, the response is expected to contain the expiration time, the
8333 ECDSA signature, the derived key and the (variable-size) encrypted data
8334 of the block.
8335
8336 @node Store
8337 @subsubsection Store
8338
8339 @c %**end of header
8340
8341 The @code{struct BlockCacheMessage} is used to cache a block in the
8342 NAMECACHE.
8343 It has the same structure as the @code{struct LookupBlockResponseMessage}.
8344 The service responds with a @code{struct BlockCacheResponseMessage} which
8345 contains the result of the operation (success or failure).
8346 In the future, we might want to make it possible to provide an error
8347 message as well.
8348
8349 @node The NAMECACHE Plugin API
8350 @subsection The NAMECACHE Plugin API
8351 @c %**end of header
8352
8353 The NAMECACHE plugin API consists of two functions, @code{cache_block} to
8354 store a block in the database, and @code{lookup_block} to lookup a block
8355 in the database.
8356
8357
8358 @menu
8359 * Lookup2::
8360 * Store2::
8361 @end menu
8362
8363 @node Lookup2
8364 @subsubsection Lookup2
8365
8366 @c %**end of header
8367
8368 The @code{lookup_block} function is expected to return at most one block
8369 to the iterator, and return @code{GNUNET_NO} if there were no non-expired
8370 results.
8371 If there are multiple non-expired results in the cache, the lookup is
8372 supposed to return the result with the largest expiration time.
8373
8374 @node Store2
8375 @subsubsection Store2
8376
8377 @c %**end of header
8378
8379 The @code{cache_block} function is expected to try to store the block in
8380 the database, and return @code{GNUNET_SYSERR} if this was not possible
8381 for any reason.
8382 Furthermore, @code{cache_block} is expected to implicitly perform cache
8383 maintenance and purge blocks from the cache that have expired. Note that
8384 @code{cache_block} might encounter the case where the database already has
8385 another block stored under the same key. In this case, the plugin must
8386 ensure that the block with the larger expiration time is preserved.
8387 Obviously, this can done either by simply adding new blocks and selecting
8388 for the most recent expiration time during lookup, or by checking which
8389 block is more recent during the store operation.
8390
8391 @cindex REVOCATION Subsystem
8392 @node REVOCATION Subsystem
8393 @section REVOCATION Subsystem
8394 @c %**end of header
8395
8396 The REVOCATION subsystem is responsible for key revocation of Egos.
8397 If a user learns that theis private key has been compromised or has lost
8398 it, they can use the REVOCATION system to inform all of the other users
8399 that their private key is no longer valid.
8400 The subsystem thus includes ways to query for the validity of keys and to
8401 propagate revocation messages.
8402
8403 @menu
8404 * Dissemination::
8405 * Revocation Message Design Requirements::
8406 * libgnunetrevocation::
8407 * The REVOCATION Client-Service Protocol::
8408 * The REVOCATION Peer-to-Peer Protocol::
8409 @end menu
8410
8411 @node Dissemination
8412 @subsection Dissemination
8413
8414 @c %**end of header
8415
8416 When a revocation is performed, the revocation is first of all
8417 disseminated by flooding the overlay network.
8418 The goal is to reach every peer, so that when a peer needs to check if a
8419 key has been revoked, this will be purely a local operation where the
8420 peer looks at its local revocation list. Flooding the network is also the
8421 most robust form of key revocation --- an adversary would have to control
8422 a separator of the overlay graph to restrict the propagation of the
8423 revocation message. Flooding is also very easy to implement --- peers that
8424 receive a revocation message for a key that they have never seen before
8425 simply pass the message to all of their neighbours.
8426
8427 Flooding can only distribute the revocation message to peers that are
8428 online.
8429 In order to notify peers that join the network later, the revocation
8430 service performs efficient set reconciliation over the sets of known
8431 revocation messages whenever two peers (that both support REVOCATION
8432 dissemination) connect.
8433 The SET service is used to perform this operation efficiently.
8434
8435 @node Revocation Message Design Requirements
8436 @subsection Revocation Message Design Requirements
8437
8438 @c %**end of header
8439
8440 However, flooding is also quite costly, creating O(|E|) messages on a
8441 network with |E| edges.
8442 Thus, revocation messages are required to contain a proof-of-work, the
8443 result of an expensive computation (which, however, is cheap to verify).
8444 Only peers that have expended the CPU time necessary to provide
8445 this proof will be able to flood the network with the revocation message.
8446 This ensures that an attacker cannot simply flood the network with
8447 millions of revocation messages. The proof-of-work required by GNUnet is
8448 set to take days on a typical PC to compute; if the ability to quickly
8449 revoke a key is needed, users have the option to pre-compute revocation
8450 messages to store off-line and use instantly after their key has expired.
8451
8452 Revocation messages must also be signed by the private key that is being
8453 revoked. Thus, they can only be created while the private key is in the
8454 possession of the respective user. This is another reason to create a
8455 revocation message ahead of time and store it in a secure location.
8456
8457 @node libgnunetrevocation
8458 @subsection libgnunetrevocation
8459
8460 @c %**end of header
8461
8462 The REVOCATION API consists of two parts, to query and to issue
8463 revocations.
8464
8465
8466 @menu
8467 * Querying for revoked keys::
8468 * Preparing revocations::
8469 * Issuing revocations::
8470 @end menu
8471
8472 @node Querying for revoked keys
8473 @subsubsection Querying for revoked keys
8474
8475 @c %**end of header
8476
8477 @code{GNUNET_REVOCATION_query} is used to check if a given ECDSA public
8478 key has been revoked.
8479 The given callback will be invoked with the result of the check.
8480 The query can be canceled using @code{GNUNET_REVOCATION_query_cancel} on
8481 the return value.
8482
8483 @node Preparing revocations
8484 @subsubsection Preparing revocations
8485
8486 @c %**end of header
8487
8488 It is often desirable to create a revocation record ahead-of-time and
8489 store it in an off-line location to be used later in an emergency.
8490 This is particularly true for GNUnet revocations, where performing the
8491 revocation operation itself is computationally expensive and thus is
8492 likely to take some time.
8493 Thus, if users want the ability to perform revocations quickly in an
8494 emergency, they must pre-compute the revocation message.
8495 The revocation API enables this with two functions that are used to
8496 compute the revocation message, but not trigger the actual revocation
8497 operation.
8498
8499 @code{GNUNET_REVOCATION_check_pow} should be used to calculate the
8500 proof-of-work required in the revocation message. This function takes the
8501 public key, the required number of bits for the proof of work (which in
8502 GNUnet is a network-wide constant) and finally a proof-of-work number as
8503 arguments.
8504 The function then checks if the given proof-of-work number is a valid
8505 proof of work for the given public key. Clients preparing a revocation
8506 are expected to call this function repeatedly (typically with a
8507 monotonically increasing sequence of numbers of the proof-of-work number)
8508 until a given number satisfies the check.
8509 That number should then be saved for later use in the revocation
8510 operation.
8511
8512 @code{GNUNET_REVOCATION_sign_revocation} is used to generate the
8513 signature that is required in a revocation message.
8514 It takes the private key that (possibly in the future) is to be revoked
8515 and returns the signature.
8516 The signature can again be saved to disk for later use, which will then
8517 allow performing a revocation even without access to the private key.
8518
8519 @node Issuing revocations
8520 @subsubsection Issuing revocations
8521
8522
8523 Given a ECDSA public key, the signature from @code{GNUNET_REVOCATION_sign}
8524 and the proof-of-work,
8525 @code{GNUNET_REVOCATION_revoke} can be used to perform the
8526 actual revocation. The given callback is called upon completion of the
8527 operation. @code{GNUNET_REVOCATION_revoke_cancel} can be used to stop the
8528 library from calling the continuation; however, in that case it is
8529 undefined whether or not the revocation operation will be executed.
8530
8531 @node The REVOCATION Client-Service Protocol
8532 @subsection The REVOCATION Client-Service Protocol
8533
8534
8535 The REVOCATION protocol consists of four simple messages.
8536
8537 A @code{QueryMessage} containing a public ECDSA key is used to check if a
8538 particular key has been revoked. The service responds with a
8539 @code{QueryResponseMessage} which simply contains a bit that says if the
8540 given public key is still valid, or if it has been revoked.
8541
8542 The second possible interaction is for a client to revoke a key by
8543 passing a @code{RevokeMessage} to the service. The @code{RevokeMessage}
8544 contains the ECDSA public key to be revoked, a signature by the
8545 corresponding private key and the proof-of-work, The service responds
8546 with a @code{RevocationResponseMessage} which can be used to indicate
8547 that the @code{RevokeMessage} was invalid (i.e. proof of work incorrect),
8548 or otherwise indicates that the revocation has been processed
8549 successfully.
8550
8551 @node The REVOCATION Peer-to-Peer Protocol
8552 @subsection The REVOCATION Peer-to-Peer Protocol
8553
8554 @c %**end of header
8555
8556 Revocation uses two disjoint ways to spread revocation information among
8557 peers.
8558 First of all, P2P gossip exchanged via CORE-level neighbours is used to
8559 quickly spread revocations to all connected peers.
8560 Second, whenever two peers (that both support revocations) connect,
8561 the SET service is used to compute the union of the respective revocation
8562 sets.
8563
8564 In both cases, the exchanged messages are @code{RevokeMessage}s which
8565 contain the public key that is being revoked, a matching ECDSA signature,
8566 and a proof-of-work.
8567 Whenever a peer learns about a new revocation this way, it first
8568 validates the signature and the proof-of-work, then stores it to disk
8569 (typically to a file $GNUNET_DATA_HOME/revocation.dat) and finally
8570 spreads the information to all directly connected neighbours.
8571
8572 For computing the union using the SET service, the peer with the smaller
8573 hashed peer identity will connect (as a "client" in the two-party set
8574 protocol) to the other peer after one second (to reduce traffic spikes
8575 on connect) and initiate the computation of the set union.
8576 All revocation services use a common hash to identify the SET operation
8577 over revocation sets.
8578
8579 The current implementation accepts revocation set union operations from
8580 all peers at any time; however, well-behaved peers should only initiate
8581 this operation once after establishing a connection to a peer with a
8582 larger hashed peer identity.
8583
8584 @cindex FS
8585 @cindex FS Subsystem
8586 @node File-sharing (FS) Subsystem
8587 @section File-sharing (FS) Subsystem
8588
8589 @c %**end of header
8590
8591 This chapter describes the details of how the file-sharing service works.
8592 As with all services, it is split into an API (libgnunetfs), the service
8593 process (gnunet-service-fs) and user interface(s).
8594 The file-sharing service uses the datastore service to store blocks and
8595 the DHT (and indirectly datacache) for lookups for non-anonymous
8596 file-sharing.
8597 Furthermore, the file-sharing service uses the block library (and the
8598 block fs plugin) for validation of DHT operations.
8599
8600 In contrast to many other services, libgnunetfs is rather complex since
8601 the client library includes a large number of high-level abstractions;
8602 this is necessary since the Fs service itself largely only operates on
8603 the block level.
8604 The FS library is responsible for providing a file-based abstraction to
8605 applications, including directories, meta data, keyword search,
8606 verification, and so on.
8607
8608 The method used by GNUnet to break large files into blocks and to use
8609 keyword search is called the
8610 "Encoding for Censorship Resistant Sharing" (ECRS).
8611 ECRS is largely implemented in the fs library; block validation is also
8612 reflected in the block FS plugin and the FS service.
8613 ECRS on-demand encoding is implemented in the FS service.
8614
8615 NOTE: The documentation in this chapter is quite incomplete.
8616
8617 @menu
8618 * Encoding for Censorship-Resistant Sharing (ECRS)::
8619 * File-sharing persistence directory structure::
8620 @end menu
8621
8622 @cindex ECRS
8623 @cindex Encoding for Censorship-Resistant Sharing
8624 @node Encoding for Censorship-Resistant Sharing (ECRS)
8625 @subsection Encoding for Censorship-Resistant Sharing (ECRS)
8626
8627 @c %**end of header
8628
8629 When GNUnet shares files, it uses a content encoding that is called ECRS,
8630 the Encoding for Censorship-Resistant Sharing.
8631 Most of ECRS is described in the (so far unpublished) research paper
8632 attached to this page. ECRS obsoletes the previous ESED and ESED II
8633 encodings which were used in GNUnet before version 0.7.0.
8634 The rest of this page assumes that the reader is familiar with the
8635 attached paper. What follows is a description of some minor extensions
8636 that GNUnet makes over what is described in the paper.
8637 The reason why these extensions are not in the paper is that we felt
8638 that they were obvious or trivial extensions to the original scheme and
8639 thus did not warrant space in the research report.
8640
8641 @menu
8642 * Namespace Advertisements::
8643 * KSBlocks::
8644 @end menu
8645
8646 @node Namespace Advertisements
8647 @subsubsection Namespace Advertisements
8648
8649 @c %**end of header
8650 @c %**FIXME: all zeroses -> ?
8651
8652 An @code{SBlock} with identifier all zeros is a signed
8653 advertisement for a namespace. This special @code{SBlock} contains
8654 metadata describing the content of the namespace.
8655 Instead of the name of the identifier for a potential update, it contains
8656 the identifier for the root of the namespace.
8657 The URI should always be empty. The @code{SBlock} is signed with the
8658 content provider's RSA private key (just like any other SBlock). Peers
8659 can search for @code{SBlock}s in order to find out more about a namespace.
8660
8661 @node KSBlocks
8662 @subsubsection KSBlocks
8663
8664 @c %**end of header
8665
8666 GNUnet implements @code{KSBlocks} which are @code{KBlocks} that, instead
8667 of encrypting a CHK and metadata, encrypt an @code{SBlock} instead.
8668 In other words, @code{KSBlocks} enable GNUnet to find @code{SBlocks}
8669 using the global keyword search.
8670 Usually the encrypted @code{SBlock} is a namespace advertisement.
8671 The rationale behind @code{KSBlock}s and @code{SBlock}s is to enable
8672 peers to discover namespaces via keyword searches, and, to associate
8673 useful information with namespaces. When GNUnet finds @code{KSBlocks}
8674 during a normal keyword search, it adds the information to an internal
8675 list of discovered namespaces. Users looking for interesting namespaces
8676 can then inspect this list, reducing the need for out-of-band discovery
8677 of namespaces.
8678 Naturally, namespaces (or more specifically, namespace advertisements) can
8679 also be referenced from directories, but @code{KSBlock}s should make it
8680 easier to advertise namespaces for the owner of the pseudonym since they
8681 eliminate the need to first create a directory.
8682
8683 Collections are also advertised using @code{KSBlock}s.
8684
8685 @c https://old.gnunet.org/sites/default/files/ecrs.pdf
8686
8687 @node File-sharing persistence directory structure
8688 @subsection File-sharing persistence directory structure
8689
8690 @c %**end of header
8691
8692 This section documents how the file-sharing library implements
8693 persistence of file-sharing operations and specifically the resulting
8694 directory structure.
8695 This code is only active if the @code{GNUNET_FS_FLAGS_PERSISTENCE} flag
8696 was set when calling @code{GNUNET_FS_start}.
8697 In this case, the file-sharing library will try hard to ensure that all
8698 major operations (searching, downloading, publishing, unindexing) are
8699 persistent, that is, can live longer than the process itself.
8700 More specifically, an operation is supposed to live until it is
8701 explicitly stopped.
8702
8703 If @code{GNUNET_FS_stop} is called before an operation has been stopped, a
8704 @code{SUSPEND} event is generated and then when the process calls
8705 @code{GNUNET_FS_start} next time, a @code{RESUME} event is generated.
8706 Additionally, even if an application crashes (segfault, SIGKILL, system
8707 crash) and hence @code{GNUNET_FS_stop} is never called and no
8708 @code{SUSPEND} events are generated, operations are still resumed (with
8709 @code{RESUME} events).
8710 This is implemented by constantly writing the current state of the
8711 file-sharing operations to disk.
8712 Specifically, the current state is always written to disk whenever
8713 anything significant changes (the exception are block-wise progress in
8714 publishing and unindexing, since those operations would be slowed down
8715 significantly and can be resumed cheaply even without detailed
8716 accounting).
8717 Note that if the process crashes (or is killed) during a serialization
8718 operation, FS does not guarantee that this specific operation is
8719 recoverable (no strict transactional semantics, again for performance
8720 reasons). However, all other unrelated operations should resume nicely.
8721
8722 Since we need to serialize the state continuously and want to recover as
8723 much as possible even after crashing during a serialization operation,
8724 we do not use one large file for serialization.
8725 Instead, several directories are used for the various operations.
8726 When @code{GNUNET_FS_start} executes, the master directories are scanned
8727 for files describing operations to resume.
8728 Sometimes, these operations can refer to related operations in child
8729 directories which may also be resumed at this point.
8730 Note that corrupted files are cleaned up automatically.
8731 However, dangling files in child directories (those that are not
8732 referenced by files from the master directories) are not automatically
8733 removed.
8734
8735 Persistence data is kept in a directory that begins with the "STATE_DIR"
8736 prefix from the configuration file
8737 (by default, "$SERVICEHOME/persistence/") followed by the name of the
8738 client as given to @code{GNUNET_FS_start} (for example, "gnunet-gtk")
8739 followed by the actual name of the master or child directory.
8740
8741 The names for the master directories follow the names of the operations:
8742
8743 @itemize @bullet
8744 @item "search"
8745 @item "download"
8746 @item "publish"
8747 @item "unindex"
8748 @end itemize
8749
8750 Each of the master directories contains names (chosen at random) for each
8751 active top-level (master) operation.
8752 Note that a download that is associated with a search result is not a
8753 top-level operation.
8754
8755 In contrast to the master directories, the child directories are only
8756 consulted when another operation refers to them.
8757 For each search, a subdirectory (named after the master search
8758 synchronization file) contains the search results.
8759 Search results can have an associated download, which is then stored in
8760 the general "download-child" directory.
8761 Downloads can be recursive, in which case children are stored in
8762 subdirectories mirroring the structure of the recursive download
8763 (either starting in the master "download" directory or in the
8764 "download-child" directory depending on how the download was initiated).
8765 For publishing operations, the "publish-file" directory contains
8766 information about the individual files and directories that are part of
8767 the publication.
8768 However, this directory structure is flat and does not mirror the
8769 structure of the publishing operation.
8770 Note that unindex operations cannot have associated child operations.
8771
8772 @cindex REGEX subsystem
8773 @node REGEX Subsystem
8774 @section REGEX Subsystem
8775
8776 @c %**end of header
8777
8778 Using the REGEX subsystem, you can discover peers that offer a particular
8779 service using regular expressions.
8780 The peers that offer a service specify it using a regular expressions.
8781 Peers that want to patronize a service search using a string.
8782 The REGEX subsystem will then use the DHT to return a set of matching
8783 offerers to the patrons.
8784
8785 For the technical details, we have Max's defense talk and Max's Master's
8786 thesis.
8787
8788 @c An additional publication is under preparation and available to
8789 @c team members (in Git).
8790 @c FIXME: Where is the file? Point to it. Assuming that it's szengel2012ms
8791
8792 @menu
8793 * How to run the regex profiler::
8794 @end menu
8795
8796 @node How to run the regex profiler
8797 @subsection How to run the regex profiler
8798
8799 @c %**end of header
8800
8801 The gnunet-regex-profiler can be used to profile the usage of mesh/regex
8802 for a given set of regular expressions and strings.
8803 Mesh/regex allows you to announce your peer ID under a certain regex and
8804 search for peers matching a particular regex using a string.
8805 See @uref{https://old.gnunet.org/szengel2012ms, szengel2012ms} for a full
8806 introduction.
8807
8808 First of all, the regex profiler uses GNUnet testbed, thus all the
8809 implications for testbed also apply to the regex profiler
8810 (for example you need password-less ssh login to the machines listed in
8811 your hosts file).
8812
8813 @strong{Configuration}
8814
8815 Moreover, an appropriate configuration file is needed.
8816 Generally you can refer to the
8817 @file{contrib/regex_profiler_infiniband.conf} file in the sourcecode
8818 of GNUnet for an example configuration.
8819 In the following paragraph the important details are highlighted.
8820
8821 Announcing of the regular expressions is done by the
8822 gnunet-daemon-regexprofiler, therefore you have to make sure it is
8823 started, by adding it to the START_ON_DEMAND set of ARM:
8824
8825 @example
8826 [regexprofiler]
8827 START_ON_DEMAND = YES
8828 @end example
8829
8830 @noindent
8831 Furthermore you have to specify the location of the binary:
8832
8833 @example
8834 [regexprofiler]
8835 # Location of the gnunet-daemon-regexprofiler binary.
8836 BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler
8837 # Regex prefix that will be applied to all regular expressions and
8838 # search string.
8839 REGEX_PREFIX = "GNVPN-0001-PAD"
8840 @end example
8841
8842 @noindent
8843 When running the profiler with a large scale deployment, you probably
8844 want to reduce the workload of each peer.
8845 Use the following options to do this.
8846
8847 @example
8848 [dht]
8849 # Force network size estimation
8850 FORCE_NSE = 1
8851
8852 [dhtcache]
8853 DATABASE = heap
8854 # Disable RC-file for Bloom filter? (for benchmarking with limited IO
8855 # availability)
8856 DISABLE_BF_RC = YES
8857 # Disable Bloom filter entirely
8858 DISABLE_BF = YES
8859
8860 [nse]
8861 # Minimize proof-of-work CPU consumption by NSE
8862 WORKBITS = 1
8863 @end example
8864
8865 @noindent
8866 @strong{Options}
8867
8868 To finally run the profiler some options and the input data need to be
8869 specified on the command line.
8870
8871 @example
8872 gnunet-regex-profiler -c config-file -d log-file -n num-links \
8873 -p path-compression-length -s search-delay -t matching-timeout \
8874 -a num-search-strings hosts-file policy-dir search-strings-file
8875 @end example
8876
8877 @noindent
8878 Where...
8879
8880 @itemize @bullet
8881 @item ... @code{config-file} means the configuration file created earlier.
8882 @item ... @code{log-file} is the file where to write statistics output.
8883 @item ... @code{num-links} indicates the number of random links between
8884 started peers.
8885 @item ... @code{path-compression-length} is the maximum path compression
8886 length in the DFA.
8887 @item ... @code{search-delay} time to wait between peers finished linking
8888 and starting to match strings.
8889 @item ... @code{matching-timeout} timeout after which to cancel the
8890 searching.
8891 @item ... @code{num-search-strings} number of strings in the
8892 search-strings-file.
8893 @item ... the @code{hosts-file} should contain a list of hosts for the
8894 testbed, one per line in the following format:
8895
8896 @itemize @bullet
8897 @item @code{user@@host_ip:port}
8898 @end itemize
8899 @item ... the @code{policy-dir} is a folder containing text files
8900 containing one or more regular expressions. A peer is started for each
8901 file in that folder and the regular expressions in the corresponding file
8902 are announced by this peer.
8903 @item ... the @code{search-strings-file} is a text file containing search
8904 strings, one in each line.
8905 @end itemize
8906
8907 @noindent
8908 You can create regular expressions and search strings for every AS in the
8909 Internet using the attached scripts. You need one of the
8910 @uref{http://data.caida.org/datasets/routing/routeviews-prefix2as/, CAIDA routeviews prefix2as}
8911 data files for this. Run
8912
8913 @example
8914 create_regex.py <filename> <output path>
8915 @end example
8916
8917 @noindent
8918 to create the regular expressions and
8919
8920 @example
8921 create_strings.py <input path> <outfile>
8922 @end example
8923
8924 @noindent
8925 to create a search strings file from the previously created
8926 regular expressions.
8927
8928 @cindex REST subsystem
8929 @node REST Subsystem
8930 @section REST Subsystem
8931
8932 @c %**end of header
8933
8934 Using the REST subsystem, you can expose REST-based APIs or services.
8935 The REST service is designed as a pluggable architecture.
8936 To create a new REST endpoint, simply add a library in the form
8937 ``plugin_rest_*''.
8938 The REST service will automatically load all REST plugins on startup.
8939
8940 @strong{Configuration}
8941
8942 The REST service can be configured in various ways.
8943 The reference config file can be found in
8944 @file{src/rest/rest.conf}:
8945 @example
8946 [rest]
8947 REST_PORT=7776
8948 REST_ALLOW_HEADERS=Authorization,Accept,Content-Type
8949 REST_ALLOW_ORIGIN=*
8950 REST_ALLOW_CREDENTIALS=true
8951 @end example
8952
8953 The port as well as
8954 @deffn{cross-origin resource sharing} (CORS)
8955 @end deffn
8956 headers that are supposed to be advertised by the rest service are
8957 configurable.
8958
8959 @menu
8960 * Namespace considerations::
8961 * Endpoint documentation::
8962 @end menu
8963
8964 @node Namespace considerations
8965 @subsection Namespace considerations
8966
8967 The @command{gnunet-rest-service} will load all plugins that are installed.
8968 As such it is important that the endpoint namespaces do not clash.
8969
8970 For example, plugin X might expose the endpoint ``/xxx'' while plugin Y
8971 exposes endpoint ``/xxx/yyy''.
8972 This is a problem if plugin X is also supposed to handle a call
8973 to ``/xxx/yyy''.
8974 Currently the REST service will not complain or warn about such clashes,
8975 so please make sure that endpoints are unambiguous.
8976
8977 @node Endpoint documentation
8978 @subsection Endpoint documentation
8979
8980 This is WIP. Endpoints should be documented appropriately.
8981 Preferably using annotations.
8982