doc/documentation/chapters/developer.texi

   1 @c ***********************************************************************
   2 @node GNUnet Developer Handbook
   3 @chapter GNUnet Developer Handbook
   4
   5 This book is intended to be an introduction for programmers that want to
   6 extend the GNUnet framework. GNUnet is more than a simple peer-to-peer
   7 application.
   8
   9 For developers, GNUnet is:
  10
  11 @itemize @bullet
  12 @item developed by a community that believes in the GNU philosophy
  13 @item Free Software (Free as in Freedom), licensed under the
  14 GNU General Public License@footnote{@uref{https://www.gnu.org/licenses/licenses.html#GPL, https://www.gnu.org/licenses/licenses.html#GPL}}
  15 @item A set of standards, including coding conventions and
  16 architectural rules
  17 @item A set of layered protocols, both specifying the communication
  18 between peers as well as the communication between components
  19 of a single peer
  20 @item A set of libraries with well-defined APIs suitable for
  21 writing extensions
  22 @end itemize
  23
  24 In particular, the architecture specifies that a peer consists of many
  25 processes communicating via protocols. Processes can be written in almost
  26 any language.
  27 @code{C}, @code{Java} and @code{Guile} APIs exist for accessing existing
  28 services and for writing extensions.
  29 It is possible to write extensions in other languages by
  30 implementing the necessary IPC protocols.
  31
  32 GNUnet can be extended and improved along many possible dimensions, and
  33 anyone interested in Free Software and Freedom-enhancing Networking is
  34 welcome to join the effort. This Developer Handbook attempts to provide
  35 an initial introduction to some of the key design choices and central
  36 components of the system.
  37 This part of the GNUNet documentation is far from complete,
  38 and we welcome informed contributions, be it in the form of
  39 new chapters, sections or insightful comments.
  40
  41 @menu
  42 * Developer Introduction::
  43 * Code overview::
  44 * System Architecture::
  45 * Subsystem stability::
  46 * Naming conventions and coding style guide::
  47 * Build-system::
  48 * Developing extensions for GNUnet using the gnunet-ext template::
  49 * Writing testcases::
  50 * TESTING library::
  51 * Performance regression analysis with Gauger::
  52 * TESTBED Subsystem::
  53 * libgnunetutil::
  54 * Automatic Restart Manager (ARM)::
  55 * TRANSPORT Subsystem::
  56 * NAT library::
  57 * Distance-Vector plugin::
  58 * SMTP plugin::
  59 * Bluetooth plugin::
  60 * WLAN plugin::
  61 * ATS Subsystem::
  62 * CORE Subsystem::
  63 * CADET Subsystem::
  64 * NSE Subsystem::
  65 * HOSTLIST Subsystem::
  66 * IDENTITY Subsystem::
  67 * NAMESTORE Subsystem::
  68 * PEERINFO Subsystem::
  69 * PEERSTORE Subsystem::
  70 * SET Subsystem::
  71 * STATISTICS Subsystem::
  72 * Distributed Hash Table (DHT)::
  73 * GNU Name System (GNS)::
  74 * GNS Namecache::
  75 * REVOCATION Subsystem::
  76 * File-sharing (FS) Subsystem::
  77 * REGEX Subsystem::
  78 * REST Subsystem::
  79 @end menu
  80
  81 @node Developer Introduction
  82 @section Developer Introduction
  83
  84 This Developer Handbook is intended as first introduction to GNUnet for
  85 new developers that want to extend the GNUnet framework. After the
  86 introduction, each of the GNUnet subsystems (directories in the
  87 @file{src/} tree) is (supposed to be) covered in its own chapter. In
  88 addition to this documentation, GNUnet developers should be aware of the
  89 services available on the GNUnet server to them.
  90
  91 New developers can have a look a the GNUnet tutorials for C and java
  92 available in the @file{src/} directory of the repository or under the
  93 following links:
  94
  95 @c ** FIXME: Link to files in source, not online.
  96 @c ** FIXME: Where is the Java tutorial?
  97 @itemize @bullet
  98 @item @xref{Top, Introduction,, gnunet-c-tutorial, The GNUnet C Tutorial}.
  99 @c broken link
 100 @c @item @uref{https://gnunet.org/git/gnunet.git/plain/doc/gnunet-c-tutorial.pdf, GNUnet C tutorial}
 101 @item GNUnet Java tutorial
 102 @end itemize
 103
 104 In addition to the GNUnet Reference Documentation you are reading,
 105 the GNUnet server at @uref{https://gnunet.org} contains
 106 various resources for GNUnet developers and those
 107 who aspire to become regular contributors.
 108 They are all conveniently reachable via the "Developer"
 109 entry in the navigation menu. Some additional tools (such as static
 110 analysis reports) require a special developer access to perform certain
 111 operations. If you want (or require) access, you should contact
 112 @uref{http://grothoff.org/christian/, Christian Grothoff},
 113 GNUnet's maintainer.
 114
 115 @c FIXME: A good part of this belongs on the website or should be
 116 @c extended in subsections explaining usage of this. A simple list
 117 @c is just taking space people have to read.
 118 The public subsystems on the GNUnet server that help developers are:
 119
 120 @itemize @bullet
 121
 122 @item The version control system (git) keeps our code and enables
 123 distributed development.
 124 It is publicly accessible at @uref{https://gnunet.org/git/}.
 125 Only developers with write access can commit code, everyone else is
 126 encouraged to submit patches to the GNUnet-developers mailinglist:
 127 @uref{https://lists.gnu.org/mailman/listinfo/gnunet-developers, https://lists.gnu.org/mailman/listinfo/gnunet-developers}
 128
 129 @item The bugtracking system (Mantis).
 130 We use it to track feature requests, open bug reports and their
 131 resolutions.
 132 It can be accessed at
 133 @uref{https://gnunet.org/bugs/, https://gnunet.org/bugs/}.
 134 Anyone can report bugs.
 135
 136 @item Our site installation of the
 137 CI@footnote{Continuous Integration} system @code{Buildbot} is used
 138 to check GNUnet builds automatically on a range of platforms.
 139 The web interface of this CI is exposed at
 140 @uref{https://gnunet.org/buildbot/, https://gnunet.org/buildbot/}.
 141 Builds are triggered automatically 30 minutes after the last commit to
 142 our repository was made.
 143
 144 @item The current quality of our automated test suite is assessed using
 145 Code coverage analysis. This analysis is run daily; however the webpage
 146 is only updated if all automated tests pass at that time. Testcases that
 147 improve our code coverage are always welcome.
 148
 149 @item We try to automatically find bugs using a static analysis scan.
 150 This scan is run daily; however the webpage is only updated if all
 151 automated tests pass at the time. Note that not everything that is
 152 flagged by the analysis is a bug, sometimes even good code can be marked
 153 as possibly problematic. Nevertheless, developers are encouraged to at
 154 least be aware of all issues in their code that are listed.
 155
 156 @item We use Gauger for automatic performance regression visualization.
 157 @c FIXME: LINK!
 158 Details on how to use Gauger are here.
 159
 160 @item We use @uref{http://junit.org/, junit} to automatically test
 161 @command{gnunet-java}.
 162 Automatically generated, current reports on the test suite are here.
 163 @c FIXME: Likewise.
 164
 165 @item We use Cobertura to generate test coverage reports for gnunet-java.
 166 Current reports on test coverage are here.
 167 @c FIXME: Likewise.
 168
 169 @end itemize
 170
 171
 172
 173 @c ***********************************************************************
 174 @menu
 175 * Project overview::
 176 @end menu
 177
 178 @node Project overview
 179 @subsection Project overview
 180
 181 The GNUnet project consists at this point of several sub-projects. This
 182 section is supposed to give an initial overview about the various
 183 sub-projects. Note that this description also lists projects that are far
 184 from complete, including even those that have literally not a single line
 185 of code in them yet.
 186
 187 GNUnet sub-projects in order of likely relevance are currently:
 188
 189 @table @asis
 190
 191 @item @command{gnunet}
 192 Core of the P2P framework, including file-sharing, VPN and
 193 chat applications; this is what the Developer Handbook covers mostly
 194 @item @command{gnunet-gtk}
 195 Gtk+-based user interfaces, including:
 196
 197 @itemize @bullet
 198 @item @command{gnunet-fs-gtk} (file-sharing),
 199 @item @command{gnunet-statistics-gtk} (statistics over time),
 200 @item @command{gnunet-peerinfo-gtk}
 201 (information about current connections and known peers),
 202 @item @command{gnunet-chat-gtk} (chat GUI) and
 203 @item @command{gnunet-setup} (setup tool for "everything")
 204 @end itemize
 205
 206 @item @command{gnunet-fuse}
 207 Mounting directories shared via GNUnet's file-sharing
 208 on GNU/Linux distributions
 209 @item @command{gnunet-update}
 210 Installation and update tool
 211 @item @command{gnunet-ext}
 212 Template for starting 'external' GNUnet projects
 213 @item @command{gnunet-java}
 214 Java APIs for writing GNUnet services and applications
 215 @c ** FIXME: Point to new website repository once we have it:
 216 @c ** @item svn/gnunet-www/ Code and media helping drive the GNUnet
 217 @c website
 218 @item @command{eclectic}
 219 Code to run GNUnet nodes on testbeds for research, development,
 220 testing and evaluation
 221 @c ** FIXME: Solve the status and location of gnunet-qt
 222 @item @command{gnunet-qt}
 223 Qt-based GNUnet GUI (is it deprecated?)
 224 @item @command{gnunet-cocoa}
 225 cocoa-based GNUnet GUI (is it deprecated?)
 226 @item @command{gnunet-guile}
 227 Guile bindings for GNUnet
 228
 229 @end table
 230
 231 We are also working on various supporting libraries and tools:
 232 @c ** FIXME: What about gauger, and what about libmwmodem?
 233
 234 @table @asis
 235 @item @command{libextractor}
 236 GNU libextractor (meta data extraction)
 237 @item @command{libmicrohttpd}
 238 GNU libmicrohttpd (embedded HTTP(S) server library)
 239 @item @command{gauger}
 240 Tool for performance regression analysis
 241 @item @command{monkey}
 242 Tool for automated debugging of distributed systems
 243 @item @command{libmwmodem}
 244 Library for accessing satellite connection quality reports
 245 @item @command{libgnurl}
 246 gnURL (feature-restricted variant of cURL/libcurl)
 247 @end table
 248
 249 Finally, there are various external projects (see links for a list of
 250 those that have a public website) which build on top of the GNUnet
 251 framework.
 252
 253 @c ***********************************************************************
 254 @node Code overview
 255 @section Code overview
 256
 257 This section gives a brief overview of the GNUnet source code.
 258 Specifically, we sketch the function of each of the subdirectories in
 259 the @file{gnunet/src/} directory. The order given is roughly bottom-up
 260 (in terms of the layers of the system).
 261
 262 @table @asis
 263 @item @file{util/} --- libgnunetutil
 264 Library with general utility functions, all
 265 GNUnet binaries link against this library. Anything from memory
 266 allocation and data structures to cryptography and inter-process
 267 communication. The goal is to provide an OS-independent interface and
 268 more 'secure' or convenient implementations of commonly used primitives.
 269 The API is spread over more than a dozen headers, developers should study
 270 those closely to avoid duplicating existing functions.
 271 @pxref{libgnunetutil}.
 272 @item @file{hello/} --- libgnunethello
 273 HELLO messages are used to
 274 describe under which addresses a peer can be reached (for example,
 275 protocol, IP, port). This library manages parsing and generating of HELLO
 276 messages.
 277 @item @file{block/} --- libgnunetblock
 278 The DHT and other components of GNUnet
 279 store information in units called 'blocks'. Each block has a type and the
 280 type defines a particular format and how that binary format is to be
 281 linked to a hash code (the key for the DHT and for databases). The block
 282 library is a wapper around block plugins which provide the necessary
 283 functions for each block type.
 284 @item @file{statistics/} --- statistics service
 285 The statistics service enables associating
 286 values (of type uint64_t) with a component name and a string. The main
 287 uses is debugging (counting events), performance tracking and user
 288 entertainment (what did my peer do today?).
 289 @item @file{arm/} --- Automatic Restart Manager (ARM)
 290 The automatic-restart-manager (ARM) service
 291 is the GNUnet master service. Its role is to start gnunet-services, to
 292 re-start them when they crashed and finally to shut down the system when
 293 requested.
 294 @item @file{peerinfo/} --- peerinfo service
 295 The peerinfo service keeps track of which peers are known
 296 to the local peer and also tracks the validated addresses for each peer
 297 (in the form of a HELLO message) for each of those peers. The peer is not
 298 necessarily connected to all peers known to the peerinfo service.
 299 Peerinfo provides persistent storage for peer identities --- peers are
 300 not forgotten just because of a system restart.
 301 @item @file{datacache/} --- libgnunetdatacache
 302 The datacache library provides (temporary) block storage for the DHT.
 303 Existing plugins can store blocks in Sqlite, Postgres or MySQL databases.
 304 All data stored in the cache is lost when the peer is stopped or
 305 restarted (datacache uses temporary tables).
 306 @item @file{datastore/} --- datastore service
 307 The datastore service stores file-sharing blocks in
 308 databases for extended periods of time. In contrast to the datacache, data
 309 is not lost when peers restart. However, quota restrictions may still
 310 cause old, expired or low-priority data to be eventually discarded.
 311 Existing plugins can store blocks in Sqlite, Postgres or MySQL databases.
 312 @item @file{template/} --- service template
 313 Template for writing a new service. Does nothing.
 314 @item @file{ats/} --- Automatic Transport Selection
 315 The automatic transport selection (ATS) service
 316 is responsible for deciding which address (i.e.
 317 which transport plugin) should be used for communication with other peers,
 318 and at what bandwidth.
 319 @item @file{nat/} --- libgnunetnat
 320 Library that provides basic functions for NAT traversal.
 321 The library supports NAT traversal with
 322 manual hole-punching by the user, UPnP and ICMP-based autonomous NAT
 323 traversal. The library also includes an API for testing if the current
 324 configuration works and the @code{gnunet-nat-server} which provides an
 325 external service to test the local configuration.
 326 @item @file{fragmentation/} --- libgnunetfragmentation
 327 Some transports (UDP and WLAN, mostly) have restrictions on the maximum
 328 transfer unit (MTU) for packets. The fragmentation library can be used to
 329 break larger packets into chunks of at most 1k and transmit the resulting
 330 fragments reliabily (with acknowledgement, retransmission, timeouts,
 331 etc.).
 332 @item @file{transport/} --- transport service
 333 The transport service is responsible for managing the
 334 basic P2P communication. It uses plugins to support P2P communication
 335 over TCP, UDP, HTTP, HTTPS and other protocols.The transport service
 336 validates peer addresses, enforces bandwidth restrictions, limits the
 337 total number of connections and enforces connectivity restrictions (i.e.
 338 friends-only).
 339 @item @file{peerinfo-tool/} --- gnunet-peerinfo
 340 This directory contains the gnunet-peerinfo binary which can be used to
 341 inspect the peers and HELLOs known to the peerinfo service.
 342 @item @file{core/}
 343 The core service is responsible for establishing encrypted, authenticated
 344 connections with other peers, encrypting and decrypting messages and
 345 forwarding messages to higher-level services that are interested in them.
 346 @item @file{testing/} --- libgnunettesting
 347 The testing library allows starting (and stopping) peers
 348 for writing testcases.
 349 It also supports automatic generation of configurations for peers
 350 ensuring that the ports and paths are disjoint. libgnunettesting is also
 351 the foundation for the testbed service
 352 @item @file{testbed/} --- testbed service
 353 The testbed service is used for creating small or large scale deployments
 354 of GNUnet peers for evaluation of protocols.
 355 It facilitates peer depolyments on multiple
 356 hosts (for example, in a cluster) and establishing varous network
 357 topologies (both underlay and overlay).
 358 @item @file{nse/} --- Network Size Estimation
 359 The network size estimation (NSE) service
 360 implements a protocol for (securely) estimating the current size of the
 361 P2P network.
 362 @item @file{dht/} --- distributed hash table
 363 The distributed hash table (DHT) service provides a
 364 distributed implementation of a hash table to store blocks under hash
 365 keys in the P2P network.
 366 @item @file{hostlist/} --- hostlist service
 367 The hostlist service allows learning about
 368 other peers in the network by downloading HELLO messages from an HTTP
 369 server, can be configured to run such an HTTP server and also implements
 370 a P2P protocol to advertise and automatically learn about other peers
 371 that offer a public hostlist server.
 372 @item @file{topology/} --- topology service
 373 The topology service is responsible for
 374 maintaining the mesh topology. It tries to maintain connections to friends
 375 (depending on the configuration) and also tries to ensure that the peer
 376 has a decent number of active connections at all times. If necessary, new
 377 connections are added. All peers should run the topology service,
 378 otherwise they may end up not being connected to any other peer (unless
 379 some other service ensures that core establishes the required
 380 connections). The topology service also tells the transport service which
 381 connections are permitted (for friend-to-friend networking)
 382 @item @file{fs/} --- file-sharing
 383 The file-sharing (FS) service implements GNUnet's
 384 file-sharing application. Both anonymous file-sharing (using gap) and
 385 non-anonymous file-sharing (using dht) are supported.
 386 @item @file{cadet/} --- cadet service
 387 The CADET service provides a general-purpose routing abstraction to create
 388 end-to-end encrypted tunnels in mesh networks. We wrote a paper
 389 documenting key aspects of the design.
 390 @item @file{tun/} --- libgnunettun
 391 Library for building IPv4, IPv6 packets and creating
 392 checksums for UDP, TCP and ICMP packets. The header
 393 defines C structs for common Internet packet formats and in particular
 394 structs for interacting with TUN (virtual network) interfaces.
 395 @item @file{mysql/} --- libgnunetmysql
 396 Library for creating and executing prepared MySQL
 397 statements and to manage the connection to the MySQL database.
 398 Essentially a lightweight wrapper for the interaction between GNUnet
 399 components and libmysqlclient.
 400 @item @file{dns/}
 401 Service that allows intercepting and modifying DNS requests of
 402 the local machine. Currently used for IPv4-IPv6 protocol translation
 403 (DNS-ALG) as implemented by "pt/" and for the GNUnet naming system. The
 404 service can also be configured to offer an exit service for DNS traffic.
 405 @item @file{vpn/} --- VPN service
 406 The virtual public network (VPN) service provides a virtual
 407 tunnel interface (VTUN) for IP routing over GNUnet.
 408 Needs some other peers to run an "exit" service to work.
 409 Can be activated using the "gnunet-vpn" tool or integrated with DNS using
 410 the "pt" daemon.
 411 @item @file{exit/}
 412 Daemon to allow traffic from the VPN to exit this
 413 peer to the Internet or to specific IP-based services of the local peer.
 414 Currently, an exit service can only be restricted to IPv4 or IPv6, not to
 415 specific ports and or IP address ranges. If this is not acceptable,
 416 additional firewall rules must be added manually. exit currently only
 417 works for normal UDP, TCP and ICMP traffic; DNS queries need to leave the
 418 system via a DNS service.
 419 @item @file{pt/}
 420 protocol translation daemon. This daemon enables 4-to-6,
 421 6-to-4, 4-over-6 or 6-over-4 transitions for the local system. It
 422 essentially uses "DNS" to intercept DNS replies and then maps results to
 423 those offered by the VPN, which then sends them using mesh to some daemon
 424 offering an appropriate exit service.
 425 @item @file{identity/}
 426 Management of egos (alter egos) of a user; identities are
 427 essentially named ECC private keys and used for zones in the GNU name
 428 system and for namespaces in file-sharing, but might find other uses later
 429 @item @file{revocation/}
 430 Key revocation service, can be used to revoke the
 431 private key of an identity if it has been compromised
 432 @item @file{namecache/}
 433 Cache for resolution results for the GNU name system;
 434 data is encrypted and can be shared among users,
 435 loss of the data should ideally only result in a
 436 performance degradation (persistence not required)
 437 @item @file{namestore/}
 438 Database for the GNU name system with per-user private information,
 439 persistence required
 440 @item @file{gns/}
 441 GNU name system, a GNU approach to DNS and PKI.
 442 @item @file{dv/}
 443 A plugin for distance-vector (DV)-based routing.
 444 DV consists of a service and a transport plugin to provide peers
 445 with the illusion of a direct P2P connection for connections
 446 that use multiple (typically up to 3) hops in the actual underlay network.
 447 @item @file{regex/}
 448 Service for the (distributed) evaluation of regular expressions.
 449 @item @file{scalarproduct/}
 450 The scalar product service offers an API to perform a secure multiparty
 451 computation which calculates a scalar product between two peers
 452 without exposing the private input vectors of the peers to each other.
 453 @item @file{consensus/}
 454 The consensus service will allow a set of peers to agree
 455 on a set of values via a distributed set union computation.
 456 @item @file{rest/}
 457 The rest API allows access to GNUnet services using RESTful interaction.
 458 The services provide plugins that can exposed by the rest server.
 459 @c FIXME: Where did this disappear to?
 460 @c @item @file{experimentation/}
 461 @c The experimentation daemon coordinates distributed
 462 @c experimentation to evaluate transport and ATS properties.
 463 @end table
 464
 465 @c ***********************************************************************
 466 @node System Architecture
 467 @section System Architecture
 468
 469 @c FIXME: For those irritated by the textflow, we are missing images here,
 470 @c in the short term we should add them back, in the long term this should
 471 @c work without images or have images with alt-text.
 472
 473 GNUnet developers like LEGOs. The blocks are indestructible, can be
 474 stacked together to construct complex buildings and it is generally easy
 475 to swap one block for a different one that has the same shape. GNUnet's
 476 architecture is based on LEGOs:
 477
 478 @c @image{images/service_lego_block,5in,,picture of a LEGO block stack - 3 APIs as connectors upon Network Protocol on top of a Service}
 479
 480 This chapter documents the GNUnet LEGO system, also known as GNUnet's
 481 system architecture.
 482
 483 The most common GNUnet component is a service. Services offer an API (or
 484 several, depending on what you count as "an API") which is implemented as
 485 a library. The library communicates with the main process of the service
 486 using a service-specific network protocol. The main process of the service
 487 typically doesn't fully provide everything that is needed --- it has holes
 488 to be filled by APIs to other services.
 489
 490 A special kind of component in GNUnet are user interfaces and daemons.
 491 Like services, they have holes to be filled by APIs of other services.
 492 Unlike services, daemons do not implement their own network protocol and
 493 they have no API:
 494
 495 The GNUnet system provides a range of services, daemons and user
 496 interfaces, which are then combined into a layered GNUnet instance (also
 497 known as a peer).
 498
 499 Note that while it is generally possible to swap one service for another
 500 compatible service, there is often only one implementation. However,
 501 during development we often have a "new" version of a service in parallel
 502 with an "old" version. While the "new" version is not working, developers
 503 working on other parts of the service can continue their development by
 504 simply using the "old" service. Alternative design ideas can also be
 505 easily investigated by swapping out individual components. This is
 506 typically achieved by simply changing the name of the "BINARY" in the
 507 respective configuration section.
 508
 509 Key properties of GNUnet services are that they must be separate
 510 processes and that they must protect themselves by applying tight error
 511 checking against the network protocol they implement (thereby achieving a
 512 certain degree of robustness).
 513
 514 On the other hand, the APIs are implemented to tolerate failures of the
 515 service, isolating their host process from errors by the service. If the
 516 service process crashes, other services and daemons around it should not
 517 also fail, but instead wait for the service process to be restarted by
 518 ARM.
 519
 520
 521 @c ***********************************************************************
 522 @node Subsystem stability
 523 @section Subsystem stability
 524
 525 This section documents the current stability of the various GNUnet
 526 subsystems. Stability here describes the expected degree of compatibility
 527 with future versions of GNUnet. For each subsystem we distinguish between
 528 compatibility on the P2P network level (communication protocol between
 529 peers), the IPC level (communication between the service and the service
 530 library) and the API level (stability of the API). P2P compatibility is
 531 relevant in terms of which applications are likely going to be able to
 532 communicate with future versions of the network. IPC communication is
 533 relevant for the implementation of language bindings that re-implement the
 534 IPC messages. Finally, API compatibility is relevant to developers that
 535 hope to be able to avoid changes to applications build on top of the APIs
 536 of the framework.
 537
 538 The following table summarizes our current view of the stability of the
 539 respective protocols or APIs:
 540
 541 @multitable @columnfractions .20 .20 .20 .20
 542 @headitem Subsystem @tab P2P @tab IPC @tab C API
 543 @item util @tab n/a @tab n/a @tab stable
 544 @item arm @tab n/a @tab stable @tab stable
 545 @item ats @tab n/a @tab unstable @tab testing
 546 @item block @tab n/a @tab n/a @tab stable
 547 @item cadet @tab testing @tab testing @tab testing
 548 @item consensus @tab experimental @tab experimental @tab experimental
 549 @item core @tab stable @tab stable @tab stable
 550 @item datacache @tab n/a @tab n/a @tab stable
 551 @item datastore @tab n/a @tab stable @tab stable
 552 @item dht @tab stable @tab stable @tab stable
 553 @item dns @tab stable @tab stable @tab stable
 554 @item dv @tab testing @tab testing @tab n/a
 555 @item exit @tab testing @tab n/a @tab n/a
 556 @item fragmentation @tab stable @tab n/a @tab stable
 557 @item fs @tab stable @tab stable @tab stable
 558 @item gns @tab stable @tab stable @tab stable
 559 @item hello @tab n/a @tab n/a @tab testing
 560 @item hostlist @tab stable @tab stable @tab n/a
 561 @item identity @tab stable @tab stable @tab n/a
 562 @item multicast @tab experimental @tab experimental @tab experimental
 563 @item mysql @tab stable @tab n/a @tab stable
 564 @item namestore @tab n/a @tab stable @tab stable
 565 @item nat @tab n/a @tab n/a @tab stable
 566 @item nse @tab stable @tab stable @tab stable
 567 @item peerinfo @tab n/a @tab stable @tab stable
 568 @item psyc @tab experimental @tab experimental @tab experimental
 569 @item pt @tab n/a @tab n/a @tab n/a
 570 @item regex @tab stable @tab stable @tab stable
 571 @item revocation @tab stable @tab stable @tab stable
 572 @item social @tab experimental @tab experimental @tab experimental
 573 @item statistics @tab n/a @tab stable @tab stable
 574 @item testbed @tab n/a @tab testing @tab testing
 575 @item testing @tab n/a @tab n/a @tab testing
 576 @item topology @tab n/a @tab n/a @tab n/a
 577 @item transport @tab stable @tab stable @tab stable
 578 @item tun @tab n/a @tab n/a @tab stable
 579 @item vpn @tab testing @tab n/a @tab n/a
 580 @end multitable
 581
 582 Here is a rough explanation of the values:
 583
 584 @table @samp
 585 @item stable
 586 No incompatible changes are planned at this time; for IPC/APIs, if
 587 there are incompatible changes, they will be minor and might only require
 588 minimal changes to existing code; for P2P, changes will be avoided if at
 589 all possible for the 0.10.x-series
 590
 591 @item testing
 592 No incompatible changes are
 593 planned at this time, but the code is still known to be in flux; so while
 594 we have no concrete plans, our expectation is that there will still be
 595 minor modifications; for P2P, changes will likely be extensions that
 596 should not break existing code
 597
 598 @item unstable
 599 Changes are planned and will happen; however, they
 600 will not be totally radical and the result should still resemble what is
 601 there now; nevertheless, anticipated changes will break protocol/API
 602 compatibility
 603
 604 @item experimental
 605 Changes are planned and the result may look nothing like
 606 what the API/protocol looks like today
 607
 608 @item unknown
 609 Someone should think about where this subsystem headed
 610
 611 @item n/a
 612 This subsystem does not have an API/IPC-protocol/P2P-protocol
 613 @end table
 614
 615 @c ***********************************************************************
 616 @node Naming conventions and coding style guide
 617 @section Naming conventions and coding style guide
 618
 619 Here you can find some rules to help you write code for GNUnet.
 620
 621 @c ***********************************************************************
 622 @menu
 623 * Naming conventions::
 624 * Coding style::
 625 @end menu
 626
 627 @node Naming conventions
 628 @subsection Naming conventions
 629
 630
 631 @c ***********************************************************************
 632 @menu
 633 * include files::
 634 * binaries::
 635 * logging::
 636 * configuration::
 637 * exported symbols::
 638 * private (library-internal) symbols (including structs and macros)::
 639 * testcases::
 640 * performance tests::
 641 * src/ directories::
 642 @end menu
 643
 644 @node include files
 645 @subsubsection include files
 646
 647 @itemize @bullet
 648 @item _lib: library without need for a process
 649 @item _service: library that needs a service process
 650 @item _plugin: plugin definition
 651 @item _protocol: structs used in network protocol
 652 @item exceptions:
 653 @itemize @bullet
 654 @item gnunet_config.h --- generated
 655 @item platform.h --- first included
 656 @item plibc.h --- external library
 657 @item gnunet_common.h --- fundamental routines
 658 @item gnunet_directories.h --- generated
 659 @item gettext.h --- external library
 660 @end itemize
 661 @end itemize
 662
 663 @c ***********************************************************************
 664 @node binaries
 665 @subsubsection binaries
 666
 667 @itemize @bullet
 668 @item gnunet-service-xxx: service process (has listen socket)
 669 @item gnunet-daemon-xxx: daemon process (no listen socket)
 670 @item gnunet-helper-xxx[-yyy]: SUID helper for module xxx
 671 @item gnunet-yyy: command-line tool for end-users
 672 @item libgnunet_plugin_xxx_yyy.so: plugin for API xxx
 673 @item libgnunetxxx.so: library for API xxx
 674 @end itemize
 675
 676 @c ***********************************************************************
 677 @node logging
 678 @subsubsection logging
 679
 680 @itemize @bullet
 681 @item services and daemons use their directory name in
 682 @code{GNUNET_log_setup} (i.e. 'core') and log using
 683 plain 'GNUNET_log'.
 684 @item command-line tools use their full name in
 685 @code{GNUNET_log_setup} (i.e. 'gnunet-publish') and log using
 686 plain 'GNUNET_log'.
 687 @item service access libraries log using
 688 '@code{GNUNET_log_from}' and use '@code{DIRNAME-api}' for the
 689 component (i.e. 'core-api')
 690 @item pure libraries (without associated service) use
 691 '@code{GNUNET_log_from}' with the component set to their
 692 library name (without lib or '@file{.so}'),
 693 which should also be their directory name (i.e. '@file{nat}')
 694 @item plugins should use '@code{GNUNET_log_from}'
 695 with the directory name and the plugin name combined to produce
 696 the component name (i.e. 'transport-tcp').
 697 @item logging should be unified per-file by defining a
 698 @code{LOG} macro with the appropriate arguments,
 699 along these lines:
 700
 701 @example
 702 #define LOG(kind,...)
 703 GNUNET_log_from (kind, "example-api",__VA_ARGS__)
 704 @end example
 705
 706 @end itemize
 707
 708 @c ***********************************************************************
 709 @node configuration
 710 @subsubsection configuration
 711
 712 @itemize @bullet
 713 @item paths (that are substituted in all filenames) are in PATHS
 714 (have as few as possible)
 715 @item all options for a particular module (@file{src/MODULE})
 716 are under @code{[MODULE]}
 717 @item options for a plugin of a module
 718 are under @code{[MODULE-PLUGINNAME]}
 719 @end itemize
 720
 721 @c ***********************************************************************
 722 @node exported symbols
 723 @subsubsection exported symbols
 724
 725 @itemize @bullet
 726 @item must start with @code{GNUNET_modulename_} and be defined in
 727 @file{modulename.c}
 728 @item exceptions: those defined in @file{gnunet_common.h}
 729 @end itemize
 730
 731 @c ***********************************************************************
 732 @node private (library-internal) symbols (including structs and macros)
 733 @subsubsection private (library-internal) symbols (including structs and macros)
 734
 735 @itemize @bullet
 736 @item must NOT start with any prefix
 737 @item must not be exported in a way that linkers could use them or@ other
 738 libraries might see them via headers; they must be either
 739 declared/defined in C source files or in headers that are in the
 740 respective directory under @file{src/modulename/} and NEVER be declared
 741 in @file{src/include/}.
 742 @end itemize
 743
 744 @node testcases
 745 @subsubsection testcases
 746
 747 @itemize @bullet
 748 @item must be called @file{test_module-under-test_case-description.c}
 749 @item "case-description" maybe omitted if there is only one test
 750 @end itemize
 751
 752 @c ***********************************************************************
 753 @node performance tests
 754 @subsubsection performance tests
 755
 756 @itemize @bullet
 757 @item must be called @file{perf_module-under-test_case-description.c}
 758 @item "case-description" maybe omitted if there is only one performance
 759 test
 760 @item Must only be run if @code{HAVE_BENCHMARKS} is satisfied
 761 @end itemize
 762
 763 @c ***********************************************************************
 764 @node src/ directories
 765 @subsubsection src/ directories
 766
 767 @itemize @bullet
 768 @item gnunet-NAME: end-user applications (i.e., gnunet-search, gnunet-arm)
 769 @item gnunet-service-NAME: service processes with accessor library (i.e.,
 770 gnunet-service-arm)
 771 @item libgnunetNAME: accessor library (_service.h-header) or standalone
 772 library (_lib.h-header)
 773 @item gnunet-daemon-NAME: daemon process without accessor library (i.e.,
 774 gnunet-daemon-hostlist) and no GNUnet management port
 775 @item libgnunet_plugin_DIR_NAME: loadable plugins (i.e.,
 776 libgnunet_plugin_transport_tcp)
 777 @end itemize
 778
 779 @cindex Coding style
 780 @node Coding style
 781 @subsection Coding style
 782
 783 @c XXX: Adjust examples to GNU Standards!
 784 @itemize @bullet
 785 @item We follow the GNU Coding Standards (@pxref{Top, The GNU Coding Standards,, standards, The GNU Coding Standards});
 786 @item Indentation is done with spaces, two per level, no tabs;
 787 @item C99 struct initialization is fine;
 788 @item declare only one variable per line, for example:
 789
 790 @noindent
 791 instead of
 792
 793 @example
 794 int i,j;
 795 @end example
 796
 797 @noindent
 798 write:
 799
 800 @example
 801 int i;
 802 int j;
 803 @end example
 804
 805 @c TODO: include actual example from a file in source
 806
 807 @noindent
 808 This helps keep diffs small and forces developers to think precisely about
 809 the type of every variable.
 810 Note that @code{char *} is different from @code{const char*} and
 811 @code{int} is different from @code{unsigned int} or @code{uint32_t}.
 812 Each variable type should be chosen with care.
 813
 814 @item While @code{goto} should generally be avoided, having a
 815 @code{goto} to the end of a function to a block of clean up
 816 statements (free, close, etc.) can be acceptable.
 817
 818 @item Conditions should be written with constants on the left (to avoid
 819 accidental assignment) and with the @code{true} target being either the
 820 @code{error} case or the significantly simpler continuation. For example:
 821
 822 @example
 823 if (0 != stat ("filename," &sbuf)) @{
 824   error();
 825  @}
 826  else @{
 827    /* handle normal case here */
 828  @}
 829 @end example
 830
 831 @noindent
 832 instead of
 833
 834 @example
 835 if (stat ("filename," &sbuf) == 0) @{
 836   /* handle normal case here */
 837  @} else @{
 838   error();
 839  @}
 840 @end example
 841
 842 @noindent
 843 If possible, the error clause should be terminated with a @code{return} (or
 844 @code{goto} to some cleanup routine) and in this case, the @code{else} clause
 845 should be omitted:
 846
 847 @example
 848 if (0 != stat ("filename," &sbuf)) @{
 849   error();
 850   return;
 851  @}
 852 /* handle normal case here */
 853 @end example
 854
 855 This serves to avoid deep nesting. The 'constants on the left' rule
 856 applies to all constants (including. @code{GNUNET_SCHEDULER_NO_TASK}),
 857 NULL, and enums). With the two above rules (constants on left, errors in
 858 'true' branch), there is only one way to write most branches correctly.
 859
 860 @item Combined assignments and tests are allowed if they do not hinder
 861 code clarity. For example, one can write:
 862
 863 @example
 864 if (NULL == (value = lookup_function())) @{
 865   error();
 866   return;
 867  @}
 868 @end example
 869
 870 @item Use @code{break} and @code{continue} wherever possible to avoid
 871 deep(er) nesting. Thus, we would write:
 872
 873 @example
 874 next = head;
 875 while (NULL != (pos = next)) @{
 876   next = pos->next;
 877   if (! should_free (pos))
 878     continue;
 879   GNUNET_CONTAINER_DLL_remove (head, tail, pos);
 880   GNUNET_free (pos);
 881  @}
 882 @end example
 883
 884 instead of
 885
 886 @example
 887 next = head; while (NULL != (pos = next)) @{
 888   next = pos->next;
 889   if (should_free (pos)) @{
 890     /* unnecessary nesting! */
 891     GNUNET_CONTAINER_DLL_remove (head, tail, pos);
 892     GNUNET_free (pos);
 893    @}
 894   @}
 895 @end example
 896
 897 @item We primarily use @code{for} and @code{while} loops.
 898 A @code{while} loop is used if the method for advancing in the loop is
 899 not a straightforward increment operation. In particular, we use:
 900
 901 @example
 902 next = head;
 903 while (NULL != (pos = next))
 904 @{
 905   next = pos->next;
 906   if (! should_free (pos))
 907     continue;
 908   GNUNET_CONTAINER_DLL_remove (head, tail, pos);
 909   GNUNET_free (pos);
 910 @}
 911 @end example
 912
 913 to free entries in a list (as the iteration changes the structure of the
 914 list due to the free; the equivalent @code{for} loop does no longer
 915 follow the simple @code{for} paradigm of @code{for(INIT;TEST;INC)}).
 916 However, for loops that do follow the simple @code{for} paradigm we do
 917 use @code{for}, even if it involves linked lists:
 918
 919 @example
 920 /* simple iteration over a linked list */
 921 for (pos = head;
 922      NULL != pos;
 923      pos = pos->next)
 924 @{
 925    use (pos);
 926 @}
 927 @end example
 928
 929
 930 @item The first argument to all higher-order functions in GNUnet must be
 931 declared to be of type @code{void *} and is reserved for a closure. We do
 932 not use inner functions, as trampolines would conflict with setups that
 933 use non-executable stacks.
 934 The first statement in a higher-order function, which unusually should
 935 be part of the variable declarations, should assign the
 936 @code{cls} argument to the precise expected type. For example:
 937
 938 @example
 939 int callback (void *cls, char *args) @{
 940   struct Foo *foo = cls;
 941   int other_variables;
 942
 943    /* rest of function */
 944 @}
 945 @end example
 946
 947
 948 @item It is good practice to write complex @code{if} expressions instead
 949 of using deeply nested @code{if} statements. However, except for addition
 950 and multiplication, all operators should use parens. This is fine:
 951
 952 @example
 953 if ( (1 == foo) || ((0 == bar) && (x != y)) )
 954   return x;
 955 @end example
 956
 957
 958 However, this is not:
 959
 960 @example
 961 if (1 == foo)
 962   return x;
 963 if (0 == bar && x != y)
 964   return x;
 965 @end example
 966
 967 @noindent
 968 Note that splitting the @code{if} statement above is debateable as the
 969 @code{return x} is a very trivial statement. However, once the logic after
 970 the branch becomes more complicated (and is still identical), the "or"
 971 formulation should be used for sure.
 972
 973 @item There should be two empty lines between the end of the function and
 974 the comments describing the following function. There should be a single
 975 empty line after the initial variable declarations of a function. If a
 976 function has no local variables, there should be no initial empty line. If
 977 a long function consists of several complex steps, those steps might be
 978 separated by an empty line (possibly followed by a comment describing the
 979 following step). The code should not contain empty lines in arbitrary
 980 places; if in doubt, it is likely better to NOT have an empty line (this
 981 way, more code will fit on the screen).
 982 @end itemize
 983
 984 @c ***********************************************************************
 985 @node Build-system
 986 @section Build-system
 987
 988 If you have code that is likely not to compile or build rules you might
 989 want to not trigger for most developers, use @code{if HAVE_EXPERIMENTAL}
 990 in your @file{Makefile.am}.
 991 Then it is OK to (temporarily) add non-compiling (or known-to-not-port)
 992 code.
 993
 994 If you want to compile all testcases but NOT run them, run configure with
 995 the @code{--enable-test-suppression} option.
 996
 997 If you want to run all testcases, including those that take a while, run
 998 configure with the @code{--enable-expensive-testcases} option.
 999
1000 If you want to compile and run benchmarks, run configure with the
1001 @code{--enable-benchmarks} option.
1002
1003 If you want to obtain code coverage results, run configure with the
1004 @code{--enable-coverage} option and run the @file{coverage.sh} script in
1005 the @file{contrib/} directory.
1006
1007 @cindex gnunet-ext
1008 @node Developing extensions for GNUnet using the gnunet-ext template
1009 @section Developing extensions for GNUnet using the gnunet-ext template
1010
1011 For developers who want to write extensions for GNUnet we provide the
1012 gnunet-ext template to provide an easy to use skeleton.
1013
1014 gnunet-ext contains the build environment and template files for the
1015 development of GNUnet services, command line tools, APIs and tests.
1016
1017 First of all you have to obtain gnunet-ext from git:
1018
1019 @example
1020 git clone https://gnunet.org/git/gnunet-ext.git
1021 @end example
1022
1023 The next step is to bootstrap and configure it. For configure you have to
1024 provide the path containing GNUnet with
1025 @code{--with-gnunet=/path/to/gnunet} and the prefix where you want the
1026 install the extension using @code{--prefix=/path/to/install}:
1027
1028 @example
1029 ./bootstrap
1030 ./configure --prefix=/path/to/install --with-gnunet=/path/to/gnunet
1031 @end example
1032
1033 When your GNUnet installation is not included in the default linker search
1034 path, you have to add @code{/path/to/gnunet} to the file
1035 @file{/etc/ld.so.conf} and run @code{ldconfig} or your add it to the
1036 environmental variable @code{LD_LIBRARY_PATH} by using
1037
1038 @example
1039 export LD_LIBRARY_PATH=/path/to/gnunet/lib
1040 @end example
1041
1042 @cindex writing testcases
1043 @node Writing testcases
1044 @section Writing testcases
1045
1046 Ideally, any non-trivial GNUnet code should be covered by automated
1047 testcases. Testcases should reside in the same place as the code that is
1048 being tested. The name of source files implementing tests should begin
1049 with @code{test_} followed by the name of the file that contains
1050 the code that is being tested.
1051
1052 Testcases in GNUnet should be integrated with the autotools build system.
1053 This way, developers and anyone building binary packages will be able to
1054 run all testcases simply by running @code{make check}. The final
1055 testcases shipped with the distribution should output at most some brief
1056 progress information and not display debug messages by default. The
1057 success or failure of a testcase must be indicated by returning zero
1058 (success) or non-zero (failure) from the main method of the testcase.
1059 The integration with the autotools is relatively straightforward and only
1060 requires modifications to the @file{Makefile.am} in the directory
1061 containing the testcase. For a testcase testing the code in @file{foo.c}
1062 the @file{Makefile.am} would contain the following lines:
1063
1064 @example
1065 check_PROGRAMS = test_foo
1066 TESTS = $(check_PROGRAMS)
1067 test_foo_SOURCES = test_foo.c
1068 test_foo_LDADD = $(top_builddir)/src/util/libgnunetutil.la
1069 @end example
1070
1071 Naturally, other libraries used by the testcase may be specified in the
1072 @code{LDADD} directive as necessary.
1073
1074 Often testcases depend on additional input files, such as a configuration
1075 file. These support files have to be listed using the @code{EXTRA_DIST}
1076 directive in order to ensure that they are included in the distribution.
1077
1078 Example:
1079
1080 @example
1081 EXTRA_DIST = test_foo_data.conf
1082 @end example
1083
1084 Executing @code{make check} will run all testcases in the current
1085 directory and all subdirectories. Testcases can be compiled individually
1086 by running @code{make test_foo} and then invoked directly using
1087 @code{./test_foo}. Note that due to the use of plugins in GNUnet, it is
1088 typically necessary to run @code{make install} before running any
1089 testcases. Thus the canonical command @code{make check install} has to be
1090 changed to @code{make install check} for GNUnet.
1091
1092 @cindex TESTING library
1093 @node TESTING library
1094 @section TESTING library
1095
1096 The TESTING library is used for writing testcases which involve starting a
1097 single or multiple peers. While peers can also be started by testcases
1098 using the ARM subsystem, using TESTING library provides an elegant way to
1099 do this. The configurations of the peers are auto-generated from a given
1100 template to have non-conflicting port numbers ensuring that peers'
1101 services do not run into bind errors. This is achieved by testing ports'
1102 availability by binding a listening socket to them before allocating them
1103 to services in the generated configurations.
1104
1105 An another advantage while using TESTING is that it shortens the testcase
1106 startup time as the hostkeys for peers are copied from a pre-computed set
1107 of hostkeys instead of generating them at peer startup which may take a
1108 considerable amount of time when starting multiple peers or on an embedded
1109 processor.
1110
1111 TESTING also allows for certain services to be shared among peers. This
1112 feature is invaluable when testing with multiple peers as it helps to
1113 reduce the number of services run per each peer and hence the total
1114 number of processes run per testcase.
1115
1116 TESTING library only handles creating, starting and stopping peers.
1117 Features useful for testcases such as connecting peers in a topology are
1118 not available in TESTING but are available in the TESTBED subsystem.
1119 Furthermore, TESTING only creates peers on the localhost, however by
1120 using TESTBED testcases can benefit from creating peers across multiple
1121 hosts.
1122
1123 @menu
1124 * API::
1125 * Finer control over peer stop::
1126 * Helper functions::
1127 * Testing with multiple processes::
1128 @end menu
1129
1130 @cindex TESTING API
1131 @node API
1132 @subsection API
1133
1134 TESTING abstracts a group of peers as a TESTING system. All peers in a
1135 system have common hostname and no two services of these peers have a
1136 same port or a UNIX domain socket path.
1137
1138 TESTING system can be created with the function
1139 @code{GNUNET_TESTING_system_create()} which returns a handle to the
1140 system. This function takes a directory path which is used for generating
1141 the configurations of peers, an IP address from which connections to the
1142 peers' services should be allowed, the hostname to be used in peers'
1143 configuration, and an array of shared service specifications of type
1144 @code{struct GNUNET_TESTING_SharedService}.
1145
1146 The shared service specification must specify the name of the service to
1147 share, the configuration pertaining to that shared service and the
1148 maximum number of peers that are allowed to share a single instance of
1149 the shared service.
1150
1151 TESTING system created with @code{GNUNET_TESTING_system_create()} chooses
1152 ports from the default range @code{12000} - @code{56000} while
1153 auto-generating configurations for peers.
1154 This range can be customised with the function
1155 @code{GNUNET_TESTING_system_create_with_portrange()}. This function is
1156 similar to @code{GNUNET_TESTING_system_create()} except that it take 2
1157 additional parameters --- the start and end of the port range to use.
1158
1159 A TESTING system is destroyed with the funciton
1160 @code{GNUNET_TESTING_system_destory()}. This function takes the handle of
1161 the system and a flag to remove the files created in the directory used
1162 to generate configurations.
1163
1164 A peer is created with the function
1165 @code{GNUNET_TESTING_peer_configure()}. This functions takes the system
1166 handle, a configuration template from which the configuration for the peer
1167 is auto-generated and the index from where the hostkey for the peer has to
1168 be copied from. When successfull, this function returs a handle to the
1169 peer which can be used to start and stop it and to obtain the identity of
1170 the peer. If unsuccessful, a NULL pointer is returned with an error
1171 message. This function handles the generated configuration to have
1172 non-conflicting ports and paths.
1173
1174 Peers can be started and stopped by calling the functions
1175 @code{GNUNET_TESTING_peer_start()} and @code{GNUNET_TESTING_peer_stop()}
1176 respectively. A peer can be destroyed by calling the function
1177 @code{GNUNET_TESTING_peer_destroy}. When a peer is destroyed, the ports
1178 and paths in allocated in its configuration are reclaimed for usage in new
1179 peers.
1180
1181 @c ***********************************************************************
1182 @node Finer control over peer stop
1183 @subsection Finer control over peer stop
1184
1185 Using @code{GNUNET_TESTING_peer_stop()} is normally fine for testcases.
1186 However, calling this function for each peer is inefficient when trying to
1187 shutdown multiple peers as this function sends the termination signal to
1188 the given peer process and waits for it to terminate. It would be faster
1189 in this case to send the termination signals to the peers first and then
1190 wait on them. This is accomplished by the functions
1191 @code{GNUNET_TESTING_peer_kill()} which sends a termination signal to the
1192 peer, and the function @code{GNUNET_TESTING_peer_wait()} which waits on
1193 the peer.
1194
1195 Further finer control can be achieved by choosing to stop a peer
1196 asynchronously with the function @code{GNUNET_TESTING_peer_stop_async()}.
1197 This function takes a callback parameter and a closure for it in addition
1198 to the handle to the peer to stop. The callback function is called with
1199 the given closure when the peer is stopped. Using this function
1200 eliminates blocking while waiting for the peer to terminate.
1201
1202 An asynchronous peer stop can be cancelled by calling the function
1203 @code{GNUNET_TESTING_peer_stop_async_cancel()}. Note that calling this
1204 function does not prevent the peer from terminating if the termination
1205 signal has already been sent to it. It does, however, cancels the
1206 callback to be called when the peer is stopped.
1207
1208 @c ***********************************************************************
1209 @node Helper functions
1210 @subsection Helper functions
1211
1212 Most of the testcases can benefit from an abstraction which configures a
1213 peer and starts it. This is provided by the function
1214 @code{GNUNET_TESTING_peer_run()}. This function takes the testing
1215 directory pathname, a configuration template, a callback and its closure.
1216 This function creates a peer in the given testing directory by using the
1217 configuration template, starts the peer and calls the given callback with
1218 the given closure.
1219
1220 The function @code{GNUNET_TESTING_peer_run()} starts the ARM service of
1221 the peer which starts the rest of the configured services. A similar
1222 function @code{GNUNET_TESTING_service_run} can be used to just start a
1223 single service of a peer. In this case, the peer's ARM service is not
1224 started; instead, only the given service is run.
1225
1226 @c ***********************************************************************
1227 @node Testing with multiple processes
1228 @subsection Testing with multiple processes
1229
1230 When testing GNUnet, the splitting of the code into a services and clients
1231 often complicates testing. The solution to this is to have the testcase
1232 fork @code{gnunet-service-arm}, ask it to start the required server and
1233 daemon processes and then execute appropriate client actions (to test the
1234 client APIs or the core module or both). If necessary, multiple ARM
1235 services can be forked using different ports (!) to simulate a network.
1236 However, most of the time only one ARM process is needed. Note that on
1237 exit, the testcase should shutdown ARM with a @code{TERM} signal (to give
1238 it the chance to cleanly stop its child processes).
1239
1240 The following code illustrates spawning and killing an ARM process from a
1241 testcase:
1242
1243 @example
1244 static void run (void *cls,
1245                  char *const *args,
1246                  const char *cfgfile,
1247                  const struct GNUNET_CONFIGURATION_Handle *cfg) @{
1248   struct GNUNET_OS_Process *arm_pid;
1249   arm_pid = GNUNET_OS_start_process (NULL,
1250                                      NULL,
1251                                      "gnunet-service-arm",
1252                                      "gnunet-service-arm",
1253                                      "-c",
1254                                      cfgname,
1255                                      NULL);
1256   /* do real test work here */
1257   if (0 != GNUNET_OS_process_kill (arm_pid, SIGTERM))
1258     GNUNET_log_strerror
1259       (GNUNET_ERROR_TYPE_WARNING, "kill");
1260   GNUNET_assert (GNUNET_OK == GNUNET_OS_process_wait (arm_pid));
1261   GNUNET_OS_process_close (arm_pid); @}
1262
1263 GNUNET_PROGRAM_run (argc, argv,
1264                     "NAME-OF-TEST",
1265                     "nohelp",
1266                     options,
1267                     &run,
1268                     cls);
1269 @end example
1270
1271
1272 An alternative way that works well to test plugins is to implement a
1273 mock-version of the environment that the plugin expects and then to
1274 simply load the plugin directly.
1275
1276 @c ***********************************************************************
1277 @node Performance regression analysis with Gauger
1278 @section Performance regression analysis with Gauger
1279
1280 To help avoid performance regressions, GNUnet uses Gauger. Gauger is a
1281 simple logging tool that allows remote hosts to send performance data to
1282 a central server, where this data can be analyzed and visualized. Gauger
1283 shows graphs of the repository revisions and the performace data recorded
1284 for each revision, so sudden performance peaks or drops can be identified
1285 and linked to a specific revision number.
1286
1287 In the case of GNUnet, the buildbots log the performance data obtained
1288 during the tests after each build. The data can be accesed on GNUnet's
1289 Gauger page.
1290
1291 The menu on the left allows to select either the results of just one
1292 build bot (under "Hosts") or review the data from all hosts for a given
1293 test result (under "Metrics"). In case of very different absolute value
1294 of the results, for instance arm vs. amd64 machines, the option
1295 "Normalize" on a metric view can help to get an idea about the
1296 performance evolution across all hosts.
1297
1298 Using Gauger in GNUnet and having the performance of a module tracked over
1299 time is very easy. First of course, the testcase must generate some
1300 consistent metric, which makes sense to have logged. Highly volatile or
1301 random dependant metrics probably are not ideal candidates for meaningful
1302 regression detection.
1303
1304 To start logging any value, just include @code{gauger.h} in your testcase
1305 code. Then, use the macro @code{GAUGER()} to make the Buildbots log
1306 whatever value is of interest for you to @code{gnunet.org}'s Gauger
1307 server. No setup is necessary as most Buildbots have already everything
1308 in place and new metrics are created on demand. To delete a metric, you
1309 need to contact a member of the GNUnet development team (a file will need
1310 to be removed manually from the respective directory).
1311
1312 The code in the test should look like this:
1313
1314 @example
1315 [other includes]
1316 #include <gauger.h>
1317
1318 int main (int argc, char *argv[]) @{
1319
1320   [run test, generate data]
1321     GAUGER("YOUR_MODULE",
1322            "METRIC_NAME",
1323            (float)value,
1324            "UNIT"); @}
1325 @end example
1326
1327 Where:
1328
1329 @table @asis
1330
1331 @item @strong{YOUR_MODULE} is a category in the gauger page and should be
1332 the name of the module or subsystem like "Core" or "DHT"
1333 @item @strong{METRIC} is
1334 the name of the metric being collected and should be concise and
1335 descriptive, like "PUT operations in sqlite-datastore".
1336 @item @strong{value} is the value
1337 of the metric that is logged for this run.
1338 @item @strong{UNIT} is the unit in
1339 which the value is measured, for instance "kb/s" or "kb of RAM/node".
1340 @end table
1341
1342 If you wish to use Gauger for your own project, you can grab a copy of the
1343 latest stable release or check out Gauger's Subversion repository.
1344
1345 @cindex TESTBED Subsystem
1346 @node TESTBED Subsystem
1347 @section TESTBED Subsystem
1348
1349 The TESTBED subsystem facilitates testing and measuring of multi-peer
1350 deployments on a single host or over multiple hosts.
1351
1352 The architecture of the testbed module is divided into the following:
1353 @itemize @bullet
1354
1355 @item Testbed API: An API which is used by the testing driver programs. It
1356 provides with functions for creating, destroying, starting, stopping
1357 peers, etc.
1358
1359 @item Testbed service (controller): A service which is started through the
1360 Testbed API. This service handles operations to create, destroy, start,
1361 stop peers, connect them, modify their configurations.
1362
1363 @item Testbed helper: When a controller has to be started on a host, the
1364 testbed API starts the testbed helper on that host which in turn starts
1365 the controller. The testbed helper receives a configuration for the
1366 controller through its stdin and changes it to ensure the controller
1367 doesn't run into any port conflict on that host.
1368 @end itemize
1369
1370
1371 The testbed service (controller) is different from the other GNUnet
1372 services in that it is not started by ARM and is not supposed to be run
1373 as a daemon. It is started by the testbed API through a testbed helper.
1374 In a typical scenario involving multiple hosts, a controller is started
1375 on each host. Controllers take up the actual task of creating peers,
1376 starting and stopping them on the hosts they run.
1377
1378 While running deployments on a single localhost the testbed API starts the
1379 testbed helper directly as a child process. When running deployments on
1380 remote hosts the testbed API starts Testbed Helpers on each remote host
1381 through remote shell. By default testbed API uses SSH as a remote shell.
1382 This can be changed by setting the environmental variable
1383 GNUNET_TESTBED_RSH_CMD to the required remote shell program. This
1384 variable can also contain parameters which are to be passed to the remote
1385 shell program. For e.g:
1386
1387 @example
1388 export GNUNET_TESTBED_RSH_CMD="ssh -o BatchMode=yes \
1389 -o NoHostAuthenticationForLocalhost=yes %h"
1390 @end example
1391
1392 Substitutions are allowed in the command string above,
1393 this allows for substitutions through placemarks which begin with a `%'.
1394 At present the following substitutions are supported
1395
1396 @itemize @bullet
1397 @item %h: hostname
1398 @item %u: username
1399 @item %p: port
1400 @end itemize
1401
1402 Note that the substitution placemark is replaced only when the
1403 corresponding field is available and only once. Specifying
1404
1405 @example
1406 %u@@%h
1407 @end example
1408
1409 doesn't work either. If you want to user username substitutions for
1410 @command{SSH}, use the argument @code{-l} before the
1411 username substitution.
1412
1413 For example:
1414 @example
1415 ssh -l %u -p %p %h
1416 @end example
1417
1418 The testbed API and the helper communicate through the helpers stdin and
1419 stdout. As the helper is started through a remote shell on remote hosts
1420 any output messages from the remote shell interfere with the communication
1421 and results in a failure while starting the helper. For this reason, it is
1422 suggested to use flags to make the remote shells produce no output
1423 messages and to have password-less logins. The default remote shell, SSH,
1424 the default options are:
1425
1426 @example
1427 -o BatchMode=yes -o NoHostBasedAuthenticationForLocalhost=yes"
1428 @end example
1429
1430 Password-less logins should be ensured by using SSH keys.
1431
1432 Since the testbed API executes the remote shell as a non-interactive
1433 shell, certain scripts like .bashrc, .profiler may not be executed. If
1434 this is the case testbed API can be forced to execute an interactive
1435 shell by setting up the environmental variable
1436 @code{GNUNET_TESTBED_RSH_CMD_SUFFIX} to a shell program.
1437
1438 An example could be:
1439
1440 @example
1441 export GNUNET_TESTBED_RSH_CMD_SUFFIX="sh -lc"
1442 @end example
1443
1444 The testbed API will then execute the remote shell program as:
1445
1446 @example
1447 $GNUNET_TESTBED_RSH_CMD -p $port $dest $GNUNET_TESTBED_RSH_CMD_SUFFIX \
1448 gnunet-helper-testbed
1449 @end example
1450
1451 On some systems, problems may arise while starting testbed helpers if
1452 GNUnet is installed into a custom location since the helper may not be
1453 found in the standard path. This can be addressed by setting the variable
1454 `@code{HELPER_BINARY_PATH}' to the path of the testbed helper.
1455 Testbed API will then use this path to start helper binaries both
1456 locally and remotely.
1457
1458 Testbed API can accessed by including the
1459 @file{gnunet_testbed_service.h} file and linking with
1460 @code{-lgnunettestbed}.
1461
1462 @c ***********************************************************************
1463 @menu
1464 * Supported Topologies::
1465 * Hosts file format::
1466 * Topology file format::
1467 * Testbed Barriers::
1468 * Automatic large-scale deployment in the PlanetLab testbed::
1469 * TESTBED Caveats::
1470 @end menu
1471
1472 @node Supported Topologies
1473 @subsection Supported Topologies
1474
1475 While testing multi-peer deployments, it is often needed that the peers
1476 are connected in some topology. This requirement is addressed by the
1477 function @code{GNUNET_TESTBED_overlay_connect()} which connects any given
1478 two peers in the testbed.
1479
1480 The API also provides a helper function
1481 @code{GNUNET_TESTBED_overlay_configure_topology()} to connect a given set
1482 of peers in any of the following supported topologies:
1483
1484 @itemize @bullet
1485
1486 @item @code{GNUNET_TESTBED_TOPOLOGY_CLIQUE}: All peers are connected with
1487 each other
1488
1489 @item @code{GNUNET_TESTBED_TOPOLOGY_LINE}: Peers are connected to form a
1490 line
1491
1492 @item @code{GNUNET_TESTBED_TOPOLOGY_RING}: Peers are connected to form a
1493 ring topology
1494
1495 @item @code{GNUNET_TESTBED_TOPOLOGY_2D_TORUS}: Peers are connected to
1496 form a 2 dimensional torus topology. The number of peers may not be a
1497 perfect square, in that case the resulting torus may not have the uniform
1498 poloidal and toroidal lengths
1499
1500 @item @code{GNUNET_TESTBED_TOPOLOGY_ERDOS_RENYI}: Topology is generated
1501 to form a random graph. The number of links to be present should be given
1502
1503 @item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD}: Peers are connected to
1504 form a 2D Torus with some random links among them. The number of random
1505 links are to be given
1506
1507 @item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD_RING}: Peers are
1508 connected to form a ring with some random links among them. The number of
1509 random links are to be given
1510
1511 @item @code{GNUNET_TESTBED_TOPOLOGY_SCALE_FREE}: Connects peers in a
1512 topology where peer connectivity follows power law - new peers are
1513 connected with high probabililty to well connected peers.
1514 @footnote{See Emergence of Scaling in Random Networks. Science 286,
1515 509-512, 1999
1516 (@uref{https://gnunet.org/git/bibliography.git/plain/docs/emergence_of_scaling_in_random_networks__barabasi_albert_science_286__1999.pdf, pdf})}
1517
1518 @item @code{GNUNET_TESTBED_TOPOLOGY_FROM_FILE}: The topology information
1519 is loaded from a file. The path to the file has to be given.
1520 @xref{Topology file format}, for the format of this file.
1521
1522 @item @code{GNUNET_TESTBED_TOPOLOGY_NONE}: No topology
1523 @end itemize
1524
1525
1526 The above supported topologies can be specified respectively by setting
1527 the variable @code{OVERLAY_TOPOLOGY} to the following values in the
1528 configuration passed to Testbed API functions
1529 @code{GNUNET_TESTBED_test_run()} and
1530 @code{GNUNET_TESTBED_run()}:
1531
1532 @itemize @bullet
1533 @item @code{CLIQUE}
1534 @item @code{RING}
1535 @item @code{LINE}
1536 @item @code{2D_TORUS}
1537 @item @code{RANDOM}
1538 @item @code{SMALL_WORLD}
1539 @item @code{SMALL_WORLD_RING}
1540 @item @code{SCALE_FREE}
1541 @item @code{FROM_FILE}
1542 @item @code{NONE}
1543 @end itemize
1544
1545
1546 Topologies @code{RANDOM}, @code{SMALL_WORLD} and @code{SMALL_WORLD_RING}
1547 require the option @code{OVERLAY_RANDOM_LINKS} to be set to the number of
1548 random links to be generated in the configuration. The option will be
1549 ignored for the rest of the topologies.
1550
1551 Topology @code{SCALE_FREE} requires the options
1552 @code{SCALE_FREE_TOPOLOGY_CAP} to be set to the maximum number of peers
1553 which can connect to a peer and @code{SCALE_FREE_TOPOLOGY_M} to be set to
1554 how many peers a peer should be atleast connected to.
1555
1556 Similarly, the topology @code{FROM_FILE} requires the option
1557 @code{OVERLAY_TOPOLOGY_FILE} to contain the path of the file containing
1558 the topology information. This option is ignored for the rest of the
1559 topologies. @xref{Topology file format}, for the format of this file.
1560
1561 @c ***********************************************************************
1562 @node Hosts file format
1563 @subsection Hosts file format
1564
1565 The testbed API offers the function
1566 @code{GNUNET_TESTBED_hosts_load_from_file()} to load from a given file
1567 details about the hosts which testbed can use for deploying peers.
1568 This function is useful to keep the data about hosts
1569 separate instead of hard coding them in code.
1570
1571 Another helper function from testbed API, @code{GNUNET_TESTBED_run()}
1572 also takes a hosts file name as its parameter. It uses the above
1573 function to populate the hosts data structures and start controllers to
1574 deploy peers.
1575
1576 These functions require the hosts file to be of the following format:
1577 @itemize @bullet
1578 @item Each line is interpreted to have details about a host
1579 @item Host details should include the username to use for logging into the
1580 host, the hostname of the host and the port number to use for the remote
1581 shell program. All thee values should be given.
1582 @item These details should be given in the following format:
1583 @example
1584 <username>@@<hostname>:<port>
1585 @end example
1586 @end itemize
1587
1588 Note that having canonical hostnames may cause problems while resolving
1589 the IP addresses (See this bug). Hence it is advised to provide the hosts'
1590 IP numerical addresses as hostnames whenever possible.
1591
1592 @c ***********************************************************************
1593 @node Topology file format
1594 @subsection Topology file format
1595
1596 A topology file describes how peers are to be connected. It should adhere
1597 to the following format for testbed to parse it correctly.
1598
1599 Each line should begin with the target peer id. This should be followed by
1600 a colon(`:') and origin peer ids seperated by `|'. All spaces except for
1601 newline characters are ignored. The API will then try to connect each
1602 origin peer to the target peer.
1603
1604 For example, the following file will result in 5 overlay connections:
1605 [2->1], [3->1],[4->3], [0->3], [2->0]@
1606 @code{@ 1:2|3@ 3:4| 0@ 0: 2@ }
1607
1608 @c ***********************************************************************
1609 @node Testbed Barriers
1610 @subsection Testbed Barriers
1611
1612 The testbed subsystem's barriers API facilitates coordination among the
1613 peers run by the testbed and the experiment driver. The concept is
1614 similar to the barrier synchronisation mechanism found in parallel
1615 programming or multi-threading paradigms - a peer waits at a barrier upon
1616 reaching it until the barrier is reached by a predefined number of peers.
1617 This predefined number of peers required to cross a barrier is also called
1618 quorum. We say a peer has reached a barrier if the peer is waiting for the
1619 barrier to be crossed. Similarly a barrier is said to be reached if the
1620 required quorum of peers reach the barrier. A barrier which is reached is
1621 deemed as crossed after all the peers waiting on it are notified.
1622
1623 The barriers API provides the following functions:
1624 @itemize @bullet
1625 @item @strong{@code{GNUNET_TESTBED_barrier_init()}:} function to
1626 initialse a barrier in the experiment
1627 @item @strong{@code{GNUNET_TESTBED_barrier_cancel()}:} function to cancel
1628 a barrier which has been initialised before
1629 @item @strong{@code{GNUNET_TESTBED_barrier_wait()}:} function to signal
1630 barrier service that the caller has reached a barrier and is waiting for
1631 it to be crossed
1632 @item @strong{@code{GNUNET_TESTBED_barrier_wait_cancel()}:} function to
1633 stop waiting for a barrier to be crossed
1634 @end itemize
1635
1636
1637 Among the above functions, the first two, namely
1638 @code{GNUNET_TESTBED_barrier_init()} and
1639 @code{GNUNET_TESTBED_barrier_cancel()} are used by experiment drivers. All
1640 barriers should be initialised by the experiment driver by calling
1641 @code{GNUNET_TESTBED_barrier_init()}. This function takes a name to
1642 identify the barrier, the quorum required for the barrier to be crossed
1643 and a notification callback for notifying the experiment driver when the
1644 barrier is crossed. @code{GNUNET_TESTBED_barrier_cancel()} cancels an
1645 initialised barrier and frees the resources allocated for it. This
1646 function can be called upon a initialised barrier before it is crossed.
1647
1648 The remaining two functions @code{GNUNET_TESTBED_barrier_wait()} and
1649 @code{GNUNET_TESTBED_barrier_wait_cancel()} are used in the peer's
1650 processes. @code{GNUNET_TESTBED_barrier_wait()} connects to the local
1651 barrier service running on the same host the peer is running on and
1652 registers that the caller has reached the barrier and is waiting for the
1653 barrier to be crossed. Note that this function can only be used by peers
1654 which are started by testbed as this function tries to access the local
1655 barrier service which is part of the testbed controller service. Calling
1656 @code{GNUNET_TESTBED_barrier_wait()} on an uninitialised barrier results
1657 in failure. @code{GNUNET_TESTBED_barrier_wait_cancel()} cancels the
1658 notification registered by @code{GNUNET_TESTBED_barrier_wait()}.
1659
1660
1661 @c ***********************************************************************
1662 @menu
1663 * Implementation::
1664 @end menu
1665
1666 @node Implementation
1667 @subsubsection Implementation
1668
1669 Since barriers involve coordination between experiment driver and peers,
1670 the barrier service in the testbed controller is split into two
1671 components. The first component responds to the message generated by the
1672 barrier API used by the experiment driver (functions
1673 @code{GNUNET_TESTBED_barrier_init()} and
1674 @code{GNUNET_TESTBED_barrier_cancel()}) and the second component to the
1675 messages generated by barrier API used by peers (functions
1676 @code{GNUNET_TESTBED_barrier_wait()} and
1677 @code{GNUNET_TESTBED_barrier_wait_cancel()}).
1678
1679 Calling @code{GNUNET_TESTBED_barrier_init()} sends a
1680 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_INIT} message to the master
1681 controller. The master controller then registers a barrier and calls
1682 @code{GNUNET_TESTBED_barrier_init()} for each its subcontrollers. In this
1683 way barrier initialisation is propagated to the controller hierarchy.
1684 While propagating initialisation, any errors at a subcontroller such as
1685 timeout during further propagation are reported up the hierarchy back to
1686 the experiment driver.
1687
1688 Similar to @code{GNUNET_TESTBED_barrier_init()},
1689 @code{GNUNET_TESTBED_barrier_cancel()} propagates
1690 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_CANCEL} message which causes
1691 controllers to remove an initialised barrier.
1692
1693 The second component is implemented as a separate service in the binary
1694 `gnunet-service-testbed' which already has the testbed controller service.
1695 Although this deviates from the gnunet process architecture of having one
1696 service per binary, it is needed in this case as this component needs
1697 access to barrier data created by the first component. This component
1698 responds to @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages from
1699 local peers when they call @code{GNUNET_TESTBED_barrier_wait()}. Upon
1700 receiving @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} message, the
1701 service checks if the requested barrier has been initialised before and
1702 if it was not initialised, an error status is sent through
1703 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to the local
1704 peer and the connection from the peer is terminated. If the barrier is
1705 initialised before, the barrier's counter for reached peers is incremented
1706 and a notification is registered to notify the peer when the barrier is
1707 reached. The connection from the peer is left open.
1708
1709 When enough peers required to attain the quorum send
1710 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages, the controller
1711 sends a @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to its
1712 parent informing that the barrier is crossed. If the controller has
1713 started further subcontrollers, it delays this message until it receives
1714 a similar notification from each of those subcontrollers. Finally, the
1715 barriers API at the experiment driver receives the
1716 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} when the barrier is
1717 reached at all the controllers.
1718
1719 The barriers API at the experiment driver responds to the
1720 @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message by echoing it
1721 back to the master controller and notifying the experiment controller
1722 through the notification callback that a barrier has been crossed. The
1723 echoed @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message is
1724 propagated by the master controller to the controller hierarchy. This
1725 propagation triggers the notifications registered by peers at each of the
1726 controllers in the hierarchy. Note the difference between this downward
1727 propagation of the @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS}
1728 message from its upward propagation --- the upward propagation is needed
1729 for ensuring that the barrier is reached by all the controllers and the
1730 downward propagation is for triggering that the barrier is crossed.
1731
1732 @cindex PlanetLab testbed
1733 @node Automatic large-scale deployment in the PlanetLab testbed
1734 @subsection Automatic large-scale deployment in the PlanetLab testbed
1735
1736 PlanetLab is a testbed for computer networking and distributed systems
1737 research. It was established in 2002 and as of June 2010 was composed of
1738 1090 nodes at 507 sites worldwide.
1739
1740 To automate the GNUnet we created a set of automation tools to simplify
1741 the large-scale deployment. We provide you a set of scripts you can use
1742 to deploy GNUnet on a set of nodes and manage your installation.
1743
1744 Please also check @uref{https://gnunet.org/installation-fedora8-svn} and
1745 @uref{https://gnunet.org/installation-fedora12-svn} to find detailled
1746 instructions how to install GNUnet on a PlanetLab node.
1747
1748
1749 @c ***********************************************************************
1750 @menu
1751 * PlanetLab Automation for Fedora8 nodes::
1752 * Install buildslave on PlanetLab nodes running fedora core 8::
1753 * Setup a new PlanetLab testbed using GPLMT::
1754 * Why do i get an ssh error when using the regex profiler?::
1755 @end menu
1756
1757 @node PlanetLab Automation for Fedora8 nodes
1758 @subsubsection PlanetLab Automation for Fedora8 nodes
1759
1760 @c ***********************************************************************
1761 @node Install buildslave on PlanetLab nodes running fedora core 8
1762 @subsubsection Install buildslave on PlanetLab nodes running fedora core 8
1763 @c ** Actually this is a subsubsubsection, but must be fixed differently
1764 @c ** as subsubsection is the lowest.
1765
1766 Since most of the PlanetLab nodes are running the very old Fedora core 8
1767 image, installing the buildslave software is quite some pain. For our
1768 PlanetLab testbed we figured out how to install the buildslave software
1769 best.
1770
1771 @c This is a vvery terrible way to suggest installing software.
1772 @c FIXME: Is there an official, safer way instead of blind-piping a
1773 @c script?
1774 @c FIXME: Use newer pypi URLs below.
1775 Install Distribute for Python:
1776
1777 @example
1778 curl http://python-distribute.org/distribute_setup.py | sudo python
1779 @end example
1780
1781 Install Distribute for zope.interface <= 3.8.0 (4.0 and 4.0.1 will not
1782 work):
1783
1784 @example
1785 export PYPI=@value{PYPI-URL}
1786 wget $PYPI/z/zope.interface/zope.interface-3.8.0.tar.gz
1787 tar zvfz zope.interface-3.8.0.tar.gz
1788 cd zope.interface-3.8.0
1789 sudo python setup.py install
1790 @end example
1791
1792 Install the buildslave software (0.8.6 was the latest version):
1793
1794 @example
1795 export GCODE="http://buildbot.googlecode.com/files"
1796 wget $GCODE/buildbot-slave-0.8.6p1.tar.gz
1797 tar xvfz buildbot-slave-0.8.6p1.tar.gz
1798 cd buildslave-0.8.6p1
1799 sudo python setup.py install
1800 @end example
1801
1802 The setup will download the matching twisted package and install it.
1803 It will also try to install the latest version of zope.interface which
1804 will fail to install. Buildslave will work anyway since version 3.8.0
1805 was installed before!
1806
1807 @c ***********************************************************************
1808 @node Setup a new PlanetLab testbed using GPLMT
1809 @subsubsection Setup a new PlanetLab testbed using GPLMT
1810
1811 @itemize @bullet
1812 @item Get a new slice and assign nodes
1813 Ask your PlanetLab PI to give you a new slice and assign the nodes you
1814 need
1815 @item Install a buildmaster
1816 You can stick to the buildbot documentation:@
1817 @uref{http://buildbot.net/buildbot/docs/current/manual/installation.html}
1818 @item Install the buildslave software on all nodes
1819 To install the buildslave on all nodes assigned to your slice you can use
1820 the tasklist @code{install_buildslave_fc8.xml} provided with GPLMT:
1821
1822 @example
1823 ./gplmt.py -c contrib/tumple_gnunet.conf -t \
1824 contrib/tasklists/install_buildslave_fc8.xml -a -p <planetlab password>
1825 @end example
1826
1827 @item Create the buildmaster configuration and the slave setup commands
1828
1829 The master and the and the slaves have need to have credentials and the
1830 master has to have all nodes configured. This can be done with the
1831 @file{create_buildbot_configuration.py} script in the @file{scripts}
1832 directory.
1833
1834 This scripts takes a list of nodes retrieved directly from PlanetLab or
1835 read from a file and a configuration template and creates:
1836
1837 @itemize @bullet
1838 @item a tasklist which can be executed with gplmt to setup the slaves
1839 @item a master.cfg file containing a PlanetLab nodes
1840 @end itemize
1841
1842 A configuration template is included in the <contrib>, most important is
1843 that the script replaces the following tags in the template:
1844
1845 %GPLMT_BUILDER_DEFINITION :@ GPLMT_BUILDER_SUMMARY@ GPLMT_SLAVES@
1846 %GPLMT_SCHEDULER_BUILDERS
1847
1848 Create configuration for all nodes assigned to a slice:
1849
1850 @example
1851 ./create_buildbot_configuration.py -u <planetlab username> \
1852 -p <planetlab password> -s <slice> -m <buildmaster+port> \
1853 -t <template>
1854 @end example
1855
1856 Create configuration for some nodes in a file:
1857
1858 @example
1859 ./create_buildbot_configuration.p -f <node_file> \
1860 -m <buildmaster+port> -t <template>
1861 @end example
1862
1863 @item Copy the @file{master.cfg} to the buildmaster and start it
1864 Use @code{buildbot start <basedir>} to start the server
1865 @item Setup the buildslaves
1866 @end itemize
1867
1868 @c ***********************************************************************
1869 @node Why do i get an ssh error when using the regex profiler?
1870 @subsubsection Why do i get an ssh error when using the regex profiler?
1871
1872 Why do i get an ssh error "Permission denied (publickey,password)." when
1873 using the regex profiler although passwordless ssh to localhost works
1874 using publickey and ssh-agent?
1875
1876 You have to generate a public/private-key pair with no password:@
1877 @code{ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_localhost}@
1878 and then add the following to your ~/.ssh/config file:
1879
1880 @code{Host 127.0.0.1@ IdentityFile ~/.ssh/id_localhost}
1881
1882 now make sure your hostsfile looks like
1883
1884 @example
1885 [USERNAME]@@127.0.0.1:22@
1886 [USERNAME]@@127.0.0.1:22
1887 @end example
1888
1889 You can test your setup by running @code{ssh 127.0.0.1} in a
1890 terminal and then in the opened session run it again.
1891 If you were not asked for a password on either login,
1892 then you should be good to go.
1893
1894 @cindex TESTBED Caveats
1895 @node TESTBED Caveats
1896 @subsection TESTBED Caveats
1897
1898 This section documents a few caveats when using the GNUnet testbed
1899 subsystem.
1900
1901 @c ***********************************************************************
1902 @menu
1903 * CORE must be started::
1904 * ATS must want the connections::
1905 @end menu
1906
1907 @node CORE must be started
1908 @subsubsection CORE must be started
1909
1910 A uncomplicated issue is bug #3993@footnote{@uref{https://gnunet.org/bugs/view.php?id=3993, https://gnunet.org/bugs/view.php?id=3993}}:
1911 Your configuration MUST somehow ensure that for each peer the
1912 @code{CORE} service is started when the peer is setup, otherwise
1913 @code{TESTBED} may fail to connect peers when the topology is initialized,
1914 as @code{TESTBED} will start some @code{CORE} services but not
1915 necessarily all (but it relies on all of them running). The easiest way
1916 is to set
1917
1918 @example
1919 [core]
1920 FORCESTART = YES
1921 @end example
1922
1923 @noindent
1924 in the configuration file.
1925 Alternatively, having any service that directly or indirectly depends on
1926 @code{CORE} being started with @code{FORCESTART} will also do.
1927 This issue largely arises if users try to over-optimize by not
1928 starting any services with @code{FORCESTART}.
1929
1930 @c ***********************************************************************
1931 @node ATS must want the connections
1932 @subsubsection ATS must want the connections
1933
1934 When TESTBED sets up connections, it only offers the respective HELLO
1935 information to the TRANSPORT service. It is then up to the ATS service to
1936 @strong{decide} to use the connection. The ATS service will typically
1937 eagerly establish any connection if the number of total connections is
1938 low (relative to bandwidth). Details may further depend on the
1939 specific ATS backend that was configured. If ATS decides to NOT establish
1940 a connection (even though TESTBED provided the required information), then
1941 that connection will count as failed for TESTBED. Note that you can
1942 configure TESTBED to tolerate a certain number of connection failures
1943 (see '-e' option of gnunet-testbed-profiler). This issue largely arises
1944 for dense overlay topologies, especially if you try to create cliques
1945 with more than 20 peers.
1946
1947 @cindex libgnunetutil
1948 @node libgnunetutil
1949 @section libgnunetutil
1950
1951 libgnunetutil is the fundamental library that all GNUnet code builds upon.
1952 Ideally, this library should contain most of the platform dependent code
1953 (except for user interfaces and really special needs that only few
1954 applications have). It is also supposed to offer basic services that most
1955 if not all GNUnet binaries require. The code of libgnunetutil is in the
1956 @file{src/util/} directory. The public interface to the library is in the
1957 gnunet_util.h header. The functions provided by libgnunetutil fall
1958 roughly into the following categories (in roughly the order of importance
1959 for new developers):
1960
1961 @itemize @bullet
1962 @item logging (common_logging.c)
1963 @item memory allocation (common_allocation.c)
1964 @item endianess conversion (common_endian.c)
1965 @item internationalization (common_gettext.c)
1966 @item String manipulation (string.c)
1967 @item file access (disk.c)
1968 @item buffered disk IO (bio.c)
1969 @item time manipulation (time.c)
1970 @item configuration parsing (configuration.c)
1971 @item command-line handling (getopt*.c)
1972 @item cryptography (crypto_*.c)
1973 @item data structures (container_*.c)
1974 @item CPS-style scheduling (scheduler.c)
1975 @item Program initialization (program.c)
1976 @item Networking (network.c, client.c, server*.c, service.c)
1977 @item message queueing (mq.c)
1978 @item bandwidth calculations (bandwidth.c)
1979 @item Other OS-related (os*.c, plugin.c, signal.c)
1980 @item Pseudonym management (pseudonym.c)
1981 @end itemize
1982
1983 It should be noted that only developers that fully understand this entire
1984 API will be able to write good GNUnet code.
1985
1986 Ideally, porting GNUnet should only require porting the gnunetutil
1987 library. More testcases for the gnunetutil APIs are therefore a great
1988 way to make porting of GNUnet easier.
1989
1990 @menu
1991 * Logging::
1992 * Interprocess communication API (IPC)::
1993 * Cryptography API::
1994 * Message Queue API::
1995 * Service API::
1996 * Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps::
1997 * CONTAINER_MDLL API::
1998 @end menu
1999
2000 @cindex Logging
2001 @cindex log levels
2002 @node Logging
2003 @subsection Logging
2004
2005 GNUnet is able to log its activity, mostly for the purposes of debugging
2006 the program at various levels.
2007
2008 @file{gnunet_common.h} defines several @strong{log levels}:
2009 @table @asis
2010
2011 @item ERROR for errors (really problematic situations, often leading to
2012 crashes)
2013 @item WARNING for warnings (troubling situations that might have
2014 negative consequences, although not fatal)
2015 @item INFO for various information.
2016 Used somewhat rarely, as GNUnet statistics is used to hold and display
2017 most of the information that users might find interesting.
2018 @item DEBUG for debugging.
2019 Does not produce much output on normal builds, but when extra logging is
2020 enabled at compile time, a staggering amount of data is outputted under
2021 this log level.
2022 @end table
2023
2024
2025 Normal builds of GNUnet (configured with @code{--enable-logging[=yes]})
2026 are supposed to log nothing under DEBUG level. The
2027 @code{--enable-logging=verbose} configure option can be used to create a
2028 build with all logging enabled. However, such build will produce large
2029 amounts of log data, which is inconvenient when one tries to hunt down a
2030 specific problem.
2031
2032 To mitigate this problem, GNUnet provides facilities to apply a filter to
2033 reduce the logs:
2034 @table @asis
2035
2036 @item Logging by default When no log levels are configured in any other
2037 way (see below), GNUnet will default to the WARNING log level. This
2038 mostly applies to GNUnet command line utilities, services and daemons;
2039 tests will always set log level to WARNING or, if
2040 @code{--enable-logging=verbose} was passed to configure, to DEBUG. The
2041 default level is suggested for normal operation.
2042 @item The -L option Most GNUnet executables accept an "-L loglevel" or
2043 "--log=loglevel" option. If used, it makes the process set a global log
2044 level to "loglevel". Thus it is possible to run some processes
2045 with -L DEBUG, for example, and others with -L ERROR to enable specific
2046 settings to diagnose problems with a particular process.
2047 @item Configuration files.  Because GNUnet
2048 service and deamon processes are usually launched by gnunet-arm, it is not
2049 possible to pass different custom command line options directly to every
2050 one of them. The options passed to @code{gnunet-arm} only affect
2051 gnunet-arm and not the rest of GNUnet. However, one can specify a
2052 configuration key "OPTIONS" in the section that corresponds to a service
2053 or a daemon, and put a value of "-L loglevel" there. This will make the
2054 respective service or daemon set its log level to "loglevel" (as the
2055 value of OPTIONS will be passed as a command-line argument).
2056
2057 To specify the same log level for all services without creating separate
2058 "OPTIONS" entries in the configuration for each one, the user can specify
2059 a config key "GLOBAL_POSTFIX" in the [arm] section of the configuration
2060 file. The value of GLOBAL_POSTFIX will be appended to all command lines
2061 used by the ARM service to run other services. It can contain any option
2062 valid for all GNUnet commands, thus in particular the "-L loglevel"
2063 option. The ARM service itself is, however, unaffected by GLOBAL_POSTFIX;
2064 to set log level for it, one has to specify "OPTIONS" key in the [arm]
2065 section.
2066 @item Environment variables.
2067 Setting global per-process log levels with "-L loglevel" does not offer
2068 sufficient log filtering granularity, as one service will call interface
2069 libraries and supporting libraries of other GNUnet services, potentially
2070 producing lots of debug log messages from these libraries. Also, changing
2071 the config file is not always convenient (especially when running the
2072 GNUnet test suite).@ To fix that, and to allow GNUnet to use different
2073 log filtering at runtime without re-compiling the whole source tree, the
2074 log calls were changed to be configurable at run time. To configure them
2075 one has to define environment variables "GNUNET_FORCE_LOGFILE",
2076 "GNUNET_LOG" and/or "GNUNET_FORCE_LOG":
2077 @itemize @bullet
2078
2079 @item "GNUNET_LOG" only affects the logging when no global log level is
2080 configured by any other means (that is, the process does not explicitly
2081 set its own log level, there are no "-L loglevel" options on command line
2082 or in configuration files), and can be used to override the default
2083 WARNING log level.
2084
2085 @item "GNUNET_FORCE_LOG" will completely override any other log
2086 configuration options given.
2087
2088 @item "GNUNET_FORCE_LOGFILE" will completely override the location of the
2089 file to log messages to. It should contain a relative or absolute file
2090 name. Setting GNUNET_FORCE_LOGFILE is equivalent to passing
2091 "--log-file=logfile" or "-l logfile" option (see below). It supports "[]"
2092 format in file names, but not "@{@}" (see below).
2093 @end itemize
2094
2095
2096 Because environment variables are inherited by child processes when they
2097 are launched, starting or re-starting the ARM service with these
2098 variables will propagate them to all other services.
2099
2100 "GNUNET_LOG" and "GNUNET_FORCE_LOG" variables must contain a specially
2101 formatted @strong{logging definition} string, which looks like this:@
2102
2103 @c FIXME: Can we close this with [/component] instead?
2104 @example
2105 [component];[file];[function];[from_line[-to_line]];loglevel[/component...]
2106 @end example
2107
2108 That is, a logging definition consists of definition entries, separated by
2109 slashes ('/'). If only one entry is present, there is no need to add a
2110 slash to its end (although it is not forbidden either).@ All definition
2111 fields (component, file, function, lines and loglevel) are mandatory, but
2112 (except for the loglevel) they can be empty. An empty field means
2113 "match anything". Note that even if fields are empty, the semicolon (';')
2114 separators must be present.@ The loglevel field is mandatory, and must
2115 contain one of the log level names (ERROR, WARNING, INFO or DEBUG).@
2116 The lines field might contain one non-negative number, in which case it
2117 matches only one line, or a range "from_line-to_line", in which case it
2118 matches any line in the interval [from_line;to_line] (that is, including
2119 both start and end line).@ GNUnet mostly defaults component name to the
2120 name of the service that is implemented in a process ('transport',
2121 'core', 'peerinfo', etc), but logging calls can specify custom component
2122 names using @code{GNUNET_log_from}.@ File name and function name are
2123 provided by the compiler (__FILE__ and __FUNCTION__ built-ins).
2124
2125 Component, file and function fields are interpreted as non-extended
2126 regular expressions (GNU libc regex functions are used). Matching is
2127 case-sensitive, "^" and "$" will match the beginning and the end of the
2128 text. If a field is empty, its contents are automatically replaced with
2129 a ".*" regular expression, which matches anything. Matching is done in
2130 the default way, which means that the expression matches as long as it's
2131 contained anywhere in the string. Thus "GNUNET_" will match both
2132 "GNUNET_foo" and "BAR_GNUNET_BAZ". Use '^' and/or '$' to make sure that
2133 the expression matches at the start and/or at the end of the string.
2134 The semicolon (';') can't be escaped, and GNUnet will not use it in
2135 component names (it can't be used in function names and file names
2136 anyway).
2137
2138 @end table
2139
2140
2141 Every logging call in GNUnet code will be (at run time) matched against
2142 the log definitions passed to the process. If a log definition fields are
2143 matching the call arguments, then the call log level is compared the the
2144 log level of that definition. If the call log level is less or equal to
2145 the definition log level, the call is allowed to proceed. Otherwise the
2146 logging call is forbidden, and nothing is logged. If no definitions
2147 matched at all, GNUnet will use the global log level or (if a global log
2148 level is not specified) will default to WARNING (that is, it will allow
2149 the call to proceed, if its level is less or equal to the global log
2150 level or to WARNING).
2151
2152 That is, definitions are evaluated from left to right, and the first
2153 matching definition is used to allow or deny the logging call. Thus it is
2154 advised to place narrow definitions at the beginning of the logdef
2155 string, and generic definitions - at the end.
2156
2157 Whether a call is allowed or not is only decided the first time this
2158 particular call is made. The evaluation result is then cached, so that
2159 any attempts to make the same call later will be allowed or disallowed
2160 right away. Because of that runtime log level evaluation should not
2161 significantly affect the process performance.
2162 Log definition parsing is only done once, at the first call to
2163 @code{GNUNET_log_setup ()} made by the process (which is usually done soon after
2164 it starts).
2165
2166 At the moment of writing there is no way to specify logging definitions
2167 from configuration files, only via environment variables.
2168
2169 At the moment GNUnet will stop processing a log definition when it
2170 encounters an error in definition formatting or an error in regular
2171 expression syntax, and will not report the failure in any way.
2172
2173
2174 @c ***********************************************************************
2175 @menu
2176 * Examples::
2177 * Log files::
2178 * Updated behavior of GNUNET_log::
2179 @end menu
2180
2181 @node Examples
2182 @subsubsection Examples
2183
2184 @table @asis
2185
2186 @item @code{GNUNET_FORCE_LOG=";;;;DEBUG" gnunet-arm -s} Start GNUnet
2187 process tree, running all processes with DEBUG level (one should be
2188 careful with it, as log files will grow at alarming rate!)
2189 @item @code{GNUNET_FORCE_LOG="core;;;;DEBUG" gnunet-arm -s} Start GNUnet
2190 process tree, running the core service under DEBUG level (everything else
2191 will use configured or default level).
2192
2193 @item Start GNUnet process tree, allowing any logging calls from
2194 gnunet-service-transport_validation.c (everything else will use
2195 configured or default level).
2196
2197 @example
2198 GNUNET_FORCE_LOG=";gnunet-service-transport_validation.c;;; DEBUG" \
2199 gnunet-arm -s
2200 @end example
2201
2202 @item Start GNUnet process tree, allowing any logging calls from
2203 gnunet-gnunet-service-fs_push.c (everything else will use configured or
2204 default level).
2205
2206 @example
2207 GNUNET_FORCE_LOG="fs;gnunet-service-fs_push.c;;;DEBUG" gnunet-arm -s
2208 @end example
2209
2210 @item Start GNUnet process tree, allowing any logging calls from the
2211 GNUNET_NETWORK_socket_select function (everything else will use
2212 configured or default level).
2213
2214 @example
2215 GNUNET_FORCE_LOG=";;GNUNET_NETWORK_socket_select;;DEBUG" gnunet-arm -s
2216 @end example
2217
2218 @item Start GNUnet process tree, allowing any logging calls from the
2219 components that have "transport" in their names, and are made from
2220 function that have "send" in their names. Everything else will be allowed
2221 to be logged only if it has WARNING level.
2222
2223 @example
2224 GNUNET_FORCE_LOG="transport.*;;.*send.*;;DEBUG/;;;;WARNING" gnunet-arm -s
2225 @end example
2226
2227 @end table
2228
2229
2230 On Windows, one can use batch files to run GNUnet processes with special
2231 environment variables, without affecting the whole system. Such batch
2232 file will look like this:
2233
2234 @example
2235 set GNUNET_FORCE_LOG=;;do_transmit;;DEBUG@ gnunet-arm -s
2236 @end example
2237
2238 (note the absence of double quotes in the environment variable definition,
2239 as opposed to earlier examples, which use the shell).
2240 Another limitation, on Windows, GNUNET_FORCE_LOGFILE @strong{MUST} be set
2241 in order to GNUNET_FORCE_LOG to work.
2242
2243
2244 @cindex Log files
2245 @node Log files
2246 @subsubsection Log files
2247
2248 GNUnet can be told to log everything into a file instead of stderr (which
2249 is the default) using the "--log-file=logfile" or "-l logfile" option.
2250 This option can also be passed via command line, or from the "OPTION" and
2251 "GLOBAL_POSTFIX" configuration keys (see above). The file name passed
2252 with this option is subject to GNUnet filename expansion. If specified in
2253 "GLOBAL_POSTFIX", it is also subject to ARM service filename expansion,
2254 in particular, it may contain "@{@}" (left and right curly brace)
2255 sequence, which will be replaced by ARM with the name of the service.
2256 This is used to keep logs from more than one service separate, while only
2257 specifying one template containing "@{@}" in GLOBAL_POSTFIX.
2258
2259 As part of a secondary file name expansion, the first occurrence of "[]"
2260 sequence ("left square brace" followed by "right square brace") in the
2261 file name will be replaced with a process identifier or the process when
2262 it initializes its logging subsystem. As a result, all processes will log
2263 into different files. This is convenient for isolating messages of a
2264 particular process, and prevents I/O races when multiple processes try to
2265 write into the file at the same time. This expansion is done
2266 independently of "@{@}" expansion that ARM service does (see above).
2267
2268 The log file name that is specified via "-l" can contain format characters
2269 from the 'strftime' function family. For example, "%Y" will be replaced
2270 with the current year. Using "basename-%Y-%m-%d.log" would include the
2271 current year, month and day in the log file. If a GNUnet process runs for
2272 long enough to need more than one log file, it will eventually clean up
2273 old log files. Currently, only the last three log files (plus the current
2274 log file) are preserved. So once the fifth log file goes into use (so
2275 after 4 days if you use "%Y-%m-%d" as above), the first log file will be
2276 automatically deleted. Note that if your log file name only contains "%Y",
2277 then log files would be kept for 4 years and the logs from the first year
2278 would be deleted once year 5 begins. If you do not use any date-related
2279 string format codes, logs would never be automatically deleted by GNUnet.
2280
2281
2282 @c ***********************************************************************
2283
2284 @node Updated behavior of GNUNET_log
2285 @subsubsection Updated behavior of GNUNET_log
2286
2287 It's currently quite common to see constructions like this all over the
2288 code:
2289
2290 @example
2291 #if MESH_DEBUG
2292 GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, "MESH: client disconnected\n");
2293 #endif
2294 @end example
2295
2296 The reason for the #if is not to avoid displaying the message when
2297 disabled (GNUNET_ERROR_TYPE takes care of that), but to avoid the
2298 compiler including it in the binary at all, when compiling GNUnet for
2299 platforms with restricted storage space / memory (MIPS routers,
2300 ARM plug computers / dev boards, etc).
2301
2302 This presents several problems: the code gets ugly, hard to write and it
2303 is very easy to forget to include the #if guards, creating non-consistent
2304 code. A new change in GNUNET_log aims to solve these problems.
2305
2306 @strong{This change requires to @file{./configure} with at least
2307 @code{--enable-logging=verbose} to see debug messages.}
2308
2309 Here is an example of code with dense debug statements:
2310
2311 @example
2312 switch (restrict_topology) @{
2313 case GNUNET_TESTING_TOPOLOGY_CLIQUE:#if VERBOSE_TESTING
2314 GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but clique
2315 topology\n")); #endif unblacklisted_connections = create_clique (pg,
2316 &remove_connections, BLACKLIST, GNUNET_NO); break; case
2317 GNUNET_TESTING_TOPOLOGY_SMALL_WORLD_RING: #if VERBOSE_TESTING GNUNET_log
2318 (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but small world (ring)
2319 topology\n")); #endif unblacklisted_connections = create_small_world_ring
2320 (pg,&remove_connections, BLACKLIST); break;
2321 @end example
2322
2323
2324 Pretty hard to follow, huh?
2325
2326 From now on, it is not necessary to include the #if / #endif statements to
2327 achieve the same behavior. The @code{GNUNET_log} and @code{GNUNET_log_from}
2328 macros take
2329 care of it for you, depending on the configure option:
2330
2331 @itemize @bullet
2332 @item If @code{--enable-logging} is set to @code{no}, the binary will
2333 contain no log messages at all.
2334 @item If @code{--enable-logging} is set to @code{yes}, the binary will
2335 contain no DEBUG messages, and therefore running with @command{-L DEBUG}
2336 will have
2337 no effect. Other messages (ERROR, WARNING, INFO, etc) will be included.
2338 @item If @code{--enable-logging} is set to @code{verbose}, or
2339 @code{veryverbose} the binary will contain DEBUG messages (still, it will
2340 be neccessary to run with @command{-L DEBUG} or set the DEBUG config option
2341 to show
2342 them).
2343 @end itemize
2344
2345
2346 If you are a developer:
2347 @itemize @bullet
2348 @item please make sure that you @code{./configure
2349 --enable-logging=@{verbose,veryverbose@}}, so you can see DEBUG messages.
2350 @item please remove the @code{#if} statements around @code{GNUNET_log
2351 (GNUNET_ERROR_TYPE_DEBUG, ...)} lines, to improve the readibility of your
2352 code.
2353 @end itemize
2354
2355 Since now activating DEBUG automatically makes it VERBOSE and activates
2356 @strong{all} debug messages by default, you probably want to use the
2357 https://gnunet.org/logging functionality to filter only relevant messages.
2358 A suitable configuration could be:
2359
2360 @example
2361 $ export GNUNET_FORCE_LOG="^YOUR_SUBSYSTEM$;;;;DEBUG/;;;;WARNING"
2362 @end example
2363
2364 Which will behave almost like enabling DEBUG in that subsytem before the
2365 change. Of course you can adapt it to your particular needs, this is only
2366 a quick example.
2367
2368 @cindex Interprocess communication API
2369 @cindex ICP
2370 @node Interprocess communication API (IPC)
2371 @subsection Interprocess communication API (IPC)
2372
2373 In GNUnet a variety of new message types might be defined and used in
2374 interprocess communication, in this tutorial we use the
2375 @code{struct AddressLookupMessage} as a example to introduce how to
2376 construct our own message type in GNUnet and how to implement the message
2377 communication between service and client.
2378 (Here, a client uses the @code{struct AddressLookupMessage} as a request
2379 to ask the server to return the address of any other peer connecting to
2380 the service.)
2381
2382
2383 @c ***********************************************************************
2384 @menu
2385 * Define new message types::
2386 * Define message struct::
2387 * Client - Establish connection::
2388 * Client - Initialize request message::
2389 * Client - Send request and receive response::
2390 * Server - Startup service::
2391 * Server - Add new handles for specified messages::
2392 * Server - Process request message::
2393 * Server - Response to client::
2394 * Server - Notification of clients::
2395 * Conversion between Network Byte Order (Big Endian) and Host Byte Order::
2396 @end menu
2397
2398 @node Define new message types
2399 @subsubsection Define new message types
2400
2401 First of all, you should define the new message type in
2402 @file{gnunet_protocols.h}:
2403
2404 @example
2405  // Request to look addresses of peers in server.
2406 #define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP 29
2407   // Response to the address lookup request.
2408 #define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY 30
2409 @end example
2410
2411 @c ***********************************************************************
2412 @node Define message struct
2413 @subsubsection Define message struct
2414
2415 After the type definition, the specified message structure should also be
2416 described in the header file, e.g. transport.h in our case.
2417
2418 @example
2419 struct AddressLookupMessage @{
2420   struct GNUNET_MessageHeader header;
2421   int32_t numeric_only GNUNET_PACKED;
2422   struct GNUNET_TIME_AbsoluteNBO timeout;
2423   uint32_t addrlen GNUNET_PACKED;
2424   /* followed by 'addrlen' bytes of the actual address, then
2425      followed by the 0-terminated name of the transport */ @};
2426 GNUNET_NETWORK_STRUCT_END
2427 @end example
2428
2429
2430 Please note @code{GNUNET_NETWORK_STRUCT_BEGIN} and @code{GNUNET_PACKED}
2431 which both ensure correct alignment when sending structs over the network.
2432
2433 @menu
2434 @end menu
2435
2436 @c ***********************************************************************
2437 @node Client - Establish connection
2438 @subsubsection Client - Establish connection
2439 @c %**end of header
2440
2441
2442 At first, on the client side, the underlying API is employed to create a
2443 new connection to a service, in our example the transport service would be
2444 connected.
2445
2446 @example
2447 struct GNUNET_CLIENT_Connection *client;
2448 client = GNUNET_CLIENT_connect ("transport", cfg);
2449 @end example
2450
2451 @c ***********************************************************************
2452 @node Client - Initialize request message
2453 @subsubsection Client - Initialize request message
2454 @c %**end of header
2455
2456 When the connection is ready, we initialize the message. In this step,
2457 all the fields of the message should be properly initialized, namely the
2458 size, type, and some extra user-defined data, such as timeout, name of
2459 transport, address and name of transport.
2460
2461 @example
2462 struct AddressLookupMessage *msg;
2463 size_t len = sizeof (struct AddressLookupMessage)
2464   + addressLen
2465   + strlen (nameTrans)
2466   + 1;
2467 msg->header->size = htons (len);
2468 msg->header->type = htons
2469 (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP);
2470 msg->timeout = GNUNET_TIME_absolute_hton (abs_timeout);
2471 msg->addrlen = htonl (addressLen);
2472 char *addrbuf = (char *) &msg[1];
2473 memcpy (addrbuf, address, addressLen);
2474 char *tbuf = &addrbuf[addressLen];
2475 memcpy (tbuf, nameTrans, strlen (nameTrans) + 1);
2476 @end example
2477
2478 Note that, here the functions @code{htonl}, @code{htons} and
2479 @code{GNUNET_TIME_absolute_hton} are applied to convert little endian
2480 into big endian, about the usage of the big/small endian order and the
2481 corresponding conversion function please refer to Introduction of
2482 Big Endian and Little Endian.
2483
2484 @c ***********************************************************************
2485 @node Client - Send request and receive response
2486 @subsubsection Client - Send request and receive response
2487 @c %**end of header
2488
2489 @b{FIXME: This is very outdated, see the tutorial for the current API!}
2490
2491 Next, the client would send the constructed message as a request to the
2492 service and wait for the response from the service. To accomplish this
2493 goal, there are a number of API calls that can be used. In this example,
2494 @code{GNUNET_CLIENT_transmit_and_get_response} is chosen as the most
2495 appropriate function to use.
2496
2497 @example
2498 GNUNET_CLIENT_transmit_and_get_response
2499 (client, msg->header, timeout, GNUNET_YES, &address_response_processor,
2500 arp_ctx);
2501 @end example
2502
2503 the argument @code{address_response_processor} is a function with
2504 @code{GNUNET_CLIENT_MessageHandler} type, which is used to process the
2505 reply message from the service.
2506
2507 @node Server - Startup service
2508 @subsubsection Server - Startup service
2509
2510 After receiving the request message, we run a standard GNUnet service
2511 startup sequence using @code{GNUNET_SERVICE_run}, as follows,
2512
2513 @example
2514 int main(int argc, char**argv) @{
2515   GNUNET_SERVICE_run(argc, argv, "transport"
2516   GNUNET_SERVICE_OPTION_NONE, &run, NULL)); @}
2517 @end example
2518
2519 @c ***********************************************************************
2520 @node Server - Add new handles for specified messages
2521 @subsubsection Server - Add new handles for specified messages
2522 @c %**end of header
2523
2524 in the function above the argument @code{run} is used to initiate
2525 transport service,and defined like this:
2526
2527 @example
2528 static void run (void *cls,
2529 struct GNUNET_SERVER_Handle *serv,
2530 const struct GNUNET_CONFIGURATION_Handle *cfg) @{
2531   GNUNET_SERVER_add_handlers (serv, handlers); @}
2532 @end example
2533
2534
2535 Here, @code{GNUNET_SERVER_add_handlers} must be called in the run
2536 function to add new handlers in the service. The parameter
2537 @code{handlers} is a list of @code{struct GNUNET_SERVER_MessageHandler}
2538 to tell the service which function should be called when a particular
2539 type of message is received, and should be defined in this way:
2540
2541 @example
2542 static struct GNUNET_SERVER_MessageHandler handlers[] = @{
2543   @{&handle_start,
2544    NULL,
2545    GNUNET_MESSAGE_TYPE_TRANSPORT_START,
2546    0@},
2547   @{&handle_send,
2548    NULL,
2549    GNUNET_MESSAGE_TYPE_TRANSPORT_SEND,
2550    0@},
2551   @{&handle_try_connect,
2552    NULL,
2553    GNUNET_MESSAGE_TYPE_TRANSPORT_TRY_CONNECT,
2554    sizeof (struct TryConnectMessage)
2555   @},
2556   @{&handle_address_lookup,
2557    NULL,
2558    GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP,
2559    0@},
2560   @{NULL,
2561    NULL,
2562    0,
2563    0@}
2564 @};
2565 @end example
2566
2567
2568 As shown, the first member of the struct in the first area is a callback
2569 function, which is called to process the specified message types, given
2570 as the third member. The second parameter is the closure for the callback
2571 function, which is set to @code{NULL} in most cases, and the last
2572 parameter is the expected size of the message of this type, usually we
2573 set it to 0 to accept variable size, for special cases the exact size of
2574 the specified message also can be set. In addition, the terminator sign
2575 depicted as @code{@{NULL, NULL, 0, 0@}} is set in the last aera.
2576
2577 @c ***********************************************************************
2578 @node Server - Process request message
2579 @subsubsection Server - Process request message
2580 @c %**end of header
2581
2582 After the initialization of transport service, the request message would
2583 be processed. Before handling the main message data, the validity of this
2584 message should be checked out, e.g., to check whether the size of message
2585 is correct.
2586
2587 @example
2588 size = ntohs (message->size);
2589 if (size < sizeof (struct AddressLookupMessage)) @{
2590   GNUNET_break_op (0);
2591   GNUNET_SERVER_receive_done (client, GNUNET_SYSERR);
2592   return; @}
2593 @end example
2594
2595
2596 Note that, opposite to the construction method of the request message in
2597 the client, in the server the function @code{nothl} and @code{ntohs}
2598 should be employed during the extraction of the data from the message, so
2599 that the data in big endian order can be converted back into little
2600 endian order. See more in detail please refer to Introduction of
2601 Big Endian and Little Endian.
2602
2603 Moreover in this example, the name of the transport stored in the message
2604 is a 0-terminated string, so we should also check whether the name of the
2605 transport in the received message is 0-terminated:
2606
2607 @example
2608 nameTransport = (const char *) &address[addressLen];
2609 if (nameTransport[size - sizeof
2610                   (struct AddressLookupMessage)
2611                   - addressLen - 1] != '\0') @{
2612   GNUNET_break_op (0);
2613   GNUNET_SERVER_receive_done (client,
2614                               GNUNET_SYSERR);
2615   return; @}
2616 @end example
2617
2618 Here, @code{GNUNET_SERVER_receive_done} should be called to tell the
2619 service that the request is done and can receive the next message. The
2620 argument @code{GNUNET_SYSERR} here indicates that the service didn't
2621 understand the request message, and the processing of this request would
2622 be terminated.
2623
2624 In comparison to the aforementioned situation, when the argument is equal
2625 to @code{GNUNET_OK}, the service would continue to process the requst
2626 message.
2627
2628 @c ***********************************************************************
2629 @node Server - Response to client
2630 @subsubsection Server - Response to client
2631 @c %**end of header
2632
2633 Once the processing of current request is done, the server should give the
2634 response to the client. A new @code{struct AddressLookupMessage} would be
2635 produced by the server in a similar way as the client did and sent to the
2636 client, but here the type should be
2637 @code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY} rather than
2638 @code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP} in client.
2639 @example
2640 struct AddressLookupMessage *msg;
2641 size_t len = sizeof (struct AddressLookupMessage)
2642   + addressLen
2643   + strlen (nameTrans) + 1;
2644 msg->header->size = htons (len);
2645 msg->header->type = htons
2646   (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2647
2648 // ...
2649
2650 struct GNUNET_SERVER_TransmitContext *tc;
2651 tc = GNUNET_SERVER_transmit_context_create (client);
2652 GNUNET_SERVER_transmit_context_append_data
2653 (tc,
2654  NULL,
2655  0,
2656  GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2657 GNUNET_SERVER_transmit_context_run (tc, rtimeout);
2658 @end example
2659
2660
2661 Note that, there are also a number of other APIs provided to the service
2662 to send the message.
2663
2664 @c ***********************************************************************
2665 @node Server - Notification of clients
2666 @subsubsection Server - Notification of clients
2667 @c %**end of header
2668
2669 Often a service needs to (repeatedly) transmit notifications to a client
2670 or a group of clients. In these cases, the client typically has once
2671 registered for a set of events and then needs to receive a message
2672 whenever such an event happens (until the client disconnects). The use of
2673 a notification context can help manage message queues to clients and
2674 handle disconnects. Notification contexts can be used to send
2675 individualized messages to a particular client or to broadcast messages
2676 to a group of clients. An individualized notification might look like
2677 this:
2678
2679 @example
2680 GNUNET_SERVER_notification_context_unicast(nc,
2681                                            client,
2682                                            msg,
2683                                            GNUNET_YES);
2684 @end example
2685
2686
2687 Note that after processing the original registration message for
2688 notifications, the server code still typically needs to call
2689 @code{GNUNET_SERVER_receive_done} so that the client can transmit further
2690 messages to the server.
2691
2692 @c ***********************************************************************
2693 @node Conversion between Network Byte Order (Big Endian) and Host Byte Order
2694 @subsubsection Conversion between Network Byte Order (Big Endian) and Host Byte Order
2695 @c %** subsub? it's a referenced page on the ipc document.
2696 @c %**end of header
2697
2698 Here we can simply comprehend big endian and little endian as Network Byte
2699 Order and Host Byte Order respectively. What is the difference between
2700 both two?
2701
2702 Usually in our host computer we store the data byte as Host Byte Order,
2703 for example, we store a integer in the RAM which might occupies 4 Byte,
2704 as Host Byte Order the higher Byte would be stored at the lower address
2705 of RAM, and the lower Byte would be stored at the higher address of RAM.
2706 However, contrast to this, Network Byte Order just take the totally
2707 opposite way to store the data, says, it will store the lower Byte at the
2708 lower address, and the higher Byte will stay at higher address.
2709
2710 For the current communication of network, we normally exchange the
2711 information by surveying the data package, every two host wants to
2712 communicate with each other must send and receive data package through
2713 network. In order to maintain the identity of data through the
2714 transmission in the network, the order of the Byte storage must changed
2715 before sending and after receiving the data.
2716
2717 There ten convenient functions to realize the conversion of Byte Order in
2718 GNUnet, as following:
2719
2720 @table @asis
2721
2722 @item uint16_t htons(uint16_t hostshort) Convert host byte order to net
2723 byte order with short int
2724 @item uint32_t htonl(uint32_t hostlong) Convert host byte
2725 order to net byte order with long int
2726 @item uint16_t ntohs(uint16_t netshort)
2727 Convert net byte order to host byte order with short int
2728 @item uint32_t
2729 ntohl(uint32_t netlong) Convert net byte order to host byte order with
2730 long int
2731 @item unsigned long long GNUNET_ntohll (unsigned long long netlonglong)
2732 Convert net byte order to host byte order with long long int
2733 @item unsigned long long GNUNET_htonll (unsigned long long hostlonglong)
2734 Convert host byte order to net byte order with long long int
2735 @item struct GNUNET_TIME_RelativeNBO GNUNET_TIME_relative_hton
2736 (struct GNUNET_TIME_Relative a) Convert relative time to network byte
2737 order.
2738 @item struct GNUNET_TIME_Relative GNUNET_TIME_relative_ntoh
2739 (struct GNUNET_TIME_RelativeNBO a) Convert relative time from network
2740 byte order.
2741 @item struct GNUNET_TIME_AbsoluteNBO GNUNET_TIME_absolute_hton
2742 (struct GNUNET_TIME_Absolute a) Convert relative time to network byte
2743 order.
2744 @item struct GNUNET_TIME_Absolute GNUNET_TIME_absolute_ntoh
2745 (struct GNUNET_TIME_AbsoluteNBO a) Convert relative time from network
2746 byte order.
2747 @end table
2748
2749 @cindex Cryptography API
2750 @node Cryptography API
2751 @subsection Cryptography API
2752 @c %**end of header
2753
2754 The gnunetutil APIs provides the cryptographic primitives used in GNUnet.
2755 GNUnet uses 2048 bit RSA keys for the session key exchange and for signing
2756 messages by peers and most other public-key operations. Most researchers
2757 in cryptography consider 2048 bit RSA keys as secure and practically
2758 unbreakable for a long time. The API provides functions to create a fresh
2759 key pair, read a private key from a file (or create a new file if the
2760 file does not exist), encrypt, decrypt, sign, verify and extraction of
2761 the public key into a format suitable for network transmission.
2762
2763 For the encryption of files and the actual data exchanged between peers
2764 GNUnet uses 256-bit AES encryption. Fresh, session keys are negotiated
2765 for every new connection.@ Again, there is no published technique to
2766 break this cipher in any realistic amount of time. The API provides
2767 functions for generation of keys, validation of keys (important for
2768 checking that decryptions using RSA succeeded), encryption and decryption.
2769
2770 GNUnet uses SHA-512 for computing one-way hash codes. The API provides
2771 functions to compute a hash over a block in memory or over a file on disk.
2772
2773 The crypto API also provides functions for randomizing a block of memory,
2774 obtaining a single random number and for generating a permuation of the
2775 numbers 0 to n-1. Random number generation distinguishes between WEAK and
2776 STRONG random number quality; WEAK random numbers are pseudo-random
2777 whereas STRONG random numbers use entropy gathered from the operating
2778 system.
2779
2780 Finally, the crypto API provides a means to deterministically generate a
2781 1024-bit RSA key from a hash code. These functions should most likely not
2782 be used by most applications; most importantly,
2783 GNUNET_CRYPTO_rsa_key_create_from_hash does not create an RSA-key that
2784 should be considered secure for traditional applications of RSA.
2785
2786 @cindex Message Queue API
2787 @node Message Queue API
2788 @subsection Message Queue API
2789 @c %**end of header
2790
2791 @strong{ Introduction }@
2792 Often, applications need to queue messages that
2793 are to be sent to other GNUnet peers, clients or services. As all of
2794 GNUnet's message-based communication APIs, by design, do not allow
2795 messages to be queued, it is common to implement custom message queues
2796 manually when they are needed. However, writing very similar code in
2797 multiple places is tedious and leads to code duplication.
2798
2799 MQ (for Message Queue) is an API that provides the functionality to
2800 implement and use message queues. We intend to eventually replace all of
2801 the custom message queue implementations in GNUnet with MQ.
2802
2803 @strong{ Basic Concepts }@
2804 The two most important entities in MQ are queues and envelopes.
2805
2806 Every queue is backed by a specific implementation (e.g. for mesh, stream,
2807 connection, server client, etc.) that will actually deliver the queued
2808 messages. For convenience,@ some queues also allow to specify a list of
2809 message handlers. The message queue will then also wait for incoming
2810 messages and dispatch them appropriately.
2811
2812 An envelope holds the the memory for a message, as well as metadata
2813 (Where is the envelope queued? What should happen after it has been
2814 sent?). Any envelope can only be queued in one message queue.
2815
2816 @strong{ Creating Queues }@
2817 The following is a list of currently available message queues. Note that
2818 to avoid layering issues, message queues for higher level APIs are not
2819 part of @code{libgnunetutil}, but@ the respective API itself provides the
2820 queue implementation.
2821
2822 @table @asis
2823
2824 @item @code{GNUNET_MQ_queue_for_connection_client}
2825 Transmits queued messages over a @code{GNUNET_CLIENT_Connection} handle.
2826 Also supports receiving with message handlers.
2827
2828 @item @code{GNUNET_MQ_queue_for_server_client}
2829 Transmits queued messages over a @code{GNUNET_SERVER_Client} handle. Does
2830 not support incoming message handlers.
2831
2832 @item @code{GNUNET_MESH_mq_create} Transmits queued messages over a
2833 @code{GNUNET_MESH_Tunnel} handle. Does not support incoming message
2834 handlers.
2835
2836 @item @code{GNUNET_MQ_queue_for_callbacks} This is the most general
2837 implementation. Instead of delivering and receiving messages with one of
2838 GNUnet's communication APIs, implementation callbacks are called. Refer to
2839 "Implementing Queues" for a more detailed explanation.
2840 @end table
2841
2842
2843 @strong{ Allocating Envelopes }@
2844 A GNUnet message (as defined by the GNUNET_MessageHeader) has three
2845 parts: The size, the type, and the body.
2846
2847 MQ provides macros to allocate an envelope containing a message
2848 conveniently, automatically setting the size and type fields of the
2849 message.
2850
2851 Consider the following simple message, with the body consisting of a
2852 single number value.
2853 @c why the empy code function?
2854 @code{}
2855
2856 @example
2857 struct NumberMessage @{
2858   /** Type: GNUNET_MESSAGE_TYPE_EXAMPLE_1 */
2859   struct GNUNET_MessageHeader header;
2860   uint32_t number GNUNET_PACKED;
2861 @};
2862 @end example
2863
2864 An envelope containing an instance of the NumberMessage can be
2865 constructed like this:
2866
2867 @example
2868 struct GNUNET_MQ_Envelope *ev;
2869 struct NumberMessage *msg;
2870 ev = GNUNET_MQ_msg (msg, GNUNET_MESSAGE_TYPE_EXAMPLE_1);
2871 msg->number = htonl (42);
2872 @end example
2873
2874 In the above code, @code{GNUNET_MQ_msg} is a macro. The return value is
2875 the newly allocated envelope. The first argument must be a pointer to some
2876 @code{struct} containing a @code{struct GNUNET_MessageHeader header}
2877 field, while the second argument is the desired message type, in host
2878 byte order.
2879
2880 The @code{msg} pointer now points to an allocated message, where the
2881 message type and the message size are already set. The message's size is
2882 inferred from the type of the @code{msg} pointer: It will be set to
2883 'sizeof(*msg)', properly converted to network byte order.
2884
2885 If the message body's size is dynamic, the the macro
2886 @code{GNUNET_MQ_msg_extra} can be used to allocate an envelope whose
2887 message has additional space allocated after the @code{msg} structure.
2888
2889 If no structure has been defined for the message,
2890 @code{GNUNET_MQ_msg_header_extra} can be used to allocate additional space
2891 after the message header. The first argument then must be a pointer to a
2892 @code{GNUNET_MessageHeader}.
2893
2894 @strong{Envelope Properties}@
2895 A few functions in MQ allow to set additional properties on envelopes:
2896
2897 @table @asis
2898
2899 @item @code{GNUNET_MQ_notify_sent} Allows to specify a function that will
2900 be called once the envelope's message has been sent irrevocably.
2901 An envelope can be canceled precisely up to the@ point where the notify
2902 sent callback has been called.
2903
2904 @item @code{GNUNET_MQ_disable_corking} No corking will be used when
2905 sending the message. Not every@ queue supports this flag, per default,
2906 envelopes are sent with corking.@
2907
2908 @end table
2909
2910
2911 @strong{Sending Envelopes}@
2912 Once an envelope has been constructed, it can be queued for sending with
2913 @code{GNUNET_MQ_send}.
2914
2915 Note that in order to avoid memory leaks, an envelope must either be sent
2916 (the queue will free it) or destroyed explicitly with
2917 @code{GNUNET_MQ_discard}.
2918
2919 @strong{Canceling Envelopes}@
2920 An envelope queued with @code{GNUNET_MQ_send} can be canceled with
2921 @code{GNUNET_MQ_cancel}. Note that after the notify sent callback has
2922 been called, canceling a message results in undefined behavior.
2923 Thus it is unsafe to cancel an envelope that does not have a notify sent
2924 callback. When canceling an envelope, it is not necessary@ to call
2925 @code{GNUNET_MQ_discard}, and the envelope can't be sent again.
2926
2927 @strong{ Implementing Queues }@
2928 @code{TODO}
2929
2930 @cindex Service API
2931 @node Service API
2932 @subsection Service API
2933 @c %**end of header
2934
2935 Most GNUnet code lives in the form of services. Services are processes
2936 that offer an API for other components of the system to build on. Those
2937 other components can be command-line tools for users, graphical user
2938 interfaces or other services. Services provide their API using an IPC
2939 protocol. For this, each service must listen on either a TCP port or a
2940 UNIX domain socket; for this, the service implementation uses the server
2941 API. This use of server is exposed directly to the users of the service
2942 API. Thus, when using the service API, one is usually also often using
2943 large parts of the server API. The service API provides various
2944 convenience functions, such as parsing command-line arguments and the
2945 configuration file, which are not found in the server API.
2946 The dual to the service/server API is the client API, which can be used to
2947 access services.
2948
2949 The most common way to start a service is to use the
2950 @code{GNUNET_SERVICE_run} function from the program's main function.
2951 @code{GNUNET_SERVICE_run} will then parse the command line and
2952 configuration files and, based on the options found there,
2953 start the server. It will then give back control to the main
2954 program, passing the server and the configuration to the
2955 @code{GNUNET_SERVICE_Main} callback. @code{GNUNET_SERVICE_run}
2956 will also take care of starting the scheduler loop.
2957 If this is inappropriate (for example, because the scheduler loop
2958 is already running), @code{GNUNET_SERVICE_start} and
2959 related functions provide an alternative to @code{GNUNET_SERVICE_run}.
2960
2961 When starting a service, the service_name option is used to determine
2962 which sections in the configuration file should be used to configure the
2963 service. A typical value here is the name of the @file{src/}
2964 sub-directory, for example @file{statistics}.
2965 The same string would also be given to
2966 @code{GNUNET_CLIENT_connect} to access the service.
2967
2968 Once a service has been initialized, the program should use the
2969 @code{GNUNET_SERVICE_Main} callback to register message handlers
2970 using @code{GNUNET_SERVER_add_handlers}.
2971 The service will already have registered a handler for the
2972 "TEST" message.
2973
2974 @fnindex GNUNET_SERVICE_Options
2975 The option bitfield (@code{enum GNUNET_SERVICE_Options})
2976 determines how a service should behave during shutdown.
2977 There are three key strategies:
2978
2979 @table @asis
2980
2981 @item instant (@code{GNUNET_SERVICE_OPTION_NONE})
2982 Upon receiving the shutdown
2983 signal from the scheduler, the service immediately terminates the server,
2984 closing all existing connections with clients.
2985 @item manual (@code{GNUNET_SERVICE_OPTION_MANUAL_SHUTDOWN})
2986 The service does nothing by itself
2987 during shutdown. The main program will need to take the appropriate
2988 action by calling GNUNET_SERVER_destroy or GNUNET_SERVICE_stop (depending
2989 on how the service was initialized) to terminate the service. This method
2990 is used by gnunet-service-arm and rather uncommon.
2991 @item soft (@code{GNUNET_SERVICE_OPTION_SOFT_SHUTDOWN})
2992 Upon receiving the shutdown signal from the scheduler,
2993 the service immediately tells the server to stop
2994 listening for incoming clients. Requests from normal existing clients are
2995 still processed and the server/service terminates once all normal clients
2996 have disconnected. Clients that are not expected to ever disconnect (such
2997 as clients that monitor performance values) can be marked as 'monitor'
2998 clients using GNUNET_SERVER_client_mark_monitor. Those clients will
2999 continue to be processed until all 'normal' clients have disconnected.
3000 Then, the server will terminate, closing the monitor connections.
3001 This mode is for example used by 'statistics', allowing existing 'normal'
3002 clients to set (possibly persistent) statistic values before terminating.
3003
3004 @end table
3005
3006 @c ***********************************************************************
3007 @node Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
3008 @subsection Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
3009 @c %**end of header
3010
3011 A commonly used data structure in GNUnet is a (multi-)hash map. It is most
3012 often used to map a peer identity to some data structure, but also to map
3013 arbitrary keys to values (for example to track requests in the distributed
3014 hash table or in file-sharing). As it is commonly used, the DHT is
3015 actually sometimes responsible for a large share of GNUnet's overall
3016 memory consumption (for some processes, 30% is not uncommon). The
3017 following text documents some API quirks (and their implications for
3018 applications) that were recently introduced to minimize the footprint of
3019 the hash map.
3020
3021
3022 @c ***********************************************************************
3023 @menu
3024 * Analysis::
3025 * Solution::
3026 * Migration::
3027 * Conclusion::
3028 * Availability::
3029 @end menu
3030
3031 @node Analysis
3032 @subsubsection Analysis
3033 @c %**end of header
3034
3035 The main reason for the "excessive" memory consumption by the hash map is
3036 that GNUnet uses 512-bit cryptographic hash codes --- and the
3037 (multi-)hash map also uses the same 512-bit 'struct GNUNET_HashCode'. As
3038 a result, storing just the keys requires 64 bytes of memory for each key.
3039 As some applications like to keep a large number of entries in the hash
3040 map (after all, that's what maps are good for), 64 bytes per hash is
3041 significant: keeping a pointer to the value and having a linked list for
3042 collisions consume between 8 and 16 bytes, and 'malloc' may add about the
3043 same overhead per allocation, putting us in the 16 to 32 byte per entry
3044 ballpark. Adding a 64-byte key then triples the overall memory
3045 requirement for the hash map.
3046
3047 To make things "worse", most of the time storing the key in the hash map
3048 is not required: it is typically already in memory elsewhere! In most
3049 cases, the values stored in the hash map are some application-specific
3050 struct that _also_ contains the hash. Here is a simplified example:
3051
3052 @example
3053 struct MyValue @{
3054 struct GNUNET_HashCode key;
3055 unsigned int my_data; @};
3056
3057 // ...
3058 val = GNUNET_malloc (sizeof (struct MyValue));
3059 val->key = key;
3060 val->my_data = 42;
3061 GNUNET_CONTAINER_multihashmap_put (map, &key, val, ...);
3062 @end example
3063
3064 This is a common pattern as later the entries might need to be removed,
3065 and at that time it is convenient to have the key immediately at hand:
3066
3067 @example
3068 GNUNET_CONTAINER_multihashmap_remove (map, &val->key, val);
3069 @end example
3070
3071
3072 Note that here we end up with two times 64 bytes for the key, plus maybe
3073 64 bytes total for the rest of the 'struct MyValue' and the map entry in
3074 the hash map. The resulting redundant storage of the key increases
3075 overall memory consumption per entry from the "optimal" 128 bytes to 192
3076 bytes. This is not just an extreme example: overheads in practice are
3077 actually sometimes close to those highlighted in this example. This is
3078 especially true for maps with a significant number of entries, as there
3079 we tend to really try to keep the entries small.
3080
3081 @c ***********************************************************************
3082 @node Solution
3083 @subsubsection Solution
3084 @c %**end of header
3085
3086 The solution that has now been implemented is to @strong{optionally}
3087 allow the hash map to not make a (deep) copy of the hash but instead have
3088 a pointer to the hash/key in the entry. This reduces the memory
3089 consumption for the key from 64 bytes to 4 to 8 bytes. However, it can
3090 also only work if the key is actually stored in the entry (which is the
3091 case most of the time) and if the entry does not modify the key (which in
3092 all of the code I'm aware of has been always the case if there key is
3093 stored in the entry). Finally, when the client stores an entry in the
3094 hash map, it @strong{must} provide a pointer to the key within the entry,
3095 not just a pointer to a transient location of the key. If
3096 the client code does not meet these requirements, the result is a dangling
3097 pointer and undefined behavior of the (multi-)hash map API.
3098
3099 @c ***********************************************************************
3100 @node Migration
3101 @subsubsection Migration
3102 @c %**end of header
3103
3104 To use the new feature, first check that the values contain the respective
3105 key (and never modify it). Then, all calls to
3106 @code{GNUNET_CONTAINER_multihashmap_put} on the respective map must be
3107 audited and most likely changed to pass a pointer into the value's struct.
3108 For the initial example, the new code would look like this:
3109
3110 @example
3111 struct MyValue @{
3112 struct GNUNET_HashCode key;
3113 unsigned int my_data; @};
3114
3115 // ...
3116 val = GNUNET_malloc (sizeof (struct MyValue));
3117 val->key = key; val->my_data = 42;
3118 GNUNET_CONTAINER_multihashmap_put (map, &val->key, val, ...);
3119 @end example
3120
3121
3122 Note that @code{&val} was changed to @code{&val->key} in the argument to
3123 the @code{put} call. This is critical as often @code{key} is on the stack
3124 or in some other transient data structure and thus having the hash map
3125 keep a pointer to @code{key} would not work. Only the key inside of
3126 @code{val} has the same lifetime as the entry in the map (this must of
3127 course be checked as well). Naturally, @code{val->key} must be
3128 intiialized before the @code{put} call. Once all @code{put} calls have
3129 been converted and double-checked, you can change the call to create the
3130 hash map from
3131
3132 @example
3133 map =
3134 GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_NO);
3135 @end example
3136
3137 to
3138
3139 @example
3140 map = GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_YES);
3141 @end example
3142
3143 If everything was done correctly, you now use about 60 bytes less memory
3144 per entry in @code{map}. However, if now (or in the future) any call to
3145 @code{put} does not ensure that the given key is valid until the entry is
3146 removed from the map, undefined behavior is likely to be observed.
3147
3148 @c ***********************************************************************
3149 @node Conclusion
3150 @subsubsection Conclusion
3151 @c %**end of header
3152
3153 The new optimization can is often applicable and can result in a
3154 reduction in memory consumption of up to 30% in practice. However, it
3155 makes the code less robust as additional invariants are imposed on the
3156 multi hash map client. Thus applications should refrain from enabling the
3157 new mode unless the resulting performance increase is deemed significant
3158 enough. In particular, it should generally not be used in new code (wait
3159 at least until benchmarks exist).
3160
3161 @c ***********************************************************************
3162 @node Availability
3163 @subsubsection Availability
3164 @c %**end of header
3165
3166 The new multi hash map code was committed in SVN 24319 (will be in GNUnet
3167 0.9.4). Various subsystems (transport, core, dht, file-sharing) were
3168 previously audited and modified to take advantage of the new capability.
3169 In particular, memory consumption of the file-sharing service is expected
3170 to drop by 20-30% due to this change.
3171
3172
3173 @cindex CONTAINER_MDLL API
3174 @node CONTAINER_MDLL API
3175 @subsection CONTAINER_MDLL API
3176 @c %**end of header
3177
3178 This text documents the GNUNET_CONTAINER_MDLL API. The
3179 GNUNET_CONTAINER_MDLL API is similar to the GNUNET_CONTAINER_DLL API in
3180 that it provides operations for the construction and manipulation of
3181 doubly-linked lists. The key difference to the (simpler) DLL-API is that
3182 the MDLL-version allows a single element (instance of a "struct") to be
3183 in multiple linked lists at the same time.
3184
3185 Like the DLL API, the MDLL API stores (most of) the data structures for
3186 the doubly-linked list with the respective elements; only the 'head' and
3187 'tail' pointers are stored "elsewhere" --- and the application needs to
3188 provide the locations of head and tail to each of the calls in the
3189 MDLL API. The key difference for the MDLL API is that the "next" and
3190 "previous" pointers in the struct can no longer be simply called "next"
3191 and "prev" --- after all, the element may be in multiple doubly-linked
3192 lists, so we cannot just have one "next" and one "prev" pointer!
3193
3194 The solution is to have multiple fields that must have a name of the
3195 format "next_XX" and "prev_XX" where "XX" is the name of one of the
3196 doubly-linked lists. Here is a simple example:
3197
3198 @example
3199 struct MyMultiListElement @{
3200   struct MyMultiListElement *next_ALIST;
3201   struct MyMultiListElement *prev_ALIST;
3202   struct MyMultiListElement *next_BLIST;
3203   struct MyMultiListElement *prev_BLIST;
3204   void
3205   *data;
3206 @};
3207 @end example
3208
3209
3210 Note that by convention, we use all-uppercase letters for the list names.
3211 In addition, the program needs to have a location for the head and tail
3212 pointers for both lists, for example:
3213
3214 @example
3215 static struct MyMultiListElement *head_ALIST;
3216 static struct MyMultiListElement *tail_ALIST;
3217 static struct MyMultiListElement *head_BLIST;
3218 static struct MyMultiListElement *tail_BLIST;
3219 @end example
3220
3221
3222 Using the MDLL-macros, we can now insert an element into the ALIST:
3223
3224 @example
3225 GNUNET_CONTAINER_MDLL_insert (ALIST, head_ALIST, tail_ALIST, element);
3226 @end example
3227
3228
3229 Passing "ALIST" as the first argument to MDLL specifies which of the
3230 next/prev fields in the 'struct MyMultiListElement' should be used. The
3231 extra "ALIST" argument and the "_ALIST" in the names of the
3232 next/prev-members are the only differences between the MDDL and DLL-API.
3233 Like the DLL-API, the MDLL-API offers functions for inserting (at head,
3234 at tail, after a given element) and removing elements from the list.
3235 Iterating over the list should be done by directly accessing the
3236 "next_XX" and/or "prev_XX" members.
3237
3238 @cindex Automatic Restart Manager
3239 @cindex ARM
3240 @node Automatic Restart Manager (ARM)
3241 @section Automatic Restart Manager (ARM)
3242 @c %**end of header
3243
3244 GNUnet's Automated Restart Manager (ARM) is the GNUnet service responsible
3245 for system initialization and service babysitting. ARM starts and halts
3246 services, detects configuration changes and restarts services impacted by
3247 the changes as needed. It's also responsible for restarting services in
3248 case of crashes and is planned to incorporate automatic debugging for
3249 diagnosing service crashes providing developers insights about crash
3250 reasons. The purpose of this document is to give GNUnet developer an idea
3251 about how ARM works and how to interact with it.
3252
3253 @menu
3254 * Basic functionality::
3255 * Key configuration options::
3256 * ARM - Availability::
3257 * Reliability::
3258 @end menu
3259
3260 @c ***********************************************************************
3261 @node Basic functionality
3262 @subsection Basic functionality
3263 @c %**end of header
3264
3265 @itemize @bullet
3266 @item ARM source code can be found under "src/arm".@ Service processes are
3267 managed by the functions in "gnunet-service-arm.c" which is controlled
3268 with "gnunet-arm.c" (main function in that file is ARM's entry point).
3269
3270 @item The functions responsible for communicating with ARM , starting and
3271 stopping services -including ARM service itself- are provided by the
3272 ARM API "arm_api.c".@ Function: GNUNET_ARM_connect() returns to the caller
3273 an ARM handle after setting it to the caller's context (configuration and
3274 scheduler in use). This handle can be used afterwards by the caller to
3275 communicate with ARM. Functions GNUNET_ARM_start_service() and
3276 GNUNET_ARM_stop_service() are used for starting and stopping services
3277 respectively.
3278
3279 @item A typical example of using these basic ARM services can be found in
3280 file test_arm_api.c. The test case connects to ARM, starts it, then uses
3281 it to start a service "resolver", stops the "resolver" then stops "ARM".
3282 @end itemize
3283
3284 @c ***********************************************************************
3285 @node Key configuration options
3286 @subsection Key configuration options
3287 @c %**end of header
3288
3289 Configurations for ARM and services should be available in a .conf file
3290 (As an example, see test_arm_api_data.conf). When running ARM, the
3291 configuration file to use should be passed to the command:
3292
3293 @example
3294 $ gnunet-arm -s -c configuration_to_use.conf
3295 @end example
3296
3297 If no configuration is passed, the default configuration file will be used
3298 (see GNUNET_PREFIX/share/gnunet/defaults.conf which is created from
3299 contrib/defaults.conf).@ Each of the services is having a section starting
3300 by the service name between square brackets, for example: "[arm]".
3301 The following options configure how ARM configures or interacts with the
3302 various services:
3303
3304 @table @asis
3305
3306 @item PORT Port number on which the service is listening for incoming TCP
3307 connections. ARM will start the services should it notice a request at
3308 this port.
3309
3310 @item HOSTNAME Specifies on which host the service is deployed. Note
3311 that ARM can only start services that are running on the local system
3312 (but will not check that the hostname matches the local machine name).
3313 This option is used by the @code{gnunet_client_lib.h} implementation to
3314 determine which system to connect to. The default is "localhost".
3315
3316 @item BINARY The name of the service binary file.
3317
3318 @item OPTIONS To be passed to the service.
3319
3320 @item PREFIX A command to pre-pend to the actual command, for example,
3321 running a service with "valgrind" or "gdb"
3322
3323 @item DEBUG Run in debug mode (much verbosity).
3324
3325 @item AUTOSTART ARM will listen to UNIX domain socket and/or TCP port of
3326 the service and start the service on-demand.
3327
3328 @item FORCESTART ARM will always start this service when the peer
3329 is started.
3330
3331 @item ACCEPT_FROM IPv4 addresses the service accepts connections from.
3332
3333 @item ACCEPT_FROM6 IPv6 addresses the service accepts connections from.
3334
3335 @end table
3336
3337
3338 Options that impact the operation of ARM overall are in the "[arm]"
3339 section. ARM is a normal service and has (except for AUTOSTART) all of the
3340 options that other services do. In addition, ARM has the
3341 following options:
3342
3343 @table @asis
3344
3345 @item GLOBAL_PREFIX Command to be pre-pended to all services that are
3346 going to run.
3347
3348 @item GLOBAL_POSTFIX Global option that will be supplied to all the
3349 services that are going to run.
3350
3351 @end table
3352
3353 @c ***********************************************************************
3354 @node ARM - Availability
3355 @subsection ARM - Availability
3356 @c %**end of header
3357
3358 As mentioned before, one of the features provided by ARM is starting
3359 services on demand. Consider the example of one service "client" that
3360 wants to connect to another service a "server". The "client" will ask ARM
3361 to run the "server". ARM starts the "server". The "server" starts
3362 listening to incoming connections. The "client" will establish a
3363 connection with the "server". And then, they will start to communicate
3364 together.@ One problem with that scheme is that it's slow!@
3365 The "client" service wants to communicate with the "server" service at
3366 once and is not willing wait for it to be started and listening to
3367 incoming connections before serving its request.@ One solution for that
3368 problem will be that ARM starts all services as default services. That
3369 solution will solve the problem, yet, it's not quite practical, for some
3370 services that are going to be started can never be used or are going to
3371 be used after a relatively long time.@
3372 The approach followed by ARM to solve this problem is as follows:
3373
3374 @itemize @bullet
3375
3376 @item For each service having a PORT field in the configuration file and
3377 that is not one of the default services ( a service that accepts incoming
3378 connections from clients), ARM creates listening sockets for all addresses
3379 associated with that service.
3380
3381 @item The "client" will immediately establish a connection with
3382 the "server".
3383
3384 @item ARM --- pretending to be the "server" --- will listen on the
3385 respective port and notice the incoming connection from the "client"
3386 (but not accept it), instead
3387
3388 @item Once there is an incoming connection, ARM will start the "server",
3389 passing on the listen sockets (now, the service is started and can do its
3390 work).
3391
3392 @item Other client services now can directly connect directly to the
3393 "server".
3394
3395 @end itemize
3396
3397 @c ***********************************************************************
3398 @node Reliability
3399 @subsection Reliability
3400
3401 One of the features provided by ARM, is the automatic restart of crashed
3402 services.@ ARM needs to know which of the running services died. Function
3403 "gnunet-service-arm.c/maint_child_death()" is responsible for that. The
3404 function is scheduled to run upon receiving a SIGCHLD signal. The
3405 function, then, iterates ARM's list of services running and monitors
3406 which service has died (crashed). For all crashing services, ARM restarts
3407 them.@
3408 Now, considering the case of a service having a serious problem causing it
3409 to crash each time it's started by ARM. If ARM keeps blindly restarting
3410 such a service, we are going to have the pattern:
3411 start-crash-restart-crash-restart-crash and so forth!! Which is of course
3412 not practical.@
3413 For that reason, ARM schedules the service to be restarted after waiting
3414 for some delay that grows exponentially with each crash/restart of that
3415 service.@ To clarify the idea, considering the following example:
3416
3417 @itemize @bullet
3418
3419 @item Service S crashed.
3420
3421 @item ARM receives the SIGCHLD and inspects its list of services to find
3422 the dead one(s).
3423
3424 @item ARM finds S dead and schedules it for restarting after "backoff"
3425 time which is initially set to 1ms. ARM will double the backoff time
3426 correspondent to S (now backoff(S) = 2ms)
3427
3428 @item Because there is a severe problem with S, it crashed again.
3429
3430 @item Again ARM receives the SIGCHLD and detects that it's S again that's
3431 crashed. ARM schedules it for restarting but after its new backoff time
3432 (which became 2ms), and doubles its backoff time (now backoff(S) = 4).
3433
3434 @item and so on, until backoff(S) reaches a certain threshold
3435 (@code{EXPONENTIAL_BACKOFF_THRESHOLD} is set to half an hour),
3436 after reaching it, backoff(S) will remain half an hour,
3437 hence ARM won't be busy for a lot of time trying to restart a
3438 problematic service.
3439 @end itemize
3440
3441 @cindex TRANSPORT Subsystem
3442 @node TRANSPORT Subsystem
3443 @section TRANSPORT Subsystem
3444 @c %**end of header
3445
3446 This chapter documents how the GNUnet transport subsystem works. The
3447 GNUnet transport subsystem consists of three main components: the
3448 transport API (the interface used by the rest of the system to access the
3449 transport service), the transport service itself (most of the interesting
3450 functions, such as choosing transports, happens here) and the transport
3451 plugins. A transport plugin is a concrete implementation for how two
3452 GNUnet peers communicate; many plugins exist, for example for
3453 communication via TCP, UDP, HTTP, HTTPS and others. Finally, the
3454 transport subsystem uses supporting code, especially the NAT/UPnP
3455 library to help with tasks such as NAT traversal.
3456
3457 Key tasks of the transport service include:
3458
3459 @itemize @bullet
3460
3461 @item Create our HELLO message, notify clients and neighbours if our HELLO
3462 changes (using NAT library as necessary)
3463
3464 @item Validate HELLOs from other peers (send PING), allow other peers to
3465 validate our HELLO's addresses (send PONG)
3466
3467 @item Upon request, establish connections to other peers (using address
3468 selection from ATS subsystem) and maintain them (again using PINGs and
3469 PONGs) as long as desired
3470
3471 @item Accept incoming connections, give ATS service the opportunity to
3472 switch communication channels
3473
3474 @item Notify clients about peers that have connected to us or that have
3475 been disconnected from us
3476
3477 @item If a (stateful) connection goes down unexpectedly (without explicit
3478 DISCONNECT), quickly attempt to recover (without notifying clients) but do
3479 notify clients quickly if reconnecting fails
3480
3481 @item Send (payload) messages arriving from clients to other peers via
3482 transport plugins and receive messages from other peers, forwarding
3483 those to clients
3484
3485 @item Enforce inbound traffic limits (using flow-control if it is
3486 applicable); outbound traffic limits are enforced by CORE, not by us (!)
3487
3488 @item Enforce restrictions on P2P connection as specified by the blacklist
3489 configuration and blacklisting clients
3490 @end itemize
3491
3492 Note that the term "clients" in the list above really refers to the
3493 GNUnet-CORE service, as CORE is typically the only client of the
3494 transport service.
3495
3496 @menu
3497 * Address validation protocol::
3498 @end menu
3499
3500 @node Address validation protocol
3501 @subsection Address validation protocol
3502 @c %**end of header
3503
3504 This section documents how the GNUnet transport service validates
3505 connections with other peers. It is a high-level description of the
3506 protocol necessary to understand the details of the implementation. It
3507 should be noted that when we talk about PING and PONG messages in this
3508 section, we refer to transport-level PING and PONG messages, which are
3509 different from core-level PING and PONG messages (both in implementation
3510 and function).
3511
3512 The goal of transport-level address validation is to minimize the chances
3513 of a successful man-in-the-middle attack against GNUnet peers on the
3514 transport level. Such an attack would not allow the adversary to decrypt
3515 the P2P transmissions, but a successful attacker could at least measure
3516 traffic volumes and latencies (raising the adversaries capablities by
3517 those of a global passive adversary in the worst case). The scenarios we
3518 are concerned about is an attacker, Mallory, giving a @code{HELLO} to
3519 Alice that claims to be for Bob, but contains Mallory's IP address
3520 instead of Bobs (for some transport).
3521 Mallory would then forward the traffic to Bob (by initiating a
3522 connection to Bob and claiming to be Alice). As a further
3523 complication, the scheme has to work even if say Alice is behind a NAT
3524 without traversal support and hence has no address of her own (and thus
3525 Alice must always initiate the connection to Bob).
3526
3527 An additional constraint is that @code{HELLO} messages do not contain a
3528 cryptographic signature since other peers must be able to edit
3529 (i.e. remove) addresses from the @code{HELLO} at any time (this was
3530 not true in GNUnet 0.8.x). A basic @strong{assumption} is that each peer
3531 knows the set of possible network addresses that it @strong{might}
3532 be reachable under (so for example, the external IP address of the
3533 NAT plus the LAN address(es) with the respective ports).
3534
3535 The solution is the following. If Alice wants to validate that a given
3536 address for Bob is valid (i.e. is actually established @strong{directly}
3537 with the intended target), she sends a PING message over that connection
3538 to Bob. Note that in this case, Alice initiated the connection so only
3539 Alice knows which address was used for sure (Alice may be behind NAT, so
3540 whatever address Bob sees may not be an address Alice knows she has).
3541 Bob checks that the address given in the @code{PING} is actually one
3542 of Bob's addresses (ie: does not belong to Mallory), and if it is,
3543 sends back a @code{PONG} (with a signature that says that Bob
3544 owns/uses the address from the @code{PING}).
3545 Alice checks the signature and is happy if it is valid and the address
3546 in the @code{PONG} is the address Alice used.
3547 This is similar to the 0.8.x protocol where the @code{HELLO} contained a
3548 signature from Bob for each address used by Bob.
3549 Here, the purpose code for the signature is
3550 @code{GNUNET_SIGNATURE_PURPOSE_TRANSPORT_PONG_OWN}. After this, Alice will
3551 remember Bob's address and consider the address valid for a while (12h in
3552 the current implementation). Note that after this exchange, Alice only
3553 considers Bob's address to be valid, the connection itself is not
3554 considered 'established'. In particular, Alice may have many addresses
3555 for Bob that Alice considers valid.
3556
3557 @c TODO: reference Footnotes so that I don't have to duplicate the
3558 @c footnotes or add them to an index at the end. Is this possible at
3559 @c all in Texinfo?
3560 The @code{PONG} message is protected with a nonce/challenge against replay
3561 attacks@footnote{@uref{http://en.wikipedia.org/wiki/Replay_attack, replay}}
3562 and uses an expiration time for the signature (but those are almost
3563 implementation details).
3564
3565 @cindex NAT library
3566 @node NAT library
3567 @section NAT library
3568 @c %**end of header
3569
3570 The goal of the GNUnet NAT library is to provide a general-purpose API for
3571 NAT traversal @strong{without} third-party support. So protocols that
3572 involve contacting a third peer to help establish a connection between
3573 two peers are outside of the scope of this API. That does not mean that
3574 GNUnet doesn't support involving a third peer (we can do this with the
3575 distance-vector transport or using application-level protocols), it just
3576 means that the NAT API is not concerned with this possibility. The API is
3577 written so that it will work for IPv6-NAT in the future as well as
3578 current IPv4-NAT. Furthermore, the NAT API is always used, even for peers
3579 that are not behind NAT --- in that case, the mapping provided is simply
3580 the identity.
3581
3582 NAT traversal is initiated by calling @code{GNUNET_NAT_register}. Given a
3583 set of addresses that the peer has locally bound to (TCP or UDP), the NAT
3584 library will return (via callback) a (possibly longer) list of addresses
3585 the peer @strong{might} be reachable under. Internally, depending on the
3586 configuration, the NAT library will try to punch a hole (using UPnP) or
3587 just "know" that the NAT was manually punched and generate the respective
3588 external IP address (the one that should be globally visible) based on
3589 the given information.
3590
3591 The NAT library also supports ICMP-based NAT traversal. Here, the other
3592 peer can request connection-reversal by this peer (in this special case,
3593 the peer is even allowed to configure a port number of zero). If the NAT
3594 library detects a connection-reversal request, it returns the respective
3595 target address to the client as well. It should be noted that
3596 connection-reversal is currently only intended for TCP, so other plugins
3597 @strong{must} pass @code{NULL} for the reversal callback. Naturally, the
3598 NAT library also supports requesting connection reversal from a remote
3599 peer (@code{GNUNET_NAT_run_client}).
3600
3601 Once initialized, the NAT handle can be used to test if a given address is
3602 possibly a valid address for this peer (@code{GNUNET_NAT_test_address}).
3603 This is used for validating our addresses when generating PONGs.
3604
3605 Finally, the NAT library contains an API to test if our NAT configuration
3606 is correct. Using @code{GNUNET_NAT_test_start} @strong{before} binding to
3607 the respective port, the NAT library can be used to test if the
3608 configuration works. The test function act as a local client, initialize
3609 the NAT traversal and then contact a @code{gnunet-nat-server} (running by
3610 default on @code{gnunet.org}) and ask for a connection to be established.
3611 This way, it is easy to test if the current NAT configuration is valid.
3612
3613 @node Distance-Vector plugin
3614 @section Distance-Vector plugin
3615 @c %**end of header
3616
3617 The Distance Vector (DV) transport is a transport mechanism that allows
3618 peers to act as relays for each other, thereby connecting peers that would
3619 otherwise be unable to connect. This gives a larger connection set to
3620 applications that may work better with more peers to choose from (for
3621 example, File Sharing and/or DHT).
3622
3623 The Distance Vector transport essentially has two functions. The first is
3624 "gossiping" connection information about more distant peers to directly
3625 connected peers. The second is taking messages intended for non-directly
3626 connected peers and encapsulating them in a DV wrapper that contains the
3627 required information for routing the message through forwarding peers. Via
3628 gossiping, optimal routes through the known DV neighborhood are discovered
3629 and utilized and the message encapsulation provides some benefits in
3630 addition to simply getting the message from the correct source to the
3631 proper destination.
3632
3633 The gossiping function of DV provides an up to date routing table of
3634 peers that are available up to some number of hops. We call this a
3635 fisheye view of the network (like a fish, nearby objects are known while
3636 more distant ones unknown). Gossip messages are sent only to directly
3637 connected peers, but they are sent about other knowns peers within the
3638 "fisheye distance". Whenever two peers connect, they immediately gossip
3639 to each other about their appropriate other neighbors. They also gossip
3640 about the newly connected peer to previously
3641 connected neighbors. In order to keep the routing tables up to date,
3642 disconnect notifications are propogated as gossip as well (because
3643 disconnects may not be sent/received, timeouts are also used remove
3644 stagnant routing table entries).
3645
3646 Routing of messages via DV is straightforward. When the DV transport is
3647 notified of a message destined for a non-direct neighbor, the appropriate
3648 forwarding peer is selected, and the base message is encapsulated in a DV
3649 message which contains information about the initial peer and the intended
3650 recipient. At each forwarding hop, the initial peer is validated (the
3651 forwarding peer ensures that it has the initial peer in its neighborhood,
3652 otherwise the message is dropped). Next the base message is
3653 re-encapsulated in a new DV message for the next hop in the forwarding
3654 chain (or delivered to the current peer, if it has arrived at the
3655 destination).
3656
3657 Assume a three peer network with peers Alice, Bob and Carol. Assume that
3658
3659 @example
3660 Alice <-> Bob and Bob <-> Carol
3661 @end example
3662
3663 @noindent
3664 are direct (e.g. over TCP or UDP transports) connections, but that
3665 Alice cannot directly connect to Carol.
3666 This may be the case due to NAT or firewall restrictions, or perhaps
3667 based on one of the peers respective configurations. If the Distance
3668 Vector transport is enabled on all three peers, it will automatically
3669 discover (from the gossip protocol) that Alice and Carol can connect via
3670 Bob and provide a "virtual" Alice <-> Carol connection. Routing between
3671 Alice and Carol happens as follows; Alice creates a message destined for
3672 Carol and notifies the DV transport about it. The DV transport at Alice
3673 looks up Carol in the routing table and finds that the message must be
3674 sent through Bob for Carol. The message is encapsulated setting Alice as
3675 the initiator and Carol as the destination and sent to Bob. Bob receives
3676 the messages, verifies that both Alice and Carol are known to Bob, and
3677 re-wraps the message in a new DV message for Carol.
3678 The DV transport at Carol receives this message, unwraps the original
3679 message, and delivers it to Carol as though it came directly from Alice.
3680
3681 @cindex SMTP plugin
3682 @node SMTP plugin
3683 @section SMTP plugin
3684 @c %**end of header
3685
3686 This section describes the new SMTP transport plugin for GNUnet as it
3687 exists in the 0.7.x and 0.8.x branch. SMTP support is currently not
3688 available in GNUnet 0.9.x. This page also describes the transport layer
3689 abstraction (as it existed in 0.7.x and 0.8.x) in more detail and gives
3690 some benchmarking results. The performance results presented are quite
3691 old and maybe outdated at this point.
3692
3693 @itemize @bullet
3694 @item Why use SMTP for a peer-to-peer transport?
3695 @item SMTPHow does it work?
3696 @item How do I configure my peer?
3697 @item How do I test if it works?
3698 @item How fast is it?
3699 @item Is there any additional documentation?
3700 @end itemize
3701
3702
3703 @menu
3704 * Why use SMTP for a peer-to-peer transport?::
3705 * How does it work?::
3706 * How do I configure my peer?::
3707 * How do I test if it works?::
3708 * How fast is it?::
3709 @end menu
3710
3711 @node Why use SMTP for a peer-to-peer transport?
3712 @subsection Why use SMTP for a peer-to-peer transport?
3713 @c %**end of header
3714
3715 There are many reasons why one would not want to use SMTP:
3716
3717 @itemize @bullet
3718 @item SMTP is using more bandwidth than TCP, UDP or HTTP
3719 @item SMTP has a much higher latency.
3720 @item SMTP requires significantly more computation (encoding and decoding
3721 time) for the peers.
3722 @item SMTP is significantly more complicated to configure.
3723 @item SMTP may be abused by tricking GNUnet into sending mail to@
3724 non-participating third parties.
3725 @end itemize
3726
3727 So why would anybody want to use SMTP?
3728 @itemize @bullet
3729 @item SMTP can be used to contact peers behind NAT boxes (in virtual
3730 private networks).
3731 @item SMTP can be used to circumvent policies that limit or prohibit
3732 peer-to-peer traffic by masking as "legitimate" traffic.
3733 @item SMTP uses E-mail addresses which are independent of a specific IP,
3734 which can be useful to address peers that use dynamic IP addresses.
3735 @item SMTP can be used to initiate a connection (e.g. initial address
3736 exchange) and peers can then negotiate the use of a more efficient
3737 protocol (e.g. TCP) for the actual communication.
3738 @end itemize
3739
3740 In summary, SMTP can for example be used to send a message to a peer
3741 behind a NAT box that has a dynamic IP to tell the peer to establish a
3742 TCP connection to a peer outside of the private network. Even an
3743 extraordinary overhead for this first message would be irrelevant in this
3744 type of situation.
3745
3746 @node How does it work?
3747 @subsection How does it work?
3748 @c %**end of header
3749
3750 When a GNUnet peer needs to send a message to another GNUnet peer that has
3751 advertised (only) an SMTP transport address, GNUnet base64-encodes the
3752 message and sends it in an E-mail to the advertised address. The
3753 advertisement contains a filter which is placed in the E-mail header,
3754 such that the receiving host can filter the tagged E-mails and forward it
3755 to the GNUnet peer process. The filter can be specified individually by
3756 each peer and be changed over time. This makes it impossible to censor
3757 GNUnet E-mail messages by searching for a generic filter.
3758
3759 @node How do I configure my peer?
3760 @subsection How do I configure my peer?
3761 @c %**end of header
3762
3763 First, you need to configure @code{procmail} to filter your inbound E-mail
3764 for GNUnet traffic. The GNUnet messages must be delivered into a pipe, for
3765 example @code{/tmp/gnunet.smtp}. You also need to define a filter that is
3766 used by @command{procmail} to detect GNUnet messages. You are free to
3767 choose whichever filter you like, but you should make sure that it does
3768 not occur in your other E-mail. In our example, we will use
3769 @code{X-mailer: GNUnet}. The @code{~/.procmailrc} configuration file then
3770 looks like this:
3771
3772 @example
3773 :0:
3774 * ^X-mailer: GNUnet
3775 /tmp/gnunet.smtp
3776 # where do you want your other e-mail delivered to
3777 # (default: /var/spool/mail/)
3778 :0: /var/spool/mail/
3779 @end example
3780
3781 After adding this file, first make sure that your regular E-mail still
3782 works (e.g. by sending an E-mail to yourself). Then edit the GNUnet
3783 configuration. In the section @code{SMTP} you need to specify your E-mail
3784 address under @code{EMAIL}, your mail server (for outgoing mail) under
3785 @code{SERVER}, the filter (X-mailer: GNUnet in the example) under
3786 @code{FILTER} and the name of the pipe under @code{PIPE}.@ The completed
3787 section could then look like this:
3788
3789 @example
3790 EMAIL = me@@mail.gnu.org MTU = 65000 SERVER = mail.gnu.org:25 FILTER =
3791 "X-mailer: GNUnet" PIPE = /tmp/gnunet.smtp
3792 @end example
3793
3794 Finally, you need to add @code{smtp} to the list of @code{TRANSPORTS} in
3795 the @code{GNUNETD} section. GNUnet peers will use the E-mail address that
3796 you specified to contact your peer until the advertisement times out.
3797 Thus, if you are not sure if everything works properly or if you are not
3798 planning to be online for a long time, you may want to configure this
3799 timeout to be short, e.g. just one hour. For this, set
3800 @code{HELLOEXPIRES} to @code{1} in the @code{GNUNETD} section.
3801
3802 This should be it, but you may probably want to test it first.
3803
3804 @node How do I test if it works?
3805 @subsection How do I test if it works?
3806 @c %**end of header
3807
3808 Any transport can be subjected to some rudimentary tests using the
3809 @code{gnunet-transport-check} tool. The tool sends a message to the local
3810 node via the transport and checks that a valid message is received. While
3811 this test does not involve other peers and can not check if firewalls or
3812 other network obstacles prohibit proper operation, this is a great
3813 testcase for the SMTP transport since it tests pretty much nearly all of
3814 the functionality.
3815
3816 @code{gnunet-transport-check} should only be used without running
3817 @code{gnunetd} at the same time. By default, @code{gnunet-transport-check}
3818 tests all transports that are specified in the configuration file. But
3819 you can specifically test SMTP by giving the option
3820 @code{--transport=smtp}.
3821
3822 Note that this test always checks if a transport can receive and send.
3823 While you can configure most transports to only receive or only send
3824 messages, this test will only work if you have configured the transport
3825 to send and receive messages.
3826
3827 @node How fast is it?
3828 @subsection How fast is it?
3829 @c %**end of header
3830
3831 We have measured the performance of the UDP, TCP and SMTP transport layer
3832 directly and when used from an application using the GNUnet core.
3833 Measureing just the transport layer gives the better view of the actual
3834 overhead of the protocol, whereas evaluating the transport from the
3835 application puts the overhead into perspective from a practical point of
3836 view.
3837
3838 The loopback measurements of the SMTP transport were performed on three
3839 different machines spanning a range of modern SMTP configurations. We
3840 used a PIII-800 running RedHat 7.3 with the Purdue Computer Science
3841 configuration which includes filters for spam. We also used a Xenon 2 GHZ
3842 with a vanilla RedHat 8.0 sendmail configuration. Furthermore, we used
3843 qmail on a PIII-1000 running Sorcerer GNU Linux (SGL). The numbers for
3844 UDP and TCP are provided using the SGL configuration. The qmail benchmark
3845 uses qmail's internal filtering whereas the sendmail benchmarks relies on
3846 procmail to filter and deliver the mail. We used the transport layer to
3847 send a message of b bytes (excluding transport protocol headers) directly
3848 to the local machine. This way, network latency and packet loss on the
3849 wire have no impact on the timings. n messages were sent sequentially over
3850 the transport layer, sending message i+1 after the i-th message was
3851 received. All messages were sent over the same connection and the time to
3852 establish the connection was not taken into account since this overhead is
3853 miniscule in practice --- as long as a connection is used for a
3854 significant number of messages.
3855
3856 @multitable @columnfractions .20 .15 .15 .15 .15 .15
3857 @headitem Transport @tab UDP @tab TCP @tab SMTP (Purdue sendmail)
3858 @tab SMTP (RH 8.0) @tab SMTP (SGL qmail)
3859 @item  11 bytes @tab 31 ms @tab 55 ms @tab  781 s @tab 77 s @tab 24 s
3860 @item  407 bytes @tab 37 ms @tab 62 ms @tab  789 s @tab 78 s @tab 25 s
3861 @item 1,221 bytes @tab 46 ms @tab 73 ms @tab  804 s @tab 78 s @tab 25 s
3862 @end multitable
3863
3864 The benchmarks show that UDP and TCP are, as expected, both significantly
3865 faster compared with any of the SMTP services. Among the SMTP
3866 implementations, there can be significant differences depending on the
3867 SMTP configuration. Filtering with an external tool like procmail that
3868 needs to re-parse its configuration for each mail can be very expensive.
3869 Applying spam filters can also significantly impact the performance of
3870 the underlying SMTP implementation. The microbenchmark shows that SMTP
3871 can be a viable solution for initiating peer-to-peer sessions: a couple of
3872 seconds to connect to a peer are probably not even going to be noticed by
3873 users. The next benchmark measures the possible throughput for a
3874 transport. Throughput can be measured by sending multiple messages in
3875 parallel and measuring packet loss. Note that not only UDP but also the
3876 TCP transport can actually loose messages since the TCP implementation
3877 drops messages if the @code{write} to the socket would block. While the
3878 SMTP protocol never drops messages itself, it is often so
3879 slow that only a fraction of the messages can be sent and received in the
3880 given time-bounds. For this benchmark we report the message loss after
3881 allowing t time for sending m messages. If messages were not sent (or
3882 received) after an overall timeout of t, they were considered lost. The
3883 benchmark was performed using two Xeon 2 GHZ machines running RedHat 8.0
3884 with sendmail. The machines were connected with a direct 100 MBit ethernet
3885 connection.@ Figures udp1200, tcp1200 and smtp-MTUs show that the
3886 throughput for messages of size 1,200 octects is 2,343 kbps, 3,310 kbps
3887 and 6 kbps for UDP, TCP and SMTP respectively. The high per-message
3888 overhead of SMTP can be improved by increasing the MTU, for example, an
3889 MTU of 12,000 octets improves the throughput to 13 kbps as figure
3890 smtp-MTUs shows. Our research paper) has some more details on the
3891 benchmarking results.
3892
3893 @cindex Bluetooth plugin
3894 @node Bluetooth plugin
3895 @section Bluetooth plugin
3896 @c %**end of header
3897
3898 This page describes the new Bluetooth transport plugin for GNUnet. The
3899 plugin is still in the testing stage so don't expect it to work
3900 perfectly. If you have any questions or problems just post them here or
3901 ask on the IRC channel.
3902
3903 @itemize @bullet
3904 @item What do I need to use the Bluetooth plugin transport?
3905 @item BluetoothHow does it work?
3906 @item What possible errors should I be aware of?
3907 @item How do I configure my peer?
3908 @item How can I test it?
3909 @end itemize
3910
3911 @menu
3912 * What do I need to use the Bluetooth plugin transport?::
3913 * How does it work2?::
3914 * What possible errors should I be aware of?::
3915 * How do I configure my peer2?::
3916 * How can I test it?::
3917 * The implementation of the Bluetooth transport plugin::
3918 @end menu
3919
3920 @node What do I need to use the Bluetooth plugin transport?
3921 @subsection What do I need to use the Bluetooth plugin transport?
3922 @c %**end of header
3923
3924 If you are a GNU/Linux user and you want to use the Bluetooth
3925 transport plugin you should install the
3926 @command{BlueZ development libraries} (if they aren't already
3927 installed).
3928 For instructions about how to install the libraries you should
3929 check out the BlueZ site
3930 (@uref{http://www.bluez.org/, http://www.bluez.org}). If you don't know if
3931 you have the necesarry libraries, don't worry, just run the GNUnet
3932 configure script and you will be able to see a notification at the end
3933 which will warn you if you don't have the necessary libraries.
3934
3935 If you are a Windows user you should have installed the
3936 @emph{MinGW}/@emph{MSys2} with the latest updates (especially the
3937 @emph{ws2bth} header). If this is your first build of GNUnet on Windows
3938 you should check out the SBuild repository. It will semi-automatically
3939 assembles a @emph{MinGW}/@emph{MSys2} installation with a lot of extra
3940 packages which are needed for the GNUnet build. So this will ease your
3941 work!@ Finally you just have to be sure that you have the correct drivers
3942 for your Bluetooth device installed and that your device is on and in a
3943 discoverable mode. The Windows Bluetooth Stack supports only the RFCOMM
3944 protocol so we cannot turn on your device programatically!
3945
3946 @c FIXME: Change to unique title
3947 @node How does it work2?
3948 @subsection How does it work2?
3949 @c %**end of header
3950
3951 The Bluetooth transport plugin uses virtually the same code as the WLAN
3952 plugin and only the helper binary is different. The helper takes a single
3953 argument, which represents the interface name and is specified in the
3954 configuration file. Here are the basic steps that are followed by the
3955 helper binary used on GNU/Linux:
3956
3957 @itemize @bullet
3958 @item it verifies if the name corresponds to a Bluetooth interface name
3959 @item it verifies if the iterface is up (if it is not, it tries to bring
3960 it up)
3961 @item it tries to enable the page and inquiry scan in order to make the
3962 device discoverable and to accept incoming connection requests
3963 @emph{The above operations require root access so you should start the
3964 transport plugin with root privileges.}
3965 @item it finds an available port number and registers a SDP service which
3966 will be used to find out on which port number is the server listening on
3967 and switch the socket in listening mode
3968 @item it sends a HELLO message with its address
3969 @item finally it forwards traffic from the reading sockets to the STDOUT
3970 and from the STDIN to the writing socket
3971 @end itemize
3972
3973 Once in a while the device will make an inquiry scan to discover the
3974 nearby devices and it will send them randomly HELLO messages for peer
3975 discovery.
3976
3977 @node What possible errors should I be aware of?
3978 @subsection What possible errors should I be aware of?
3979 @c %**end of header
3980
3981 @emph{This section is dedicated for GNU/Linux users}
3982
3983 Well there are many ways in which things could go wrong but I will try to
3984 present some tools that you could use to debug and some scenarios.
3985
3986 @itemize @bullet
3987
3988 @item @code{bluetoothd -n -d} : use this command to enable logging in the
3989 foreground and to print the logging messages
3990
3991 @item @code{hciconfig}: can be used to configure the Bluetooth devices.
3992 If you run it without any arguments it will print information about the
3993 state of the interfaces. So if you receive an error that the device
3994 couldn't be brought up you should try to bring it manually and to see if
3995 it works (use @code{hciconfig -a hciX up}). If you can't and the
3996 Bluetooth address has the form 00:00:00:00:00:00 it means that there is
3997 something wrong with the D-Bus daemon or with the Bluetooth daemon. Use
3998 @code{bluetoothd} tool to see the logs
3999
4000 @item @code{sdptool} can be used to control and interogate SDP servers.
4001 If you encounter problems regarding the SDP server (like the SDP server is
4002 down) you should check out if the D-Bus daemon is running correctly and to
4003 see if the Bluetooth daemon started correctly(use @code{bluetoothd} tool).
4004 Also, sometimes the SDP service could work but somehow the device couldn't
4005 register his service. Use @code{sdptool browse [dev-address]} to see if
4006 the service is registered. There should be a service with the name of the
4007 interface and GNUnet as provider.
4008
4009 @item @code{hcitool} : another useful tool which can be used to configure
4010 the device and to send some particular commands to it.
4011
4012 @item @code{hcidump} : could be used for low level debugging
4013 @end itemize
4014
4015 @c FIXME: A more unique name
4016 @node How do I configure my peer2?
4017 @subsection How do I configure my peer2?
4018 @c %**end of header
4019
4020 On GNU/Linux, you just have to be sure that the interface name
4021 corresponds to the one that you want to use.
4022 Use the @code{hciconfig} tool to check that.
4023 By default it is set to hci0 but you can change it.
4024
4025 A basic configuration looks like this:
4026
4027 @example
4028 [transport-bluetooth]
4029 # Name of the interface (typically hciX)
4030 INTERFACE = hci0
4031 # Real hardware, no testing
4032 TESTMODE = 0 TESTING_IGNORE_KEYS = ACCEPT_FROM;
4033 @end example
4034
4035 In order to use the Bluetooth transport plugin when the transport service
4036 is started, you must add the plugin name to the default transport service
4037 plugins list. For example:
4038
4039 @example
4040 [transport] ...  PLUGINS = dns bluetooth ...
4041 @end example
4042
4043 If you want to use only the Bluetooth plugin set
4044 @emph{PLUGINS = bluetooth}
4045
4046 On Windows, you cannot specify which device to use. The only thing that
4047 you should do is to add @emph{bluetooth} on the plugins list of the
4048 transport service.
4049
4050 @node How can I test it?
4051 @subsection How can I test it?
4052 @c %**end of header
4053
4054 If you have two Bluetooth devices on the same machine and you are using
4055 GNU/Linux you must:
4056
4057 @itemize @bullet
4058
4059 @item create two different file configuration (one which will use the
4060 first interface (@emph{hci0}) and the other which will use the second
4061 interface (@emph{hci1})). Let's name them @emph{peer1.conf} and
4062 @emph{peer2.conf}.
4063
4064 @item run @emph{gnunet-peerinfo -c peerX.conf -s} in order to generate the
4065 peers private keys. The @strong{X} must be replace with 1 or 2.
4066
4067 @item run @emph{gnunet-arm -c peerX.conf -s -i=transport} in order to
4068 start the transport service. (Make sure that you have "bluetooth" on the
4069 transport plugins list if the Bluetooth transport service doesn't start.)
4070
4071 @item run @emph{gnunet-peerinfo -c peer1.conf -s} to get the first peer's
4072 ID. If you already know your peer ID (you saved it from the first
4073 command), this can be skipped.
4074
4075 @item run @emph{gnunet-transport -c peer2.conf -p=PEER1_ID -s} to start
4076 sending data for benchmarking to the other peer.
4077
4078 @end itemize
4079
4080
4081 This scenario will try to connect the second peer to the first one and
4082 then start sending data for benchmarking.
4083
4084 On Windows you cannot test the plugin functionality using two Bluetooth
4085 devices from the same machine because after you install the drivers there
4086 will occur some conflicts between the Bluetooth stacks. (At least that is
4087 what happend on my machine : I wasn't able to use the Bluesoleil stack and
4088 the WINDCOMM one in the same time).
4089
4090 If you have two different machines and your configuration files are good
4091 you can use the same scenario presented on the begining of this section.
4092
4093 Another way to test the plugin functionality is to create your own
4094 application which will use the GNUnet framework with the Bluetooth
4095 transport service.
4096
4097 @node The implementation of the Bluetooth transport plugin
4098 @subsection The implementation of the Bluetooth transport plugin
4099 @c %**end of header
4100
4101 This page describes the implementation of the Bluetooth transport plugin.
4102
4103 First I want to remind you that the Bluetooth transport plugin uses
4104 virtually the same code as the WLAN plugin and only the helper binary is
4105 different. Also the scope of the helper binary from the Bluetooth
4106 transport plugin is the same as the one used for the wlan transport
4107 plugin: it acceses the interface and then it forwards traffic in both
4108 directions between the Bluetooth interface and stdin/stdout of the
4109 process involved.
4110
4111 The Bluetooth plugin transport could be used both on GNU/Linux and Windows
4112 platforms.
4113
4114 @itemize @bullet
4115 @item Linux functionality
4116 @item Windows functionality
4117 @item Pending Features
4118 @end itemize
4119
4120
4121
4122 @menu
4123 * Linux functionality::
4124 * THE INITIALIZATION::
4125 * THE LOOP::
4126 * Details about the broadcast implementation::
4127 * Windows functionality::
4128 * Pending features::
4129 @end menu
4130
4131 @node Linux functionality
4132 @subsubsection Linux functionality
4133 @c %**end of header
4134
4135 In order to implement the plugin functionality on GNU/Linux I
4136 used the BlueZ stack.
4137 For the communication with the other devices I used the RFCOMM
4138 protocol. Also I used the HCI protocol to gain some control over the
4139 device. The helper binary takes a single argument (the name of the
4140 Bluetooth interface) and is separated in two stages:
4141
4142 @c %** 'THE INITIALIZATION' should be in bigger letters or stand out, not
4143 @c %** starting a new section?
4144 @node THE INITIALIZATION
4145 @subsubsection THE INITIALIZATION
4146
4147 @itemize @bullet
4148 @item first, it checks if we have root privilegies
4149 (@emph{Remember that we need to have root privilegies in order to be able
4150 to bring the interface up if it is down or to change its state.}).
4151
4152 @item second, it verifies if the interface with the given name exists.
4153
4154 @strong{If the interface with that name exists and it is a Bluetooth
4155 interface:}
4156
4157 @item it creates a RFCOMM socket which will be used for listening and call
4158 the @emph{open_device} method
4159
4160 On the @emph{open_device} method:
4161 @itemize @bullet
4162 @item creates a HCI socket used to send control events to the the device
4163 @item searches for the device ID using the interface name
4164 @item saves the device MAC address
4165 @item checks if the interface is down and tries to bring it UP
4166 @item checks if the interface is in discoverable mode and tries to make it
4167 discoverable
4168 @item closes the HCI socket and binds the RFCOMM one
4169 @item switches the RFCOMM socket in listening mode
4170 @item registers the SDP service (the service will be used by the other
4171 devices to get the port on which this device is listening on)
4172 @end itemize
4173
4174 @item drops the root privilegies
4175
4176 @strong{If the interface is not a Bluetooth interface the helper exits
4177 with a suitable error}
4178 @end itemize
4179
4180 @c %** Same as for @node entry above
4181 @node THE LOOP
4182 @subsubsection THE LOOP
4183
4184 The helper binary uses a list where it saves all the connected neighbour
4185 devices (@emph{neighbours.devices}) and two buffers (@emph{write_pout} and
4186 @emph{write_std}). The first message which is send is a control message
4187 with the device's MAC address in order to announce the peer presence to
4188 the neighbours. Here are a short description of what happens in the main
4189 loop:
4190
4191 @itemize @bullet
4192 @item Every time when it receives something from the STDIN it processes
4193 the data and saves the message in the first buffer (@emph{write_pout}).
4194 When it has something in the buffer, it gets the destination address from
4195 the buffer, searches the destination address in the list (if there is no
4196 connection with that device, it creates a new one and saves it to the
4197 list) and sends the message.
4198 @item Every time when it receives something on the listening socket it
4199 accepts the connection and saves the socket on a list with the reading
4200 sockets. @item Every time when it receives something from a reading
4201 socket it parses the message, verifies the CRC and saves it in the
4202 @emph{write_std} buffer in order to be sent later to the STDOUT.
4203 @end itemize
4204
4205 So in the main loop we use the select function to wait until one of the
4206 file descriptor saved in one of the two file descriptors sets used is
4207 ready to use. The first set (@emph{rfds}) represents the reading set and
4208 it could contain the list with the reading sockets, the STDIN file
4209 descriptor or the listening socket. The second set (@emph{wfds}) is the
4210 writing set and it could contain the sending socket or the STDOUT file
4211 descriptor. After the select function returns, we check which file
4212 descriptor is ready to use and we do what is supposed to do on that kind
4213 of event. @emph{For example:} if it is the listening socket then we
4214 accept a new connection and save the socket in the reading list; if it is
4215 the STDOUT file descriptor, then we write to STDOUT the message from the
4216 @emph{write_std} buffer.
4217
4218 To find out on which port a device is listening on we connect to the local
4219 SDP server and searche the registered service for that device.
4220
4221 @emph{You should be aware of the fact that if the device fails to connect
4222 to another one when trying to send a message it will attempt one more
4223 time. If it fails again, then it skips the message.}
4224 @emph{Also you should know that the transport Bluetooth plugin has
4225 support for @strong{broadcast messages}.}
4226
4227 @node Details about the broadcast implementation
4228 @subsubsection Details about the broadcast implementation
4229 @c %**end of header
4230
4231 First I want to point out that the broadcast functionality for the CONTROL
4232 messages is not implemented in a conventional way. Since the inquiry scan
4233 time is too big and it will take some time to send a message to all the
4234 discoverable devices I decided to tackle the problem in a different way.
4235 Here is how I did it:
4236
4237 @itemize @bullet
4238 @item If it is the first time when I have to broadcast a message I make an
4239 inquiry scan and save all the devices' addresses to a vector.
4240 @item After the inquiry scan ends I take the first address from the list
4241 and I try to connect to it. If it fails, I try to connect to the next one.
4242 If it succeeds, I save the socket to a list and send the message to the
4243 device.
4244 @item When I have to broadcast another message, first I search on the list
4245 for a new device which I'm not connected to. If there is no new device on
4246 the list I go to the beginning of the list and send the message to the
4247 old devices. After 5 cycles I make a new inquiry scan to check out if
4248 there are new discoverable devices and save them to the list. If there
4249 are no new discoverable devices I reset the cycling counter and go again
4250 through the old list and send messages to the devices saved in it.
4251 @end itemize
4252
4253 @strong{Therefore}:
4254
4255 @itemize @bullet
4256 @item every time when I have a broadcast message I look up on the list
4257 for a new device and send the message to it
4258 @item if I reached the end of the list for 5 times and I'm connected to
4259 all the devices from the list I make a new inquiry scan.
4260 @emph{The number of the list's cycles after an inquiry scan could be
4261 increased by redefining the MAX_LOOPS variable}
4262 @item when there are no new devices I send messages to the old ones.
4263 @end itemize
4264
4265 Doing so, the broadcast control messages will reach the devices but with
4266 delay.
4267
4268 @emph{NOTICE:} When I have to send a message to a certain device first I
4269 check on the broadcast list to see if we are connected to that device. If
4270 not we try to connect to it and in case of success we save the address and
4271 the socket on the list. If we are already connected to that device we
4272 simply use the socket.
4273
4274 @node Windows functionality
4275 @subsubsection Windows functionality
4276 @c %**end of header
4277
4278 For Windows I decided to use the Microsoft Bluetooth stack which has the
4279 advantage of coming standard from Windows XP SP2. The main disadvantage is
4280 that it only supports the RFCOMM protocol so we will not be able to have
4281 a low level control over the Bluetooth device. Therefore it is the user
4282 responsability to check if the device is up and in the discoverable mode.
4283 Also there are no tools which could be used for debugging in order to read
4284 the data coming from and going to a Bluetooth device, which obviously
4285 hindered my work. Another thing that slowed down the implementation of the
4286 plugin (besides that I wasn't too accomodated with the win32 API) was that
4287 there were some bugs on MinGW regarding the Bluetooth. Now they are solved
4288 but you should keep in mind that you should have the latest updates
4289 (especially the @emph{ws2bth} header).
4290
4291 Besides the fact that it uses the Windows Sockets, the Windows
4292 implemenation follows the same principles as the GNU/Linux one:
4293
4294 @itemize @bullet
4295 @item It has a initalization part where it initializes the
4296 Windows Sockets, creates a RFCOMM socket which will be binded and switched
4297 to the listening mode and registers a SDP service. In the Microsoft
4298 Bluetooth API there are two ways to work with the SDP:
4299 @itemize @bullet
4300 @item an easy way which works with very simple service records
4301 @item a hard way which is useful when you need to update or to delete the
4302 record
4303 @end itemize
4304 @end itemize
4305
4306 Since I only needed the SDP service to find out on which port the device
4307 is listening on and that did not change, I decided to use the easy way.
4308 In order to register the service I used the @emph{WSASetService} function
4309 and I generated the @emph{Universally Unique Identifier} with the
4310 @emph{guidgen.exe} Windows's tool.
4311
4312 In the loop section the only difference from the GNU/Linux implementation
4313 is that I used the @code{GNUNET_NETWORK} library for
4314 functions like @emph{accept}, @emph{bind}, @emph{connect} or
4315 @emph{select}. I decided to use the
4316 @code{GNUNET_NETWORK} library because I also needed to interact
4317 with the STDIN and STDOUT handles and on Windows
4318 the select function is only defined for sockets,
4319 and it will not work for arbitrary file handles.
4320
4321 Another difference between GNU/Linux and Windows implementation is that in
4322 GNU/Linux, the Bluetooth address is represented in 48 bits
4323 while in Windows is represented in 64 bits.
4324 Therefore I had to do some changes on @emph{plugin_transport_wlan} header.
4325
4326 Also, currently on Windows the Bluetooth plugin doesn't have support for
4327 broadcast messages. When it receives a broadcast message it will skip it.
4328
4329 @node Pending features
4330 @subsubsection Pending features
4331 @c %**end of header
4332
4333 @itemize @bullet
4334 @item Implement the broadcast functionality on Windows @emph{(currently
4335 working on)}
4336 @item Implement a testcase for the helper :@ @emph{The testcase
4337 consists of a program which emaluates the plugin and uses the helper. It
4338 will simulate connections, disconnections and data transfers.}
4339 @end itemize
4340
4341 If you have a new idea about a feature of the plugin or suggestions about
4342 how I could improve the implementation you are welcome to comment or to
4343 contact me.
4344
4345 @node WLAN plugin
4346 @section WLAN plugin
4347 @c %**end of header
4348
4349 This section documents how the wlan transport plugin works. Parts which
4350 are not implemented yet or could be better implemented are described at
4351 the end.
4352
4353 @cindex ATS Subsystem
4354 @node ATS Subsystem
4355 @section ATS Subsystem
4356 @c %**end of header
4357
4358 ATS stands for "automatic transport selection", and the function of ATS in
4359 GNUnet is to decide on which address (and thus transport plugin) should
4360 be used for two peers to communicate, and what bandwidth limits should be
4361 imposed on such an individual connection. To help ATS make an informed
4362 decision, higher-level services inform the ATS service about their
4363 requirements and the quality of the service rendered. The ATS service
4364 also interacts with the transport service to be appraised of working
4365 addresses and to communicate its resource allocation decisions. Finally,
4366 the ATS service's operation can be observed using a monitoring API.
4367
4368 The main logic of the ATS service only collects the available addresses,
4369 their performance characteristics and the applications requirements, but
4370 does not make the actual allocation decision. This last critical step is
4371 left to an ATS plugin, as we have implemented (currently three) different
4372 allocation strategies which differ significantly in their performance and
4373 maturity, and it is still unclear if any particular plugin is generally
4374 superior.
4375
4376 @cindex CORE Subsystem
4377 @node CORE Subsystem
4378 @section CORE Subsystem
4379 @c %**end of header
4380
4381 The CORE subsystem in GNUnet is responsible for securing link-layer
4382 communications between nodes in the GNUnet overlay network. CORE builds
4383 on the TRANSPORT subsystem which provides for the actual, insecure,
4384 unreliable link-layer communication (for example, via UDP or WLAN), and
4385 then adds fundamental security to the connections:
4386
4387 @itemize @bullet
4388 @item confidentiality with so-called perfect forward secrecy; we use
4389 ECDHE@footnote{@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman}}
4390 powered by Curve25519
4391 @footnote{@uref{http://cr.yp.to/ecdh.html, Curve25519}} for the key
4392 exchange and then use symmetric encryption, encrypting with both AES-256
4393 @footnote{@uref{http://en.wikipedia.org/wiki/Rijndael, AES-256}} and
4394 Twofish @footnote{@uref{http://en.wikipedia.org/wiki/Twofish, Twofish}}
4395 @item @uref{http://en.wikipedia.org/wiki/Authentication, authentication}
4396 is achieved by signing the ephemeral keys using Ed25519
4397 @footnote{@uref{http://ed25519.cr.yp.to/, Ed25519}}, a deterministic
4398 variant of ECDSA
4399 @footnote{@uref{http://en.wikipedia.org/wiki/ECDSA, ECDSA}}
4400 @item integrity protection (using SHA-512
4401 @footnote{@uref{http://en.wikipedia.org/wiki/SHA-2, SHA-512}} to do
4402 encrypt-then-MAC
4403 @footnote{@uref{http://en.wikipedia.org/wiki/Authenticated_encryption, encrypt-then-MAC}})
4404 @item Replay
4405 @footnote{@uref{http://en.wikipedia.org/wiki/Replay_attack, replay}}
4406 protection (using nonces, timestamps, challenge-response,
4407 message counters and ephemeral keys)
4408 @item liveness (keep-alive messages, timeout)
4409 @end itemize
4410
4411 @menu
4412 * Limitations::
4413 * When is a peer "connected"?::
4414 * libgnunetcore::
4415 * The CORE Client-Service Protocol::
4416 * The CORE Peer-to-Peer Protocol::
4417 @end menu
4418
4419 @cindex core subsystem limitations
4420 @node Limitations
4421 @subsection Limitations
4422 @c %**end of header
4423
4424 CORE does not perform
4425 @uref{http://en.wikipedia.org/wiki/Routing, routing}; using CORE it is
4426 only possible to communicate with peers that happen to already be
4427 "directly" connected with each other. CORE also does not have an
4428 API to allow applications to establish such "direct" connections --- for
4429 this, applications can ask TRANSPORT, but TRANSPORT might not be able to
4430 establish a "direct" connection. The TOPOLOGY subsystem is responsible for
4431 trying to keep a few "direct" connections open at all times. Applications
4432 that need to talk to particular peers should use the CADET subsystem, as
4433 it can establish arbitrary "indirect" connections.
4434
4435 Because CORE does not perform routing, CORE must only be used directly by
4436 applications that either perform their own routing logic (such as
4437 anonymous file-sharing) or that do not require routing, for example
4438 because they are based on flooding the network. CORE communication is
4439 unreliable and delivery is possibly out-of-order. Applications that
4440 require reliable communication should use the CADET service. Each
4441 application can only queue one message per target peer with the CORE
4442 service at any time; messages cannot be larger than approximately
4443 63 kilobytes. If messages are small, CORE may group multiple messages
4444 (possibly from different applications) prior to encryption. If permitted
4445 by the application (using the @uref{http://baus.net/on-tcp_cork/, cork}
4446 option), CORE may delay transmissions to facilitate grouping of multiple
4447 small messages. If cork is not enabled, CORE will transmit the message as
4448 soon as TRANSPORT allows it (TRANSPORT is responsible for limiting
4449 bandwidth and congestion control). CORE does not allow flow control;
4450 applications are expected to process messages at line-speed. If flow
4451 control is needed, applications should use the CADET service.
4452
4453 @cindex when is a peer connected
4454 @node When is a peer "connected"?
4455 @subsection When is a peer "connected"?
4456 @c %**end of header
4457
4458 In addition to the security features mentioned above, CORE also provides
4459 one additional key feature to applications using it, and that is a
4460 limited form of protocol-compatibility checking. CORE distinguishes
4461 between TRANSPORT-level connections (which enable communication with other
4462 peers) and application-level connections. Applications using the CORE API
4463 will (typically) learn about application-level connections from CORE, and
4464 not about TRANSPORT-level connections. When a typical application uses
4465 CORE, it will specify a set of message types
4466 (from @code{gnunet_protocols.h}) that it understands. CORE will then
4467 notify the application about connections it has with other peers if and
4468 only if those applications registered an intersecting set of message
4469 types with their CORE service. Thus, it is quite possible that CORE only
4470 exposes a subset of the established direct connections to a particular
4471 application --- and different applications running above CORE might see
4472 different sets of connections at the same time.
4473
4474 A special case are applications that do not register a handler for any
4475 message type.
4476 CORE assumes that these applications merely want to monitor connections
4477 (or "all" messages via other callbacks) and will notify those applications
4478 about all connections. This is used, for example, by the
4479 @code{gnunet-core} command-line tool to display the active connections.
4480 Note that it is also possible that the TRANSPORT service has more active
4481 connections than the CORE service, as the CORE service first has to
4482 perform a key exchange with connecting peers before exchanging information
4483 about supported message types and notifying applications about the new
4484 connection.
4485
4486 @cindex libgnunetcore
4487 @node libgnunetcore
4488 @subsection libgnunetcore
4489 @c %**end of header
4490
4491 The CORE API (defined in @file{gnunet_core_service.h}) is the basic
4492 messaging API used by P2P applications built using GNUnet. It provides
4493 applications the ability to send and receive encrypted messages to the
4494 peer's "directly" connected neighbours.
4495
4496 As CORE connections are generally "direct" connections,@ applications must
4497 not assume that they can connect to arbitrary peers this way, as "direct"
4498 connections may not always be possible. Applications using CORE are
4499 notified about which peers are connected. Creating new "direct"
4500 connections must be done using the TRANSPORT API.
4501
4502 The CORE API provides unreliable, out-of-order delivery. While the
4503 implementation tries to ensure timely, in-order delivery, both message
4504 losses and reordering are not detected and must be tolerated by the
4505 application. Most important, the core will NOT perform retransmission if
4506 messages could not be delivered.
4507
4508 Note that CORE allows applications to queue one message per connected
4509 peer. The rate at which each connection operates is influenced by the
4510 preferences expressed by local application as well as restrictions
4511 imposed by the other peer. Local applications can express their
4512 preferences for particular connections using the "performance" API of the
4513 ATS service.
4514
4515 Applications that require more sophisticated transmission capabilities
4516 such as TCP-like behavior, or if you intend to send messages to arbitrary
4517 remote peers, should use the CADET API.
4518
4519 The typical use of the CORE API is to connect to the CORE service using
4520 @code{GNUNET_CORE_connect}, process events from the CORE service (such as
4521 peers connecting, peers disconnecting and incoming messages) and send
4522 messages to connected peers using
4523 @code{GNUNET_CORE_notify_transmit_ready}. Note that applications must
4524 cancel pending transmission requests if they receive a disconnect event
4525 for a peer that had a transmission pending; furthermore, queueing more
4526 than one transmission request per peer per application using the
4527 service is not permitted.
4528
4529 The CORE API also allows applications to monitor all communications of the
4530 peer prior to encryption (for outgoing messages) or after decryption (for
4531 incoming messages). This can be useful for debugging, diagnostics or to
4532 establish the presence of cover traffic (for anonymity). As monitoring
4533 applications are often not interested in the payload, the monitoring
4534 callbacks can be configured to only provide the message headers (including
4535 the message type and size) instead of copying the full data stream to the
4536 monitoring client.
4537
4538 The init callback of the @code{GNUNET_CORE_connect} function is called
4539 with the hash of the public key of the peer. This public key is used to
4540 identify the peer globally in the GNUnet network. Applications are
4541 encouraged to check that the provided hash matches the hash that they are
4542 using (as theoretically the application may be using a different
4543 configuration file with a different private key, which would result in
4544 hard to find bugs).
4545
4546 As with most service APIs, the CORE API isolates applications from crashes
4547 of the CORE service. If the CORE service crashes, the application will see
4548 disconnect events for all existing connections. Once the connections are
4549 re-established, the applications will be receive matching connect events.
4550
4551 @cindex core clinet-service protocol
4552 @node The CORE Client-Service Protocol
4553 @subsection The CORE Client-Service Protocol
4554 @c %**end of header
4555
4556 This section describes the protocol between an application using the CORE
4557 service (the client) and the CORE service process itself.
4558
4559
4560 @menu
4561 * Setup2::
4562 * Notifications::
4563 * Sending::
4564 @end menu
4565
4566 @node Setup2
4567 @subsubsection Setup2
4568 @c %**end of header
4569
4570 When a client connects to the CORE service, it first sends a
4571 @code{InitMessage} which specifies options for the connection and a set of
4572 message type values which are supported by the application. The options
4573 bitmask specifies which events the client would like to be notified about.
4574 The options include:
4575
4576 @table @asis
4577 @item GNUNET_CORE_OPTION_NOTHING No notifications
4578 @item GNUNET_CORE_OPTION_STATUS_CHANGE Peers connecting and disconnecting
4579 @item GNUNET_CORE_OPTION_FULL_INBOUND All inbound messages (after
4580 decryption) with full payload
4581 @item GNUNET_CORE_OPTION_HDR_INBOUND Just the @code{MessageHeader}
4582 of all inbound messages
4583 @item GNUNET_CORE_OPTION_FULL_OUTBOUND All outbound
4584 messages (prior to encryption) with full payload
4585 @item GNUNET_CORE_OPTION_HDR_OUTBOUND Just the @code{MessageHeader} of all
4586 outbound messages
4587 @end table
4588
4589 Typical applications will only monitor for connection status changes.
4590
4591 The CORE service responds to the @code{InitMessage} with an
4592 @code{InitReplyMessage} which contains the peer's identity. Afterwards,
4593 both CORE and the client can send messages.
4594
4595 @node Notifications
4596 @subsubsection Notifications
4597 @c %**end of header
4598
4599 The CORE will send @code{ConnectNotifyMessage}s and
4600 @code{DisconnectNotifyMessage}s whenever peers connect or disconnect from
4601 the CORE (assuming their type maps overlap with the message types
4602 registered by the client). When the CORE receives a message that matches
4603 the set of message types specified during the @code{InitMessage} (or if
4604 monitoring is enabled in for inbound messages in the options), it sends a
4605 @code{NotifyTrafficMessage} with the peer identity of the sender and the
4606 decrypted payload. The same message format (except with
4607 @code{GNUNET_MESSAGE_TYPE_CORE_NOTIFY_OUTBOUND} for the message type) is
4608 used to notify clients monitoring outbound messages; here, the peer
4609 identity given is that of the receiver.
4610
4611 @node Sending
4612 @subsubsection Sending
4613 @c %**end of header
4614
4615 When a client wants to transmit a message, it first requests a
4616 transmission slot by sending a @code{SendMessageRequest} which specifies
4617 the priority, deadline and size of the message. Note that these values
4618 may be ignored by CORE. When CORE is ready for the message, it answers
4619 with a @code{SendMessageReady} response. The client can then transmit the
4620 payload with a @code{SendMessage} message. Note that the actual message
4621 size in the @code{SendMessage} is allowed to be smaller than the size in
4622 the original request. A client may at any time send a fresh
4623 @code{SendMessageRequest}, which then superceeds the previous
4624 @code{SendMessageRequest}, which is then no longer valid. The client can
4625 tell which @code{SendMessageRequest} the CORE service's
4626 @code{SendMessageReady} message is for as all of these messages contain a
4627 "unique" request ID (based on a counter incremented by the client
4628 for each request).
4629
4630 @cindex CORE Peer-to-Peer Protocol
4631 @node The CORE Peer-to-Peer Protocol
4632 @subsection The CORE Peer-to-Peer Protocol
4633 @c %**end of header
4634
4635
4636 @menu
4637 * Creating the EphemeralKeyMessage::
4638 * Establishing a connection::
4639 * Encryption and Decryption::
4640 * Type maps::
4641 @end menu
4642
4643 @cindex EphemeralKeyMessage creation
4644 @node Creating the EphemeralKeyMessage
4645 @subsubsection Creating the EphemeralKeyMessage
4646 @c %**end of header
4647
4648 When the CORE service starts, each peer creates a fresh ephemeral (ECC)
4649 public-private key pair and signs the corresponding
4650 @code{EphemeralKeyMessage} with its long-term key (which we usually call
4651 the peer's identity; the hash of the public long term key is what results
4652 in a @code{struct GNUNET_PeerIdentity} in all GNUnet APIs. The ephemeral
4653 key is ONLY used for an ECDHE@footnote{@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman}}
4654 exchange by the CORE service to establish symmetric session keys. A peer
4655 will use the same @code{EphemeralKeyMessage} for all peers for
4656 @code{REKEY_FREQUENCY}, which is usually 12 hours. After that time, it
4657 will create a fresh ephemeral key (forgetting the old one) and broadcast
4658 the new @code{EphemeralKeyMessage} to all connected peers, resulting in
4659 fresh symmetric session keys. Note that peers independently decide on
4660 when to discard ephemeral keys; it is not a protocol violation to discard
4661 keys more often. Ephemeral keys are also never stored to disk; restarting
4662 a peer will thus always create a fresh ephemeral key. The use of ephemeral
4663 keys is what provides @uref{http://en.wikipedia.org/wiki/Forward_secrecy, forward secrecy}.
4664
4665 Just before transmission, the @code{EphemeralKeyMessage} is patched to
4666 reflect the current sender_status, which specifies the current state of
4667 the connection from the point of view of the sender. The possible values
4668 are:
4669
4670 @itemize @bullet
4671 @item @code{KX_STATE_DOWN} Initial value, never used on the network
4672 @item @code{KX_STATE_KEY_SENT} We sent our ephemeral key, do not know the
4673 key of the other peer
4674 @item @code{KX_STATE_KEY_RECEIVED} This peer has received a valid
4675 ephemeral key of the other peer, but we are waiting for the other peer to
4676 confirm it's authenticity (ability to decode) via challenge-response.
4677 @item @code{KX_STATE_UP} The connection is fully up from the point of
4678 view of the sender (now performing keep-alives)
4679 @item @code{KX_STATE_REKEY_SENT} The sender has initiated a rekeying
4680 operation; the other peer has so far failed to confirm a working
4681 connection using the new ephemeral key
4682 @end itemize
4683
4684 @node Establishing a connection
4685 @subsubsection Establishing a connection
4686 @c %**end of header
4687
4688 Peers begin their interaction by sending a @code{EphemeralKeyMessage} to
4689 the other peer once the TRANSPORT service notifies the CORE service about
4690 the connection.
4691 A peer receiving an @code{EphemeralKeyMessage} with a status
4692 indicating that the sender does not have the receiver's ephemeral key, the
4693 receiver's @code{EphemeralKeyMessage} is sent in response.
4694 Additionally, if the receiver has not yet confirmed the authenticity of
4695 the sender, it also sends an (encrypted)@code{PingMessage} with a
4696 challenge (and the identity of the target) to the other peer. Peers
4697 receiving a @code{PingMessage} respond with an (encrypted)
4698 @code{PongMessage} which includes the challenge. Peers receiving a
4699 @code{PongMessage} check the challenge, and if it matches set the
4700 connection to @code{KX_STATE_UP}.
4701
4702 @node Encryption and Decryption
4703 @subsubsection Encryption and Decryption
4704 @c %**end of header
4705
4706 All functions related to the key exchange and encryption/decryption of
4707 messages can be found in @file{gnunet-service-core_kx.c} (except for the
4708 cryptographic primitives, which are in @file{util/crypto*.c}).
4709 Given the key material from ECDHE, a Key derivation function
4710 @footnote{@uref{https://en.wikipedia.org/wiki/Key_derivation_function, Key derivation function}}
4711 is used to derive two pairs of encryption and decryption keys for AES-256
4712 and TwoFish, as well as initialization vectors and authentication keys
4713 (for HMAC@footnote{@uref{https://en.wikipedia.org/wiki/HMAC, HMAC}}).
4714 The HMAC is computed over the encrypted payload.
4715 Encrypted messages include an iv_seed and the HMAC in the header.
4716
4717 Each encrypted message in the CORE service includes a sequence number and
4718 a timestamp in the encrypted payload. The CORE service remembers the
4719 largest observed sequence number and a bit-mask which represents which of
4720 the previous 32 sequence numbers were already used.
4721 Messages with sequence numbers lower than the largest observed sequence
4722 number minus 32 are discarded. Messages with a timestamp that is less
4723 than @code{REKEY_TOLERANCE} off (5 minutes) are also discarded. This of
4724 course means that system clocks need to be reasonably synchronized for
4725 peers to be able to communicate. Additionally, as the ephemeral key
4726 changes every 12 hours, a peer would not even be able to decrypt messages
4727 older than 12 hours.
4728
4729 @node Type maps
4730 @subsubsection Type maps
4731 @c %**end of header
4732
4733 Once an encrypted connection has been established, peers begin to exchange
4734 type maps. Type maps are used to allow the CORE service to determine which
4735 (encrypted) connections should be shown to which applications. A type map
4736 is an array of 65536 bits representing the different types of messages
4737 understood by applications using the CORE service. Each CORE service
4738 maintains this map, simply by setting the respective bit for each message
4739 type supported by any of the applications using the CORE service. Note
4740 that bits for message types embedded in higher-level protocols (such as
4741 MESH) will not be included in these type maps.
4742
4743 Typically, the type map of a peer will be sparse. Thus, the CORE service
4744 attempts to compress its type map using @code{gzip}-style compression
4745 ("deflate") prior to transmission. However, if the compression fails to
4746 compact the map, the map may also be transmitted without compression
4747 (resulting in @code{GNUNET_MESSAGE_TYPE_CORE_COMPRESSED_TYPE_MAP} or
4748 @code{GNUNET_MESSAGE_TYPE_CORE_BINARY_TYPE_MAP} messages respectively).
4749 Upon receiving a type map, the respective CORE service notifies
4750 applications about the connection to the other peer if they support any
4751 message type indicated in the type map (or no message type at all).
4752 If the CORE service experience a connect or disconnect event from an
4753 application, it updates its type map (setting or unsetting the respective
4754 bits) and notifies its neighbours about the change.
4755 The CORE services of the neighbours then in turn generate connect and
4756 disconnect events for the peer that sent the type map for their respective
4757 applications. As CORE messages may be lost, the CORE service confirms
4758 receiving a type map by sending back a
4759 @code{GNUNET_MESSAGE_TYPE_CORE_CONFIRM_TYPE_MAP}. If such a confirmation
4760 (with the correct hash of the type map) is not received, the sender will
4761 retransmit the type map (with exponential back-off).
4762
4763 @cindex CADET Subsystem
4764 @node CADET Subsystem
4765 @section CADET Subsystem
4766
4767 The CADET subsystem in GNUnet is responsible for secure end-to-end
4768 communications between nodes in the GNUnet overlay network. CADET builds
4769 on the CORE subsystem which provides for the link-layer communication and
4770 then adds routing, forwarding and additional security to the connections.
4771 CADET offers the same cryptographic services as CORE, but on an
4772 end-to-end level. This is done so peers retransmitting traffic on behalf
4773 of other peers cannot access the payload data.
4774
4775 @itemize @bullet
4776 @item CADET provides confidentiality with so-called perfect forward
4777 secrecy; we use ECDHE powered by Curve25519 for the key exchange and then
4778 use symmetric encryption, encrypting with both AES-256 and Twofish
4779 @item authentication is achieved by signing the ephemeral keys using
4780 Ed25519, a deterministic variant of ECDSA
4781 @item integrity protection (using SHA-512 to do encrypt-then-MAC, although
4782 only 256 bits are sent to reduce overhead)
4783 @item replay protection (using nonces, timestamps, challenge-response,
4784 message counters and ephemeral keys)
4785 @item liveness (keep-alive messages, timeout)
4786 @end itemize
4787
4788 Additional to the CORE-like security benefits, CADET offers other
4789 properties that make it a more universal service than CORE.
4790
4791 @itemize @bullet
4792 @item CADET can establish channels to arbitrary peers in GNUnet. If a
4793 peer is not immediately reachable, CADET will find a path through the
4794 network and ask other peers to retransmit the traffic on its behalf.
4795 @item CADET offers (optional) reliability mechanisms. In a reliable
4796 channel traffic is guaranteed to arrive complete, unchanged and in-order.
4797 @item CADET takes care of flow and congestion control mechanisms, not
4798 allowing the sender to send more traffic than the receiver or the network
4799 are able to process.
4800 @end itemize
4801
4802 @menu
4803 * libgnunetcadet::
4804 @end menu
4805
4806 @cindex libgnunetcadet
4807 @node libgnunetcadet
4808 @subsection libgnunetcadet
4809
4810
4811 The CADET API (defined in @file{gnunet_cadet_service.h}) is the
4812 messaging API used by P2P applications built using GNUnet.
4813 It provides applications the ability to send and receive encrypted
4814 messages to any peer participating in GNUnet.
4815 The API is heavily base on the CORE API.
4816
4817 CADET delivers messages to other peers in "channels".
4818 A channel is a permanent connection defined by a destination peer
4819 (identified by its public key) and a port number.
4820 Internally, CADET tunnels all channels towards a destiantion peer
4821 using one session key and relays the data on multiple "connections",
4822 independent from the channels.
4823
4824 Each channel has optional paramenters, the most important being the
4825 reliability flag.
4826 Should a message get lost on TRANSPORT/CORE level, if a channel is
4827 created with as reliable, CADET will retransmit the lost message and
4828 deliver it in order to the destination application.
4829
4830 To communicate with other peers using CADET, it is necessary to first
4831 connect to the service using @code{GNUNET_CADET_connect}.
4832 This function takes several parameters in form of callbacks, to allow the
4833 client to react to various events, like incoming channels or channels that
4834 terminate, as well as specify a list of ports the client wishes to listen
4835 to (at the moment it is not possible to start listening on further ports
4836 once connected, but nothing prevents a client to connect several times to
4837 CADET, even do one connection per listening port).
4838 The function returns a handle which has to be used for any further
4839 interaction with the service.
4840
4841 To connect to a remote peer a client has to call the
4842 @code{GNUNET_CADET_channel_create} function. The most important parameters
4843 given are the remote peer's identity (it public key) and a port, which
4844 specifies which application on the remote peer to connect to, similar to
4845 TCP/UDP ports. CADET will then find the peer in the GNUnet network and
4846 establish the proper low-level connections and do the necessary key
4847 exchanges to assure and authenticated, secure and verified communication.
4848 Similar to @code{GNUNET_CADET_connect},@code{GNUNET_CADET_create_channel}
4849 returns a handle to interact with the created channel.
4850
4851 For every message the client wants to send to the remote application,
4852 @code{GNUNET_CADET_notify_transmit_ready} must be called, indicating the
4853 channel on which the message should be sent and the size of the message
4854 (but not the message itself!). Once CADET is ready to send the message,
4855 the provided callback will fire, and the message contents are provided to
4856 this callback.
4857
4858 Please note the CADET does not provide an explicit notification of when a
4859 channel is connected. In loosely connected networks, like big wireless
4860 mesh networks, this can take several seconds, even minutes in the worst
4861 case. To be alerted when a channel is online, a client can call
4862 @code{GNUNET_CADET_notify_transmit_ready} immediately after
4863 @code{GNUNET_CADET_create_channel}. When the callback is activated, it
4864 means that the channel is online. The callback can give 0 bytes to CADET
4865 if no message is to be sent, this is ok.
4866
4867 If a transmission was requested but before the callback fires it is no
4868 longer needed, it can be cancelled with
4869 @code{GNUNET_CADET_notify_transmit_ready_cancel}, which uses the handle
4870 given back by @code{GNUNET_CADET_notify_transmit_ready}.
4871 As in the case of CORE, only one message can be requested at a time: a
4872 client must not call @code{GNUNET_CADET_notify_transmit_ready} again until
4873 the callback is called or the request is cancelled.
4874
4875 When a channel is no longer needed, a client can call
4876 @code{GNUNET_CADET_channel_destroy} to get rid of it.
4877 Note that CADET will try to transmit all pending traffic before notifying
4878 the remote peer of the destruction of the channel, including
4879 retransmitting lost messages if the channel was reliable.
4880
4881 Incoming channels, channels being closed by the remote peer, and traffic
4882 on any incoming or outgoing channels are given to the client when CADET
4883 executes the callbacks given to it at the time of
4884 @code{GNUNET_CADET_connect}.
4885
4886 Finally, when an application no longer wants to use CADET, it should call
4887 @code{GNUNET_CADET_disconnect}, but first all channels and pending
4888 transmissions must be closed (otherwise CADET will complain).
4889
4890 @cindex NSE Subsystem
4891 @node NSE Subsystem
4892 @section NSE Subsystem
4893
4894
4895 NSE stands for @dfn{Network Size Estimation}. The NSE subsystem provides
4896 other subsystems and users with a rough estimate of the number of peers
4897 currently participating in the GNUnet overlay.
4898 The computed value is not a precise number as producing a precise number
4899 in a decentralized, efficient and secure way is impossible.
4900 While NSE's estimate is inherently imprecise, NSE also gives the expected
4901 range. For a peer that has been running in a stable network for a
4902 while, the real network size will typically (99.7% of the time) be in the
4903 range of [2/3 estimate, 3/2 estimate]. We will now give an overview of the
4904 algorithm used to calculate the estimate;
4905 all of the details can be found in this technical report.
4906
4907 @c FIXME: link to the report.
4908
4909 @menu
4910 * Motivation::
4911 * Principle::
4912 * libgnunetnse::
4913 * The NSE Client-Service Protocol::
4914 * The NSE Peer-to-Peer Protocol::
4915 @end menu
4916
4917 @node Motivation
4918 @subsection Motivation
4919
4920
4921 Some subsytems, like DHT, need to know the size of the GNUnet network to
4922 optimize some parameters of their own protocol. The decentralized nature
4923 of GNUnet makes efficient and securely counting the exact number of peers
4924 infeasable. Although there are several decentralized algorithms to count
4925 the number of peers in a system, so far there is none to do so securely.
4926 Other protocols may allow any malicious peer to manipulate the final
4927 result or to take advantage of the system to perform
4928 @dfn{Denial of Service} (DoS) attacks against the network.
4929 GNUnet's NSE protocol avoids these drawbacks.
4930
4931
4932
4933 @menu
4934 * Security::
4935 @end menu
4936
4937 @cindex NSE security
4938 @cindex nse security
4939 @node Security
4940 @subsubsection Security
4941
4942
4943 The NSE subsystem is designed to be resilient against these attacks.
4944 It uses @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proofs of work}
4945 to prevent one peer from impersonating a large number of participants,
4946 which would otherwise allow an adversary to artifically inflate the
4947 estimate.
4948 The DoS protection comes from the time-based nature of the protocol:
4949 the estimates are calculated periodically and out-of-time traffic is
4950 either ignored or stored for later retransmission by benign peers.
4951 In particular, peers cannot trigger global network communication at will.
4952
4953 @cindex NSE principle
4954 @cindex nse principle
4955 @node Principle
4956 @subsection Principle
4957
4958
4959 The algorithm calculates the estimate by finding the globally closest
4960 peer ID to a random, time-based value.
4961
4962 The idea is that the closer the ID is to the random value, the more
4963 "densely packed" the ID space is, and therefore, more peers are in the
4964 network.
4965
4966
4967
4968 @menu
4969 * Example::
4970 * Algorithm::
4971 * Target value::
4972 * Timing::
4973 * Controlled Flooding::
4974 * Calculating the estimate::
4975 @end menu
4976
4977 @node Example
4978 @subsubsection Example
4979
4980
4981 Suppose all peers have IDs between 0 and 100 (our ID space), and the
4982 random value is 42.
4983 If the closest peer has the ID 70 we can imagine that the average
4984 "distance" between peers is around 30 and therefore the are around 3
4985 peers in the whole ID space. On the other hand, if the closest peer has
4986 the ID 44, we can imagine that the space is rather packed with peers,
4987 maybe as much as 50 of them.
4988 Naturally, we could have been rather unlucky, and there is only one peer
4989 and happens to have the ID 44. Thus, the current estimate is calculated
4990 as the average over multiple rounds, and not just a single sample.
4991
4992 @node Algorithm
4993 @subsubsection Algorithm
4994
4995
4996 Given that example, one can imagine that the job of the subsystem is to
4997 efficiently communicate the ID of the closest peer to the target value
4998 to all the other peers, who will calculate the estimate from it.
4999
5000 @node Target value
5001 @subsubsection Target value
5002
5003 @c %**end of header
5004
5005 The target value itself is generated by hashing the current time, rounded
5006 down to an agreed value. If the rounding amount is 1h (default) and the
5007 time is 12:34:56, the time to hash would be 12:00:00. The process is
5008 repeated each rouning amount (in this example would be every hour).
5009 Every repetition is called a round.
5010
5011 @node Timing
5012 @subsubsection Timing
5013 @c %**end of header
5014
5015 The NSE subsystem has some timing control to avoid everybody broadcasting
5016 its ID all at one. Once each peer has the target random value, it
5017 compares its own ID to the target and calculates the hypothetical size of
5018 the network if that peer were to be the closest.
5019 Then it compares the hypothetical size with the estimate from the previous
5020 rounds. For each value there is an assiciated point in the period,
5021 let's call it "broadcast time". If its own hypothetical estimate
5022 is the same as the previous global estimate, its "broadcast time" will be
5023 in the middle of the round. If its bigger it will be earlier and if its
5024 smaller (the most likely case) it will be later. This ensures that the
5025 peers closests to the target value start broadcasting their ID the first.
5026
5027 @node Controlled Flooding
5028 @subsubsection Controlled Flooding
5029
5030 @c %**end of header
5031
5032 When a peer receives a value, first it verifies that it is closer than the
5033 closest value it had so far, otherwise it answers the incoming message
5034 with a message containing the better value. Then it checks a proof of
5035 work that must be included in the incoming message, to ensure that the
5036 other peer's ID is not made up (otherwise a malicious peer could claim to
5037 have an ID of exactly the target value every round). Once validated, it
5038 compares the brodcast time of the received value with the current time
5039 and if it's not too early, sends the received value to its neighbors.
5040 Otherwise it stores the value until the correct broadcast time comes.
5041 This prevents unnecessary traffic of sub-optimal values, since a better
5042 value can come before the broadcast time, rendering the previous one
5043 obsolete and saving the traffic that would have been used to broadcast it
5044 to the neighbors.
5045
5046 @node Calculating the estimate
5047 @subsubsection Calculating the estimate
5048
5049 @c %**end of header
5050
5051 Once the closest ID has been spread across the network each peer gets the
5052 exact distance betweed this ID and the target value of the round and
5053 calculates the estimate with a mathematical formula described in the tech
5054 report. The estimate generated with this method for a single round is not
5055 very precise. Remember the case of the example, where the only peer is the
5056 ID 44 and we happen to generate the target value 42, thinking there are
5057 50 peers in the network. Therefore, the NSE subsystem remembers the last
5058 64 estimates and calculates an average over them, giving a result of which
5059 usually has one bit of uncertainty (the real size could be half of the
5060 estimate or twice as much). Note that the actual network size is
5061 calculated in powers of two of the raw input, thus one bit of uncertainty
5062 means a factor of two in the size estimate.
5063
5064 @cindex libgnunetnse
5065 @node libgnunetnse
5066 @subsection libgnunetnse
5067
5068 @c %**end of header
5069
5070 The NSE subsystem has the simplest API of all services, with only two
5071 calls: @code{GNUNET_NSE_connect} and @code{GNUNET_NSE_disconnect}.
5072
5073 The connect call gets a callback function as a parameter and this function
5074 is called each time the network agrees on an estimate. This usually is
5075 once per round, with some exceptions: if the closest peer has a late
5076 local clock and starts spreading his ID after everyone else agreed on a
5077 value, the callback might be activated twice in a round, the second value
5078 being always bigger than the first. The default round time is set to
5079 1 hour.
5080
5081 The disconnect call disconnects from the NSE subsystem and the callback
5082 is no longer called with new estimates.
5083
5084
5085
5086 @menu
5087 * Results::
5088 * libgnunetnse - Examples::
5089 @end menu
5090
5091 @node Results
5092 @subsubsection Results
5093
5094 @c %**end of header
5095
5096 The callback provides two values: the average and the
5097 @uref{http://en.wikipedia.org/wiki/Standard_deviation, standard deviation}
5098 of the last 64 rounds. The values provided by the callback function are
5099 logarithmic, this means that the real estimate numbers can be obtained by
5100 calculating 2 to the power of the given value (2average). From a
5101 statistics point of view this means that:
5102
5103 @itemize @bullet
5104 @item 68% of the time the real size is included in the interval
5105 [(2average-stddev), 2]
5106 @item 95% of the time the real size is included in the interval
5107 [(2average-2*stddev, 2^average+2*stddev]
5108 @item 99.7% of the time the real size is included in the interval
5109 [(2average-3*stddev, 2average+3*stddev]
5110 @end itemize
5111
5112 The expected standard variation for 64 rounds in a network of stable size
5113 is 0.2. Thus, we can say that normally:
5114
5115 @itemize @bullet
5116 @item 68% of the time the real size is in the range [-13%, +15%]
5117 @item 95% of the time the real size is in the range [-24%, +32%]
5118 @item 99.7% of the time the real size is in the range [-34%, +52%]
5119 @end itemize
5120
5121 As said in the introduction, we can be quite sure that usually the real
5122 size is between one third and three times the estimate. This can of
5123 course vary with network conditions.
5124 Thus, applications may want to also consider the provided standard
5125 deviation value, not only the average (in particular, if the standard
5126 veriation is very high, the average maybe meaningless: the network size is
5127 changing rapidly).
5128
5129 @node libgnunetnse - Examples
5130 @subsubsection libgnunetnse -Examples
5131
5132 @c %**end of header
5133
5134 Let's close with a couple examples.
5135
5136 @table @asis
5137
5138 @item Average: 10, std dev: 1 Here the estimate would be
5139 2^10 = 1024 peers. @footnote{The range in which we can be 95% sure is:
5140 [2^8, 2^12] = [256, 4096]. We can be very (>99.7%) sure that the network
5141 is not a hundred peers and absolutely sure that it is not a million peers,
5142 but somewhere around a thousand.}
5143
5144 @item Average 22, std dev: 0.2 Here the estimate would be
5145 2^22 = 4 Million peers. @footnote{The range in which we can be 99.7% sure
5146 is: [2^21.4, 2^22.6] = [2.8M, 6.3M]. We can be sure that the network size
5147 is around four million, with absolutely way of it being 1 million.}
5148
5149 @end table
5150
5151 To put this in perspective, if someone remembers the LHC Higgs boson
5152 results, were announced with "5 sigma" and "6 sigma" certainties. In this
5153 case a 5 sigma minimum would be 2 million and a 6 sigma minimum,
5154 1.8 million.
5155
5156 @node The NSE Client-Service Protocol
5157 @subsection The NSE Client-Service Protocol
5158
5159 @c %**end of header
5160
5161 As with the API, the client-service protocol is very simple, only has 2
5162 different messages, defined in @code{src/nse/nse.h}:
5163
5164 @itemize @bullet
5165 @item @code{GNUNET_MESSAGE_TYPE_NSE_START}@ This message has no parameters
5166 and is sent from the client to the service upon connection.
5167 @item @code{GNUNET_MESSAGE_TYPE_NSE_ESTIMATE}@ This message is sent from
5168 the service to the client for every new estimate and upon connection.
5169 Contains a timestamp for the estimate, the average and the standard
5170 deviation for the respective round.
5171 @end itemize
5172
5173 When the @code{GNUNET_NSE_disconnect} API call is executed, the client
5174 simply disconnects from the service, with no message involved.
5175
5176 @cindex NSE Peer-to-Peer Protocol
5177 @node The NSE Peer-to-Peer Protocol
5178 @subsection The NSE Peer-to-Peer Protocol
5179
5180 @c %**end of header
5181
5182 The NSE subsystem only has one message in the P2P protocol, the
5183 @code{GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD} message.
5184
5185 This message key contents are the timestamp to identify the round
5186 (differences in system clocks may cause some peers to send messages way
5187 too early or way too late, so the timestamp allows other peers to
5188 identify such messages easily), the
5189 @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proof of work}
5190 used to make it difficult to mount a
5191 @uref{http://en.wikipedia.org/wiki/Sybil_attack, Sybil attack}, and the
5192 public key, which is used to verify the signature on the message.
5193
5194 Every peer stores a message for the previous, current and next round. The
5195 messages for the previous and current round are given to peers that
5196 connect to us. The message for the next round is simply stored until our
5197 system clock advances to the next round. The message for the current round
5198 is what we are flooding the network with right now.
5199 At the beginning of each round the peer does the following:
5200
5201 @itemize @bullet
5202 @item calculates his own distance to the target value
5203 @item creates, signs and stores the message for the current round (unless
5204 it has a better message in the "next round" slot which came early in the
5205 previous round)
5206 @item calculates, based on the stored round message (own or received) when
5207 to stard flooding it to its neighbors
5208 @end itemize
5209
5210 Upon receiving a message the peer checks the validity of the message
5211 (round, proof of work, signature). The next action depends on the
5212 contents of the incoming message:
5213
5214 @itemize @bullet
5215 @item if the message is worse than the current stored message, the peer
5216 sends the current message back immediately, to stop the other peer from
5217 spreading suboptimal results
5218 @item if the message is better than the current stored message, the peer
5219 stores the new message and calculates the new target time to start
5220 spreading it to its neighbors (excluding the one the message came from)
5221 @item if the message is for the previous round, it is compared to the
5222 message stored in the "previous round slot", which may then be updated
5223 @item if the message is for the next round, it is compared to the message
5224 stored in the "next round slot", which again may then be updated
5225 @end itemize
5226
5227 Finally, when it comes to send the stored message for the current round to
5228 the neighbors there is a random delay added for each neighbor, to avoid
5229 traffic spikes and minimize cross-messages.
5230
5231 @cindex HOSTLIST Subsystem
5232 @node HOSTLIST Subsystem
5233 @section HOSTLIST Subsystem
5234
5235 @c %**end of header
5236
5237 Peers in the GNUnet overlay network need address information so that they
5238 can connect with other peers. GNUnet uses so called HELLO messages to
5239 store and exchange peer addresses.
5240 GNUnet provides several methods for peers to obtain this information:
5241
5242 @itemize @bullet
5243 @item out-of-band exchange of HELLO messages (manually, using for example
5244 gnunet-peerinfo)
5245 @item HELLO messages shipped with GNUnet (automatic with distribution)
5246 @item UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast)
5247 @item topology gossiping (learning from other peers we already connected
5248 to), and
5249 @item the HOSTLIST daemon covered in this section, which is particularly
5250 relevant for bootstrapping new peers.
5251 @end itemize
5252
5253 New peers have no existing connections (and thus cannot learn from gossip
5254 among peers), may not have other peers in their LAN and might be started
5255 with an outdated set of HELLO messages from the distribution.
5256 In this case, getting new peers to connect to the network requires either
5257 manual effort or the use of a HOSTLIST to obtain HELLOs.
5258
5259 @menu
5260 * HELLOs::
5261 * Overview for the HOSTLIST subsystem::
5262 * Interacting with the HOSTLIST daemon::
5263 * Hostlist security address validation::
5264 * The HOSTLIST daemon::
5265 * The HOSTLIST server::
5266 * The HOSTLIST client::
5267 * Usage::
5268 @end menu
5269
5270 @node HELLOs
5271 @subsection HELLOs
5272
5273 @c %**end of header
5274
5275 The basic information peers require to connect to other peers are
5276 contained in so called HELLO messages you can think of as a business card.
5277 Besides the identity of the peer (based on the cryptographic public key) a
5278 HELLO message may contain address information that specifies ways to
5279 contact a peer. By obtaining HELLO messages, a peer can learn how to
5280 contact other peers.
5281
5282 @node Overview for the HOSTLIST subsystem
5283 @subsection Overview for the HOSTLIST subsystem
5284
5285 @c %**end of header
5286
5287 The HOSTLIST subsystem provides a way to distribute and obtain contact
5288 information to connect to other peers using a simple HTTP GET request.
5289 It's implementation is split in three parts, the main file for the daemon
5290 itself (@file{gnunet-daemon-hostlist.c}), the HTTP client used to download
5291 peer information (@file{hostlist-client.c}) and the server component used
5292 to provide this information to other peers (@file{hostlist-server.c}).
5293 The server is basically a small HTTP web server (based on GNU
5294 libmicrohttpd) which provides a list of HELLOs known to the local peer for
5295 download. The client component is basically a HTTP client
5296 (based on libcurl) which can download hostlists from one or more websites.
5297 The hostlist format is a binary blob containing a sequence of HELLO
5298 messages. Note that any HTTP server can theoretically serve a hostlist,
5299 the build-in hostlist server makes it simply convenient to offer this
5300 service.
5301
5302
5303 @menu
5304 * Features::
5305 * HOSTLIST - Limitations::
5306 @end menu
5307
5308 @node Features
5309 @subsubsection Features
5310
5311 @c %**end of header
5312
5313 The HOSTLIST daemon can:
5314
5315 @itemize @bullet
5316 @item provide HELLO messages with validated addresses obtained from
5317 PEERINFO to download for other peers
5318 @item download HELLO messages and forward these message to the TRANSPORT
5319 subsystem for validation
5320 @item advertises the URL of this peer's hostlist address to other peers
5321 via gossip
5322 @item automatically learn about hostlist servers from the gossip of other
5323 peers
5324 @end itemize
5325
5326 @node HOSTLIST - Limitations
5327 @subsubsection HOSTLIST - Limitations
5328
5329 @c %**end of header
5330
5331 The HOSTLIST daemon does not:
5332
5333 @itemize @bullet
5334 @item verify the cryptographic information in the HELLO messages
5335 @item verify the address information in the HELLO messages
5336 @end itemize
5337
5338 @node Interacting with the HOSTLIST daemon
5339 @subsection Interacting with the HOSTLIST daemon
5340
5341 @c %**end of header
5342
5343 The HOSTLIST subsystem is currently implemented as a daemon, so there is
5344 no need for the user to interact with it and therefore there is no
5345 command line tool and no API to communicate with the daemon. In the
5346 future, we can envision changing this to allow users to manually trigger
5347 the download of a hostlist.
5348
5349 Since there is no command line interface to interact with HOSTLIST, the
5350 only way to interact with the hostlist is to use STATISTICS to obtain or
5351 modify information about the status of HOSTLIST:
5352
5353 @example
5354 $ gnunet-statistics -s hostlist
5355 @end example
5356
5357 @noindent
5358 In particular, HOSTLIST includes a @strong{persistent} value in statistics
5359 that specifies when the hostlist server might be queried next. As this
5360 value is exponentially increasing during runtime, developers may want to
5361 reset or manually adjust it. Note that HOSTLIST (but not STATISTICS) needs
5362 to be shutdown if changes to this value are to have any effect on the
5363 daemon (as HOSTLIST does not monitor STATISTICS for changes to the
5364 download frequency).
5365
5366 @node Hostlist security address validation
5367 @subsection Hostlist security address validation
5368
5369 @c %**end of header
5370
5371 Since information obtained from other parties cannot be trusted without
5372 validation, we have to distinguish between @emph{validated} and
5373 @emph{not validated} addresses. Before using (and so trusting)
5374 information from other parties, this information has to be double-checked
5375 (validated). Address validation is not done by HOSTLIST but by the
5376 TRANSPORT service.
5377
5378 The HOSTLIST component is functionally located between the PEERINFO and
5379 the TRANSPORT subsystem. When acting as a server, the daemon obtains valid
5380 (@emph{validated}) peer information (HELLO messages) from the PEERINFO
5381 service and provides it to other peers. When acting as a client, it
5382 contacts the HOSTLIST servers specified in the configuration, downloads
5383 the (unvalidated) list of HELLO messages and forwards these information
5384 to the TRANSPORT server to validate the addresses.
5385
5386 @cindex HOSTLIST daemon
5387 @node The HOSTLIST daemon
5388 @subsection The HOSTLIST daemon
5389
5390 @c %**end of header
5391
5392 The hostlist daemon is the main component of the HOSTLIST subsystem. It is
5393 started by the ARM service and (if configured) starts the HOSTLIST client
5394 and server components.
5395
5396 If the daemon provides a hostlist itself it can advertise it's own
5397 hostlist to other peers. To do so it sends a
5398 @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to other peers
5399 when they connect to this peer on the CORE level. This hostlist
5400 advertisement message contains the URL to access the HOSTLIST HTTP
5401 server of the sender. The daemon may also subscribe to this type of
5402 message from CORE service, and then forward these kind of message to the
5403 HOSTLIST client. The client then uses all available URLs to download peer
5404 information when necessary.
5405
5406 When starting, the HOSTLIST daemon first connects to the CORE subsystem
5407 and if hostlist learning is enabled, registers a CORE handler to receive
5408 this kind of messages. Next it starts (if configured) the client and
5409 server. It passes pointers to CORE connect and disconnect and receive
5410 handlers where the client and server store their functions, so the daemon
5411 can notify them about CORE events.
5412
5413 To clean up on shutdown, the daemon has a cleaning task, shutting down all
5414 subsystems and disconnecting from CORE.
5415
5416 @cindex HOSTLIST server
5417 @node The HOSTLIST server
5418 @subsection The HOSTLIST server
5419
5420 @c %**end of header
5421
5422 The server provides a way for other peers to obtain HELLOs. Basically it
5423 is a small web server other peers can connect to and download a list of
5424 HELLOs using standard HTTP; it may also advertise the URL of the hostlist
5425 to other peers connecting on CORE level.
5426
5427
5428 @menu
5429 * The HTTP Server::
5430 * Advertising the URL::
5431 @end menu
5432
5433 @node The HTTP Server
5434 @subsubsection The HTTP Server
5435
5436 @c %**end of header
5437
5438 During startup, the server starts a web server listening on the port
5439 specified with the HTTPPORT value (default 8080). In addition it connects
5440 to the PEERINFO service to obtain peer information. The HOSTLIST server
5441 uses the GNUNET_PEERINFO_iterate function to request HELLO information for
5442 all peers and adds their information to a new hostlist if they are
5443 suitable (expired addresses and HELLOs without addresses are both not
5444 suitable) and the maximum size for a hostlist is not exceeded
5445 (MAX_BYTES_PER_HOSTLISTS = 500000).
5446 When PEERINFO finishes (with a last NULL callback), the server destroys
5447 the previous hostlist response available for download on the web server
5448 and replaces it with the updated hostlist. The hostlist format is
5449 basically a sequence of HELLO messages (as obtained from PEERINFO) without
5450 any special tokenization. Since each HELLO message contains a size field,
5451 the response can easily be split into separate HELLO messages by the
5452 client.
5453
5454 A HOSTLIST client connecting to the HOSTLIST server will receive the
5455 hostlist as a HTTP response and the the server will terminate the
5456 connection with the result code @code{HTTP 200 OK}.
5457 The connection will be closed immediately if no hostlist is available.
5458
5459 @node Advertising the URL
5460 @subsubsection Advertising the URL
5461
5462 @c %**end of header
5463
5464 The server also advertises the URL to download the hostlist to other peers
5465 if hostlist advertisement is enabled.
5466 When a new peer connects and has hostlist learning enabled, the server
5467 sends a @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to this
5468 peer using the CORE service.
5469
5470 @cindex HOSTLIST client
5471 @node The HOSTLIST client
5472 @subsection The HOSTLIST client
5473
5474 @c %**end of header
5475
5476 The client provides the functionality to download the list of HELLOs from
5477 a set of URLs.
5478 It performs a standard HTTP request to the URLs configured and learned
5479 from advertisement messages received from other peers. When a HELLO is
5480 downloaded, the HOSTLIST client forwards the HELLO to the TRANSPORT
5481 service for validation.
5482
5483 The client supports two modes of operation:
5484
5485 @itemize @bullet
5486 @item download of HELLOs (bootstrapping)
5487 @item learning of URLs
5488 @end itemize
5489
5490 @menu
5491 * Bootstrapping::
5492 * Learning::
5493 @end menu
5494
5495 @node Bootstrapping
5496 @subsubsection Bootstrapping
5497
5498 @c %**end of header
5499
5500 For bootstrapping, it schedules a task to download the hostlist from the
5501 set of known URLs.
5502 The downloads are only performed if the number of current
5503 connections is smaller than a minimum number of connections
5504 (at the moment 4).
5505 The interval between downloads increases exponentially; however, the
5506 exponential growth is limited if it becomes longer than an hour.
5507 At that point, the frequency growth is capped at
5508 (#number of connections * 1h).
5509
5510 Once the decision has been taken to download HELLOs, the daemon chooses a
5511 random URL from the list of known URLs. URLs can be configured in the
5512 configuration or be learned from advertisement messages.
5513 The client uses a HTTP client library (libcurl) to initiate the download
5514 using the libcurl multi interface.
5515 Libcurl passes the data to the callback_download function which
5516 stores the data in a buffer if space is available and the maximum size for
5517 a hostlist download is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000).
5518 When a full HELLO was downloaded, the HOSTLIST client offers this
5519 HELLO message to the TRANSPORT service for validation.
5520 When the download is finished or failed, statistical information about the
5521 quality of this URL is updated.
5522
5523 @cindex HOSTLIST learning
5524 @node Learning
5525 @subsubsection Learning
5526
5527 @c %**end of header
5528
5529 The client also manages hostlist advertisements from other peers. The
5530 HOSTLIST daemon forwards @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT}
5531 messages to the client subsystem, which extracts the URL from the message.
5532 Next, a test of the newly obtained URL is performed by triggering a
5533 download from the new URL. If the URL works correctly, it is added to the
5534 list of working URLs.
5535
5536 The size of the list of URLs is restricted, so if an additional server is
5537 added and the list is full, the URL with the worst quality ranking
5538 (determined through successful downloads and number of HELLOs e.g.) is
5539 discarded. During shutdown the list of URLs is saved to a file for
5540 persistance and loaded on startup. URLs from the configuration file are
5541 never discarded.
5542
5543 @node Usage
5544 @subsection Usage
5545
5546 @c %**end of header
5547
5548 To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES
5549 section for the ARM services. This is done in the default configuration.
5550
5551 For more information on how to configure the HOSTLIST subsystem see the
5552 installation handbook:@
5553 Configuring the hostlist to bootstrap@
5554 Configuring your peer to provide a hostlist
5555
5556 @cindex IDENTITY Subsystem
5557 @node IDENTITY Subsystem
5558 @section IDENTITY Subsystem
5559
5560 @c %**end of header
5561
5562 Identities of "users" in GNUnet are called egos.
5563 Egos can be used as pseudonyms ("fake names") or be tied to an
5564 organization (for example, "GNU") or even the actual identity of a human.
5565 GNUnet users are expected to have many egos. They might have one tied to
5566 their real identity, some for organizations they manage, and more for
5567 different domains where they want to operate under a pseudonym.
5568
5569 The IDENTITY service allows users to manage their egos. The identity
5570 service manages the private keys egos of the local user; it does not
5571 manage identities of other users (public keys). Public keys for other
5572 users need names to become manageable. GNUnet uses the
5573 @dfn{GNU Name System} (GNS) to give names to other users and manage their
5574 public keys securely. This chapter is about the IDENTITY service,
5575 which is about the management of private keys.
5576
5577 On the network, an ego corresponds to an ECDSA key (over Curve25519,
5578 using RFC 6979, as required by GNS). Thus, users can perform actions
5579 under a particular ego by using (signing with) a particular private key.
5580 Other users can then confirm that the action was really performed by that
5581 ego by checking the signature against the respective public key.
5582
5583 The IDENTITY service allows users to associate a human-readable name with
5584 each ego. This way, users can use names that will remind them of the
5585 purpose of a particular ego.
5586 The IDENTITY service will store the respective private keys and
5587 allows applications to access key information by name.
5588 Users can change the name that is locally (!) associated with an ego.
5589 Egos can also be deleted, which means that the private key will be removed
5590 and it thus will not be possible to perform actions with that ego in the
5591 future.
5592
5593 Additionally, the IDENTITY subsystem can associate service functions with
5594 egos.
5595 For example, GNS requires the ego that should be used for the shorten
5596 zone. GNS will ask IDENTITY for an ego for the "gns-short" service.
5597 The IDENTITY service has a mapping of such service strings to the name of
5598 the ego that the user wants to use for this service, for example
5599 "my-short-zone-ego".
5600
5601 Finally, the IDENTITY API provides access to a special ego, the
5602 anonymous ego. The anonymous ego is special in that its private key is not
5603 really private, but fixed and known to everyone.
5604 Thus, anyone can perform actions as anonymous. This can be useful as with
5605 this trick, code does not have to contain a special case to distinguish
5606 between anonymous and pseudonymous egos.
5607
5608 @menu
5609 * libgnunetidentity::
5610 * The IDENTITY Client-Service Protocol::
5611 @end menu
5612
5613 @cindex libgnunetidentity
5614 @node libgnunetidentity
5615 @subsection libgnunetidentity
5616 @c %**end of header
5617
5618
5619 @menu
5620 * Connecting to the service::
5621 * Operations on Egos::
5622 * The anonymous Ego::
5623 * Convenience API to lookup a single ego::
5624 * Associating egos with service functions::
5625 @end menu
5626
5627 @node Connecting to the service
5628 @subsubsection Connecting to the service
5629
5630 @c %**end of header
5631
5632 First, typical clients connect to the identity service using
5633 @code{GNUNET_IDENTITY_connect}. This function takes a callback as a
5634 parameter.
5635 If the given callback parameter is non-null, it will be invoked to notify
5636 the application about the current state of the identities in the system.
5637
5638 @itemize @bullet
5639 @item First, it will be invoked on all known egos at the time of the
5640 connection. For each ego, a handle to the ego and the user's name for the
5641 ego will be passed to the callback. Furthermore, a @code{void **} context
5642 argument will be provided which gives the client the opportunity to
5643 associate some state with the ego.
5644 @item Second, the callback will be invoked with NULL for the ego, the name
5645 and the context. This signals that the (initial) iteration over all egos
5646 has completed.
5647 @item Then, the callback will be invoked whenever something changes about
5648 an ego.
5649 If an ego is renamed, the callback is invoked with the ego handle of the
5650 ego that was renamed, and the new name. If an ego is deleted, the callback
5651 is invoked with the ego handle and a name of NULL. In the deletion case,
5652 the application should also release resources stored in the context.
5653 @item When the application destroys the connection to the identity service
5654 using @code{GNUNET_IDENTITY_disconnect}, the callback is again invoked
5655 with the ego and a name of NULL (equivalent to deletion of the egos).
5656 This should again be used to clean up the per-ego context.
5657 @end itemize
5658
5659 The ego handle passed to the callback remains valid until the callback is
5660 invoked with a name of NULL, so it is safe to store a reference to the
5661 ego's handle.
5662
5663 @node Operations on Egos
5664 @subsubsection Operations on Egos
5665
5666 @c %**end of header
5667
5668 Given an ego handle, the main operations are to get its associated private
5669 key using @code{GNUNET_IDENTITY_ego_get_private_key} or its associated
5670 public key using @code{GNUNET_IDENTITY_ego_get_public_key}.
5671
5672 The other operations on egos are pretty straightforward.
5673 Using @code{GNUNET_IDENTITY_create}, an application can request the
5674 creation of an ego by specifying the desired name.
5675 The operation will fail if that name is
5676 already in use. Using @code{GNUNET_IDENTITY_rename} the name of an
5677 existing ego can be changed. Finally, egos can be deleted using
5678 @code{GNUNET_IDENTITY_delete}. All of these operations will trigger
5679 updates to the callback given to the @code{GNUNET_IDENTITY_connect}
5680 function of all applications that are connected with the identity service
5681 at the time. @code{GNUNET_IDENTITY_cancel} can be used to cancel the
5682 operations before the respective continuations would be called.
5683 It is not guaranteed that the operation will not be completed anyway,
5684 only the continuation will no longer be called.
5685
5686 @node The anonymous Ego
5687 @subsubsection The anonymous Ego
5688
5689 @c %**end of header
5690
5691 A special way to obtain an ego handle is to call
5692 @code{GNUNET_IDENTITY_ego_get_anonymous}, which returns an ego for the
5693 "anonymous" user --- anyone knows and can get the private key for this
5694 user, so it is suitable for operations that are supposed to be anonymous
5695 but require signatures (for example, to avoid a special path in the code).
5696 The anonymous ego is always valid and accessing it does not require a
5697 connection to the identity service.
5698
5699 @node Convenience API to lookup a single ego
5700 @subsubsection Convenience API to lookup a single ego
5701
5702
5703 As applications commonly simply have to lookup a single ego, there is a
5704 convenience API to do just that. Use @code{GNUNET_IDENTITY_ego_lookup} to
5705 lookup a single ego by name. Note that this is the user's name for the
5706 ego, not the service function. The resulting ego will be returned via a
5707 callback and will only be valid during that callback. The operation can
5708 be cancelled via @code{GNUNET_IDENTITY_ego_lookup_cancel}
5709 (cancellation is only legal before the callback is invoked).
5710
5711 @node Associating egos with service functions
5712 @subsubsection Associating egos with service functions
5713
5714
5715 The @code{GNUNET_IDENTITY_set} function is used to associate a particular
5716 ego with a service function. The name used by the service and the ego are
5717 given as arguments.
5718 Afterwards, the service can use its name to lookup the associated ego
5719 using @code{GNUNET_IDENTITY_get}.
5720
5721 @node The IDENTITY Client-Service Protocol
5722 @subsection The IDENTITY Client-Service Protocol
5723
5724 @c %**end of header
5725
5726 A client connecting to the identity service first sends a message with
5727 type
5728 @code{GNUNET_MESSAGE_TYPE_IDENTITY_START} to the service. After that, the
5729 client will receive information about changes to the egos by receiving
5730 messages of type @code{GNUNET_MESSAGE_TYPE_IDENTITY_UPDATE}.
5731 Those messages contain the private key of the ego and the user's name of
5732 the ego (or zero bytes for the name to indicate that the ego was deleted).
5733 A special bit @code{end_of_list} is used to indicate the end of the
5734 initial iteration over the identity service's egos.
5735
5736 The client can trigger changes to the egos by sending @code{CREATE},
5737 @code{RENAME} or @code{DELETE} messages.
5738 The CREATE message contains the private key and the desired name.@
5739 The RENAME message contains the old name and the new name.@
5740 The DELETE message only needs to include the name of the ego to delete.@
5741 The service responds to each of these messages with a @code{RESULT_CODE}
5742 message which indicates success or error of the operation, and possibly
5743 a human-readable error message.
5744
5745 Finally, the client can bind the name of a service function to an ego by
5746 sending a @code{SET_DEFAULT} message with the name of the service function
5747 and the private key of the ego.
5748 Such bindings can then be resolved using a @code{GET_DEFAULT} message,
5749 which includes the name of the service function. The identity service
5750 will respond to a GET_DEFAULT request with a SET_DEFAULT message
5751 containing the respective information, or with a RESULT_CODE to
5752 indicate an error.
5753
5754 @cindex NAMESTORE Subsystem
5755 @node NAMESTORE Subsystem
5756 @section NAMESTORE Subsystem
5757
5758 The NAMESTORE subsystem provides persistent storage for local GNS zone
5759 information. All local GNS zone information are managed by NAMESTORE. It
5760 provides both the functionality to administer local GNS information (e.g.
5761 delete and add records) as well as to retrieve GNS information (e.g to
5762 list name information in a client).
5763 NAMESTORE does only manage the persistent storage of zone information
5764 belonging to the user running the service: GNS information from other
5765 users obtained from the DHT are stored by the NAMECACHE subsystem.
5766
5767 NAMESTORE uses a plugin-based database backend to store GNS information
5768 with good performance. Here sqlite, MySQL and PostgreSQL are supported
5769 database backends.
5770 NAMESTORE clients interact with the IDENTITY subsystem to obtain
5771 cryptographic information about zones based on egos as described with the
5772 IDENTITY subsystem, but internally NAMESTORE refers to zones using the
5773 ECDSA private key.
5774 In addition, it collaborates with the NAMECACHE subsystem and
5775 stores zone information when local information are modified in the
5776 GNS cache to increase look-up performance for local information.
5777
5778 NAMESTORE provides functionality to look-up and store records, to iterate
5779 over a specific or all zones and to monitor zones for changes. NAMESTORE
5780 functionality can be accessed using the NAMESTORE api or the NAMESTORE
5781 command line tool.
5782
5783 @menu
5784 * libgnunetnamestore::
5785 @end menu
5786
5787 @cindex libgnunetnamestore
5788 @node libgnunetnamestore
5789 @subsection libgnunetnamestore
5790
5791 To interact with NAMESTORE clients first connect to the NAMESTORE service
5792 using the @code{GNUNET_NAMESTORE_connect} passing a configuration handle.
5793 As a result they obtain a NAMESTORE handle, they can use for operations,
5794 or NULL is returned if the connection failed.
5795
5796 To disconnect from NAMESTORE, clients use
5797 @code{GNUNET_NAMESTORE_disconnect} and specify the handle to disconnect.
5798
5799 NAMESTORE internally uses the ECDSA private key to refer to zones. These
5800 private keys can be obtained from the IDENTITY subsytem.
5801 Here @emph{egos} @emph{can be used to refer to zones or the default ego
5802 assigned to the GNS subsystem can be used to obtained the master zone's
5803 private key.}
5804
5805
5806 @menu
5807 * Editing Zone Information::
5808 * Iterating Zone Information::
5809 * Monitoring Zone Information::
5810 @end menu
5811
5812 @node Editing Zone Information
5813 @subsubsection Editing Zone Information
5814
5815 @c %**end of header
5816
5817 NAMESTORE provides functions to lookup records stored under a label in a
5818 zone and to store records under a label in a zone.
5819
5820 To store (and delete) records, the client uses the
5821 @code{GNUNET_NAMESTORE_records_store} function and has to provide
5822 namestore handle to use, the private key of the zone, the label to store
5823 the records under, the records and number of records plus an callback
5824 function.
5825 After the operation is performed NAMESTORE will call the provided
5826 callback function with the result GNUNET_SYSERR on failure
5827 (including timeout/queue drop/failure to validate), GNUNET_NO if content
5828 was already there or not found GNUNET_YES (or other positive value) on
5829 success plus an additional error message.
5830
5831 Records are deleted by using the store command with 0 records to store.
5832 It is important to note, that records are not merged when records exist
5833 with the label.
5834 So a client has first to retrieve records, merge with existing records
5835 and then store the result.
5836
5837 To perform a lookup operation, the client uses the
5838 @code{GNUNET_NAMESTORE_records_store} function. Here he has to pass the
5839 namestore handle, the private key of the zone and the label. He also has
5840 to provide a callback function which will be called with the result of
5841 the lookup operation:
5842 the zone for the records, the label, and the records including the
5843 number of records included.
5844
5845 A special operation is used to set the preferred nickname for a zone.
5846 This nickname is stored with the zone and is automatically merged with
5847 all labels and records stored in a zone. Here the client uses the
5848 @code{GNUNET_NAMESTORE_set_nick} function and passes the private key of
5849 the zone, the nickname as string plus a the callback with the result of
5850 the operation.
5851
5852 @node Iterating Zone Information
5853 @subsubsection Iterating Zone Information
5854
5855 @c %**end of header
5856
5857 A client can iterate over all information in a zone or all zones managed
5858 by NAMESTORE.
5859 Here a client uses the @code{GNUNET_NAMESTORE_zone_iteration_start}
5860 function and passes the namestore handle, the zone to iterate over and a
5861 callback function to call with the result.
5862 If the client wants to iterate over all the, he passes NULL for the zone.
5863 A @code{GNUNET_NAMESTORE_ZoneIterator} handle is returned to be used to
5864 continue iteration.
5865
5866 NAMESTORE calls the callback for every result and expects the client to
5867 call @code{GNUNET_NAMESTORE_zone_iterator_next} to continue to iterate or
5868 @code{GNUNET_NAMESTORE_zone_iterator_stop} to interrupt the iteration.
5869 When NAMESTORE reached the last item it will call the callback with a
5870 NULL value to indicate.
5871
5872 @node Monitoring Zone Information
5873 @subsubsection Monitoring Zone Information
5874
5875 @c %**end of header
5876
5877 Clients can also monitor zones to be notified about changes. Here the
5878 clients uses the @code{GNUNET_NAMESTORE_zone_monitor_start} function and
5879 passes the private key of the zone and and a callback function to call
5880 with updates for a zone.
5881 The client can specify to obtain zone information first by iterating over
5882 the zone and specify a synchronization callback to be called when the
5883 client and the namestore are synced.
5884
5885 On an update, NAMESTORE will call the callback with the private key of the
5886 zone, the label and the records and their number.
5887
5888 To stop monitoring, the client calls
5889 @code{GNUNET_NAMESTORE_zone_monitor_stop} and passes the handle obtained
5890 from the function to start the monitoring.
5891
5892 @cindex PEERINFO Subsystem
5893 @node PEERINFO Subsystem
5894 @section PEERINFO Subsystem
5895
5896 @c %**end of header
5897
5898 The PEERINFO subsystem is used to store verified (validated) information
5899 about known peers in a persistent way. It obtains these addresses for
5900 example from TRANSPORT service which is in charge of address validation.
5901 Validation means that the information in the HELLO message are checked by
5902 connecting to the addresses and performing a cryptographic handshake to
5903 authenticate the peer instance stating to be reachable with these
5904 addresses.
5905 Peerinfo does not validate the HELLO messages itself but only stores them
5906 and gives them to interested clients.
5907
5908 As future work, we think about moving from storing just HELLO messages to
5909 providing a generic persistent per-peer information store.
5910 More and more subsystems tend to need to store per-peer information in
5911 persistent way.
5912 To not duplicate this functionality we plan to provide a PEERSTORE
5913 service providing this functionality.
5914
5915 @menu
5916 * PEERINFO - Features::
5917 * PEERINFO - Limitations::
5918 * DeveloperPeer Information::
5919 * Startup::
5920 * Managing Information::
5921 * Obtaining Information::
5922 * The PEERINFO Client-Service Protocol::
5923 * libgnunetpeerinfo::
5924 @end menu
5925
5926 @node PEERINFO - Features
5927 @subsection PEERINFO - Features
5928
5929 @c %**end of header
5930
5931 @itemize @bullet
5932 @item Persistent storage
5933 @item Client notification mechanism on update
5934 @item Periodic clean up for expired information
5935 @item Differentiation between public and friend-only HELLO
5936 @end itemize
5937
5938 @node PEERINFO - Limitations
5939 @subsection PEERINFO - Limitations
5940
5941
5942 @itemize @bullet
5943 @item Does not perform HELLO validation
5944 @end itemize
5945
5946 @node DeveloperPeer Information
5947 @subsection DeveloperPeer Information
5948
5949 @c %**end of header
5950
5951 The PEERINFO subsystem stores these information in the form of HELLO
5952 messages you can think of as business cards.
5953 These HELLO messages contain the public key of a peer and the addresses
5954 a peer can be reached under.
5955 The addresses include an expiration date describing how long they are
5956 valid. This information is updated regularly by the TRANSPORT service by
5957 revalidating the address.
5958 If an address is expired and not renewed, it can be removed from the
5959 HELLO message.
5960
5961 Some peer do not want to have their HELLO messages distributed to other
5962 peers, especially when GNUnet's friend-to-friend modus is enabled.
5963 To prevent this undesired distribution. PEERINFO distinguishes between
5964 @emph{public} and @emph{friend-only} HELLO messages.
5965 Public HELLO messages can be freely distributed to other (possibly
5966 unknown) peers (for example using the hostlist, gossiping, broadcasting),
5967 whereas friend-only HELLO messages may not be distributed to other peers.
5968 Friend-only HELLO messages have an additional flag @code{friend_only} set
5969 internally. For public HELLO message this flag is not set.
5970 PEERINFO does and cannot not check if a client is allowed to obtain a
5971 specific HELLO type.
5972
5973 The HELLO messages can be managed using the GNUnet HELLO library.
5974 Other GNUnet systems can obtain these information from PEERINFO and use
5975 it for their purposes.
5976 Clients are for example the HOSTLIST component providing these
5977 information to other peers in form of a hostlist or the TRANSPORT
5978 subsystem using these information to maintain connections to other peers.
5979
5980 @node Startup
5981 @subsection Startup
5982
5983 @c %**end of header
5984
5985 During startup the PEERINFO services loads persistent HELLOs from disk.
5986 First PEERINFO parses the directory configured in the HOSTS value of the
5987 @code{PEERINFO} configuration section to store PEERINFO information.
5988 For all files found in this directory valid HELLO messages are extracted.
5989 In addition it loads HELLO messages shipped with the GNUnet distribution.
5990 These HELLOs are used to simplify network bootstrapping by providing
5991 valid peer information with the distribution.
5992 The use of these HELLOs can be prevented by setting the
5993 @code{USE_INCLUDED_HELLOS} in the @code{PEERINFO} configuration section to
5994 @code{NO}. Files containing invalid information are removed.
5995
5996 @node Managing Information
5997 @subsection Managing Information
5998
5999 @c %**end of header
6000
6001 The PEERINFO services stores information about known PEERS and a single
6002 HELLO message for every peer.
6003 A peer does not need to have a HELLO if no information are available.
6004 HELLO information from different sources, for example a HELLO obtained
6005 from a remote HOSTLIST and a second HELLO stored on disk, are combined
6006 and merged into one single HELLO message per peer which will be given to
6007 clients. During this merge process the HELLO is immediately written to
6008 disk to ensure persistence.
6009
6010 PEERINFO in addition periodically scans the directory where information
6011 are stored for empty HELLO messages with expired TRANSPORT addresses.
6012 This periodic task scans all files in the directory and recreates the
6013 HELLO messages it finds.
6014 Expired TRANSPORT addresses are removed from the HELLO and if the
6015 HELLO does not contain any valid addresses, it is discarded and removed
6016 from the disk.
6017
6018 @node Obtaining Information
6019 @subsection Obtaining Information
6020
6021 @c %**end of header
6022
6023 When a client requests information from PEERINFO, PEERINFO performs a
6024 lookup for the respective peer or all peers if desired and transmits this
6025 information to the client.
6026 The client can specify if friend-only HELLOs have to be included or not
6027 and PEERINFO filters the respective HELLO messages before transmitting
6028 information.
6029
6030 To notify clients about changes to PEERINFO information, PEERINFO
6031 maintains a list of clients interested in this notifications.
6032 Such a notification occurs if a HELLO for a peer was updated (due to a
6033 merge for example) or a new peer was added.
6034
6035 @node The PEERINFO Client-Service Protocol
6036 @subsection The PEERINFO Client-Service Protocol
6037
6038 @c %**end of header
6039
6040 To connect and disconnect to and from the PEERINFO Service PEERINFO
6041 utilizes the util client/server infrastructure, so no special messages
6042 types are used here.
6043
6044 To add information for a peer, the plain HELLO message is transmitted to
6045 the service without any wrapping. All pieces of information required are
6046 stored within the HELLO message.
6047 The PEERINFO service provides a message handler accepting and processing
6048 these HELLO messages.
6049
6050 When obtaining PEERINFO information using the iterate functionality
6051 specific messages are used. To obtain information for all peers, a
6052 @code{struct ListAllPeersMessage} with message type
6053 @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL} and a flag
6054 include_friend_only to indicate if friend-only HELLO messages should be
6055 included are transmitted. If information for a specific peer is required
6056 a @code{struct ListAllPeersMessage} with
6057 @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET} containing the peer identity is
6058 used.
6059
6060 For both variants the PEERINFO service replies for each HELLO message it
6061 wants to transmit with a @code{struct ListAllPeersMessage} with type
6062 @code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO} containing the plain HELLO.
6063 The final message is @code{struct GNUNET_MessageHeader} with type
6064 @code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO}. If the client receives this
6065 message, it can proceed with the next request if any is pending.
6066
6067 @node libgnunetpeerinfo
6068 @subsection libgnunetpeerinfo
6069
6070 @c %**end of header
6071
6072 The PEERINFO API consists mainly of three different functionalities:
6073
6074 @itemize @bullet
6075 @item maintaining a connection to the service
6076 @item adding new information to the PEERINFO service
6077 @item retrieving information from the PEERINFO service
6078 @end itemize
6079
6080 @menu
6081 * Connecting to the PEERINFO Service::
6082 * Adding Information to the PEERINFO Service::
6083 * Obtaining Information from the PEERINFO Service::
6084 @end menu
6085
6086 @node Connecting to the PEERINFO Service
6087 @subsubsection Connecting to the PEERINFO Service
6088
6089 @c %**end of header
6090
6091 To connect to the PEERINFO service the function
6092 @code{GNUNET_PEERINFO_connect} is used, taking a configuration handle as
6093 an argument, and to disconnect from PEERINFO the function
6094 @code{GNUNET_PEERINFO_disconnect}, taking the PEERINFO
6095 handle returned from the connect function has to be called.
6096
6097 @node Adding Information to the PEERINFO Service
6098 @subsubsection Adding Information to the PEERINFO Service
6099
6100 @c %**end of header
6101
6102 @code{GNUNET_PEERINFO_add_peer} adds a new peer to the PEERINFO subsystem
6103 storage. This function takes the PEERINFO handle as an argument, the HELLO
6104 message to store and a continuation with a closure to be called with the
6105 result of the operation.
6106 The @code{GNUNET_PEERINFO_add_peer} returns a handle to this operation
6107 allowing to cancel the operation with the respective cancel function
6108 @code{GNUNET_PEERINFO_add_peer_cancel}. To retrieve information from
6109 PEERINFO you can iterate over all information stored with PEERINFO or you
6110 can tell PEERINFO to notify if new peer information are available.
6111
6112 @node Obtaining Information from the PEERINFO Service
6113 @subsubsection Obtaining Information from the PEERINFO Service
6114
6115 @c %**end of header
6116
6117 To iterate over information in PEERINFO you use
6118 @code{GNUNET_PEERINFO_iterate}.
6119 This function expects the PEERINFO handle, a flag if HELLO messages
6120 intended for friend only mode should be included, a timeout how long the
6121 operation should take and a callback with a callback closure to be called
6122 for the results.
6123 If you want to obtain information for a specific peer, you can specify
6124 the peer identity, if this identity is NULL, information for all peers are
6125 returned. The function returns a handle to allow to cancel the operation
6126 using @code{GNUNET_PEERINFO_iterate_cancel}.
6127
6128 To get notified when peer information changes, you can use
6129 @code{GNUNET_PEERINFO_notify}.
6130 This function expects a configuration handle and a flag if friend-only
6131 HELLO messages should be included. The PEERINFO service will notify you
6132 about every change and the callback function will be called to notify you
6133 about changes. The function returns a handle to cancel notifications
6134 with @code{GNUNET_PEERINFO_notify_cancel}.
6135
6136 @cindex PEERSTORE Subsystem
6137 @node PEERSTORE Subsystem
6138 @section PEERSTORE Subsystem
6139
6140 @c %**end of header
6141
6142 GNUnet's PEERSTORE subsystem offers persistent per-peer storage for other
6143 GNUnet subsystems. GNUnet subsystems can use PEERSTORE to persistently
6144 store and retrieve arbitrary data.
6145 Each data record stored with PEERSTORE contains the following fields:
6146
6147 @itemize @bullet
6148 @item subsystem: Name of the subsystem responsible for the record.
6149 @item peerid: Identity of the peer this record is related to.
6150 @item key: a key string identifying the record.
6151 @item value: binary record value.
6152 @item expiry: record expiry date.
6153 @end itemize
6154
6155 @menu
6156 * Functionality::
6157 * Architecture::
6158 * libgnunetpeerstore::
6159 @end menu
6160
6161 @node Functionality
6162 @subsection Functionality
6163
6164 @c %**end of header
6165
6166 Subsystems can store any type of value under a (subsystem, peerid, key)
6167 combination. A "replace" flag set during store operations forces the
6168 PEERSTORE to replace any old values stored under the same
6169 (subsystem, peerid, key) combination with the new value.
6170 Additionally, an expiry date is set after which the record is *possibly*
6171 deleted by PEERSTORE.
6172
6173 Subsystems can iterate over all values stored under any of the following
6174 combination of fields:
6175
6176 @itemize @bullet
6177 @item (subsystem)
6178 @item (subsystem, peerid)
6179 @item (subsystem, key)
6180 @item (subsystem, peerid, key)
6181 @end itemize
6182
6183 Subsystems can also request to be notified about any new values stored
6184 under a (subsystem, peerid, key) combination by sending a "watch"
6185 request to PEERSTORE.
6186
6187 @node Architecture
6188 @subsection Architecture
6189
6190 @c %**end of header
6191
6192 PEERSTORE implements the following components:
6193
6194 @itemize @bullet
6195 @item PEERSTORE service: Handles store, iterate and watch operations.
6196 @item PEERSTORE API: API to be used by other subsystems to communicate and
6197 issue commands to the PEERSTORE service.
6198 @item PEERSTORE plugins: Handles the persistent storage. At the moment,
6199 only an "sqlite" plugin is implemented.
6200 @end itemize
6201
6202 @cindex libgnunetpeerstore
6203 @node libgnunetpeerstore
6204 @subsection libgnunetpeerstore
6205
6206 @c %**end of header
6207
6208 libgnunetpeerstore is the library containing the PEERSTORE API. Subsystems
6209 wishing to communicate with the PEERSTORE service use this API to open a
6210 connection to PEERSTORE. This is done by calling
6211 @code{GNUNET_PEERSTORE_connect} which returns a handle to the newly
6212 created connection.
6213 This handle has to be used with any further calls to the API.
6214
6215 To store a new record, the function @code{GNUNET_PEERSTORE_store} is to
6216 be used which requires the record fields and a continuation function that
6217 will be called by the API after the STORE request is sent to the
6218 PEERSTORE service.
6219 Note that calling the continuation function does not mean that the record
6220 is successfully stored, only that the STORE request has been successfully
6221 sent to the PEERSTORE service.
6222 @code{GNUNET_PEERSTORE_store_cancel} can be called to cancel the STORE
6223 request only before the continuation function has been called.
6224
6225 To iterate over stored records, the function
6226 @code{GNUNET_PEERSTORE_iterate} is
6227 to be used. @emph{peerid} and @emph{key} can be set to NULL. An iterator
6228 callback function will be called with each matching record found and a
6229 NULL record at the end to signal the end of result set.
6230 @code{GNUNET_PEERSTORE_iterate_cancel} can be used to cancel the ITERATE
6231 request before the iterator callback is called with a NULL record.
6232
6233 To be notified with new values stored under a (subsystem, peerid, key)
6234 combination, the function @code{GNUNET_PEERSTORE_watch} is to be used.
6235 This will register the watcher with the PEERSTORE service, any new
6236 records matching the given combination will trigger the callback
6237 function passed to @code{GNUNET_PEERSTORE_watch}. This continues until
6238 @code{GNUNET_PEERSTORE_watch_cancel} is called or the connection to the
6239 service is destroyed.
6240
6241 After the connection is no longer needed, the function
6242 @code{GNUNET_PEERSTORE_disconnect} can be called to disconnect from the
6243 PEERSTORE service.
6244 Any pending ITERATE or WATCH requests will be destroyed.
6245 If the @code{sync_first} flag is set to @code{GNUNET_YES}, the API will
6246 delay the disconnection until all pending STORE requests are sent to
6247 the PEERSTORE service, otherwise, the pending STORE requests will be
6248 destroyed as well.
6249
6250 @cindex SET Subsystem
6251 @node SET Subsystem
6252 @section SET Subsystem
6253
6254 @c %**end of header
6255
6256 The SET service implements efficient set operations between two peers
6257 over a mesh tunnel.
6258 Currently, set union and set intersection are the only supported
6259 operations. Elements of a set consist of an @emph{element type} and
6260 arbitrary binary @emph{data}.
6261 The size of an element's data is limited to around 62 KB.
6262
6263 @menu
6264 * Local Sets::
6265 * Set Modifications::
6266 * Set Operations::
6267 * Result Elements::
6268 * libgnunetset::
6269 * The SET Client-Service Protocol::
6270 * The SET Intersection Peer-to-Peer Protocol::
6271 * The SET Union Peer-to-Peer Protocol::
6272 @end menu
6273
6274 @node Local Sets
6275 @subsection Local Sets
6276
6277 @c %**end of header
6278
6279 Sets created by a local client can be modified and reused for multiple
6280 operations. As each set operation requires potentially expensive special
6281 auxilliary data to be computed for each element of a set, a set can only
6282 participate in one type of set operation (i.e. union or intersection).
6283 The type of a set is determined upon its creation.
6284 If a the elements of a set are needed for an operation of a different
6285 type, all of the set's element must be copied to a new set of appropriate
6286 type.
6287
6288 @node Set Modifications
6289 @subsection Set Modifications
6290
6291 @c %**end of header
6292
6293 Even when set operations are active, one can add to and remove elements
6294 from a set.
6295 However, these changes will only be visible to operations that have been
6296 created after the changes have taken place. That is, every set operation
6297 only sees a snapshot of the set from the time the operation was started.
6298 This mechanism is @emph{not} implemented by copying the whole set, but by
6299 attaching @emph{generation information} to each element and operation.
6300
6301 @node Set Operations
6302 @subsection Set Operations
6303
6304 @c %**end of header
6305
6306 Set operations can be started in two ways: Either by accepting an
6307 operation request from a remote peer, or by requesting a set operation
6308 from a remote peer.
6309 Set operations are uniquely identified by the involved @emph{peers}, an
6310 @emph{application id} and the @emph{operation type}.
6311
6312 The client is notified of incoming set operations by @emph{set listeners}.
6313 A set listener listens for incoming operations of a specific operation
6314 type and application id.
6315 Once notified of an incoming set request, the client can accept the set
6316 request (providing a local set for the operation) or reject it.
6317
6318 @node Result Elements
6319 @subsection Result Elements
6320
6321 @c %**end of header
6322
6323 The SET service has three @emph{result modes} that determine how an
6324 operation's result set is delivered to the client:
6325
6326 @itemize @bullet
6327 @item @strong{Full Result Set.} All elements of set resulting from the set
6328 operation are returned to the client.
6329 @item @strong{Added Elements.} Only elements that result from the
6330 operation and are not already in the local peer's set are returned.
6331 Note that for some operations (like set intersection) this result mode
6332 will never return any elements.
6333 This can be useful if only the remove peer is actually interested in
6334 the result of the set operation.
6335 @item @strong{Removed Elements.} Only elements that are in the local
6336 peer's initial set but not in the operation's result set are returned.
6337 Note that for some operations (like set union) this result mode will
6338 never return any elements. This can be useful if only the remove peer is
6339 actually interested in the result of the set operation.
6340 @end itemize
6341
6342 @cindex libgnunetset
6343 @node libgnunetset
6344 @subsection libgnunetset
6345
6346 @c %**end of header
6347
6348 @menu
6349 * Sets::
6350 * Listeners::
6351 * Operations::
6352 * Supplying a Set::
6353 * The Result Callback::
6354 @end menu
6355
6356 @node Sets
6357 @subsubsection Sets
6358
6359 @c %**end of header
6360
6361 New sets are created with @code{GNUNET_SET_create}. Both the local peer's
6362 configuration (as each set has its own client connection) and the
6363 operation type must be specified.
6364 The set exists until either the client calls @code{GNUNET_SET_destroy} or
6365 the client's connection to the service is disrupted.
6366 In the latter case, the client is notified by the return value of
6367 functions dealing with sets. This return value must always be checked.
6368
6369 Elements are added and removed with @code{GNUNET_SET_add_element} and
6370 @code{GNUNET_SET_remove_element}.
6371
6372 @node Listeners
6373 @subsubsection Listeners
6374
6375 @c %**end of header
6376
6377 Listeners are created with @code{GNUNET_SET_listen}. Each time time a
6378 remote peer suggests a set operation with an application id and operation
6379 type matching a listener, the listener's callback is invoked.
6380 The client then must synchronously call either @code{GNUNET_SET_accept}
6381 or @code{GNUNET_SET_reject}. Note that the operation will not be started
6382 until the client calls @code{GNUNET_SET_commit}
6383 (see Section "Supplying a Set").
6384
6385 @node Operations
6386 @subsubsection Operations
6387
6388 @c %**end of header
6389
6390 Operations to be initiated by the local peer are created with
6391 @code{GNUNET_SET_prepare}. Note that the operation will not be started
6392 until the client calls @code{GNUNET_SET_commit}
6393 (see Section "Supplying a Set").
6394
6395 @node Supplying a Set
6396 @subsubsection Supplying a Set
6397
6398 @c %**end of header
6399
6400 To create symmetry between the two ways of starting a set operation
6401 (accepting and nitiating it), the operation handles returned by
6402 @code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare} do not yet have a
6403 set to operate on, thus they can not do any work yet.
6404
6405 The client must call @code{GNUNET_SET_commit} to specify a set to use for
6406 an operation. @code{GNUNET_SET_commit} may only be called once per set
6407 operation.
6408
6409 @node The Result Callback
6410 @subsubsection The Result Callback
6411
6412 @c %**end of header
6413
6414 Clients must specify both a result mode and a result callback with
6415 @code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare}. The result
6416 callback with a status indicating either that an element was received, or
6417 the operation failed or succeeded.
6418 The interpretation of the received element depends on the result mode.
6419 The callback needs to know which result mode it is used in, as the
6420 arguments do not indicate if an element is part of the full result set,
6421 or if it is in the difference between the original set and the final set.
6422
6423 @node The SET Client-Service Protocol
6424 @subsection The SET Client-Service Protocol
6425
6426 @c %**end of header
6427
6428 @menu
6429 * Creating Sets::
6430 * Listeners2::
6431 * Initiating Operations::
6432 * Modifying Sets::
6433 * Results and Operation Status::
6434 * Iterating Sets::
6435 @end menu
6436
6437 @node Creating Sets
6438 @subsubsection Creating Sets
6439
6440 @c %**end of header
6441
6442 For each set of a client, there exists a client connection to the service.
6443 Sets are created by sending the @code{GNUNET_SERVICE_SET_CREATE} message
6444 over a new client connection. Multiple operations for one set are
6445 multiplexed over one client connection, using a request id supplied by
6446 the client.
6447
6448 @node Listeners2
6449 @subsubsection Listeners2
6450
6451 @c %**end of header
6452
6453 Each listener also requires a seperate client connection. By sending the
6454 @code{GNUNET_SERVICE_SET_LISTEN} message, the client notifies the service
6455 of the application id and operation type it is interested in. A client
6456 rejects an incoming request by sending @code{GNUNET_SERVICE_SET_REJECT}
6457 on the listener's client connection.
6458 In contrast, when accepting an incoming request, a
6459 @code{GNUNET_SERVICE_SET_ACCEPT} message must be sent over the@ set that
6460 is supplied for the set operation.
6461
6462 @node Initiating Operations
6463 @subsubsection Initiating Operations
6464
6465 @c %**end of header
6466
6467 Operations with remote peers are initiated by sending a
6468 @code{GNUNET_SERVICE_SET_EVALUATE} message to the service. The@ client
6469 connection that this message is sent by determines the set to use.
6470
6471 @node Modifying Sets
6472 @subsubsection Modifying Sets
6473
6474 @c %**end of header
6475
6476 Sets are modified with the @code{GNUNET_SERVICE_SET_ADD} and
6477 @code{GNUNET_SERVICE_SET_REMOVE} messages.
6478
6479
6480 @c %@menu
6481 @c %* Results and Operation Status::
6482 @c %* Iterating Sets::
6483 @c %@end menu
6484
6485 @node Results and Operation Status
6486 @subsubsection Results and Operation Status
6487 @c %**end of header
6488
6489 The service notifies the client of result elements and success/failure of
6490 a set operation with the @code{GNUNET_SERVICE_SET_RESULT} message.
6491
6492 @node Iterating Sets
6493 @subsubsection Iterating Sets
6494
6495 @c %**end of header
6496
6497 All elements of a set can be requested by sending
6498 @code{GNUNET_SERVICE_SET_ITER_REQUEST}. The server responds with
6499 @code{GNUNET_SERVICE_SET_ITER_ELEMENT} and eventually terminates the
6500 iteration with @code{GNUNET_SERVICE_SET_ITER_DONE}.
6501 After each received element, the client
6502 must send @code{GNUNET_SERVICE_SET_ITER_ACK}. Note that only one set
6503 iteration may be active for a set at any given time.
6504
6505 @node The SET Intersection Peer-to-Peer Protocol
6506 @subsection The SET Intersection Peer-to-Peer Protocol
6507
6508 @c %**end of header
6509
6510 The intersection protocol operates over CADET and starts with a
6511 GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer
6512 initiating the operation to the peer listening for inbound requests.
6513 It includes the number of elements of the initiating peer, which is used
6514 to decide which side will send a Bloom filter first.
6515
6516 The listening peer checks if the operation type and application
6517 identifier are acceptable for its current state.
6518 If not, it responds with a GNUNET_MESSAGE_TYPE_SET_RESULT and a status of
6519 GNUNET_SET_STATUS_FAILURE (and terminates the CADET channel).
6520
6521 If the application accepts the request, the listener sends back a
6522 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} if it has
6523 more elements in the set than the client.
6524 Otherwise, it immediately starts with the Bloom filter exchange.
6525 If the initiator receives a
6526 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} response,
6527 it beings the Bloom filter exchange, unless the set size is indicated to
6528 be zero, in which case the intersection is considered finished after
6529 just the initial handshake.
6530
6531
6532 @menu
6533 * The Bloom filter exchange::
6534 * Salt::
6535 @end menu
6536
6537 @node The Bloom filter exchange
6538 @subsubsection The Bloom filter exchange
6539
6540 @c %**end of header
6541
6542 In this phase, each peer transmits a Bloom filter over the remaining
6543 keys of the local set to the other peer using a
6544 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_BF} message. This
6545 message additionally includes the number of elements left in the sender's
6546 set, as well as the XOR over all of the keys in that set.
6547
6548 The number of bits 'k' set per element in the Bloom filter is calculated
6549 based on the relative size of the two sets.
6550 Furthermore, the size of the Bloom filter is calculated based on 'k' and
6551 the number of elements in the set to maximize the amount of data filtered
6552 per byte transmitted on the wire (while avoiding an excessively high
6553 number of iterations).
6554
6555 The receiver of the message removes all elements from its local set that
6556 do not pass the Bloom filter test.
6557 It then checks if the set size of the sender and the XOR over the keys
6558 match what is left of his own set. If they do, he sends a
6559 @code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_DONE} back to indicate
6560 that the latest set is the final result.
6561 Otherwise, the receiver starts another Bloom fitler exchange, except
6562 this time as the sender.
6563
6564 @node Salt
6565 @subsubsection Salt
6566
6567 @c %**end of header
6568
6569 Bloomfilter operations are probablistic: With some non-zero probability
6570 the test may incorrectly say an element is in the set, even though it is
6571 not.
6572
6573 To mitigate this problem, the intersection protocol iterates exchanging
6574 Bloom filters using a different random 32-bit salt in each iteration (the
6575 salt is also included in the message).
6576 With different salts, set operations may fail for different elements.
6577 Merging the results from the executions, the probability of failure drops
6578 to zero.
6579
6580 The iterations terminate once both peers have established that they have
6581 sets of the same size, and where the XOR over all keys computes the same
6582 512-bit value (leaving a failure probability of 2-511).
6583
6584 @node The SET Union Peer-to-Peer Protocol
6585 @subsection The SET Union Peer-to-Peer Protocol
6586
6587 @c %**end of header
6588
6589 The SET union protocol is based on Eppstein's efficient set reconciliation
6590 without prior context. You should read this paper first if you want to
6591 understand the protocol.
6592
6593 The union protocol operates over CADET and starts with a
6594 GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer
6595 initiating the operation to the peer listening for inbound requests.
6596 It includes the number of elements of the initiating peer, which is
6597 currently not used.
6598
6599 The listening peer checks if the operation type and application
6600 identifier are acceptable for its current state. If not, it responds with
6601 a @code{GNUNET_MESSAGE_TYPE_SET_RESULT} and a status of
6602 @code{GNUNET_SET_STATUS_FAILURE} (and terminates the CADET channel).
6603
6604 If the application accepts the request, it sends back a strata estimator
6605 using a message of type GNUNET_MESSAGE_TYPE_SET_UNION_P2P_SE. The
6606 initiator evaluates the strata estimator and initiates the exchange of
6607 invertible Bloom filters, sending a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
6608
6609 During the IBF exchange, if the receiver cannot invert the Bloom filter or
6610 detects a cycle, it sends a larger IBF in response (up to a defined
6611 maximum limit; if that limit is reached, the operation fails).
6612 Elements decoded while processing the IBF are transmitted to the other
6613 peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS, or requested from the
6614 other peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS messages,
6615 depending on the sign observed during decoding of the IBF.
6616 Peers respond to a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS message
6617 with the respective element in a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS
6618 message. If the IBF fully decodes, the peer responds with a
6619 GNUNET_MESSAGE_TYPE_SET_UNION_P2P_DONE message instead of another
6620 GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
6621
6622 All Bloom filter operations use a salt to mingle keys before hasing them
6623 into buckets, such that future iterations have a fresh chance of
6624 succeeding if they failed due to collisions before.
6625
6626 @cindex STATISTICS Subsystem
6627 @node STATISTICS Subsystem
6628 @section STATISTICS Subsystem
6629
6630 @c %**end of header
6631
6632 In GNUnet, the STATISTICS subsystem offers a central place for all
6633 subsystems to publish unsigned 64-bit integer run-time statistics.
6634 Keeping this information centrally means that there is a unified way for
6635 the user to obtain data on all subsystems, and individual subsystems do
6636 not have to always include a custom data export method for performance
6637 metrics and other statistics. For example, the TRANSPORT system uses
6638 STATISTICS to update information about the number of directly connected
6639 peers and the bandwidth that has been consumed by the various plugins.
6640 This information is valuable for diagnosing connectivity and performance
6641 issues.
6642
6643 Following the GNUnet service architecture, the STATISTICS subsystem is
6644 divided into an API which is exposed through the header
6645 @strong{gnunet_statistics_service.h} and the STATISTICS service
6646 @strong{gnunet-service-statistics}. The @strong{gnunet-statistics}
6647 command-line tool can be used to obtain (and change) information about
6648 the values stored by the STATISTICS service. The STATISTICS service does
6649 not communicate with other peers.
6650
6651 Data is stored in the STATISTICS service in the form of tuples
6652 @strong{(subsystem, name, value, persistence)}. The subsystem determines
6653 to which other GNUnet's subsystem the data belongs. name is the name
6654 through which value is associated. It uniquely identifies the record
6655 from among other records belonging to the same subsystem.
6656 In some parts of the code, the pair @strong{(subsystem, name)} is called
6657 a @strong{statistic} as it identifies the values stored in the STATISTCS
6658 service.The persistence flag determines if the record has to be preserved
6659 across service restarts. A record is said to be persistent if this flag
6660 is set for it; if not, the record is treated as a non-persistent record
6661 and it is lost after service restart. Persistent records are written to
6662 and read from the file @strong{statistics.data} before shutdown
6663 and upon startup. The file is located in the HOME directory of the peer.
6664
6665 An anomaly of the STATISTICS service is that it does not terminate
6666 immediately upon receiving a shutdown signal if it has any clients
6667 connected to it. It waits for all the clients that are not monitors to
6668 close their connections before terminating itself.
6669 This is to prevent the loss of data during peer shutdown --- delaying the
6670 STATISTICS service shutdown helps other services to store important data
6671 to STATISTICS during shutdown.
6672
6673 @menu
6674 * libgnunetstatistics::
6675 * The STATISTICS Client-Service Protocol::
6676 @end menu
6677
6678 @cindex libgnunetstatistics
6679 @node libgnunetstatistics
6680 @subsection libgnunetstatistics
6681
6682 @c %**end of header
6683
6684 @strong{libgnunetstatistics} is the library containing the API for the
6685 STATISTICS subsystem. Any process requiring to use STATISTICS should use
6686 this API by to open a connection to the STATISTICS service.
6687 This is done by calling the function @code{GNUNET_STATISTICS_create()}.
6688 This function takes the subsystem's name which is trying to use STATISTICS
6689 and a configuration.
6690 All values written to STATISTICS with this connection will be placed in
6691 the section corresponding to the given subsystem's name.
6692 The connection to STATISTICS can be destroyed with the function
6693 @code{GNUNET_STATISTICS_destroy()}. This function allows for the
6694 connection to be destroyed immediately or upon transferring all
6695 pending write requests to the service.
6696
6697 Note: STATISTICS subsystem can be disabled by setting @code{DISABLE = YES}
6698 under the @code{[STATISTICS]} section in the configuration. With such a
6699 configuration all calls to @code{GNUNET_STATISTICS_create()} return
6700 @code{NULL} as the STATISTICS subsystem is unavailable and no other
6701 functions from the API can be used.
6702
6703
6704 @menu
6705 * Statistics retrieval::
6706 * Setting statistics and updating them::
6707 * Watches::
6708 @end menu
6709
6710 @node Statistics retrieval
6711 @subsubsection Statistics retrieval
6712
6713 @c %**end of header
6714
6715 Once a connection to the statistics service is obtained, information
6716 about any other system which uses statistics can be retrieved with the
6717 function GNUNET_STATISTICS_get().
6718 This function takes the connection handle, the name of the subsystem
6719 whose information we are interested in (a @code{NULL} value will
6720 retrieve information of all available subsystems using STATISTICS), the
6721 name of the statistic we are interested in (a @code{NULL} value will
6722 retrieve all available statistics), a continuation callback which is
6723 called when all of requested information is retrieved, an iterator
6724 callback which is called for each parameter in the retrieved information
6725 and a closure for the aforementioned callbacks. The library then invokes
6726 the iterator callback for each value matching the request.
6727
6728 Call to @code{GNUNET_STATISTICS_get()} is asynchronous and can be
6729 canceled with the function @code{GNUNET_STATISTICS_get_cancel()}.
6730 This is helpful when retrieving statistics takes too long and especially
6731 when we want to shutdown and cleanup everything.
6732
6733 @node Setting statistics and updating them
6734 @subsubsection Setting statistics and updating them
6735
6736 @c %**end of header
6737
6738 So far we have seen how to retrieve statistics, here we will learn how we
6739 can set statistics and update them so that other subsystems can retrieve
6740 them.
6741
6742 A new statistic can be set using the function
6743 @code{GNUNET_STATISTICS_set()}.
6744 This function takes the name of the statistic and its value and a flag to
6745 make the statistic persistent.
6746 The value of the statistic should be of the type @code{uint64_t}.
6747 The function does not take the name of the subsystem; it is determined
6748 from the previous @code{GNUNET_STATISTICS_create()} invocation. If
6749 the given statistic is already present, its value is overwritten.
6750
6751 An existing statistics can be updated, i.e its value can be increased or
6752 decreased by an amount with the function
6753 @code{GNUNET_STATISTICS_update()}.
6754 The parameters to this function are similar to
6755 @code{GNUNET_STATISTICS_set()}, except that it takes the amount to be
6756 changed as a type @code{int64_t} instead of the value.
6757
6758 The library will combine multiple set or update operations into one
6759 message if the client performs requests at a rate that is faster than the
6760 available IPC with the STATISTICS service. Thus, the client does not have
6761 to worry about sending requests too quickly.
6762
6763 @node Watches
6764 @subsubsection Watches
6765
6766 @c %**end of header
6767
6768 As interesting feature of STATISTICS lies in serving notifications
6769 whenever a statistic of our interest is modified.
6770 This is achieved by registering a watch through the function
6771 @code{GNUNET_STATISTICS_watch()}.
6772 The parameters of this function are similar to those of
6773 @code{GNUNET_STATISTICS_get()}.
6774 Changes to the respective statistic's value will then cause the given
6775 iterator callback to be called.
6776 Note: A watch can only be registered for a specific statistic. Hence
6777 the subsystem name and the parameter name cannot be @code{NULL} in a
6778 call to @code{GNUNET_STATISTICS_watch()}.
6779
6780 A registered watch will keep notifying any value changes until
6781 @code{GNUNET_STATISTICS_watch_cancel()} is called with the same
6782 parameters that are used for registering the watch.
6783
6784 @node The STATISTICS Client-Service Protocol
6785 @subsection The STATISTICS Client-Service Protocol
6786 @c %**end of header
6787
6788
6789 @menu
6790 * Statistics retrieval2::
6791 * Setting and updating statistics::
6792 * Watching for updates::
6793 @end menu
6794
6795 @node Statistics retrieval2
6796 @subsubsection Statistics retrieval2
6797
6798 @c %**end of header
6799
6800 To retrieve statistics, the client transmits a message of type
6801 @code{GNUNET_MESSAGE_TYPE_STATISTICS_GET} containing the given subsystem
6802 name and statistic parameter to the STATISTICS service.
6803 The service responds with a message of type
6804 @code{GNUNET_MESSAGE_TYPE_STATISTICS_VALUE} for each of the statistics
6805 parameters that match the client request for the client. The end of
6806 information retrieved is signaled by the service by sending a message of
6807 type @code{GNUNET_MESSAGE_TYPE_STATISTICS_END}.
6808
6809 @node Setting and updating statistics
6810 @subsubsection Setting and updating statistics
6811
6812 @c %**end of header
6813
6814 The subsystem name, parameter name, its value and the persistence flag are
6815 communicated to the service through the message
6816 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}.
6817
6818 When the service receives a message of type
6819 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}, it retrieves the subsystem
6820 name and checks for a statistic parameter with matching the name given in
6821 the message.
6822 If a statistic parameter is found, the value is overwritten by the new
6823 value from the message; if not found then a new statistic parameter is
6824 created with the given name and value.
6825
6826 In addition to just setting an absolute value, it is possible to perform a
6827 relative update by sending a message of type
6828 @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET} with an update flag
6829 (@code{GNUNET_STATISTICS_SETFLAG_RELATIVE}) signifying that the value in
6830 the message should be treated as an update value.
6831
6832 @node Watching for updates
6833 @subsubsection Watching for updates
6834
6835 @c %**end of header
6836
6837 The function registers the watch at the service by sending a message of
6838 type @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH}. The service then sends
6839 notifications through messages of type
6840 @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH_VALUE} whenever the statistic
6841 parameter's value is changed.
6842
6843 @cindex DHT
6844 @cindex Distributed Hash Table
6845 @node Distributed Hash Table (DHT)
6846 @section Distributed Hash Table (DHT)
6847
6848 @c %**end of header
6849
6850 GNUnet includes a generic distributed hash table that can be used by
6851 developers building P2P applications in the framework.
6852 This section documents high-level features and how developers are
6853 expected to use the DHT.
6854 We have a research paper detailing how the DHT works.
6855 Also, Nate's thesis includes a detailed description and performance
6856 analysis (in chapter 6).
6857
6858 Key features of GNUnet's DHT include:
6859
6860 @itemize @bullet
6861 @item stores key-value pairs with values up to (approximately) 63k in size
6862 @item works with many underlay network topologies (small-world, random
6863 graph), underlay does not need to be a full mesh / clique
6864 @item support for extended queries (more than just a simple 'key'),
6865 filtering duplicate replies within the network (bloomfilter) and content
6866 validation (for details, please read the subsection on the block library)
6867 @item can (optionally) return paths taken by the PUT and GET operations
6868 to the application
6869 @item provides content replication to handle churn
6870 @end itemize
6871
6872 GNUnet's DHT is randomized and unreliable. Unreliable means that there is
6873 no strict guarantee that a value stored in the DHT is always
6874 found --- values are only found with high probability.
6875 While this is somewhat true in all P2P DHTs, GNUnet developers should be
6876 particularly wary of this fact (this will help you write secure,
6877 fault-tolerant code). Thus, when writing any application using the DHT,
6878 you should always consider the possibility that a value stored in the
6879 DHT by you or some other peer might simply not be returned, or returned
6880 with a significant delay.
6881 Your application logic must be written to tolerate this (naturally, some
6882 loss of performance or quality of service is expected in this case).
6883
6884 @menu
6885 * Block library and plugins::
6886 * libgnunetdht::
6887 * The DHT Client-Service Protocol::
6888 * The DHT Peer-to-Peer Protocol::
6889 @end menu
6890
6891 @node Block library and plugins
6892 @subsection Block library and plugins
6893
6894 @c %**end of header
6895
6896 @menu
6897 * What is a Block?::
6898 * The API of libgnunetblock::
6899 * Queries::
6900 * Sample Code::
6901 * Conclusion2::
6902 @end menu
6903
6904 @node What is a Block?
6905 @subsubsection What is a Block?
6906
6907 @c %**end of header
6908
6909 Blocks are small (< 63k) pieces of data stored under a key (struct
6910 GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which defines
6911 their data format. Blocks are used in GNUnet as units of static data
6912 exchanged between peers and stored (or cached) locally.
6913 Uses of blocks include file-sharing (the files are broken up into blocks),
6914 the VPN (DNS information is stored in blocks) and the DHT (all
6915 information in the DHT and meta-information for the maintenance of the
6916 DHT are both stored using blocks).
6917 The block subsystem provides a few common functions that must be
6918 available for any type of block.
6919
6920 @cindex libgnunetblock API
6921 @node The API of libgnunetblock
6922 @subsubsection The API of libgnunetblock
6923
6924 @c %**end of header
6925
6926 The block library requires for each (family of) block type(s) a block
6927 plugin (implementing @file{gnunet_block_plugin.h}) that provides basic
6928 functions that are needed by the DHT (and possibly other subsystems) to
6929 manage the block.
6930 These block plugins are typically implemented within their respective
6931 subsystems.
6932 The main block library is then used to locate, load and query the
6933 appropriate block plugin.
6934 Which plugin is appropriate is determined by the block type (which is
6935 just a 32-bit integer). Block plugins contain code that specifies which
6936 block types are supported by a given plugin. The block library loads all
6937 block plugins that are installed at the local peer and forwards the
6938 application request to the respective plugin.
6939
6940 The central functions of the block APIs (plugin and main library) are to
6941 allow the mapping of blocks to their respective key (if possible) and the
6942 ability to check that a block is well-formed and matches a given
6943 request (again, if possible).
6944 This way, GNUnet can avoid storing invalid blocks, storing blocks under
6945 the wrong key and forwarding blocks in response to a query that they do
6946 not answer.
6947
6948 One key function of block plugins is that it allows GNUnet to detect
6949 duplicate replies (via the Bloom filter). All plugins MUST support
6950 detecting duplicate replies (by adding the current response to the
6951 Bloom filter and rejecting it if it is encountered again).
6952 If a plugin fails to do this, responses may loop in the network.
6953
6954 @node Queries
6955 @subsubsection Queries
6956 @c %**end of header
6957
6958 The query format for any block in GNUnet consists of four main components.
6959 First, the type of the desired block must be specified. Second, the query
6960 must contain a hash code. The hash code is used for lookups in hash
6961 tables and databases and must not be unique for the block (however, if
6962 possible a unique hash should be used as this would be best for
6963 performance).
6964 Third, an optional Bloom filter can be specified to exclude known results;
6965 replies that hash to the bits set in the Bloom filter are considered
6966 invalid. False-positives can be eliminated by sending the same query
6967 again with a different Bloom filter mutator value, which parameterizes
6968 the hash function that is used.
6969 Finally, an optional application-specific "eXtended query" (xquery) can
6970 be specified to further constrain the results. It is entirely up to
6971 the type-specific plugin to determine whether or not a given block
6972 matches a query (type, hash, Bloom filter, and xquery).
6973 Naturally, not all xquery's are valid and some types of blocks may not
6974 support Bloom filters either, so the plugin also needs to check if the
6975 query is valid in the first place.
6976
6977 Depending on the results from the plugin, the DHT will then discard the
6978 (invalid) query, forward the query, discard the (invalid) reply, cache the
6979 (valid) reply, and/or forward the (valid and non-duplicate) reply.
6980
6981 @node Sample Code
6982 @subsubsection Sample Code
6983
6984 @c %**end of header
6985
6986 The source code in @strong{plugin_block_test.c} is a good starting point
6987 for new block plugins --- it does the minimal work by implementing a
6988 plugin that performs no validation at all.
6989 The respective @strong{Makefile.am} shows how to build and install a
6990 block plugin.
6991
6992 @node Conclusion2
6993 @subsubsection Conclusion2
6994
6995 @c %**end of header
6996
6997 In conclusion, GNUnet subsystems that want to use the DHT need to define a
6998 block format and write a plugin to match queries and replies. For testing,
6999 the @code{GNUNET_BLOCK_TYPE_TEST} block type can be used; it accepts
7000 any query as valid and any reply as matching any query.
7001 This type is also used for the DHT command line tools.
7002 However, it should NOT be used for normal applications due to the lack
7003 of error checking that results from this primitive implementation.
7004
7005 @cindex libgnunetdht
7006 @node libgnunetdht
7007 @subsection libgnunetdht
7008
7009 @c %**end of header
7010
7011 The DHT API itself is pretty simple and offers the usual GET and PUT
7012 functions that work as expected. The specified block type refers to the
7013 block library which allows the DHT to run application-specific logic for
7014 data stored in the network.
7015
7016
7017 @menu
7018 * GET::
7019 * PUT::
7020 * MONITOR::
7021 * DHT Routing Options::
7022 @end menu
7023
7024 @node GET
7025 @subsubsection GET
7026
7027 @c %**end of header
7028
7029 When using GET, the main consideration for developers (other than the
7030 block library) should be that after issuing a GET, the DHT will
7031 continuously cause (small amounts of) network traffic until the operation
7032 is explicitly canceled.
7033 So GET does not simply send out a single network request once; instead,
7034 the DHT will continue to search for data. This is needed to achieve good
7035 success rates and also handles the case where the respective PUT
7036 operation happens after the GET operation was started.
7037 Developers should not cancel an existing GET operation and then
7038 explicitly re-start it to trigger a new round of network requests;
7039 this is simply inefficient, especially as the internal automated version
7040 can be more efficient, for example by filtering results in the network
7041 that have already been returned.
7042
7043 If an application that performs a GET request has a set of replies that it
7044 already knows and would like to filter, it can call@
7045 @code{GNUNET_DHT_get_filter_known_results} with an array of hashes over
7046 the respective blocks to tell the DHT that these results are not
7047 desired (any more).
7048 This way, the DHT will filter the respective blocks using the block
7049 library in the network, which may result in a significant reduction in
7050 bandwidth consumption.
7051
7052 @node PUT
7053 @subsubsection PUT
7054
7055 @c %**end of header
7056
7057 @c inconsistent use of ``must'' above it's written ``MUST''
7058 In contrast to GET operations, developers @strong{must} manually re-run
7059 PUT operations periodically (if they intend the content to continue to be
7060 available). Content stored in the DHT expires or might be lost due to
7061 churn.
7062 Furthermore, GNUnet's DHT typically requires multiple rounds of PUT
7063 operations before a key-value pair is consistently available to all
7064 peers (the DHT randomizes paths and thus storage locations, and only
7065 after multiple rounds of PUTs there will be a sufficient number of
7066 replicas in large DHTs). An explicit PUT operation using the DHT API will
7067 only cause network traffic once, so in order to ensure basic availability
7068 and resistance to churn (and adversaries), PUTs must be repeated.
7069 While the exact frequency depends on the application, a rule of thumb is
7070 that there should be at least a dozen PUT operations within the content
7071 lifetime. Content in the DHT typically expires after one day, so
7072 DHT PUT operations should be repeated at least every 1-2 hours.
7073
7074 @node MONITOR
7075 @subsubsection MONITOR
7076
7077 @c %**end of header
7078
7079 The DHT API also allows applications to monitor messages crossing the
7080 local DHT service.
7081 The types of messages used by the DHT are GET, PUT and RESULT messages.
7082 Using the monitoring API, applications can choose to monitor these
7083 requests, possibly limiting themselves to requests for a particular block
7084 type.
7085
7086 The monitoring API is not only useful for diagnostics, it can also be
7087 used to trigger application operations based on PUT operations.
7088 For example, an application may use PUTs to distribute work requests to
7089 other peers.
7090 The workers would then monitor for PUTs that give them work, instead of
7091 looking for work using GET operations.
7092 This can be beneficial, especially if the workers have no good way to
7093 guess the keys under which work would be stored.
7094 Naturally, additional protocols might be needed to ensure that the desired
7095 number of workers will process the distributed workload.
7096
7097 @node DHT Routing Options
7098 @subsubsection DHT Routing Options
7099
7100 @c %**end of header
7101
7102 There are two important options for GET and PUT requests:
7103
7104 @table @asis
7105 @item GNUNET_DHT_RO_DEMULITPLEX_EVERYWHERE This option means that all
7106 peers should process the request, even if their peer ID is not closest to
7107 the key. For a PUT request, this means that all peers that a request
7108 traverses may make a copy of the data.
7109 Similarly for a GET request, all peers will check their local database
7110 for a result. Setting this option can thus significantly improve caching
7111 and reduce bandwidth consumption --- at the expense of a larger DHT
7112 database. If in doubt, we recommend that this option should be used.
7113 @item GNUNET_DHT_RO_RECORD_ROUTE This option instructs the DHT to record
7114 the path that a GET or a PUT request is taking through the overlay
7115 network. The resulting paths are then returned to the application with
7116 the respective result. This allows the receiver of a result to construct
7117 a path to the originator of the data, which might then be used for
7118 routing. Naturally, setting this option requires additional bandwidth
7119 and disk space, so applications should only set this if the paths are
7120 needed by the application logic.
7121 @item GNUNET_DHT_RO_FIND_PEER This option is an internal option used by
7122 the DHT's peer discovery mechanism and should not be used by applications.
7123 @item GNUNET_DHT_RO_BART This option is currently not implemented. It may
7124 in the future offer performance improvements for clique topologies.
7125 @end table
7126
7127 @node The DHT Client-Service Protocol
7128 @subsection The DHT Client-Service Protocol
7129
7130 @c %**end of header
7131
7132 @menu
7133 * PUTting data into the DHT::
7134 * GETting data from the DHT::
7135 * Monitoring the DHT::
7136 @end menu
7137
7138 @node PUTting data into the DHT
7139 @subsubsection PUTting data into the DHT
7140
7141 @c %**end of header
7142
7143 To store (PUT) data into the DHT, the client sends a
7144 @code{struct GNUNET_DHT_ClientPutMessage} to the service.
7145 This message specifies the block type, routing options, the desired
7146 replication level, the expiration time, key,
7147 value and a 64-bit unique ID for the operation. The service responds with
7148 a @code{struct GNUNET_DHT_ClientPutConfirmationMessage} with the same
7149 64-bit unique ID. Note that the service sends the confirmation as soon as
7150 it has locally processed the PUT request. The PUT may still be
7151 propagating through the network at this time.
7152
7153 In the future, we may want to change this to provide (limited) feedback
7154 to the client, for example if we detect that the PUT operation had no
7155 effect because the same key-value pair was already stored in the DHT.
7156 However, changing this would also require additional state and messages
7157 in the P2P interaction.
7158
7159 @node GETting data from the DHT
7160 @subsubsection GETting data from the DHT
7161
7162 @c %**end of header
7163
7164 To retrieve (GET) data from the DHT, the client sends a
7165 @code{struct GNUNET_DHT_ClientGetMessage} to the service. The message
7166 specifies routing options, a replication level (for replicating the GET,
7167 not the content), the desired block type, the key, the (optional)
7168 extended query and unique 64-bit request ID.
7169
7170 Additionally, the client may send any number of
7171 @code{struct GNUNET_DHT_ClientGetResultSeenMessage}s to notify the
7172 service about results that the client is already aware of.
7173 These messages consist of the key, the unique 64-bit ID of the request,
7174 and an arbitrary number of hash codes over the blocks that the client is
7175 already aware of. As messages are restricted to 64k, a client that
7176 already knows more than about a thousand blocks may need to send
7177 several of these messages. Naturally, the client should transmit these
7178 messages as quickly as possible after the original GET request such that
7179 the DHT can filter those results in the network early on. Naturally, as
7180 these messages are sent after the original request, it is conceivalbe
7181 that the DHT service may return blocks that match those already known
7182 to the client anyway.
7183
7184 In response to a GET request, the service will send @code{struct
7185 GNUNET_DHT_ClientResultMessage}s to the client. These messages contain the
7186 block type, expiration, key, unique ID of the request and of course the
7187 value (a block). Depending on the options set for the respective
7188 operations, the replies may also contain the path the GET and/or the PUT
7189 took through the network.
7190
7191 A client can stop receiving replies either by disconnecting or by sending
7192 a @code{struct GNUNET_DHT_ClientGetStopMessage} which must contain the
7193 key and the 64-bit unique ID of the original request. Using an
7194 explicit "stop" message is more common as this allows a client to run
7195 many concurrent GET operations over the same connection with the DHT
7196 service --- and to stop them individually.
7197
7198 @node Monitoring the DHT
7199 @subsubsection Monitoring the DHT
7200
7201 @c %**end of header
7202
7203 To begin monitoring, the client sends a
7204 @code{struct GNUNET_DHT_MonitorStartStop} message to the DHT service.
7205 In this message, flags can be set to enable (or disable) monitoring of
7206 GET, PUT and RESULT messages that pass through a peer. The message can
7207 also restrict monitoring to a particular block type or a particular key.
7208 Once monitoring is enabled, the DHT service will notify the client about
7209 any matching event using @code{struct GNUNET_DHT_MonitorGetMessage}s for
7210 GET events, @code{struct GNUNET_DHT_MonitorPutMessage} for PUT events
7211 and @code{struct GNUNET_DHT_MonitorGetRespMessage} for RESULTs. Each of
7212 these messages contains all of the information about the event.
7213
7214 @node The DHT Peer-to-Peer Protocol
7215 @subsection The DHT Peer-to-Peer Protocol
7216 @c %**end of header
7217
7218
7219 @menu
7220 * Routing GETs or PUTs::
7221 * PUTting data into the DHT2::
7222 * GETting data from the DHT2::
7223 @end menu
7224
7225 @node Routing GETs or PUTs
7226 @subsubsection Routing GETs or PUTs
7227
7228 @c %**end of header
7229
7230 When routing GETs or PUTs, the DHT service selects a suitable subset of
7231 neighbours for forwarding. The exact number of neighbours can be zero or
7232 more and depends on the hop counter of the query (initially zero) in
7233 relation to the (log of) the network size estimate, the desired
7234 replication level and the peer's connectivity.
7235 Depending on the hop counter and our network size estimate, the selection
7236 of the peers maybe randomized or by proximity to the key.
7237 Furthermore, requests include a set of peers that a request has already
7238 traversed; those peers are also excluded from the selection.
7239
7240 @node PUTting data into the DHT2
7241 @subsubsection PUTting data into the DHT2
7242
7243 @c %**end of header
7244
7245 To PUT data into the DHT, the service sends a @code{struct PeerPutMessage}
7246 of type @code{GNUNET_MESSAGE_TYPE_DHT_P2P_PUT} to the respective
7247 neighbour.
7248 In addition to the usual information about the content (type, routing
7249 options, desired replication level for the content, expiration time, key
7250 and value), the message contains a fixed-size Bloom filter with
7251 information about which peers (may) have already seen this request.
7252 This Bloom filter is used to ensure that DHT messages never loop back to
7253 a peer that has already processed the request.
7254 Additionally, the message includes the current hop counter and, depending
7255 on the routing options, the message may include the full path that the
7256 message has taken so far.
7257 The Bloom filter should already contain the identity of the previous hop;
7258 however, the path should not include the identity of the previous hop and
7259 the receiver should append the identity of the sender to the path, not
7260 its own identity (this is done to reduce bandwidth).
7261
7262 @node GETting data from the DHT2
7263 @subsubsection GETting data from the DHT2
7264
7265 @c %**end of header
7266
7267 A peer can search the DHT by sending @code{struct PeerGetMessage}s of type
7268 @code{GNUNET_MESSAGE_TYPE_DHT_P2P_GET} to other peers. In addition to the
7269 usual information about the request (type, routing options, desired
7270 replication level for the request, the key and the extended query), a GET
7271 request also contains a hop counter, a Bloom filter over the peers
7272 that have processed the request already and depending on the routing
7273 options the full path traversed by the GET.
7274 Finally, a GET request includes a variable-size second Bloom filter and a
7275 so-called Bloom filter mutator value which together indicate which
7276 replies the sender has already seen. During the lookup, each block that
7277 matches they block type, key and extended query is additionally subjected
7278 to a test against this Bloom filter.
7279 The block plugin is expected to take the hash of the block and combine it
7280 with the mutator value and check if the result is not yet in the Bloom
7281 filter. The originator of the query will from time to time modify the
7282 mutator to (eventually) allow false-positives filtered by the Bloom filter
7283 to be returned.
7284
7285 Peers that receive a GET request perform a local lookup (depending on
7286 their proximity to the key and the query options) and forward the request
7287 to other peers.
7288 They then remember the request (including the Bloom filter for blocking
7289 duplicate results) and when they obtain a matching, non-filtered response
7290 a @code{struct PeerResultMessage} of type
7291 @code{GNUNET_MESSAGE_TYPE_DHT_P2P_RESULT} is forwarded to the previous
7292 hop.
7293 Whenver a result is forwarded, the block plugin is used to update the
7294 Bloom filter accordingly, to ensure that the same result is never
7295 forwarded more than once.
7296 The DHT service may also cache forwarded results locally if the
7297 "CACHE_RESULTS" option is set to "YES" in the configuration.
7298
7299 @cindex GNS
7300 @cindex GNU Name System
7301 @node GNU Name System (GNS)
7302 @section GNU Name System (GNS)
7303
7304 @c %**end of header
7305
7306 The GNU Name System (GNS) is a decentralized database that enables users
7307 to securely resolve names to values.
7308 Names can be used to identify other users (for example, in social
7309 networking), or network services (for example, VPN services running at a
7310 peer in GNUnet, or purely IP-based services on the Internet).
7311 Users interact with GNS by typing in a hostname that ends in a
7312 top-level domain that is configured in the ``GNS'' section, matches
7313 an identity of the user or ends in a Base32-encoded public key.
7314
7315 Videos giving an overview of most of the GNS and the motivations behind
7316 it is available here and here.
7317 The remainder of this chapter targets developers that are familiar with
7318 high level concepts of GNS as presented in these talks.
7319 @c TODO: Add links to here and here and to these.
7320
7321 GNS-aware applications should use the GNS resolver to obtain the
7322 respective records that are stored under that name in GNS.
7323 Each record consists of a type, value, expiration time and flags.
7324
7325 The type specifies the format of the value. Types below 65536 correspond
7326 to DNS record types, larger values are used for GNS-specific records.
7327 Applications can define new GNS record types by reserving a number and
7328 implementing a plugin (which mostly needs to convert the binary value
7329 representation to a human-readable text format and vice-versa).
7330 The expiration time specifies how long the record is to be valid.
7331 The GNS API ensures that applications are only given non-expired values.
7332 The flags are typically irrelevant for applications, as GNS uses them
7333 internally to control visibility and validity of records.
7334
7335 Records are stored along with a signature.
7336 The signature is generated using the private key of the authoritative
7337 zone. This allows any GNS resolver to verify the correctness of a
7338 name-value mapping.
7339
7340 Internally, GNS uses the NAMECACHE to cache information obtained from
7341 other users, the NAMESTORE to store information specific to the local
7342 users, and the DHT to exchange data between users.
7343 A plugin API is used to enable applications to define new GNS
7344 record types.
7345
7346 @menu
7347 * libgnunetgns::
7348 * libgnunetgnsrecord::
7349 * GNS plugins::
7350 * The GNS Client-Service Protocol::
7351 * Hijacking the DNS-Traffic using gnunet-service-dns::
7352 * Serving DNS lookups via GNS on W32::
7353 @end menu
7354
7355 @node libgnunetgns
7356 @subsection libgnunetgns
7357
7358 @c %**end of header
7359
7360 The GNS API itself is extremely simple. Clients first connec to the
7361 GNS service using @code{GNUNET_GNS_connect}.
7362 They can then perform lookups using @code{GNUNET_GNS_lookup} or cancel
7363 pending lookups using @code{GNUNET_GNS_lookup_cancel}.
7364 Once finished, clients disconnect using @code{GNUNET_GNS_disconnect}.
7365
7366 @menu
7367 * Looking up records::
7368 * Accessing the records::
7369 * Creating records::
7370 * Future work::
7371 @end menu
7372
7373 @node Looking up records
7374 @subsubsection Looking up records
7375
7376 @c %**end of header
7377
7378 @code{GNUNET_GNS_lookup} takes a number of arguments:
7379
7380 @table @asis
7381 @item handle This is simply the GNS connection handle from
7382 @code{GNUNET_GNS_connect}.
7383 @item name The client needs to specify the name to
7384 be resolved. This can be any valid DNS or GNS hostname.
7385 @item zone The client
7386 needs to specify the public key of the GNS zone against which the
7387 resolution should be done.
7388 Note that a key must be provided, the client should
7389 look up plausible values using its configuration,
7390 the identity service and by attempting to interpret the
7391 TLD as a base32-encoded public key.
7392 @item type This is the desired GNS or DNS record type
7393 to look for. While all records for the given name will be returned, this
7394 can be important if the client wants to resolve record types that
7395 themselves delegate resolution, such as CNAME, PKEY or GNS2DNS.
7396 Resolving a record of any of these types will only work if the respective
7397 record type is specified in the request, as the GNS resolver will
7398 otherwise follow the delegation and return the records from the
7399 respective destination, instead of the delegating record.
7400 @item only_cached This argument should typically be set to
7401 @code{GNUNET_NO}. Setting it to @code{GNUNET_YES} disables resolution via
7402 the overlay network.
7403 @item shorten_zone_key If GNS encounters new names during resolution,
7404 their respective zones can automatically be learned and added to the
7405 "shorten zone". If this is desired, clients must pass the private key of
7406 the shorten zone. If NULL is passed, shortening is disabled.
7407 @item proc This argument identifies
7408 the function to call with the result. It is given proc_cls, the number of
7409 records found (possilby zero) and the array of the records as arguments.
7410 proc will only be called once. After proc,> has been called, the lookup
7411 must no longer be cancelled.
7412 @item proc_cls The closure for proc.
7413 @end table
7414
7415 @node Accessing the records
7416 @subsubsection Accessing the records
7417
7418 @c %**end of header
7419
7420 The @code{libgnunetgnsrecord} library provides an API to manipulate the
7421 GNS record array that is given to proc. In particular, it offers
7422 functions such as converting record values to human-readable
7423 strings (and back). However, most @code{libgnunetgnsrecord} functions are
7424 not interesting to GNS client applications.
7425
7426 For DNS records, the @code{libgnunetdnsparser} library provides
7427 functions for parsing (and serializing) common types of DNS records.
7428
7429 @node Creating records
7430 @subsubsection Creating records
7431
7432 @c %**end of header
7433
7434 Creating GNS records is typically done by building the respective record
7435 information (possibly with the help of @code{libgnunetgnsrecord} and
7436 @code{libgnunetdnsparser}) and then using the @code{libgnunetnamestore} to
7437 publish the information. The GNS API is not involved in this
7438 operation.
7439
7440 @node Future work
7441 @subsubsection Future work
7442
7443 @c %**end of header
7444
7445 In the future, we want to expand @code{libgnunetgns} to allow
7446 applications to observe shortening operations performed during GNS
7447 resolution, for example so that users can receive visual feedback when
7448 this happens.
7449
7450 @node libgnunetgnsrecord
7451 @subsection libgnunetgnsrecord
7452
7453 @c %**end of header
7454
7455 The @code{libgnunetgnsrecord} library is used to manipulate GNS
7456 records (in plaintext or in their encrypted format).
7457 Applications mostly interact with @code{libgnunetgnsrecord} by using the
7458 functions to convert GNS record values to strings or vice-versa, or to
7459 lookup a GNS record type number by name (or vice-versa).
7460 The library also provides various other functions that are mostly
7461 used internally within GNS, such as converting keys to names, checking for
7462 expiration, encrypting GNS records to GNS blocks, verifying GNS block
7463 signatures and decrypting GNS records from GNS blocks.
7464
7465 We will now discuss the four commonly used functions of the API.@
7466 @code{libgnunetgnsrecord} does not perform these operations itself,
7467 but instead uses plugins to perform the operation.
7468 GNUnet includes plugins to support common DNS record types as well as
7469 standard GNS record types.
7470
7471 @menu
7472 * Value handling::
7473 * Type handling::
7474 @end menu
7475
7476 @node Value handling
7477 @subsubsection Value handling
7478
7479 @c %**end of header
7480
7481 @code{GNUNET_GNSRECORD_value_to_string} can be used to convert
7482 the (binary) representation of a GNS record value to a human readable,
7483 0-terminated UTF-8 string.
7484 NULL is returned if the specified record type is not supported by any
7485 available plugin.
7486
7487 @code{GNUNET_GNSRECORD_string_to_value} can be used to try to convert a
7488 human readable string to the respective (binary) representation of
7489 a GNS record value.
7490
7491 @node Type handling
7492 @subsubsection Type handling
7493
7494 @c %**end of header
7495
7496 @code{GNUNET_GNSRECORD_typename_to_number} can be used to obtain the
7497 numeric value associated with a given typename. For example, given the
7498 typename "A" (for DNS A reocrds), the function will return the number 1.
7499 A list of common DNS record types is
7500 @uref{http://en.wikipedia.org/wiki/List_of_DNS_record_types, here}.
7501 Note that not all DNS record types are supported by GNUnet GNSRECORD
7502 plugins at this time.
7503
7504 @code{GNUNET_GNSRECORD_number_to_typename} can be used to obtain the
7505 typename associated with a given numeric value.
7506 For example, given the type number 1, the function will return the
7507 typename "A".
7508
7509 @node GNS plugins
7510 @subsection GNS plugins
7511
7512 @c %**end of header
7513
7514 Adding a new GNS record type typically involves writing (or extending) a
7515 GNSRECORD plugin. The plugin needs to implement the
7516 @code{gnunet_gnsrecord_plugin.h} API which provides basic functions that
7517 are needed by GNSRECORD to convert typenames and values of the respective
7518 record type to strings (and back).
7519 These gnsrecord plugins are typically implemented within their respective
7520 subsystems.
7521 Examples for such plugins can be found in the GNSRECORD, GNS and
7522 CONVERSATION subsystems.
7523
7524 The @code{libgnunetgnsrecord} library is then used to locate, load and
7525 query the appropriate gnsrecord plugin.
7526 Which plugin is appropriate is determined by the record type (which is
7527 just a 32-bit integer). The @code{libgnunetgnsrecord} library loads all
7528 block plugins that are installed at the local peer and forwards the
7529 application request to the plugins. If the record type is not
7530 supported by the plugin, it should simply return an error code.
7531
7532 The central functions of the block APIs (plugin and main library) are the
7533 same four functions for converting between values and strings, and
7534 typenames and numbers documented in the previous subsection.
7535
7536 @node The GNS Client-Service Protocol
7537 @subsection The GNS Client-Service Protocol
7538 @c %**end of header
7539
7540 The GNS client-service protocol consists of two simple messages, the
7541 @code{LOOKUP} message and the @code{LOOKUP_RESULT}. Each @code{LOOKUP}
7542 message contains a unique 32-bit identifier, which will be included in the
7543 corresponding response. Thus, clients can send many lookup requests in
7544 parallel and receive responses out-of-order.
7545 A @code{LOOKUP} request also includes the public key of the GNS zone,
7546 the desired record type and fields specifying whether shortening is
7547 enabled or networking is disabled. Finally, the @code{LOOKUP} message
7548 includes the name to be resolved.
7549
7550 The response includes the number of records and the records themselves
7551 in the format created by @code{GNUNET_GNSRECORD_records_serialize}.
7552 They can thus be deserialized using
7553 @code{GNUNET_GNSRECORD_records_deserialize}.
7554
7555 @node Hijacking the DNS-Traffic using gnunet-service-dns
7556 @subsection Hijacking the DNS-Traffic using gnunet-service-dns
7557
7558 @c %**end of header
7559
7560 This section documents how the gnunet-service-dns (and the
7561 gnunet-helper-dns) intercepts DNS queries from the local system.
7562 This is merely one method for how we can obtain GNS queries.
7563 It is also possible to change @code{resolv.conf} to point to a machine
7564 running @code{gnunet-dns2gns} or to modify libc's name system switch
7565 (NSS) configuration to include a GNS resolution plugin.
7566 The method described in this chaper is more of a last-ditch catch-all
7567 approach.
7568
7569 @code{gnunet-service-dns} enables intercepting DNS traffic using policy
7570 based routing.
7571 We MARK every outgoing DNS-packet if it was not sent by our application.
7572 Using a second routing table in the Linux kernel these marked packets are
7573 then routed through our virtual network interface and can thus be
7574 captured unchanged.
7575
7576 Our application then reads the query and decides how to handle it.
7577 If the query can be addressed via GNS, it is passed to
7578 @code{gnunet-service-gns} and resolved internally using GNS.
7579 In the future, a reverse query for an address of the configured virtual
7580 network could be answered with records kept about previous forward
7581 queries.
7582 Queries that are not hijacked by some application using the DNS service
7583 will be sent to the original recipient.
7584 The answer to the query will always be sent back through the virtual
7585 interface with the original nameserver as source address.
7586
7587
7588 @menu
7589 * Network Setup Details::
7590 @end menu
7591
7592 @node Network Setup Details
7593 @subsubsection Network Setup Details
7594
7595 @c %**end of header
7596
7597 The DNS interceptor adds the following rules to the Linux kernel:
7598 @example
7599 iptables -t mangle -I OUTPUT 1 -p udp --sport $LOCALPORT --dport 53 \
7600 -j ACCEPT iptables -t mangle -I OUTPUT 2 -p udp --dport 53 -j MARK \
7601 --set-mark 3 ip rule add fwmark 3 table2 ip route add default via \
7602 $VIRTUALDNS table2
7603 @end example
7604
7605 @c FIXME: Rewrite to reflect display which is no longer content by line
7606 @c FIXME: due to the < 74 characters limit.
7607 Line 1 makes sure that all packets coming from a port our application
7608 opened beforehand (@code{$LOCALPORT}) will be routed normally.
7609 Line 2 marks every other packet to a DNS-Server with mark 3 (chosen
7610 arbitrarily). The third line adds a routing policy based on this mark
7611 3 via the routing table.
7612
7613 @node Serving DNS lookups via GNS on W32
7614 @subsection Serving DNS lookups via GNS on W32
7615
7616 @c %**end of header
7617
7618 This section documents how the libw32nsp (and
7619 gnunet-gns-helper-service-w32) do DNS resolutions of DNS queries on the
7620 local system. This only applies to GNUnet running on W32.
7621
7622 W32 has a concept of "Namespaces" and "Namespace providers".
7623 These are used to present various name systems to applications in a
7624 generic way.
7625 Namespaces include DNS, mDNS, NLA and others. For each namespace any
7626 number of providers could be registered, and they are queried in an order
7627 of priority (which is adjustable).
7628
7629 Applications can resolve names by using WSALookupService*() family of
7630 functions.
7631
7632 However, these are WSA-only facilities. Common BSD socket functions for
7633 namespace resolutions are gethostbyname and getaddrinfo (among others).
7634 These functions are implemented internally (by default - by mswsock,
7635 which also implements the default DNS provider) as wrappers around
7636 WSALookupService*() functions (see "Sample Code for a Service Provider"
7637 on MSDN).
7638
7639 On W32 GNUnet builds a libw32nsp - a namespace provider, which can then be
7640 installed into the system by using w32nsp-install (and uninstalled by
7641 w32nsp-uninstall), as described in "Installation Handbook".
7642
7643 libw32nsp is very simple and has almost no dependencies. As a response to
7644 NSPLookupServiceBegin(), it only checks that the provider GUID passed to
7645 it by the caller matches GNUnet DNS Provider GUID,
7646 then connects to
7647 gnunet-gns-helper-service-w32 at 127.0.0.1:5353 (hardcoded) and sends the
7648 name resolution request there, returning the connected socket to the
7649 caller.
7650
7651 When the caller invokes NSPLookupServiceNext(), libw32nsp reads a
7652 completely formed reply from that socket, unmarshalls it, then gives
7653 it back to the caller.
7654
7655 At the moment gnunet-gns-helper-service-w32 is implemented to ever give
7656 only one reply, and subsequent calls to NSPLookupServiceNext() will fail
7657 with WSA_NODATA (first call to NSPLookupServiceNext() might also fail if
7658 GNS failed to find the name, or there was an error connecting to it).
7659
7660 gnunet-gns-helper-service-w32 does most of the processing:
7661
7662 @itemize @bullet
7663 @item Maintains a connection to GNS.
7664 @item Reads GNS config and loads appropriate keys.
7665 @item Checks service GUID and decides on the type of record to look up,
7666 refusing to make a lookup outright when unsupported service GUID is
7667 passed.
7668 @item Launches the lookup
7669 @end itemize
7670
7671 When lookup result arrives, gnunet-gns-helper-service-w32 forms a complete
7672 reply (including filling a WSAQUERYSETW structure and, possibly, a binary
7673 blob with a hostent structure for gethostbyname() client), marshalls it,
7674 and sends it back to libw32nsp. If no records were found, it sends an
7675 empty header.
7676
7677 This works for most normal applications that use gethostbyname() or
7678 getaddrinfo() to resolve names, but fails to do anything with
7679 applications that use alternative means of resolving names (such as
7680 sending queries to a DNS server directly by themselves).
7681 This includes some of well known utilities, like "ping" and "nslookup".
7682
7683 @cindex GNS Namecache
7684 @node GNS Namecache
7685 @section GNS Namecache
7686
7687 @c %**end of header
7688
7689 The NAMECACHE subsystem is responsible for caching (encrypted) resolution
7690 results of the GNU Name System (GNS). GNS makes zone information available
7691 to other users via the DHT. However, as accessing the DHT for every
7692 lookup is expensive (and as the DHT's local cache is lost whenever the
7693 peer is restarted), GNS uses the NAMECACHE as a more persistent cache for
7694 DHT lookups.
7695 Thus, instead of always looking up every name in the DHT, GNS first
7696 checks if the result is already available locally in the NAMECACHE.
7697 Only if there is no result in the NAMECACHE, GNS queries the DHT.
7698 The NAMECACHE stores data in the same (encrypted) format as the DHT.
7699 It thus makes no sense to iterate over all items in the
7700 NAMECACHE --- the NAMECACHE does not have a way to provide the keys
7701 required to decrypt the entries.
7702
7703 Blocks in the NAMECACHE share the same expiration mechanism as blocks in
7704 the DHT --- the block expires wheneever any of the records in
7705 the (encrypted) block expires.
7706 The expiration time of the block is the only information stored in
7707 plaintext. The NAMECACHE service internally performs all of the required
7708 work to expire blocks, clients do not have to worry about this.
7709 Also, given that NAMECACHE stores only GNS blocks that local users
7710 requested, there is no configuration option to limit the size of the
7711 NAMECACHE. It is assumed to be always small enough (a few MB) to fit on
7712 the drive.
7713
7714 The NAMECACHE supports the use of different database backends via a
7715 plugin API.
7716
7717 @menu
7718 * libgnunetnamecache::
7719 * The NAMECACHE Client-Service Protocol::
7720 * The NAMECACHE Plugin API::
7721 @end menu
7722
7723 @node libgnunetnamecache
7724 @subsection libgnunetnamecache
7725
7726 @c %**end of header
7727
7728 The NAMECACHE API consists of five simple functions. First, there is
7729 @code{GNUNET_NAMECACHE_connect} to connect to the NAMECACHE service.
7730 This returns the handle required for all other operations on the
7731 NAMECACHE. Using @code{GNUNET_NAMECACHE_block_cache} clients can insert a
7732 block into the cache.
7733 @code{GNUNET_NAMECACHE_lookup_block} can be used to lookup blocks that
7734 were stored in the NAMECACHE. Both operations can be cancelled using
7735 @code{GNUNET_NAMECACHE_cancel}. Note that cancelling a
7736 @code{GNUNET_NAMECACHE_block_cache} operation can result in the block
7737 being stored in the NAMECACHE --- or not. Cancellation primarily ensures
7738 that the continuation function with the result of the operation will no
7739 longer be invoked.
7740 Finally, @code{GNUNET_NAMECACHE_disconnect} closes the connection to the
7741 NAMECACHE.
7742
7743 The maximum size of a block that can be stored in the NAMECACHE is
7744 @code{GNUNET_NAMECACHE_MAX_VALUE_SIZE}, which is defined to be 63 kB.
7745
7746 @node The NAMECACHE Client-Service Protocol
7747 @subsection The NAMECACHE Client-Service Protocol
7748
7749 @c %**end of header
7750
7751 All messages in the NAMECACHE IPC protocol start with the
7752 @code{struct GNUNET_NAMECACHE_Header} which adds a request
7753 ID (32-bit integer) to the standard message header.
7754 The request ID is used to match requests with the
7755 respective responses from the NAMECACHE, as they are allowed to happen
7756 out-of-order.
7757
7758
7759 @menu
7760 * Lookup::
7761 * Store::
7762 @end menu
7763
7764 @node Lookup
7765 @subsubsection Lookup
7766
7767 @c %**end of header
7768
7769 The @code{struct LookupBlockMessage} is used to lookup a block stored in
7770 the cache.
7771 It contains the query hash. The NAMECACHE always responds with a
7772 @code{struct LookupBlockResponseMessage}. If the NAMECACHE has no
7773 response, it sets the expiration time in the response to zero.
7774 Otherwise, the response is expected to contain the expiration time, the
7775 ECDSA signature, the derived key and the (variable-size) encrypted data
7776 of the block.
7777
7778 @node Store
7779 @subsubsection Store
7780
7781 @c %**end of header
7782
7783 The @code{struct BlockCacheMessage} is used to cache a block in the
7784 NAMECACHE.
7785 It has the same structure as the @code{struct LookupBlockResponseMessage}.
7786 The service responds with a @code{struct BlockCacheResponseMessage} which
7787 contains the result of the operation (success or failure).
7788 In the future, we might want to make it possible to provide an error
7789 message as well.
7790
7791 @node The NAMECACHE Plugin API
7792 @subsection The NAMECACHE Plugin API
7793 @c %**end of header
7794
7795 The NAMECACHE plugin API consists of two functions, @code{cache_block} to
7796 store a block in the database, and @code{lookup_block} to lookup a block
7797 in the database.
7798
7799
7800 @menu
7801 * Lookup2::
7802 * Store2::
7803 @end menu
7804
7805 @node Lookup2
7806 @subsubsection Lookup2
7807
7808 @c %**end of header
7809
7810 The @code{lookup_block} function is expected to return at most one block
7811 to the iterator, and return @code{GNUNET_NO} if there were no non-expired
7812 results.
7813 If there are multiple non-expired results in the cache, the lookup is
7814 supposed to return the result with the largest expiration time.
7815
7816 @node Store2
7817 @subsubsection Store2
7818
7819 @c %**end of header
7820
7821 The @code{cache_block} function is expected to try to store the block in
7822 the database, and return @code{GNUNET_SYSERR} if this was not possible
7823 for any reason.
7824 Furthermore, @code{cache_block} is expected to implicitly perform cache
7825 maintenance and purge blocks from the cache that have expired. Note that
7826 @code{cache_block} might encounter the case where the database already has
7827 another block stored under the same key. In this case, the plugin must
7828 ensure that the block with the larger expiration time is preserved.
7829 Obviously, this can done either by simply adding new blocks and selecting
7830 for the most recent expiration time during lookup, or by checking which
7831 block is more recent during the store operation.
7832
7833 @cindex REVOCATION Subsystem
7834 @node REVOCATION Subsystem
7835 @section REVOCATION Subsystem
7836 @c %**end of header
7837
7838 The REVOCATION subsystem is responsible for key revocation of Egos.
7839 If a user learns that theis private key has been compromised or has lost
7840 it, they can use the REVOCATION system to inform all of the other users
7841 that their private key is no longer valid.
7842 The subsystem thus includes ways to query for the validity of keys and to
7843 propagate revocation messages.
7844
7845 @menu
7846 * Dissemination::
7847 * Revocation Message Design Requirements::
7848 * libgnunetrevocation::
7849 * The REVOCATION Client-Service Protocol::
7850 * The REVOCATION Peer-to-Peer Protocol::
7851 @end menu
7852
7853 @node Dissemination
7854 @subsection Dissemination
7855
7856 @c %**end of header
7857
7858 When a revocation is performed, the revocation is first of all
7859 disseminated by flooding the overlay network.
7860 The goal is to reach every peer, so that when a peer needs to check if a
7861 key has been revoked, this will be purely a local operation where the
7862 peer looks at his local revocation list. Flooding the network is also the
7863 most robust form of key revocation --- an adversary would have to control
7864 a separator of the overlay graph to restrict the propagation of the
7865 revocation message. Flooding is also very easy to implement --- peers that
7866 receive a revocation message for a key that they have never seen before
7867 simply pass the message to all of their neighbours.
7868
7869 Flooding can only distribute the revocation message to peers that are
7870 online.
7871 In order to notify peers that join the network later, the revocation
7872 service performs efficient set reconciliation over the sets of known
7873 revocation messages whenever two peers (that both support REVOCATION
7874 dissemination) connect.
7875 The SET service is used to perform this operation efficiently.
7876
7877 @node Revocation Message Design Requirements
7878 @subsection Revocation Message Design Requirements
7879
7880 @c %**end of header
7881
7882 However, flooding is also quite costly, creating O(|E|) messages on a
7883 network with |E| edges.
7884 Thus, revocation messages are required to contain a proof-of-work, the
7885 result of an expensive computation (which, however, is cheap to verify).
7886 Only peers that have expended the CPU time necessary to provide
7887 this proof will be able to flood the network with the revocation message.
7888 This ensures that an attacker cannot simply flood the network with
7889 millions of revocation messages. The proof-of-work required by GNUnet is
7890 set to take days on a typical PC to compute; if the ability to quickly
7891 revoke a key is needed, users have the option to pre-compute revocation
7892 messages to store off-line and use instantly after their key has expired.
7893
7894 Revocation messages must also be signed by the private key that is being
7895 revoked. Thus, they can only be created while the private key is in the
7896 possession of the respective user. This is another reason to create a
7897 revocation message ahead of time and store it in a secure location.
7898
7899 @node libgnunetrevocation
7900 @subsection libgnunetrevocation
7901
7902 @c %**end of header
7903
7904 The REVOCATION API consists of two parts, to query and to issue
7905 revocations.
7906
7907
7908 @menu
7909 * Querying for revoked keys::
7910 * Preparing revocations::
7911 * Issuing revocations::
7912 @end menu
7913
7914 @node Querying for revoked keys
7915 @subsubsection Querying for revoked keys
7916
7917 @c %**end of header
7918
7919 @code{GNUNET_REVOCATION_query} is used to check if a given ECDSA public
7920 key has been revoked.
7921 The given callback will be invoked with the result of the check.
7922 The query can be cancelled using @code{GNUNET_REVOCATION_query_cancel} on
7923 the return value.
7924
7925 @node Preparing revocations
7926 @subsubsection Preparing revocations
7927
7928 @c %**end of header
7929
7930 It is often desirable to create a revocation record ahead-of-time and
7931 store it in an off-line location to be used later in an emergency.
7932 This is particularly true for GNUnet revocations, where performing the
7933 revocation operation itself is computationally expensive and thus is
7934 likely to take some time.
7935 Thus, if users want the ability to perform revocations quickly in an
7936 emergency, they must pre-compute the revocation message.
7937 The revocation API enables this with two functions that are used to
7938 compute the revocation message, but not trigger the actual revocation
7939 operation.
7940
7941 @code{GNUNET_REVOCATION_check_pow} should be used to calculate the
7942 proof-of-work required in the revocation message. This function takes the
7943 public key, the required number of bits for the proof of work (which in
7944 GNUnet is a network-wide constant) and finally a proof-of-work number as
7945 arguments.
7946 The function then checks if the given proof-of-work number is a valid
7947 proof of work for the given public key. Clients preparing a revocation
7948 are expected to call this function repeatedly (typically with a
7949 monotonically increasing sequence of numbers of the proof-of-work number)
7950 until a given number satisfies the check.
7951 That number should then be saved for later use in the revocation
7952 operation.
7953
7954 @code{GNUNET_REVOCATION_sign_revocation} is used to generate the
7955 signature that is required in a revocation message.
7956 It takes the private key that (possibly in the future) is to be revoked
7957 and returns the signature.
7958 The signature can again be saved to disk for later use, which will then
7959 allow performing a revocation even without access to the private key.
7960
7961 @node Issuing revocations
7962 @subsubsection Issuing revocations
7963
7964
7965 Given a ECDSA public key, the signature from @code{GNUNET_REVOCATION_sign}
7966 and the proof-of-work,
7967 @code{GNUNET_REVOCATION_revoke} can be used to perform the
7968 actual revocation. The given callback is called upon completion of the
7969 operation. @code{GNUNET_REVOCATION_revoke_cancel} can be used to stop the
7970 library from calling the continuation; however, in that case it is
7971 undefined whether or not the revocation operation will be executed.
7972
7973 @node The REVOCATION Client-Service Protocol
7974 @subsection The REVOCATION Client-Service Protocol
7975
7976
7977 The REVOCATION protocol consists of four simple messages.
7978
7979 A @code{QueryMessage} containing a public ECDSA key is used to check if a
7980 particular key has been revoked. The service responds with a
7981 @code{QueryResponseMessage} which simply contains a bit that says if the
7982 given public key is still valid, or if it has been revoked.
7983
7984 The second possible interaction is for a client to revoke a key by
7985 passing a @code{RevokeMessage} to the service. The @code{RevokeMessage}
7986 contains the ECDSA public key to be revoked, a signature by the
7987 corresponding private key and the proof-of-work, The service responds
7988 with a @code{RevocationResponseMessage} which can be used to indicate
7989 that the @code{RevokeMessage} was invalid (i.e. proof of work incorrect),
7990 or otherwise indicates that the revocation has been processed
7991 successfully.
7992
7993 @node The REVOCATION Peer-to-Peer Protocol
7994 @subsection The REVOCATION Peer-to-Peer Protocol
7995
7996 @c %**end of header
7997
7998 Revocation uses two disjoint ways to spread revocation information among
7999 peers.
8000 First of all, P2P gossip exchanged via CORE-level neighbours is used to
8001 quickly spread revocations to all connected peers.
8002 Second, whenever two peers (that both support revocations) connect,
8003 the SET service is used to compute the union of the respective revocation
8004 sets.
8005
8006 In both cases, the exchanged messages are @code{RevokeMessage}s which
8007 contain the public key that is being revoked, a matching ECDSA signature,
8008 and a proof-of-work.
8009 Whenever a peer learns about a new revocation this way, it first
8010 validates the signature and the proof-of-work, then stores it to disk
8011 (typically to a file $GNUNET_DATA_HOME/revocation.dat) and finally
8012 spreads the information to all directly connected neighbours.
8013
8014 For computing the union using the SET service, the peer with the smaller
8015 hashed peer identity will connect (as a "client" in the two-party set
8016 protocol) to the other peer after one second (to reduce traffic spikes
8017 on connect) and initiate the computation of the set union.
8018 All revocation services use a common hash to identify the SET operation
8019 over revocation sets.
8020
8021 The current implementation accepts revocation set union operations from
8022 all peers at any time; however, well-behaved peers should only initiate
8023 this operation once after establishing a connection to a peer with a
8024 larger hashed peer identity.
8025
8026 @cindex FS
8027 @cindex FS Subsystem
8028 @node File-sharing (FS) Subsystem
8029 @section File-sharing (FS) Subsystem
8030
8031 @c %**end of header
8032
8033 This chapter describes the details of how the file-sharing service works.
8034 As with all services, it is split into an API (libgnunetfs), the service
8035 process (gnunet-service-fs) and user interface(s).
8036 The file-sharing service uses the datastore service to store blocks and
8037 the DHT (and indirectly datacache) for lookups for non-anonymous
8038 file-sharing.
8039 Furthermore, the file-sharing service uses the block library (and the
8040 block fs plugin) for validation of DHT operations.
8041
8042 In contrast to many other services, libgnunetfs is rather complex since
8043 the client library includes a large number of high-level abstractions;
8044 this is necessary since the Fs service itself largely only operates on
8045 the block level.
8046 The FS library is responsible for providing a file-based abstraction to
8047 applications, including directories, meta data, keyword search,
8048 verification, and so on.
8049
8050 The method used by GNUnet to break large files into blocks and to use
8051 keyword search is called the
8052 "Encoding for Censorship Resistant Sharing" (ECRS).
8053 ECRS is largely implemented in the fs library; block validation is also
8054 reflected in the block FS plugin and the FS service.
8055 ECRS on-demand encoding is implemented in the FS service.
8056
8057 NOTE: The documentation in this chapter is quite incomplete.
8058
8059 @menu
8060 * Encoding for Censorship-Resistant Sharing (ECRS)::
8061 * File-sharing persistence directory structure::
8062 @end menu
8063
8064 @cindex ECRS
8065 @cindex Encoding for Censorship-Resistant Sharing
8066 @node Encoding for Censorship-Resistant Sharing (ECRS)
8067 @subsection Encoding for Censorship-Resistant Sharing (ECRS)
8068
8069 @c %**end of header
8070
8071 When GNUnet shares files, it uses a content encoding that is called ECRS,
8072 the Encoding for Censorship-Resistant Sharing.
8073 Most of ECRS is described in the (so far unpublished) research paper
8074 attached to this page. ECRS obsoletes the previous ESED and ESED II
8075 encodings which were used in GNUnet before version 0.7.0.
8076 The rest of this page assumes that the reader is familiar with the
8077 attached paper. What follows is a description of some minor extensions
8078 that GNUnet makes over what is described in the paper.
8079 The reason why these extensions are not in the paper is that we felt
8080 that they were obvious or trivial extensions to the original scheme and
8081 thus did not warrant space in the research report.
8082
8083 @menu
8084 * Namespace Advertisements::
8085 * KSBlocks::
8086 @end menu
8087
8088 @node Namespace Advertisements
8089 @subsubsection Namespace Advertisements
8090
8091 @c %**end of header
8092 @c %**FIXME: all zeroses -> ?
8093
8094 An @code{SBlock} with identifier all zeros is a signed
8095 advertisement for a namespace. This special @code{SBlock} contains
8096 metadata describing the content of the namespace.
8097 Instead of the name of the identifier for a potential update, it contains
8098 the identifier for the root of the namespace.
8099 The URI should always be empty. The @code{SBlock} is signed with the
8100 content provder's RSA private key (just like any other SBlock). Peers
8101 can search for @code{SBlock}s in order to find out more about a namespace.
8102
8103 @node KSBlocks
8104 @subsubsection KSBlocks
8105
8106 @c %**end of header
8107
8108 GNUnet implements @code{KSBlocks} which are @code{KBlocks} that, instead
8109 of encrypting a CHK and metadata, encrypt an @code{SBlock} instead.
8110 In other words, @code{KSBlocks} enable GNUnet to find @code{SBlocks}
8111 using the global keyword search.
8112 Usually the encrypted @code{SBlock} is a namespace advertisement.
8113 The rationale behind @code{KSBlock}s and @code{SBlock}s is to enable
8114 peers to discover namespaces via keyword searches, and, to associate
8115 useful information with namespaces. When GNUnet finds @code{KSBlocks}
8116 during a normal keyword search, it adds the information to an internal
8117 list of discovered namespaces. Users looking for interesting namespaces
8118 can then inspect this list, reducing the need for out-of-band discovery
8119 of namespaces.
8120 Naturally, namespaces (or more specifically, namespace advertisements) can
8121 also be referenced from directories, but @code{KSBlock}s should make it
8122 easier to advertise namespaces for the owner of the pseudonym since they
8123 eliminate the need to first create a directory.
8124
8125 Collections are also advertised using @code{KSBlock}s.
8126
8127 @c https://gnunet.org/sites/default/files/ecrs.pdf
8128
8129 @node File-sharing persistence directory structure
8130 @subsection File-sharing persistence directory structure
8131
8132 @c %**end of header
8133
8134 This section documents how the file-sharing library implements
8135 persistence of file-sharing operations and specifically the resulting
8136 directory structure.
8137 This code is only active if the @code{GNUNET_FS_FLAGS_PERSISTENCE} flag
8138 was set when calling @code{GNUNET_FS_start}.
8139 In this case, the file-sharing library will try hard to ensure that all
8140 major operations (searching, downloading, publishing, unindexing) are
8141 persistent, that is, can live longer than the process itself.
8142 More specifically, an operation is supposed to live until it is
8143 explicitly stopped.
8144
8145 If @code{GNUNET_FS_stop} is called before an operation has been stopped, a
8146 @code{SUSPEND} event is generated and then when the process calls
8147 @code{GNUNET_FS_start} next time, a @code{RESUME} event is generated.
8148 Additionally, even if an application crashes (segfault, SIGKILL, system
8149 crash) and hence @code{GNUNET_FS_stop} is never called and no
8150 @code{SUSPEND} events are generated, operations are still resumed (with
8151 @code{RESUME} events).
8152 This is implemented by constantly writing the current state of the
8153 file-sharing operations to disk.
8154 Specifically, the current state is always written to disk whenever
8155 anything significant changes (the exception are block-wise progress in
8156 publishing and unindexing, since those operations would be slowed down
8157 significantly and can be resumed cheaply even without detailed
8158 accounting).
8159 Note that if the process crashes (or is killed) during a serialization
8160 operation, FS does not guarantee that this specific operation is
8161 recoverable (no strict transactional semantics, again for performance
8162 reasons). However, all other unrelated operations should resume nicely.
8163
8164 Since we need to serialize the state continuously and want to recover as
8165 much as possible even after crashing during a serialization operation,
8166 we do not use one large file for serialization.
8167 Instead, several directories are used for the various operations.
8168 When @code{GNUNET_FS_start} executes, the master directories are scanned
8169 for files describing operations to resume.
8170 Sometimes, these operations can refer to related operations in child
8171 directories which may also be resumed at this point.
8172 Note that corrupted files are cleaned up automatically.
8173 However, dangling files in child directories (those that are not
8174 referenced by files from the master directories) are not automatically
8175 removed.
8176
8177 Persistence data is kept in a directory that begins with the "STATE_DIR"
8178 prefix from the configuration file
8179 (by default, "$SERVICEHOME/persistence/") followed by the name of the
8180 client as given to @code{GNUNET_FS_start} (for example, "gnunet-gtk")
8181 followed by the actual name of the master or child directory.
8182
8183 The names for the master directories follow the names of the operations:
8184
8185 @itemize @bullet
8186 @item "search"
8187 @item "download"
8188 @item "publish"
8189 @item "unindex"
8190 @end itemize
8191
8192 Each of the master directories contains names (chosen at random) for each
8193 active top-level (master) operation.
8194 Note that a download that is associated with a search result is not a
8195 top-level operation.
8196
8197 In contrast to the master directories, the child directories are only
8198 consulted when another operation refers to them.
8199 For each search, a subdirectory (named after the master search
8200 synchronization file) contains the search results.
8201 Search results can have an associated download, which is then stored in
8202 the general "download-child" directory.
8203 Downloads can be recursive, in which case children are stored in
8204 subdirectories mirroring the structure of the recursive download
8205 (either starting in the master "download" directory or in the
8206 "download-child" directory depending on how the download was initiated).
8207 For publishing operations, the "publish-file" directory contains
8208 information about the individual files and directories that are part of
8209 the publication.
8210 However, this directory structure is flat and does not mirror the
8211 structure of the publishing operation.
8212 Note that unindex operations cannot have associated child operations.
8213
8214 @cindex REGEX subsystem
8215 @node REGEX Subsystem
8216 @section REGEX Subsystem
8217
8218 @c %**end of header
8219
8220 Using the REGEX subsystem, you can discover peers that offer a particular
8221 service using regular expressions.
8222 The peers that offer a service specify it using a regular expressions.
8223 Peers that want to patronize a service search using a string.
8224 The REGEX subsystem will then use the DHT to return a set of matching
8225 offerers to the patrons.
8226
8227 For the technical details, we have Max's defense talk and Max's Master's
8228 thesis.
8229
8230 @c An additional publication is under preparation and available to
8231 @c team members (in Git).
8232 @c FIXME: Where is the file? Point to it. Assuming that it's szengel2012ms
8233
8234 @menu
8235 * How to run the regex profiler::
8236 @end menu
8237
8238 @node How to run the regex profiler
8239 @subsection How to run the regex profiler
8240
8241 @c %**end of header
8242
8243 The gnunet-regex-profiler can be used to profile the usage of mesh/regex
8244 for a given set of regular expressions and strings.
8245 Mesh/regex allows you to announce your peer ID under a certain regex and
8246 search for peers matching a particular regex using a string.
8247 See @uref{https://gnunet.org/szengel2012ms, szengel2012ms} for a full
8248 introduction.
8249
8250 First of all, the regex profiler uses GNUnet testbed, thus all the
8251 implications for testbed also apply to the regex profiler
8252 (for example you need password-less ssh login to the machines listed in
8253 your hosts file).
8254
8255 @strong{Configuration}
8256
8257 Moreover, an appropriate configuration file is needed.
8258 Generally you can refer to the
8259 @file{contrib/regex_profiler_infiniband.conf} file in the sourcecode
8260 of GNUnet for an example configuration.
8261 In the following paragraph the important details are highlighted.
8262
8263 Announcing of the regular expressions is done by the
8264 gnunet-daemon-regexprofiler, therefore you have to make sure it is
8265 started, by adding it to the AUTOSTART set of ARM:
8266
8267 @example
8268 [regexprofiler]
8269 AUTOSTART = YES
8270 @end example
8271
8272 @noindent
8273 Furthermore you have to specify the location of the binary:
8274
8275 @example
8276 [regexprofiler]
8277 # Location of the gnunet-daemon-regexprofiler binary.
8278 BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler
8279 # Regex prefix that will be applied to all regular expressions and
8280 # search string.
8281 REGEX_PREFIX = "GNVPN-0001-PAD"
8282 @end example
8283
8284 @noindent
8285 When running the profiler with a large scale deployment, you probably
8286 want to reduce the workload of each peer.
8287 Use the following options to do this.
8288
8289 @example
8290 [dht]
8291 # Force network size estimation
8292 FORCE_NSE = 1
8293
8294 [dhtcache]
8295 DATABASE = heap
8296 # Disable RC-file for Bloom filter? (for benchmarking with limited IO
8297 # availability)
8298 DISABLE_BF_RC = YES
8299 # Disable Bloom filter entirely
8300 DISABLE_BF = YES
8301
8302 [nse]
8303 # Minimize proof-of-work CPU consumption by NSE
8304 WORKBITS = 1
8305 @end example
8306
8307 @noindent
8308 @strong{Options}
8309
8310 To finally run the profiler some options and the input data need to be
8311 specified on the command line.
8312
8313 @example
8314 gnunet-regex-profiler -c config-file -d log-file -n num-links \
8315 -p path-compression-length -s search-delay -t matching-timeout \
8316 -a num-search-strings hosts-file policy-dir search-strings-file
8317 @end example
8318
8319 @noindent
8320 Where...
8321
8322 @itemize @bullet
8323 @item ... @code{config-file} means the configuration file created earlier.
8324 @item ... @code{log-file} is the file where to write statistics output.
8325 @item ... @code{num-links} indicates the number of random links between
8326 started peers.
8327 @item ... @code{path-compression-length} is the maximum path compression
8328 length in the DFA.
8329 @item ... @code{search-delay} time to wait between peers finished linking
8330 and starting to match strings.
8331 @item ... @code{matching-timeout} timeout after which to cancel the
8332 searching.
8333 @item ... @code{num-search-strings} number of strings in the
8334 search-strings-file.
8335 @item ... the @code{hosts-file} should contain a list of hosts for the
8336 testbed, one per line in the following format:
8337
8338 @itemize @bullet
8339 @item @code{user@@host_ip:port}
8340 @end itemize
8341 @item ... the @code{policy-dir} is a folder containing text files
8342 containing one or more regular expressions. A peer is started for each
8343 file in that folder and the regular expressions in the corresponding file
8344 are announced by this peer.
8345 @item ... the @code{search-strings-file} is a text file containing search
8346 strings, one in each line.
8347 @end itemize
8348
8349 @noindent
8350 You can create regular expressions and search strings for every AS in the
8351 Internet using the attached scripts. You need one of the
8352 @uref{http://data.caida.org/datasets/routing/routeviews-prefix2as/, CAIDA routeviews prefix2as}
8353 data files for this. Run
8354
8355 @example
8356 create_regex.py <filename> <output path>
8357 @end example
8358
8359 @noindent
8360 to create the regular expressions and
8361
8362 @example
8363 create_strings.py <input path> <outfile>
8364 @end example
8365
8366 @noindent
8367 to create a search strings file from the previously created
8368 regular expressions.
8369
8370 @cindex REST subsystem
8371 @node REST Subsystem
8372 @section REST Subsystem
8373
8374 @c %**end of header
8375
8376 Using the REST subsystem, you can expose REST-based APIs or services.
8377 The REST service is designed as a pluggable architecture.
8378 To create a new REST endpoint, simply add a library in the form
8379 ``plugin_rest_*''.
8380 The REST service will automatically load all REST plugins on startup.
8381
8382 @strong{Configuration}
8383
8384 The rest service can be configured in various ways.
8385 The reference config file can be found in
8386 @file{src/rest/rest.conf}:
8387 @example
8388 [rest]
8389 REST_PORT=7776
8390 REST_ALLOW_HEADERS=Authorization,Accept,Content-Type
8391 REST_ALLOW_ORIGIN=*
8392 REST_ALLOW_CREDENTIALS=true
8393 @end example
8394
8395 The port as well as Cross-origin resource sharing (CORS) headers that
8396 are supposed to be advertised by the rest service are configurable.
8397
8398 @menu
8399 * Namespace considerations::
8400 * Endpoint documentation::
8401 @end menu
8402
8403 @node Namespace considerations
8404 @subsection Namespace considerations
8405
8406 The gnunet-rest-service will load all plugins that are installed.
8407 As such it is important that the endpoint namespaces do not clash.
8408 For example, plugin X might expose the endpoint ``/xxx'' while plugin Y exposes
8409 endpoint ``/xxx/yyy''.
8410 This is a problem if plugins X is also supposed to handle a call to
8411 ``/xxx/yyy''.
8412 Currently, the REST service will not complain or warn about such clashes so
8413 please make sure that endpoints are unambiguous.
8414
8415 @node Endpoint documentation
8416 @subsection Endpoint documentation
8417
8418 This is WIP. Endpoints should be documented appropriately.
8419 Perferably using annotations.
8420
8421