GNUnet
GNU‘s decentralized anonymous and censorship-resistant P2P framework.
GNUnet logo  
[English | Afrikaans | Bulgarian | Catalan | Czech | Danish | Dutch | Esperanto | Finnish | French | Galician | German | Hungarian | Italian | Japanese | Polish | Portuguese | Romanian | Russian | Simplified chinese | Slovak | Spanish | Swedish | Traditional chinese | Ukrainian]

Questions

General

Features

Configuration and Installation

Error messages and bugs

Common problems

Using GNUnet

Answers

General

What do I do if my question is not answered here?

There are many other sources of information. You can read additional documentation, ask the question on one of the mailing lists.

How does GNUnet compare to other file-sharing applications?

As opposed to Napster and Gnutella, GNUnet was designed with security in mind as the highest priority. We intend on producing a network with high security guarantees. Napster and Gnutella are open to a wide variety of attacks, and users have little privacy. GNUnet is also free software and thus the source code is available, so you do not have to worry about being spied upon by the software. The following table summarizes the main differences between GNUnet and other systems. The information is accurate to the best of our knowledge, but especially for the commercial systems that does not mean much. The comparison is also difficult since there are sometimes differences between various implementations of (almost) the same protocol. In general, we pick a free implementation as the reference implementation since it is possible to inspect the free code easily. Also, all of these systems are changing over time and thus the data below may not be up-to-date. If you find any flaws, please let us know. Finally, the table is not saying much (it is hard to compare these systems this briefly), so if you want the real differences, read the research papers (and probably the code).

Network GNUnet Napster Direct Connect FastTrack eDonkey Gnutella Freenet
Distributed Queries yes no hubs super-peers DHT (eMule) yes yes
Multisource Download yes no no yes yes yes no
Economics yes no no no yes no no
Anonymity yes no no no no no yes
Language C often C C++ C C++ often C Java
Transport Protocol UDP, TCP, SMTP, HTTP TCP TCP? UDP, TCP UDP, TCP TCP TCP
Query Format (UI) keywords / ECRS URIs keywords filename, THEX filename, SHA filename, MD4? filename, SHA secret key, CHK
Routing dynamic (indirect, direct) always direct always direct always direct always direct always direct always indirect
License GPL GPL (knapster) GPL (Valknut) GPL (giFT) GPL (eMule) GPL (gtk-gnutella) GPL

Another important point of reference are the various anonymous peer-to-peer networks. Here, there are differences in terms of application domain and how specifically anonymity is achieved. Anonymous routing is a hard research topic, so for a superficial comparisson like this one we focus on the latency. Another important factor is the programming language. Type-safe languages may offer certain security benefits; however, this may come at the cost of significant increases in resource consumption which in turn may reduce anonymity.

Network GNUnet Tor IIP I2P Mute Freenet Mixminion
Latency medium low low low low low high
Application file-sharing TCP tunnel / HTTP IRC TCP tunnel / IRC file-sharing file-sharing E-mail
Language C C C Java C++ Java Python/C

What do you mean by “anonymity”?

Anonymity is the lack of distinction of an individual from a (large) group. A central goal for anonymous file-sharing in GNUnet is to make all users (peers) form a group and to make communications in that group anonymous, that is, nobody (but the initiator) should be able to tell which of the peers in the group originated the message. In other words, it should be difficult to impossible for an adversary to distinguish between the originating peer and all other peers. In particular, even peers should not be able to recognize from which node the message originated, after all, the adversary could control one or more of the peers.

Of course, in practice, it may be possible for a powerful adversary to do some analysis and potentially assign higher probabilities for being the originator of a message to a subset of the peers. The GNUnet anonymity protocol tries to make this as hard as possible (see our paper on anonymity). The degree of anonymity (how hard it would be to distinguish an individual from the group) in GNUnet depends on the resources (mostly bandwidth) that the individual has available to achieve anonymity.

In the case that an extremely powerful adversary was to break the anonymity of a peer, GNUnet provides deniability. Deniability means that the communication is secret in the sense that only the final recipient knows the key to decrypt the message. The sender and the intermediaries are unable to determine the actual contents. Since content migrates in the network, the originator of the content can often plausibly deny knowledge of the contents since the content could have migrated to the peer, making the originator indistinguishable from an intermediary. Since intermediaries have no means of decrypting the content and are (in all sane legal systems) thus not legally responsible for them (if you use the Internet to send an encrypted E-mail, your Internet Service Provider (ISP) will typically not be held responsible for the content that its servers transmit; in GNUnet, every peer plays the role of an ISP, providing Internet services to other peers).

How does “accounting” work?

GNUnet is based on a trust-based economic model. Each node is forming an opinion on all the other nodes it is in contact with. Depending on that opinion, the node will decide which requests it will honor.

As long as a node is not busy, it will typically serve all requests, using excess resources to gain popularity. If it gets busy, it will drop requests from nodes that the local node trusts least. How busy a node can get (bandwidth and CPU wise) is up to the user to configure. The node increases its trust in nodes that send replies to queries and reduces its trust in nodes that ask for content. The GNUnet encoding ensures that replies are always correct and can not be made up to earn trust without really contributing (see also the ECRS paper and the encoding page for details).

The economic model is designed in a way that the damage that a malicious node can do is bounded by the formula

damage - contribution < capacity + epsilon

where contribution is the amount of resources the node has given to GNUnet, capacity is the network capacity of the malicious node (it is impossible to keep a node from causing as much traffic as its own connection can support; yet, unlike other networks, that traffic is not multiplied by other nodes). Epsilon is a number smaller than the excess capacity of the network, whereas the excess capacity of the network are wasted resources (idle CPUs, idle network connections).

Is the code free?

GNUnet is free software, available under the GNU Public License (GPL). You are free to run, distribute or modify the code under the terms stated in that license. We are a part of the GNU project.

Isn’t all this encryption going to make things totally slow?

The answer to this is, that encryption is incredibly fast. GNUnet uses mostly AES-256, a very fast and secure cipher. What really often makes anonymous file-sharing slow are artificial delays that were introduced to make timing analysis hard and to group messages into larger packets. The reason is, that this makes it harder to correlate actions. GNUnet must wait for enough traffic from other peers to make it plausible that the traffic did not originate from the local peer. Larger delays also allow for more reordering of messages by the individual peer. By allowing peers to delay messages, it is easier for them to build more efficient messages.

The primary cause of CPU consumption in the current implementation are algorithms for message scheduling. GNUnet peers try to maximize bandwidth utility by reordering messages. Also performing downloads in parallel can cause some significant accounting issues. Many datastructures used currently are simple lists that take time linear to their size to operate on. For local indexing operations the current release is typically pushing the limits of both the CPU and the harddrive.

We expect to use smarter, faster datastructures in the future to reduce CPU consumption. The GNUnet developers are always trying to improve performance; yet, there is not much hope that performance will ever get close to typical response times from other applications like the WWW. Theoretically, it is possible that a download via GNUnet is even faster than a download from a crowded webserver or a single dialup user, but how likely this is depends in practice on how the content is spread throughout the network -- and we neither promise nor really expect to achieve this level of performance. While peer-to-peer networks can theoretically provide better performance than dedicated servers, their true strength lies in the possibility of being anarchistic: low administrative overhead, anonymity, no single point of failure. Complete decentralization is very costly and we should thus not expect to outperform the centralized solution, especially not if we also want anonymity.

Are there any known attacks?

Generally, there is the possibility of a known plaintext attack on keywords, but since the user has control over the keywords that are associated with the content he inserts, the user can take advantage of the same techniques used to generate reasonable passwords to defend against such an attack. In any event, we are not trying to hide content; thus, unless the user is trying to insert information into the network that can only be shared with a small group of people, there is no real reason to try to obfuscate the content by choosing a difficult keyword anyway. Note that it is not necessary to use keywords (or even intelligible keywords) at all. The file identifiers (two hash codes and filesize) can also be shared off-band.

Most attacks on anonymity involve a resource battle between the attacker and the victim. If the attacker has significantly more resources (bandwidth, control over Internet routers, many peers), anonymity can theoretically always be broken. In fact, this applies to all other systems that provide anonymity. Unlike other designs, the degree of anonymity that can be achieved in GNUnet depends mostly on which fraction of its resources each peer spends on its own requests.

Since this is a project in development, you can find a list of problems or report them using the Mantis system.

Features

When are you going to release the next version?

The general answer is, when it is ready. A better answer may be, earlier if you contribute (test, debug, code, document). Every release will be anounced on the Announcements mailing list and on freshmeat. You can subscribe to the mailing list or to the project on freshmeat to automatically receive a notification.

Is there a graphical user interface?

There are actually two graphical user interfaces, gnunet-gtk and gnunet-qt. Note that both of these need to be downloaded separately. The GUIs supports searching, downloading and inserting files.

How can I use GNUnet from the command line?

Yes, except for image previews pretty much all features can be accessed with various command line tools. Use gnunet-search to search for content:

$ ~/bin/gnunet-search GPL
gnunet://ecrs/chk/9E4MDN4VULE8KJG6U1C8FKH5HA8C5CHSJTILRTTPGK8MJ6VHORERHE68JU8Q0FDTOH1DGLUJ3NLE99N0ML0N9PIBAGKG7MNPBTT6UKG.1I823C58O3LKS24LLI9KB384LH82LGF9GUQRJHACCUINSCQH36SI4NF88CMAET3T3BHI93D4S0M5CC6MVDL1K8GFKVBN69Q6T307U6O.17992:
gnunet-download -o "COPYING" gnunet://ecrs/chk/9E4MDN4VULE8KJG6U1C8FKH5HA8C5CHSJTILRTTPGK8MJ6VHORERHE68JU8Q0FDTOH1DGLUJ3NLE99N0ML0N9PIBAGKG7MNPBTT6UKG.1I823C58O3LKS24LLI9KB384LH82LGF9GUQRJHACCUINSCQH36SI4NF88CMAET3T3BHI93D4S0M5CC6MVDL1K8GFKVBN69Q6T307U6O.17992
                    filename: COPYING
                 description: The GNU Public License
                      author: RMS
            publication date: Sat Jun 25 08:29:13 2005

The output above is the result of searching for the keyword “GPL”. gnunet-search will immediately start searching GNUnet and print new results (no duplicates) to the screen. The first line is the information that is required to retrieve the file (query-hash, key-hash, and the size of the file, here 17992 bytes).

This is followed by additional information about the file. In order to download the file, use

$ gnunet-download -o "COPYING" -- gnunet://ecrs/chk/N8RCF3TETLRU9CV1PAS7M2H9QDB36AE3.K9JO8IP7KTNFO23S3VB4TFUKLD7SO5AS.0466DC92.17992

where COPYING is the suggested filename.

If you want to add content to GNUnet, use

$ gnunet-insert -m "description:The GNU Public License" -k GPL -k GNU -m mimetype:text/plain -m author:RMS COPYING

where COPYING is the filename and the arguments are the description of the file (-m options) and -k is used to specify additional keywords.

Is it possible to surf the WWW anonymously with GNUnet?

It is not possible use GNUnet for anonymous browsing at this point. We recommend that you use tor for anonymous surfing.

Is it possible to access GNUnet via a browser as an anonymous WWW?

There is currently no proxy (like fproxy in Freenet) for GNUnet that would make it accessible with a browser. It is possible to build such a proxy and all one needs to know is the protocol used between browser and proxy and a swift look at the sources in src/applications/fs/tools/.

The real question is, whether or not this is a good idea. In order to achieve anonymity, the file sharing service implemented on top of GNUnet has a much higher latency than the WWW. Thus, the experience of browsing the web will usually be hindered significantly by these delays (potentially several minutes per page!).

If you still want to write a proxy, you are welcome to send us code and join the developer team.

I have some great idea for a new feature, what should I do?

Sadly, we have many more feature requests than we can possibly implement. The best way to actually get a new feature implemented is to do it yourself -- and send us a patch. If it is a larger effort, you might want to ask on the mailinglists for some feedback first. Also, check on Mantis to see if the feature is already being worked on. A list of planned long-term features is in the todo file. If you cannot code you can submit a feature request to Mantis. Please double-check that such a request does not already exist.

Configuration and Installation

On which platforms does GNUnet run?

GNUnet is being developed and tested under Debian GNU/Linux for i386. We have reports of working versions on FreeBSD, NetBSD, OpenBSD, Solaris and OS X. However, those reports are not recent, if you can or cannot get GNUnet to work on those systems please let us know. GNUnet should work on big-endian architectures, including Linux/PPC. GNUnet has been ported to Win32. Patches to make it work on other platforms are always appreciated. If you had success running GNUnet on any other platform, please report!

What is the right database for me?

If you are not experienced with databases or GNUnet, you should stick to the default which is sqlite. The mysql module requires manual setup, which is described here. mysql has good performance and the database can be repaired from internal failures, but its more difficult to install than any of the alternatives.

How do I have to configure my firewall?

GNUnet uses the ports 2086 and 1080 by default. Configure your firewall to accept packets to the ports 2086 and 1080 (TCP and UDP) for the machine running the GNUnet daemon gnunetd. If your firewall is a NAT box, forward packets to your GNUnet machine's ports 2086 and 1080 and tweak the configuration file gnunetd.conf (sections NETWORK, LOAD, UDP, TCP and NAT) to use the external IP of the NAT box. Port 2087 is used for communication between gnunetd and the client tools as gnunet-gtk, gnunet-search etc. There is no need to open port 2087 to the rest of the Internet.

Port 2086 is used for GNUnet's own transmission protocol, HTTP encapsulated GNUnet packets ("HTTP transport") are transmitted through port 1080 by default. The HTTP transport is not necessarily required and can be disabled in GNUnet's configuration file. Disabling it on firewalled systems is important, because available transports are advertised to other peers and activated but broken transports result in decreased reachability.

Why should I not use an external traffic shaper?

GNUnet accounting decides who to serve when the system is loaded. Packets are sent and dropped based on their priority and current load. External shapers (like token bucket filter) can’t make this distinction and treat all GNUnet traffic as equal. You should set GNUnets internal bandwidth limits to reflect your true configuration and what you can afford and not use any external shaping for GNUnet. It’s much better to have the limits enforced by gnunetd than by an external mechanism.

Why do you require GNU libextractor?

GNUnet needs keywords such that other users can find the files. Typing in lots of keywords is of course a major pain. Other systems like gnutella typically just use the filenames. Using filenames is not a good solution since they are not always very descriptive and/or can be a pain to produce for the content provider in the first place.

GNUnet uses a better approach, which is keyword extraction. The library libextractor was developed for the purpose of extracting keywords from arbitrary files. If keywords can easily be extracted from your files, you don’t have to supply keywords by hand. libextractor can also use the filename as a source for keywords.

If you have keywords in a file that should be extracted but the file format is not supported by libextractor, the API of the library is be simple enough that any C hacker who knows the file format should be able to code a plugin that will allow you to extract the keywords. If you just want filenames, libextractor can do those, too.

What are all of the dependencies for building GNUnet?

The short answer is, that we cannot really tell you. The reason is, that this depends a lot on your distribution. For example, we use libgcrypt, which in turn requires libgpg-error. However, most distributions would put these two libraries into one package. Similarly, dependencies for GTK and MySQL are not always identical. Finally, where does the list end? Should we list libc6, zlib, bzip2, xlib, glib? Also, many dependencies are optional. You can use GNUnet without a graphical user interface. However, even if you do not use GNUnet with a GUI, you might be using a libextractor binary that is linked against GTK-pixbuf to compute thumbnails.

What we have done instead is list all of the top-level dependencies for Debian GNU/Linux in the README.debian file. This list is for the specific Debian version that most GNUnet developers are using. The file is only detailing the top-level packages necessary to compile GNUnet. Those packages in turn depend on other packages, which are not listed. For example, the list will include libextractor-dev but not libextractor1c2a which maybe required by libextractor-dev. In other words, apt-get is your friend, and if your distribution does not support automatic download of transitive dependencies you might want to consider switching -- at least we cannot really help you with a complete list in that case.

Note that the Debian package list should still be useful for you even if you are not running Debian. Other distributions are likely to have similar packages.

Finally, please note that configure will succeed even if a suitable version of MySQL or SQLite is not detected. The reason is, that (theoretically) you might be compiling for a client-only system, or you might not care about anonymous file-sharing. If you do want to use file-sharing, please read the final lines printed by configure to make sure that a suitable database was found.

Error messages and bugs

I get error messages of the form "Failure at FILE.c:LINE". What is going on?

We use a generic error message in GNUnet to indicate that something went wrong. The cause is usually a bug or some data corruption on the network. Note that the bug does not necessarily have to be in the current version -- the problem could be caused by another peer running a different version of GNUnet. Similarly, the problem might be anything from completely harmless to rendering your peer useless. In a stable, production release we would disable these messages, but for now we want to know about those problems. Consequently, please report them to Mantis (after checking that they have not already been reported). Of course, if a corresponding report already exists, feel free to add a note saying that you are also experiencing the problem.

We do not provide detailed information about what exactly went wrong in the error message for a simple reason -- there are at least 800 different potential problems that are reported in this way. If we gave 800 specific error messages this would not only increase the binary size significantly, it would also drive people translating GNUnet into other languages crazy. Finally, it takes much less time to write BREAK() in the code to indicate that something went wrong then to write a detailed explanation of the cause that anyone unfamiliar with the code can understand. If you want to investigate what went wrong yourself, use the source.

Note that we do provide detailed error messages and warnings for problems that are likely not bugs in the code and that the user can address. Also note that GNUnet is generally quite verbose in its log messages. This is mostly useful for diagnosing problems that users report. As long as everything seems to work, it is probably safe to ignore WARNING messages.

Checksum error: the deleted hostkey problem.

Under certain circumstances, gnunetd will print warnings indicating checksum errors in messages that were received from other nodes. This is most of the time not a bug and not a problem. Everything is working ok. What has happened most of the time is the following. Each node on GNUnet has a secret, public key. When hosts start, they look at the data/hosts/ directory looking for keys and addresses of other nodes on the network. It will then cryptographically sign its current network address (say IP and port) together with a timestamp and send this, together with the public key of the node, to other nodes on the network.

Later, nodes will use this binding of key to address to communicate. The binding of a public key to an address would ideally be a one-on-one relationship. Due to dial-up, DHCP and other dynamic assignments, this may not always be the case. Even if the same host is used, a different user may be running gnunetd with a different hostkey by now. A more common scenario is that the ~/.gnunet/.hostkey file was deleted. Other nodes on the network may still know the old hostkey and have it bound to that host. Do not delete the hostkey if you want to avoid this problem!

The reason why we can’t avoid this (ok, we could just not print the error message, but that’s not the point), is that a malicious host could always claim to have any address on the Internet. If we have two public keys for the same host, the best we can do is try out both.

Checking both is very cheap, and after a while (depending on the timeout configured in gnunetd.conf), hostkeys will eventually expire.

You may also receive messages that will result in checksum errors from clients that run versions of GNUnet before 0.7.0 (protocol mismatch).

Are there any known bugs?

The list of currently known bugs is available in the Mantis system.

Some bugs are occasionally reported directly to developers or the developer mailing list. This is discouraged since developers often do not have the time to feed these bugs back into the Mantis database. Please report bugs directly to the bug tracking system. If you believe a bug is sensitive, you can set its view status to private (this should be the exception).

How do I report a bug?

Good bug reports enable developers to find and hopefully fix problems faster. Nobody can or will fix a “GNUnet does not work for me.” bug. Please try to follow the following guidelines as far as they are applicable to the bug at hand.

Use our bug-tracking system

You should use the Mantis system for any bug reporting. Also, please check first if a bug has already been reported. If the bug has been reported, you may want to add comments to the report, even if it just a statement that tells the developers that you encountered the same problem.

The following status codes are used in Mantis:

New
A new bug, developers did not look into these yet.
Feedback
Developers require feedback from users reporting the bug to resolve it. Also used if a general discussion between the researches is needed on how to address a problem.
Acknowledged
Developers have seen the bug.
Confirmed
Developers are convinced that the bug is a problem that needs to be fixed.
Assigned
Some developer has started working on the problem. Note that developers may give up on problems, putting the bug back to confirmed, or feedback.
Resolved
The bug has been fixed in some version in Subversion or in a patch attached to the bug report.
Closed
Resolved bugs are closed after the bugfix has made it into a full release of GNUnet.
Report your platform.
Please report platform information in your bugreport. The script contrib/report.sh in the GNUnet distribution scans your system and reports version numbers of relevant installed packages. Please include this information in the bugreport. A simple way to do this is to add a platform to your user profile and select that platform when reporting bugs. You can also post the report.sh output in a comment, especially if you reproduced the bug on a different system or are not the original reporter. Just reporting that you reproduced a bug can be helpful since it may narrow down the list of possible causes and will give us an idea of how frequent a particular problem is.
Include log messages
Using the -d -L DEBUG options, all GNUnet applications can be set to print (lots of) debugging output to the console. You may want to include the last 10-20 lines in the bug report.
Describe what you were doing.
If you did something specific when the problem occured, please report what you were doing. If possible, try to reproduce the bug and reduce the number of steps needed to do so. It may also be a good idea to move the ~/.gnunet/ directory for testing and then reproduce the bug starting from scratch.
Report network and file status information
If the bug occurs and does not crash the GNUnet application (deadlock, not responsive, etc.), please use netstat -tnl to gather information about open TCP sockets and include this in the bug report. Furthermore, use lsof | grep gnunet (may require root priviledges) to get information about open files.
Reporting a segfault.
If any GNUnet application reproducably segfaults, please try getting a stack trace. Use gdb bin/gnunet-application-name and then at the (gdb) prompt use run APPLICATION-ARGUMENTS to start the application with the GNU debugger. Once the segfault occurs, you will get the (gdb) prompt again. Now type ba to get the stacktrace. Attach the output to the bugreport.
Deadlocks, or what to do if GNUnet just stops working.
If any of your GNUnet applications just stops working, you can directly obtain diagnostic information using gdb. Use ps ax to obtain the process ID of your GNUnet applications. Sometimes problems can be caused by the interplay of gnunetd and a client application. In that case, follow the instructions for both applications.

Once you have the process ID (2-5 digit number), start gdb. At the (gdb) prompt, enter attach PID where PID must be replaced by the process ID. Then use info threads to obtain a list of all running threads of the application. Use thread NUMBER to select a thread. For each thread, obtain the backtrace using the ba command. Using this information, GNUnet developers can hopefully see where the application hangs.

Please also report the lsof and netstat output as described before.

Statistics.
If gnunetd is still running, always try to also run the gnunet-stats tool. If the tool still works, report the output. If it does not work anymore, this can also be an important hint towards where the problem is.
Reporting Problems with Mantis
If you have problems with Mantis, please contact christian@grothoff.org via E-mail.

Common problems

I cannot find anything. How can I test if it works?

How can I test if it works? Searches can return no results if no matching content is found. For a simple test, it is suggested to search for GPL. The GNU Public License was inserted under that keyword on the permanent node on gnunet.org. This test may of course fail if gnunet.org is temporarily not available. Common other problems are:

For a test of slightly larger scale, you can try to download another “official” test content by searching for keyword alien or go directly for the content using:

$ gnunet-download -o "Aliensong.mpeg" -- gnunet://ecrs/chk/MN8P2LS383SRU0N68OPRBU28J0MIOPFS1BTA7K76SJUFONHHGE6LJ33PU45ASNUTGT4AP70LQUOSN79C2IODFA7D4IU0HR9K3ASIHE8.E561C1GJ1SR99AMBM7L87RF2HKGE8L7D6JLIUGT5G7UBDPCT1FNDCMV15T00LD0U92C6JE3M93JE23PJKVF2AJRHIB3VCIC41952DOO.3201028

Still not satisfied? Use your imagination for guessing keywords, or try common mime-types as keywords (such as application/pdf, application/x-zip, image/jpeg or audio/mp3).

Why is downloading the last few blocks so slow?

Sometimes when downloading large files from GNUnet, it may take a long time to get the last remaining blocks of the file. This is often not an error, and if it happens, it does not automatically mean that the blocks must have disappeared from the network (though that is possible). The explanation is as follows (its a bit technical).

To summarize, there are plenty of reasons why the download MUST go slower at the end. However, the GNUnet developers are still investigating ways to make it faster.

gnunet-unindex behaves in unexpected ways.

First of all, many things can go seemingly wrong with gnunet-unindex and one has to understand what exactly gnunet-unindex does to avoid pitfalls. The first thing to recall is that gnunet-unindex only unindexes blocks from the local database. Blocks that have been replicated by other peers are not removed. This is why it is possible that a file can still be available after running gnunet-unindex. Also gnunet-unindex does not unindex the search-blocks associated with keywords. Thus searching for the file will still list the file as if it was there. Part of the reason for not removing the search-blocks is that the keywords used when indexing are not known to gnunet-unindex. A more elaborate mechanism that uses libextractor to guess which keywords could have been used still needs to be implemented. In the future we also plan to time-out search-blocks to avoid the search-space pollution.

Another important aspect of gnunet-unindex is that it may unindex blocks shared with other files or within the same file. The reason is that blocks of identical content hash to the same identifier and can thus not be distinguished by GNUnet. For highly structured content it is possible in practice that two blocks are identical. GNUnet will then share the storage space for these two blocks. When unindexing a file that contains such shared blocks, GNUnet can currently not recognize the sharing and will remove the block even if it is still used in another context. The resulting inconsistencies can result in warnings from gnunet-unindex, as well as in downloads that do not complete. In general, gnunet-unindex should be used with caution.

How can I see which files I have indexed/inserted (names, descriptions, keywords)?

For building directories, GNUnet keeps track of all file identifiers that it has so far encountered, including search results, inserted or indexed files and files mentioned in downloaded directories. This information is stored in plaintext to allow building of directories. Users should run gnunet-directory -t to start tracking this information. Note that the data is kept locally in the GNUnet directory and never send out into the network. You can inspect the information with gnunet-directory -l. It is probably a good idea to clean this database of your activities from time to time. You can run gnunet-directory -k to remove the information collected so far (and to stop tracking). Once that database has been cleaned, GNUnet can no longer tell which files you inserted, but it can tell you which files are indexed.

The reason why GNUnet can not tell you which files were inserted is the same reason, why we distinguish between indexing and insertion: deniability. The primary use of insertion is to give an adversary no easy way to figure out what files are stored on your computer, and that under the assumption that the adversary takes full control of you machine. Thus, GNUnet was designed to not require any information that would allow it to reconstruct the inserted file without the appropriate keyword (read: password). If the adversary already knows the exact content, it is still possible for the adversary that has control of your machine to verify that the content is present. The best defence against that is to insert the content with a low priority and to turn on ACTIVEMIGRATION. Then you can plausibly claim that the content migrated to your node from another peer, and that you had no way of knowing that it was there. In either case, how well deniability serves you will depend on your local court. Since there are countries where breathing can get you into jail, saying that you were not able to tell what your computer was storing may not be sufficient. Note that breaking your anonymity and taking control of your computer are steps that the adversary needs to take first, before you need to resort to deniability.

Indexed content is a slightly different story. For indexed content, the goal for GNUnet is still to make it difficult for the adversary to establish from which machine the content originates (anonymity). For indexed content GNUnet keeps links to the indexed files, typically in /var/lib/gnunet/data/shared/. GNUnet uses the list to locate the block corresponding to a request. Do NOT edit the directory by hand. Use gnunet-unindex to remove files from the directory.

Also, do not move or change indexed files since GNUnet relies on the paths of indexed files to be constant. If you must move an indexed file, use first gnunet-unindex, then move the file, and then use gnunet-insert to re-insert the file.

Using GNUnet

Why should I insert directories instead of individual files?

GNUnet's ECRS encoding/query strategy doesn't allow peers to benefit from false replies. Even small blocks of incorrect response data can be detected instantly, resulting in no trust gain for the malicious node. If you know the correct ECRS URI for the file you want, no intermediate node can cheat by false replies. However, this leaves the problem of obtaining the URIs in the first place, and unfortunately if anyone can insert files under common keywords, false data can be inserted as well. There doesn't seem to be any easy solution to this problem. Ranking search results by trust could be one answer in the future. Meanwhile, namespaces and directories are a step towards the nonspammable direction.

Inserting into a namespace requires the user to create a pseudonym first, which is equal to a public/private key pair that identifies the namespace. (One user can create any number of pseudonyms.) Then, pointers to files or directories can be inserted into the pseudonym's namespace, signed by the private key of the pseudonym. The signed blocks will be verified by each peer before the blocks are accepted or passed along. The verification works by checking the validity of the cryptographic signature against the public key included in the namespace block, and by checking that hashing the public key results in the correct namespace identifier. Thus, only the user with the private key to the namespace can publish into it, making it a nonspammable, secure publishing channel that other users can limit their searches to. Its worth noting that naturally the pointers found from a namespace can point to any files chosen by the pseudonym, even if the actual files were inserted by someone else.

Directories are a good way to group files to meaningful collections in GNUnet. The directories can contain arbitrary number of pointers to namespaces (SBlocks), pointers to other directories and pointers to files. With directories, users can build networks of content, where not only inserted files, but also interesting other content or namespaces can be pointed to, just as in WWW. Additionally, directories have two nice properties. First, they are immutable, meaning that they can't be tampered with, but contain exactly those pointers the publisher intended. The second property is that identical files pointed to by two directories waste no additional space, even if the directories were built by separate users. This contrasts strongly to the case where similar files were archived by e.g. zip or tar, which could double the space usage over the network without any speedups in retrieval time. By using directories to group the content enables GNUnet to spread the identical file blocks more efficiently.

For more info on using directories and namespaces, see GNUnet online documentation, or man page of gnunet-insert for examples.



Copyright (C) 2002, 2003, 2004, 2005, 2006 Christian Grothoff.
Verbatim copying and distribution of this entire article
is permitted in any medium, provided this notice is preserved.

Translation engine based on i18nHTML (C) 2003, 2004, 2005, 2006, 2007 Christian Grothoff.

go to i18nHTML administration page