Week 4 discussion questions

The rationale for having two distinct buckets for these different types of connections is that it’s possible that we assume an attacker can gain full control of our inbound slots, by opening multiple connections to us rapidly upon startup, but they can’t control who we make our outbound connections to. It follows that as long as we ensure that we make some outbound connections ourselves that it’s much more difficult for an attacker to eclipse attack us.

In terms of locating the most-work chain tip, we only need to connect to a single honest peer amongst our total connections (8 out and 117 in by default) to have a new most-work header or block relayed to us which we can use to base our chain decision on. In addition to helping us locate the most-work chain tip, a single honest peer can help to repopulate our "address book" — AddrMan (stored in file peers.dat on shutdown) — with honest peers. The first thing we do upon connecting (outbound) to a new peer is to request up to 3 ADDR message(s) containing up to 1000 peers in each message from them. In the case that our AddrMan has previously been "poisoned" (filled with malicious peers) making an outbound connection to a single honest node will help to re-populate AddrMan with honest peers and the rest of the node eviction logic will get rid of malicious peers for us.

The rationale behind the absolute numbers selected for these two types of connections does not seem to be particularly empirical… It appears that Satoshi reduced outbound connections from 15 to 8 in July 2010 according to this BitcoinTalk post. The OP in that thread stated the following:

I’d like to see an option (with an RPC interface as well) to limit the number of connections that a Bitcoin client accepts. My home client is connected to 70 different nodes and my poor wimpy router just can’t keep up. It’s starting to slow my network down, to the point where I’ll need to force Bitcoin to ignore connection requests (with -connect=<a node>) if this keeps up.

— Iachesis
BitcoinTalk.org

At the time the number of outbound connections was 15, so OP appears to have been servicing ~55 inbounds too. This was likely a side-effect of each peer trying to connect to 15 outbound peers.

Two quotes from Satoshi in reply:

One thing we could do is lower the outbound connections from 15 to 10 or maybe even 5. The choice of 15 was arbitrary. It just needs to be enough for redundancy and fast exponential propagation of messages. 10 would still be plenty. 5 should be fine. 10 is good as a nice round number so users can see that it stopped intentionally.

It would help to implement UPnP so there would be more inbound accepting nodes. Your number of connections is the ratio of inbound accepting nodes to out-only times 15. We need to encourage more people to accept inbound connections.

I will implement a feature to stop accepting inbound connections once you hit a certain number.

— Satoshi
BitcoinTalk.org

and

I reduced max outbound connections from 15 to 8 in RC4.

15 was way more than we needed for redundancy. 8 is still plenty of redundancy.

As the nodes upgrade to this version, this will cut in half the number of connections that inbound accepting nodes get.

If anyone wants more than 8 connections, they can open port 8333 on their firewall.

— Satoshi
BitcoinTalk.org

In PR 72, "Denial of service flood prevention" by Andresen, the maxconnections parameter was configured to the number 125 we still use by default today in 2021. It is thought, but apparently not measured empirically, that 8 outbounds gives a good balance between having enough that you avoid becoming eclipsed but not so many that the network comes under high strain as usage increases (and particularly at moments where new blocks are found and propagated). Whilst this PR is related to DoS protection, many users around this time were stuggling with their OS and routers handling large numbers of concurrent TCP connections, therefore setting maxconnections to 125 leaves some processing headroom for the average user.

Whilst inbound connections are treated as relatively untrustworthy, their purpose is to allow p2p networking to take place; if everybody can only make outbound connections but nobody accepts inbound connections, then nobody can connect to anybody else! The discrepancy between 8 and 117 might be explained in that we allow for more peers to be "non-listening" (intentionally or behind NAT) or "SPV" to connect to the network, as both of these client modes don’t accept incoming connections. In this way, we allow for ~ 117 / 8 = 14x the number of listening nodes to connect to the network.

If we assume the average user downloads Bitcoin Core GUI on a mid-range Windows laptop and connects to the P2P network via their home access point (possibly behind NAT) then these parameters feel pretty optimum. Those users will maintain their 8 outbounds, 2 blocksonly and possibly get a few inbounds after a while. The liklihood is that their inbounds (someone else’s outbounds) will get dropped due to remote nodes eviction policies as they are likely to not meet eviction criteria (search src/net_processing.cpp for "eviction" to see how this works), and so high numbers of inbounds are unlikely.

Where this might want to be optimised further is for nodes on low-bandwidth connections, although these nodes can run bitcoind with the --blocksonly option or follow other recommendations in /doc/reduce-traffic.md if they are severely constrained.

Additional question: Why were blocksonly connections chosen to be 2 of the 8 outbounds, rather than an additional two connections? A: adding two connections per node might be 100_000 * 2 = 200_000 more outbound connections

What is the rationale behind the "new"/"tried" table design? Were there any prior inspirations within the field of distributed computing?⌗

asdf

How does a fixed set of 4 outbound peers get chosen? In what circumstances would you evict or change them?⌗

asdf

What are feeler connections, and when are they used?⌗

asdf

Eclipse Attacks on Bitcoin’s Peer-to-Peer Network ⌗

What can an attacker do if they are able to eclipse a mining pool?⌗

asdf

What is the difficulty of successfully achieving an eclipse attack? What resources and skills would be required to achieve such an attack?⌗

An eclipse attack attempts to prevent one or more nodes on the network from communicating with any honest peers, instead receiving information only from the attackers malicious nodes.

If the eclipsed node was a miner, the most-work chain tip could be delayed or completely withheld from them, resulting in them wasting work mining on a stale tip. If a large-enough % of hash power can be eclipsed simultaneously, say ~30%, then they would start mining a minority fork. A minority fork could be presented to a merchant and used to trick them into accepting a transaction which was double spent (back to the attacker) on the main chain.

A majority 51% attack is extremely expensive to pull off, as it requires significant CAPEX to purchase, setup and operate the required mining hardware, whereas an eclipse attack would require fewer resources, but perhaps more skill, to execute.

Mitigation techniques ⌗

It is cheaper and easier for an attacker to control new IP addresses that are contiguous and from the same IP address block
- IP addresses in the new nodes and tried nodes tables in Bitcoin Core are bucketed into IP address block buckets. This means that an attacker will have to control many IP addresses in many distinct IP address blocks, which map in the real world to geographic locations.
  
  When a node needs a new peer to replace an existing peer that goes offline, it randomly selects from one of these two tables, randomly selects a bucket from that table, and then selects a random IP address from the selected bucket. Only if a node from the ‘New Nodes’ table has been successfully connected to does it get promoted to the larger and more persistent ‘Tried Nodes’ table. A verified new IP address only evicts an existing address from its corresponding bucket in the ‘Tried Nodes’ table if the previous node at that position in the table is offline; this technique is called “https://github.com/bitcoin/bitcoin/pull/9037[test before evict]”. This rule favors previously used IP addresses which are assumed to be more trustworthy than newly tried IP addresses and rate limits how fast an attacker can replace previously tried nodes with new ones they control.
It is more difficult for an attacker to maintain legitimate full relay nodes at multiple locations rather than version message-exchanging nodes.
- Feeler connections to test IP addresses in the ‘New Nodes’ table and only promote valid nodes to the ‘Tried Nodes’ table if they connect to a valid bitcoin node. Nodes in the ‘New Table’ that do not actually connect to a Bitcoin node when tested by the “feeler” connection are not promoted to the ‘Tried Nodes’ table. This also helps ensure that before an attack begins that the ‘Tried Nodes’ table is well populated with a good set of active honest nodes.
  
  This system of filtering IP addresses ensures that an attacker can only slowly and probabilistically introduce new IP addresses into the ‘Tried Nodes’ table of their target. Because both the ‘New Nodes’ and ‘Tried Nodes’ tables are randomly selected from, an attacker will need to occupy both tables with running nodes to increase their chances of being selected to replace an offline peer. All of these measures increase the costs to an attacker by requiring them to acquire more IP addresses and by forcing them to operate these IP addresses for a longer period of time to succeed with an attack.

Week 4 discussion questions

Attacking P2P ⌗

Given the attack surface in P2P, is running a full node worth it?⌗

How many honest nodes do you need to be connected to be sure you are connected to the right network/blockchain?⌗

P2P Connections ⌗

What is the reasoning behind the max inbound and max outbound defaults? For which type of user would they be considered ideal, and when might they be optimized?⌗

What is the rationale behind the "new"/"tried" table design? Were there any prior inspirations within the field of distributed computing?⌗

How does a fixed set of 4 outbound peers get chosen? In what circumstances would you evict or change them?⌗

What are feeler connections, and when are they used?⌗

Eclipse Attacks on Bitcoin’s Peer-to-Peer Network ⌗

What can an attacker do if they are able to eclipse a mining pool?⌗

What is the difficulty of successfully achieving an eclipse attack? What resources and skills would be required to achieve such an attack?⌗

Mitigation techniques ⌗

Transaction Trivia ⌗

Why must transaction unlocking scripts only push numbers to be relayed?⌗

What output scripts are 'IsStandard'?⌗

WWhy must transactions be no less than 82 bytes to be relayed?⌗

WWhy is the blockheight now encoded in the coinbase transaction?⌗

[OPTIONAL] Researching P2P privacy attacks ⌗

How does "diffusion" message spreading work and why is it ineffective against de-anonymization?⌗