Overview
A normal Bitcoin transaction uses at least one output from a previous transaction by referencing the transaction ID (TXID) of the previous transaction. These unspent outputs can only be spent once, and if they could be spent twice, you could double-spend your Bitcoins, rendering them worthless. However, there actually happen to be two sets of transactions that are exactly the same in Bitcoin. This is possible because coinbase transactions do not have any transaction inputs, but instead have newly generated coins. Therefore, it is possible for two different coinbase transactions to send the same amount to the same address and be constructed in exactly the same way, making them identical. Since these transactions are identical, the TXIDs also match, since the TXID is a hash digest of the transaction data. The only other way a TXID could be duplicated is by a hash collision, which is considered unlikely and unachievable with cryptographically secure hash functions. Hash collisions like SHA256 have never happened in Bitcoin or anywhere else.
Both sets of duplicate transactions occurred in close time, between 08:37 UTC on November 14, 2010 and 00:38 UTC on November 15, 2010, a span of about 16 hours. The first set of duplicate transactions is sandwiched between the second set. We classify d5d2…8599 as the first duplicate transaction because it became a duplicate first, although, oddly, it first appeared on the blockchain after another duplicate transaction , e3bf…b468 .
Repeat transaction details
In the images below, you can see two screenshots from the mempool.space block explorer, showing the first duplicate transaction being repeated in two different blocks.
Interestingly, the mempool.space block explorer defaults to showing the earlier block in the case of d5d2….8599 and the later block in the case of e3bf….b468 when the relevant URLs are entered in a web browser. Blockstream.info and Btcscan.org have the same behavior as mempool.space. On the other hand, Blockchain.com and Blockchair.com behave differently and always show the latest version of duplicate transactions when the URL is entered in the browser, according to our basic testing.
Of the four blocks in question, only one (block 91,812) contained an additional transaction, which combined the 1 BTC and 19 BTC outputs into a single 20 BTC output.
Can these outputs be spent?
Since there are two sets of the same TXID, this creates a referencing problem for subsequent transactions. Each duplicate transaction is worth 50 BTC. Therefore, these duplicate transactions involve a total of 4 x 50 BTC = 200 BTC, or depending on how you interpret it, it could involve 2 x 50 BTC = a100 BTC. In a way, there are 100 BTC that don't actually exist. As of today, all 200 BTC are unspent. As far as we know (and we could be wrong here), if someone has the private keys associated with these outputs, they can spend these bitcoins. However, once spent, the UTXO is deleted from the database and the duplicate 50 BTC will therefore be unspendable and lost, so only the 100 BTC may be recovered. As for which block these coins will come from if they are spent, whether it is earlier or more recent, this may be undefined or impossible to determine.
This person could have spent all the bitcoins before creating the duplicate transaction, and then created duplicate outputs, creating new entries in the database of unspent outputs. This would mean not only duplicate transactions, but also duplicate transactions that could potentially have duplicate spent outputs. If this happened, when these outputs were spent, more duplicate transactions would be created, forming a kind of duplicate chain. One has to be careful with the order of events and always spend before creating a duplicate, otherwise the bitcoins could be lost forever. These new duplicate transactions would not be coinbase transactions, but "normal" transactions. Fortunately, this has never happened.
The problem of duplicate transactions
Duplicate transactions are obviously bad. They cause confusion for wallets and block explorers, and make it unclear where the bitcoins came from. It also opens up many attacks and vulnerabilities. For example, you can pay someone twice with two duplicate transactions. Then, when the parties decide to try to use the funds, they may find that only half of the funds can be recovered. This can be an attack on an exchange, for example, to try to bankrupt it, while the attacker loses nothing because they can withdraw the funds immediately after depositing them.
Disallow transactions with duplicate TXIDs
In order to alleviate the duplicate transaction problem, in February 2012, Bitcoin developer Pieter Wuille proposed the BIP30 soft fork solution, which prohibits the use of duplicate TXIDs for transactions unless the previous TXID has been spent. This soft fork applies to all blocks after March 15, 2012.
In September 2012, Bitcoin developer Greg Maxwell modified this rule so that the BIP30 check applies to all blocks, not just those after March 15, 2012. The exceptions are the two duplicate transactions mentioned earlier in this article. This fixes some DOS vulnerabilities. Technically, this is another soft fork, although the rule change only applies to blocks older than 6 months, so it does not have any of the risks associated with normal protocol rule changes.
This BIP30 check is computationally expensive. The node needs to check all transaction outputs in the new block and check whether these output endpoints already exist in the UTXO. This is probably why Wuille only checks unspent outputs. If all outputs are checked, the computational cost will be higher and pruning will not be possible.
BIP34
In July 2012, Bitcoin developer Gavin Andresen proposed the BIP34 soft fork, which was activated in March 2013. This protocol change requires the coinbase transaction to include the block height, which also makes block versioning possible. The block height is added as the first item in the coinbase transaction script Sig. The first byte in the coinbase scriptSig is the number of bytes used for the block height number, and the following bytes are the block height number itself. For the first c160 years (223 / (144 blocks per day * 365 days per year)), the first byte should be 0x03. This is why today's coinbase ScriptSig (HEX) always starts with 03. This soft fork seems to have completely solved the duplicate transaction problem, and now all transactions should be unique.
Since BIP34 had already been adopted, in November 2015, Bitcoin developer Alex Morcos added a pull request to the Bitcoin Core software repository, a change that meant nodes would stop doing the BIP30 check. After all, since BIP34 fixed the problem, this expensive check was no longer necessary. Although it was not known at the time, this was technically a hard fork for some very rare blocks in the future. Now it seems that the potential hard fork is not important because almost no one is running node software before November 2015. At forkmonitor.info , we are running Bitcoin Core 0.10.3, which was released in October 2015. Therefore, this is a pre-hard fork rule, and the client is still doing the expensive BIP30 check.
Block ,983,702 Questions
It turns out that there were some coinbase transactions in blocks before BIP34 was activated where the first byte of the scriptSigs used at that time happened to match a valid future block height. So while BIP34 does fix this problem in almost all cases, it is not a complete 100% fix. In 2018, Bitcoin developer John Newbery printed out a full list of these potential duplicates, as shown in the table below.
*Note: These blocks have already generated Coinbase transactions in 2012 and 2017 and are not duplicates. 209,921 blocks (only 79 blocks away from the first halving) cannot be duplicates because BIP30 was implemented in the meantime.
Source: https://gist.github.com/jnewbery/df0a98f3d2fea52e487001bf2b9ef1fd
Number of potentially duplicate Coinbase transactions by year
Source: https://gist.github.com/jnewbery/df0a98f3d2fea52e487001bf2b9ef1fd
Therefore, the next block that could have duplicate transactions is 1,983,702, which will be generated around January 2046. The Coinbase transaction in block 164,384, which was generated in January 2012, sent 170 BTC to seven different output addresses. Therefore, if miners in 2046 want to carry out this attack, they will not only need to be lucky enough to find this block, but also need to burn less than 170 BTC in fees, with a total cost of slightly more than 170 BTC, including the opportunity cost of the 0.09765625 BTC block subsidy. At the current Bitcoin price of $88,500, this would cost more than $15 million. As for who owns the seven addresses of the 2012 coinbase transaction, it is still unknown, and it is very likely that the keys have been lost. At present, all seven output addresses of the Coinbase transaction have been used, and three of them were used in the same transaction . We believe that these funds may be related to the Pirate40 Ponzi scheme, but this is just our speculation. Therefore, the attack looks not only costly, but also almost useless to the attacker. It would be a significant expense to remove the November 2015 node from the network 31 years ago in a hard fork.
The next vulnerable block that could be copied is 169985 from March 2012. This coinbase only cost just over 50 BTC, which is far less than 170 BTC. Of course, 50 BTC was the subsidy at the time, and when this coinbase transaction becomes easily duplicable in 2078, the subsidy will be much lower. So to exploit this, miners would need to burn about 50 BTC in fees, which they cannot get back because they would have to be sent to old output from 2012. No one knows what the price of Bitcoin will be in 2078, but the cost of such an attack could also be prohibitively high. So this issue may not be a major risk for Bitcoin, but it is still a concern.
Since the 2017 SegWit upgrade, Coinbase transactions can also contain commitments to all transactions in a block. These pre-BIP34 blocks did not contain witness commitments. Therefore, to produce a duplicate Coinbase transaction, miners need to exclude any SegWit output redemption transactions from the block, which further increases the opportunity cost of the attack because the block may not contain many other transactions that pay the fee.
in conclusion
Given the difficulty and cost of copying transactions, and the rarity of the opportunity to exploit it, the copy trading vulnerability doesn't feel like a major security issue for Bitcoin. Still, it's interesting to think about, given the timescales involved and the novelty of duplicate transactions. Still, developers have spent a lot of time on this issue over the years, with the date 2046 being a possible deadline for fixing it in some developers' minds. There are a number of ways to fix this bug, which may require a soft fork. One possible fix is to enforce the SegWit commitment.