Author: Nickqiao & Faust & Shew Wang, Geekweb3
Abstract:Recently, Delphi Digital released a Bitcoin second-layer technology research report titled "The Dawn of Bitcoin Programmability: Paving the Way for Rollups", which systematically sorted out the core concepts related to Bitcoin Rollup, such as the BitVM family bucket, OP_CAT and Covenant restrictions, the Bitcoin ecosystem DA layer, bridges, and the four major Bitcoin second layers that use BitVM, such as Bitlayer, Citrea, Yona, and Bob.
Although the research report generally shows the general picture of Bitcoin's second-layer technology, it is relatively general and lacks detailed descriptions, which makes people confused. Geekweb3 conducted an in-depth exploration based on the Delphi research report, trying to make more people understand BitVM and other technologies in a systematic way.
We will work with the Bitlayer research team and the BitVM Chinese community to launch a series of columns called "Approaching BTC", focusing on key topics such as BitVM, OP_CAT and Bitcoin cross-chain bridges for a long time, and are committed to dispelling the mystery of Bitcoin's second-layer related technologies for more people and paving the way for more enthusiasts.
Text:A few months ago, Robin Linus, head of ZeroSync, published an article titled "BitVM: Compute Anything on Bitcoin", which formally proposed the concept of BitVM and promoted the progress of Bitcoin's second-layer technology. It can be said that this is one of the most revolutionary innovations in the Bitcoin ecosystem, which has detonated the entire Bitcoin second-layer ecosystem, attracted the participation of star projects such as Bitlayer, Citrea, and BOB, and brought vitality to the entire market.
After that,more researchers participated in improving BitVM, and successively launched different iterative versions such as BitVM1, BitVM2, BitVMX, and BitSNARK. The general situation is as follows:
The BitVM implementation white paper first proposed by Robin Linus last year is a BitVM implementation scheme based on a fictitious logic gate circuit, called BitVM0;
In several subsequent speeches and interviews, Robin Linus informally introduced a BitVM scheme based on a fictitious CPU (called BitVM1), which is similar to Optimism's fraud proof system Cannon, and can use Bitcoin scripts to simulate the effect of a general CPU off-chain.
Robin Linus also proposed BitVM2, a permissionless, single-step, non-interactive fraud proof protocol.
Members of Rootstock Labs and Fairgate Labs released the BitVMX white paper. Similar to BitVM1, they hope to simulate the effect of a general CPU (off-chain) through Bitcoin scripts.
Currently, the construction of the BitVM-related developer ecosystem is becoming clearer, and the iteration and improvement of surrounding tools are visible to the naked eye. Compared with last year, the current BitVM ecosystem has changed from the initial "castle in the air" to "vaguely visible", which has also attracted more and more developers and VCs to rush into the Bitcoin ecosystem.
But for most people, it is not easy to understand the technical terms related to BitVM and Bitcoin Layer 2, because you must first have a systematic understanding of the basic knowledge around it, especially the background knowledge such as Bitcoin script and Taproot. The reference materials currently available on the Internet are either too long and full of nonsense, or the explanations are not thorough enough to make people confused. We are committed to solving the above problems and strive to help more people understand the surrounding knowledge of Bitcoin Layer 2 in the clearest possible language, and establish a systematic understanding of the BitVM system.
MATT and Commitment: The Basic Idea of BitVM
First of all, we must emphasize that the basic idea of BitVM is MATT, which means Merkleize All The Things. It mainly refers to the use of a tree-like data storage structure such as Merkle Tree to display the complex program execution process and try to make Bitcoin Native verify fraud proof.
Although MATT can express a complex program and its data processing traces, it will not directly publish these data on the BTC chain because the overall scale of these data is very large. The solution using MATT only stores data in the Merkle tree under the chain, and only publishes the summary (Merkle Root) at the top of the Merkle tree to the chain. This Merkle tree mainly contains three core contents:
· Smart contract script code
· Data required for the contract
· Traces left during contract execution (records of changes to memory and CPU registers when smart contracts are executed in virtual machines such as EVM)
(A simple Merkle Tree diagram, whose Merkle Root is obtained by multi-layer hash calculation of the 8 data fragments at the bottom of the diagram)
Under the MATT scheme, only the extremely small Merkle Root is stored on the chain. The complete data set contained in Tree is stored off-chain, which uses an idea called "commitment". Here is an explanation of what "commitment" is.
Commitment is similar to a simplified statement, which can be understood as a "fingerprint" obtained by compressing a large amount of data. Generally speaking, the person who publishes a "commitment" on the chain will claim that certain data stored off-chain is accurate, and these off-chain data must correspond to a simplified statement, which is the "commitment".
At some point, the hash of the data can be used as a "commitment" to the data itself. Other commitment schemes include KZG commitment or Merkle Tree. In the fraud proof protocol commonly used by Layer2, the data publisher will publish the complete data set off-chain and publish the commitment of the data set on the chain. If someone finds invalid data in the off-chain data set, they will challenge the data commitment on the chain.
Through commitment, the second layer can compress a large amount of data and only publish its "commitment" on the Bitcoin chain. Of course, it is also necessary to ensure that the complete data set published off-chain can be observed by the outside world.
Currently, several major BitVM solutions such as BitVM0, BitVM1, BitVM2 and BitVMX basically use similar abstract structures:
1. Program decomposition and commitment: First, decompose complex programs into a large number of relatively basic opcodes (compilation), and then record the traces generated by these opcodes during specific execution (to put it bluntly, it is a record of the entire state change when a program runs in the CPU and memory, called Trace). After that, we organize all the data, including traces and opcodes, into a data set, and then generate a commitment for the data set.
Specific commitment schemes can take many forms, such as: Merkle tree, PIOPs (various ZK algorithms), hash function
2. Asset pledge and pre-signature:Data publishers and verifiers need to lock a certain amount of assets on the chain through pre-signatures, and there will be restrictions.These conditions will be triggered specifically for possible future situations. If the data publisher does something evil, the verifier can submit a proof to take away the data publisher's assets
3.Data and commitment release:The data publisher publishes the commitment on the chain and the complete data set off the chain. The verifier retrieves the data set and checks if there are any errors. Each part of the off-chain data set is associated with the commitment on the chain.
4. Challenge and punishment:Once the verifier finds that the data provided by the data publisher is wrong, it will take this part of the data to the chain for direct verification (this part of the data must be cut very finely first), which is the logic of fraud proof. If the verification result shows that the data publisher did provide invalid data off-chain, its assets will be taken away by the verifier who challenged him.
To sum up, the data publisher Alice discloses all traces generated during the execution of the second-layer transaction off-chain and publishes the corresponding commitment to the chain. If you want to prove that a part of the data is wrong, first prove to the Bitcoin node that this part of the data is related to the commitment on the chain, that is, prove that the data is made public by Alice herself, and then let the Bitcoin node confirm that this part of the data is wrong.
Now we have a general understanding of the overall idea of BitVM, and all BitVM variants are basically inseparable from the above paradigm. Next, let's start learning and understanding some of the important technologies used in the above process, starting with the most basic Bitcoin scripts, Taproot, and pre-signatures.
What is Bitcoin Script
Bitcoin-related knowledge is more difficult to understand than Ethereum. Even the most basic transfer behavior involves a series of concepts, including UTXO (unspent transaction output), locking script (also known as ScriptPubKey) and unlocking script (also known as ScriptSig). Let's first explain these main concepts.
(An example of a Bitcoin script code, which consists of operation codes at a lower level than high-level languages)
Ethereum's asset expression is more like Alipay or WeChat. Each transfer is just addition and subtraction of the balances of different accounts. This method is account-centric, and the asset balance is just a number under the account name. Bitcoin's asset expression is more like gold. Each piece of gold (UTXO) will be marked with its owner. Transfer is actually the destruction of the old UTXO and the generation of a new UTXO (the owner will change).
Bitcoin UTXO contains two key fields:
Amount, in satoshi (100 million satoshis equal one BTC);
Lock script, also known as "ScriptPubKey", defines the unlocking conditions of UTXO.
It should be noted that the ownership of Bitcoin UTXO is expressed through the lock script. If you want to transfer your UTXO to Sam, you can initiate a transaction to destroy one of your UTXOs and write the unlocking condition of the newly generated UTXO as "only Sam can unlock".
Afterwards, if Sam wants to use these bitcoins, he needs to submit an unlocking script (ScriptSig), in which Sam must present his own digital signature to prove that he is Sam himself.
If the unlocking script matches the aforementioned locking script, Sam can unlock and transfer these bitcoins to others.
(The unlocking script must match the locking script)
From the perspective of expression, each transaction on the Bitcoin chain corresponds to multiple Inputs and Outputs, and each Input must declare a certain UTXO that you want to unlock, and submit an unlocking script to unlock and destroy the UTXO; the newly generated UTXO information will be displayed in the Output, and the content of the locking script will be made public.
For example, in the Input of a transaction, you prove that you are Sam, unlock multiple UTXOs given to you by others, destroy them uniformly, and then generate multiple new UTXOs and declare that xxx will unlock them in the future.
Specifically, in the Input data of the transaction, you need to declare which UTXOs you want to unlock and indicate the "storage location" of these UTXO data.It should be noted here that Bitcoin and Ethereum are completely different. Ethereum provides two accounts, contract accounts and EOA accounts, to store data. The asset balance is recorded as a number under the name of the contract account or EOA account, and is uniformly placed in a database called "world state". When transferring money, specific accounts are modified directly from the "world state" to facilitate the location of data storage;
Bitcoin does not have a world state design, and asset data is scattered and stored in past blocks (that is, the unlocked UTXO data is stored separately in the OutPut of each transaction).
If you want to unlock a certain UTXO, you need to indicate which past transaction's Output the UTXO information exists in, and show the transaction's ID (that is, its hash), so that the Bitcoin node can search for it in the historical records. If you want to query the Bitcoin balance of a certain address, you need to traverse all blocks from the beginning to find the unlocked UTXO associated with the xx address.
When using a Bitcoin wallet, you can quickly check the Bitcoin balance of a certain address. In many cases, this is because the wallet service itself has established an index for all addresses by scanning blocks, which makes it easy for us to query quickly.
(When you generate a transaction statement to send your UTXO to others, you need to mark the position of the UTXO in the Bitcoin history record according to the transaction hash/ID to which these UTXOs belong)
Interestingly, the results of Bitcoin transactions are calculated off-chain. When users generate transactions on local devices, they must directly create all the Input and Output, which is equivalent to calculating the output results of the transaction. Transactions are broadcast to the Bitcoin network and verified by nodes before they are put on the chain. This mode of "off-chain calculation-on-chain verification" is completely different from Ethereum. On Ethereum, you only need to provide transaction input parameters, and the transaction results are calculated and output by Ethereum nodes.
In addition, the Locking Script of UTXO is customizable. You can set UTXO as "unlockable by the owner of a certain Bitcoin address", and the owner of the address needs to provide a digital signature and public key (P2PKH). In the Pay-to-Script-Hash (P2SH) transaction type, you can add a Script Hash to the UTXO locking script. Whoever can submit the script original image corresponding to this Hash and meet the conditions preset in the script original image can unlock the UTXO. The Taproot script that BitVM relies on uses features similar to P2SH.
How to trigger Bitcoin script
Here we first use P2PKH as an example to introduce the triggering method of Bitcoin script. Only by understanding its triggering method can we understand the more complex Taproot and BitVM. P2PKH stands for "Pay to Public Key Hash". Under this scheme, a public key hash will be set in the locking script of UTXO. When unlocking, the public key corresponding to the hash needs to be submitted, which is basically the same as the conventional Bitcoin transfer idea.
At this time, the Bitcoin node must make sure that the public key in the unlocking script matches the public key hash specified in the locking script, that is, it is necessary to make sure that the "key" submitted by the unlocker matches the "lock" preset by UTXO.
Furthermore, under the P2PKH scheme, after receiving the transaction, the Bitcoin node will splice the unlocking script ScriptSig given by the user with the locking script ScriptPubkey of the UTXO to be unlocked, and execute them in the execution environment of the BTC script. The following figure shows the splicing result before execution:
Readers may not understand the script execution environment of BTC, so we will briefly introduce it here. First of all, BTC scripts contain two elements:
data and operation codes. These data and operation codes will be pushed into the stack in order from left to right and executed according to the specified logic to get the final result (what is a stack will not be elaborated here, readers can chatgpt by themselves).
Take the above picture as an example, the unlocking script ScriptSig uploaded by someone on the left contains his digital signature and public key, while the locking script ScriptPubkey on the right contains a section of opcodes and data set by the UTXO creator when generating the UTXO (here we don't need to understand the meaning of each opcode, just understand the general idea).
The opcodes such as DUP, HASH160, and EQUALVERIFY in the locking script on the right side of the above picture are responsible for taking the hash of the Public key carried in the unlocking script on the left and comparing it with the Public key hash preset in the locking script. If the two are equal, it means that the public key uploaded in the unlocking script matches the public key hash preset in the locking script, which passes the first verification.
However, there is a problem. The content of the UTXO locking script is actually public on the chain. Anyone can observe the public key hash contained therein, and anyone can upload the corresponding public key and lie that they are the "appointed" person. Therefore, after verifying the public key and public key hash, it is also necessary to verify whether the transaction initiator is really the actual controller of the public key, which requires verification of the digital signature. The CHECKSIG opcode in the locking script is responsible for verifying the digital signature.
To summarize, under the P2PKH scheme, the unlocking script submitted by the transaction initiator contains a public key and a digital signature. The public key must match the public key hash specified in the locking script, and the digital signature of the transaction must be correct. Only when these conditions are met can the UTXO be successfully unlocked.
(This picture is dynamic: Schematic diagram of Bitcoin unlocking script under the P2PKH scheme
Source:https://learnmeabitcoin.com/technical/script )
Of course, the Bitcoin network supports multiple transaction types, not only Pay to public key/public key hash, but also P2SH (Pay to Script hash), etc. Everything depends on how the custom locking script is set when UTXO is created.
It should be noted here that, Under the P2SH scheme, a Script Hash can be preset in the locking script, and the unlocking script needs to submit the script content corresponding to the Script Hash in full. The Bitcoin node can execute this script. If the logic of multi-signature verification is defined in this script, the effect of a multi-signature wallet can be achieved on the Bitcoin chain.
Of course, under the P2SH scheme, the UTXO creator must let the person who unlocks the UTXO in the future know the script content corresponding to the Script Hash in advance. As long as both parties know the content of this Script, we can implement more complex business logic than multi-signature.
One thing to note here is that the Bitcoin chain (block) does not directly record which UTXO is associated with which address. It only records which public key hash/script hash the UTXO can be unlocked by, but we can quickly calculate the corresponding address based on the public key hash/script hash (the part that looks like garbled code displayed on the wallet interface).
The reason why we can see xx amount of Bitcoin under xx address in the block browser and wallet interface is because the block browser and wallet project help you parse this data, scan all blocks and calculate the corresponding "address" based on the public key hash/script hash declared in the locking script, and then display how many Bitcoins are under the xx address.
Segregated Witness and Witness
When we understand the idea of P2SH, we are one step closer to Taproot, which BitVM relies on. But before that, we need to understand an important concept: Witness and Segregated Witness.
Reviewing the unlocking script and locking script mentioned earlier, as well as the UTXO unlocking process, we will find a problem: the digital signature of the transaction is included in the unlocking script, and the unlocking script cannot be overwritten when generating the signature (the parameters used to generate the signature cannot include the signature itself), so the digital signature can only cover the part outside the unlocking script, that is, it can only be associated with the main part of the transaction data, and cannot completely cover the transaction data.
In this way, even if the unlocking script of the transaction is slightly tampered with by the middleman, it will not affect the verification result. For example, a Bitcoin node or mining pool can insert other data into the unlocking script of a transaction, which will slightly change the transaction data without affecting the signature verification and transaction results, and the final calculated transaction hash/transaction ID will also change. This is called the transaction ductility problem.
The disadvantage is that if you plan to initiate multiple transactions in succession and there is a sequence dependency (for example, transaction 3 references the output of transaction 2, and transaction 2 references the output of transaction 1), then the subsequent transaction must reference the ID (hash) of the previous transaction. Any middleman such as a mining pool or Bitcoin node can fine-tune the content in the unlocking script, so that the hash of the transaction after it is on the chain is inconsistent with what you expected, then the multiple sequentially associated transactions you created in advance will become invalid.
In fact, in the DLC bridge and BitVM2 solutions, transactions with sequential associations are constructed in batches, so the aforementioned scenarios are not uncommon.
In short, the transaction ductility problem is because the unlocking script data is included in the calculation of the transaction ID/hash, and the middleman such as the Bitcoin node can fine-tune the content of the unlocking script, resulting in the transaction ID not being consistent with the user's expectations. In fact, this is a historical burden left by Bitcoin's early design.
The later launched Segregated Witness/SegWit upgrade actually completely decouples the transaction ID and the unlocking script, and does not need to include the unlocking script data when calculating the transaction hash. The UTXO locking script that follows the SegWit upgrade will set an opcode called "OP_0" at the first position by default as a marker; and the corresponding unlocking script is renamed from SigScript to Witness.
After following the isolated witness rules, the transaction ductility problem will be properly solved, and you don't need to worry about the transaction data sent to the Bitcoin node being fine-tuned. Of course, we don't need to think too complicated. The function of P2WSH is no different from the P2SH mentioned earlier. You can preset a script hash in the UTXO locking script, and wait for the submitter of the unlocking script to submit the script content corresponding to the hash to the chain and execute it.
But if the script content you want to implement is particularly large and contains a lot of code, you can't submit the complete script to the Bitcoin chain through conventional methods (each block has a size limit). What should I do? This requires Taproot to simplify the script content on the chain, and BitVM is a complex solution built based on Taproot.