Bitcoin address is an encoded string that is used to receive Bitcoin. Unlike the account system of EVM-compatible chains, Bitcoin uses the UTXO model, which defined the bitcoin transaction format and the address format. In this article, I will introduce the format of Bitcoin transaction and explain how the Bitcoin address is derived from the transaction output.
UTXO Model
UTXO (Unspent Transaction Output) is the core concept of Bitcoin transaction. To know what is UTXO, we need to understand the Bitcoin transaction format. A Bitcoin transaction is the process of consuming UTXOs and creating new UTXOs. The transaction is a data structure that contains input and output fields. The input field references the UTXO that is going to be consumed, and the output field creates new UTXOs.
A simplified Bitcoin transaction looks like this:
Transaction {
inputs: [
{
txid: "a1b2c3...def0", // reference to a previous tx
vout: 0, // index of the output in that tx
scriptSig: "<signature> <pubkey>"
}
],
outputs: [
{
value: 0.5, // amount in BTC
scriptPubKey: "OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG"
},
{
value: 0.3, // change back to sender
scriptPubKey: "OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG"
}
]
}
Each output contains two things: an amount (value) and a locking script (scriptPubKey). The locking script defines the conditions that must be met to spend this output. Once an output is created and confirmed on the blockchain, it becomes a UTXO — it sits there, waiting to be consumed by a future transaction.
Each input references a previous UTXO (by txid and vout) and provides an unlocking script (scriptSig) that satisfies the conditions of the referenced output’s locking script.
The key insight is: there are no “accounts” or “balances” in Bitcoin. Your balance is simply the sum of all UTXOs whose locking scripts you can unlock with your private key.
Transaction Process
When Alice wants to send 0.5 BTC to Bob, the following happens:
Alice's wallet scans UTXOs
|
v
Finds a UTXO worth 0.8 BTC locked to Alice's public key hash
|
v
Constructs a transaction:
Input: reference to the 0.8 BTC UTXO + Alice's signature
Output 1: 0.5 BTC locked to Bob's public key hash
Output 2: 0.3 BTC locked to Alice's public key hash (change)
(the remaining 0.0x BTC goes to miners as fee)
|
v
Broadcast to the Bitcoin network
|
v
Miners validate the transaction and include it in a block
|
v
The 0.8 BTC UTXO is now spent (removed from UTXO set)
Two new UTXOs are created: 0.5 BTC for Bob, 0.3 BTC for Alice
Notice that a UTXO must be consumed entirely — you cannot spend “part of” a UTXO. This is why change outputs exist. If Alice has a 0.8 BTC UTXO and wants to send 0.5 BTC, she must consume the entire 0.8 BTC UTXO and create a change output of 0.3 BTC back to herself.
Script Execution
Bitcoin uses a stack-based scripting language called Bitcoin Script to define and verify spending conditions. The validation process concatenates the unlocking script (scriptSig) and the locking script (scriptPubKey), then executes them on a stack machine.
For a standard P2PKH transaction, the combined script looks like this:
scriptSig: <sig> <pubKey>
scriptPubKey: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
The execution proceeds step by step:
Step 1: Push <sig> onto the stack
Stack: [ sig ]
Step 2: Push <pubKey> onto the stack
Stack: [ sig, pubKey ]
Step 3: OP_DUP — duplicate the top element
Stack: [ sig, pubKey, pubKey ]
Step 4: OP_HASH160 — pop top, push RIPEMD160(SHA256(top))
Stack: [ sig, pubKey, hash(pubKey) ]
Step 5: Push <pubKeyHash> (from the locking script)
Stack: [ sig, pubKey, hash(pubKey), pubKeyHash ]
Step 6: OP_EQUALVERIFY — pop two, check equality, fail if not equal
Stack: [ sig, pubKey ]
(This verifies: "is this public key the one that was locked to?")
Step 7: OP_CHECKSIG — pop sig and pubKey, verify signature
Stack: [ true ]
(This verifies: "does this signature match this public key?")
Result: if the stack top is `true`, the transaction is valid.
The beauty of this design is that the locking script does not contain the public key itself — only a hash of it. The spender must provide the actual public key in the unlocking script, and the script engine verifies that it hashes to the expected value. This provides an extra layer of security: even if ECDSA is broken in the future, the attacker still needs to find a public key that hashes to the same value (a preimage attack on RIPEMD160(SHA256(x))).
P2PKH Address
P2PKH (Pay-to-Public-Key-Hash) is the most classic Bitcoin address format, starting with 1. It encodes the hash of a public key into a human-readable string.
Derivation Process
Private Key (256-bit random number)
|
| Elliptic Curve Multiplication (secp256k1)
v
Public Key (65 bytes uncompressed, or 33 bytes compressed)
|
| SHA-256
v
SHA-256 Hash (32 bytes)
|
| RIPEMD-160
v
Public Key Hash (20 bytes)
|
| Add version prefix: 0x00 (mainnet) or 0x6f (testnet)
v
Versioned Payload (21 bytes)
|
| Double SHA-256, take first 4 bytes as checksum
v
Payload + Checksum (25 bytes)
|
| Base58 Encoding
v
P2PKH Address (e.g., "1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa")
The double hashing (SHA-256 then RIPEMD-160) is commonly referred to as HASH160 in Bitcoin. It reduces the public key from 33/65 bytes down to 20 bytes, making addresses shorter while maintaining sufficient collision resistance.
Why Base58?
Bitcoin uses Base58Check encoding instead of the more common Base64 or hexadecimal. Base58 removes characters that are visually ambiguous:
| Removed | Reason |
|---|---|
0 (zero) |
Confused with O |
O (uppercase o) |
Confused with 0 |
I (uppercase i) |
Confused with l |
l (lowercase L) |
Confused with I |
+, / |
Not URL-safe, problematic in file systems |
The 4-byte checksum appended before encoding allows wallets to detect typos. If you accidentally change a character in a Bitcoin address, the checksum verification will almost certainly fail, preventing you from sending coins to a wrong address.
Locking Script
When someone sends BTC to a P2PKH address, the transaction output contains:
scriptPubKey: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
The <pubKeyHash> is the 20-byte HASH160 of the recipient’s public key — exactly the data encoded inside the Base58 address (minus the version byte and checksum).
P2SH Address
P2SH (Pay-to-Script-Hash) is a more flexible address format, starting with 3. Instead of locking funds to a public key hash, it locks funds to the hash of an arbitrary script. This enables complex spending conditions like multi-signature wallets.
The Problem P2SH Solves
Before P2SH, if Alice wanted to set up a 2-of-3 multisig, the full multisig script would need to appear in the sender’s transaction output:
scriptPubKey: OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIG
This has several problems:
- Burden on the sender: the sender needs to know and include the full multisig script
- Larger transaction size: the full script is stored in the output, increasing fees
- No standard address format: there is no way to encode a multisig condition as a simple address string
P2SH (BIP-16) solves all of these by hashing the script first:
How P2SH Works
Step 1: Alice creates a redeem script (the actual spending condition)
redeemScript: OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIG
Step 2: Hash the redeem script
scriptHash = HASH160(redeemScript) // 20 bytes
Step 3: The locking script simply checks the hash
scriptPubKey: OP_HASH160 <scriptHash> OP_EQUAL
Step 4: Encode as a P2SH address (version prefix 0x05)
Address: "3J98t1WpEZ73CNmQviecrnyiWrnqRhWNLy"
Now the sender only needs to know the P2SH address — they don’t need to know the underlying multisig script at all. The complexity is hidden from the sender.
Spending from P2SH
When Alice (or rather, 2 of the 3 key holders) wants to spend the funds:
scriptSig: OP_0 <sig1> <sig2> <serialized redeemScript>
The Bitcoin script engine validates this in two phases:
Phase 1: Verify the script hash matches
Hash the provided redeemScript
Compare with <scriptHash> in the locking script
If equal, proceed to Phase 2
Phase 2: Execute the actual redeemScript
Deserialize the redeemScript
Execute it with the provided signatures
OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIG
Verify that at least 2 of 3 signatures are valid
P2PKH vs P2SH Comparison
| Aspect | P2PKH | P2SH |
|---|---|---|
| Prefix | 1 |
3 |
| Version byte | 0x00 (mainnet) |
0x05 (mainnet) |
| Locks to | Hash of a public key | Hash of a script |
| Spending requires | Signature + Public key | Signatures + Serialized redeem script |
| Use case | Simple single-key payments | Multi-sig, time-locks, custom conditions |
| Complexity visible to sender? | N/A (always simple) | No — hidden behind the script hash |
| Output size | 25 bytes | 23 bytes |
| Introduced | Genesis (2009) | BIP-16 (2012) |
Beyond P2PKH and P2SH
Bitcoin has since introduced newer address formats:
-
P2WPKH (Pay-to-Witness-Public-Key-Hash): SegWit v0 equivalent of P2PKH, starting with
bc1q. Moves the signature data to the witness field, reducing transaction weight and fees. -
P2WSH (Pay-to-Witness-Script-Hash): SegWit v0 equivalent of P2SH, also starting with
bc1q. Uses SHA-256 (32 bytes) instead of HASH160 (20 bytes) for the script hash, providing stronger collision resistance. -
P2TR (Pay-to-Taproot): SegWit v1 (BIP-341), starting with
bc1p. Uses Schnorr signatures and Merkle trees to enable even more complex spending conditions while maintaining the appearance of a simple single-key spend on-chain. This is the current state-of-the-art.
These newer formats use Bech32/Bech32m encoding instead of Base58Check, which provides better error detection and is case-insensitive.
The evolution from P2PKH to P2TR tells the story of Bitcoin’s ongoing effort to balance simplicity, privacy, flexibility, and efficiency — all while maintaining backward compatibility with the UTXO model that Satoshi designed in 2008.