Bitcoin Address

Mar 2, 2024

Bitcoin address is an encoded string that is used to receive Bitcoin. Unlike the account system of EVM-compatible chains, Bitcoin uses the UTXO model, which defined the bitcoin transaction format and the address format. In this article, I will introduce the format of Bitcoin transaction and explain how the Bitcoin address is derived from the transaction output.

UTXO Model

UTXO (Unspent Transaction Output) is the core concept of Bitcoin transaction. To know what is UTXO, we need to understand the Bitcoin transaction format. A Bitcoin transaction is the process of consuming UTXOs and creating new UTXOs. The transaction is a data structure that contains input and output fields. The input field references the UTXO that is going to be consumed, and the output field creates new UTXOs.

A simplified Bitcoin transaction looks like this:

Transaction {
    inputs: [
        {
            txid:      "a1b2c3...def0",   // reference to a previous tx
            vout:      0,                  // index of the output in that tx
            scriptSig: "<signature> <pubkey>"
        }
    ],
    outputs: [
        {
            value:        0.5,             // amount in BTC
            scriptPubKey: "OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG"
        },
        {
            value:        0.3,             // change back to sender
            scriptPubKey: "OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG"
        }
    ]
}

Each output contains two things: an amount (value) and a locking script (scriptPubKey). The locking script defines the conditions that must be met to spend this output. Once an output is created and confirmed on the blockchain, it becomes a UTXO — it sits there, waiting to be consumed by a future transaction.

Each input references a previous UTXO (by txid and vout) and provides an unlocking script (scriptSig) that satisfies the conditions of the referenced output’s locking script.

The key insight is: there are no “accounts” or “balances” in Bitcoin. Your balance is simply the sum of all UTXOs whose locking scripts you can unlock with your private key.

Transaction Process

When Alice wants to send 0.5 BTC to Bob, the following happens:

Alice's wallet scans UTXOs
    |
    v
Finds a UTXO worth 0.8 BTC locked to Alice's public key hash
    |
    v
Constructs a transaction:
    Input:  reference to the 0.8 BTC UTXO + Alice's signature
    Output 1: 0.5 BTC locked to Bob's public key hash
    Output 2: 0.3 BTC locked to Alice's public key hash (change)
    (the remaining 0.0x BTC goes to miners as fee)
    |
    v
Broadcast to the Bitcoin network
    |
    v
Miners validate the transaction and include it in a block
    |
    v
The 0.8 BTC UTXO is now spent (removed from UTXO set)
Two new UTXOs are created: 0.5 BTC for Bob, 0.3 BTC for Alice

Notice that a UTXO must be consumed entirely — you cannot spend “part of” a UTXO. This is why change outputs exist. If Alice has a 0.8 BTC UTXO and wants to send 0.5 BTC, she must consume the entire 0.8 BTC UTXO and create a change output of 0.3 BTC back to herself.

Script Execution

Bitcoin uses a stack-based scripting language called Bitcoin Script to define and verify spending conditions. The validation process concatenates the unlocking script (scriptSig) and the locking script (scriptPubKey), then executes them on a stack machine.

For a standard P2PKH transaction, the combined script looks like this:

scriptSig:    <sig> <pubKey>
scriptPubKey: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG

The execution proceeds step by step:

Step 1: Push <sig> onto the stack
Stack: [ sig ]

Step 2: Push <pubKey> onto the stack
Stack: [ sig, pubKey ]

Step 3: OP_DUP — duplicate the top element
Stack: [ sig, pubKey, pubKey ]

Step 4: OP_HASH160 — pop top, push RIPEMD160(SHA256(top))
Stack: [ sig, pubKey, hash(pubKey) ]

Step 5: Push <pubKeyHash> (from the locking script)
Stack: [ sig, pubKey, hash(pubKey), pubKeyHash ]

Step 6: OP_EQUALVERIFY — pop two, check equality, fail if not equal
Stack: [ sig, pubKey ]
(This verifies: "is this public key the one that was locked to?")

Step 7: OP_CHECKSIG — pop sig and pubKey, verify signature
Stack: [ true ]
(This verifies: "does this signature match this public key?")

Result: if the stack top is `true`, the transaction is valid.

The beauty of this design is that the locking script does not contain the public key itself — only a hash of it. The spender must provide the actual public key in the unlocking script, and the script engine verifies that it hashes to the expected value. This provides an extra layer of security: even if ECDSA is broken in the future, the attacker still needs to find a public key that hashes to the same value (a preimage attack on RIPEMD160(SHA256(x))).

P2PKH Address

P2PKH (Pay-to-Public-Key-Hash) is the most classic Bitcoin address format, starting with 1. It encodes the hash of a public key into a human-readable string.

Derivation Process

Private Key (256-bit random number)
    |
    | Elliptic Curve Multiplication (secp256k1)
    v
Public Key (65 bytes uncompressed, or 33 bytes compressed)
    |
    | SHA-256
    v
SHA-256 Hash (32 bytes)
    |
    | RIPEMD-160
    v
Public Key Hash (20 bytes)
    |
    | Add version prefix: 0x00 (mainnet) or 0x6f (testnet)
    v
Versioned Payload (21 bytes)
    |
    | Double SHA-256, take first 4 bytes as checksum
    v
Payload + Checksum (25 bytes)
    |
    | Base58 Encoding
    v
P2PKH Address (e.g., "1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa")

The double hashing (SHA-256 then RIPEMD-160) is commonly referred to as HASH160 in Bitcoin. It reduces the public key from 33/65 bytes down to 20 bytes, making addresses shorter while maintaining sufficient collision resistance.

Why Base58?

Bitcoin uses Base58Check encoding instead of the more common Base64 or hexadecimal. Base58 removes characters that are visually ambiguous:

Removed	Reason
`0` (zero)	Confused with `O`
`O` (uppercase o)	Confused with `0`
`I` (uppercase i)	Confused with `l`
`l` (lowercase L)	Confused with `I`
`+`, `/`	Not URL-safe, problematic in file systems

The 4-byte checksum appended before encoding allows wallets to detect typos. If you accidentally change a character in a Bitcoin address, the checksum verification will almost certainly fail, preventing you from sending coins to a wrong address.

Locking Script

When someone sends BTC to a P2PKH address, the transaction output contains:

scriptPubKey: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG

The <pubKeyHash> is the 20-byte HASH160 of the recipient’s public key — exactly the data encoded inside the Base58 address (minus the version byte and checksum).

P2SH Address

P2SH (Pay-to-Script-Hash) is a more flexible address format, starting with 3. Instead of locking funds to a public key hash, it locks funds to the hash of an arbitrary script. This enables complex spending conditions like multi-signature wallets.

The Problem P2SH Solves

Before P2SH, if Alice wanted to set up a 2-of-3 multisig, the full multisig script would need to appear in the sender’s transaction output:

scriptPubKey: OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIG

This has several problems:

Burden on the sender: the sender needs to know and include the full multisig script
Larger transaction size: the full script is stored in the output, increasing fees
No standard address format: there is no way to encode a multisig condition as a simple address string

P2SH (BIP-16) solves all of these by hashing the script first:

How P2SH Works

Step 1: Alice creates a redeem script (the actual spending condition)

    redeemScript: OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIG

Step 2: Hash the redeem script

    scriptHash = HASH160(redeemScript)    // 20 bytes

Step 3: The locking script simply checks the hash

    scriptPubKey: OP_HASH160 <scriptHash> OP_EQUAL

Step 4: Encode as a P2SH address (version prefix 0x05)

    Address: "3J98t1WpEZ73CNmQviecrnyiWrnqRhWNLy"

Now the sender only needs to know the P2SH address — they don’t need to know the underlying multisig script at all. The complexity is hidden from the sender.

Spending from P2SH

When Alice (or rather, 2 of the 3 key holders) wants to spend the funds:

scriptSig: OP_0 <sig1> <sig2> <serialized redeemScript>

The Bitcoin script engine validates this in two phases:

Phase 1: Verify the script hash matches

    Hash the provided redeemScript
    Compare with <scriptHash> in the locking script
    If equal, proceed to Phase 2

Phase 2: Execute the actual redeemScript

    Deserialize the redeemScript
    Execute it with the provided signatures
    OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIG
    Verify that at least 2 of 3 signatures are valid

P2PKH vs P2SH Comparison

Aspect	P2PKH	P2SH
Prefix	`1`	`3`
Version byte	`0x00` (mainnet)	`0x05` (mainnet)
Locks to	Hash of a public key	Hash of a script
Spending requires	Signature + Public key	Signatures + Serialized redeem script
Use case	Simple single-key payments	Multi-sig, time-locks, custom conditions
Complexity visible to sender?	N/A (always simple)	No — hidden behind the script hash
Output size	25 bytes	23 bytes
Introduced	Genesis (2009)	BIP-16 (2012)

Beyond P2PKH and P2SH

Bitcoin has since introduced newer address formats:

P2WPKH (Pay-to-Witness-Public-Key-Hash): SegWit v0 equivalent of P2PKH, starting with bc1q. Moves the signature data to the witness field, reducing transaction weight and fees.
P2WSH (Pay-to-Witness-Script-Hash): SegWit v0 equivalent of P2SH, also starting with bc1q. Uses SHA-256 (32 bytes) instead of HASH160 (20 bytes) for the script hash, providing stronger collision resistance.
P2TR (Pay-to-Taproot): SegWit v1 (BIP-341), starting with bc1p. Uses Schnorr signatures and Merkle trees to enable even more complex spending conditions while maintaining the appearance of a simple single-key spend on-chain. This is the current state-of-the-art.

These newer formats use Bech32/Bech32m encoding instead of Base58Check, which provides better error detection and is case-insensitive.

The evolution from P2PKH to P2TR tells the story of Bitcoin’s ongoing effort to balance simplicity, privacy, flexibility, and efficiency — all while maintaining backward compatibility with the UTXO model that Satoshi designed in 2008.