About the EVM Part I

一、EVM 工作原理概述

EVM 是一个 基于栈的图灵完备虚拟机，每次交易或合约调用都会在 EVM 中创建一个执行上下文，用于处理指令（opcode）。

基本流程：

接收交易：用户发起交易或合约调用，EVM 收到包含输入数据和 gas 的执行请求。
读取字节码：合约代码会被编译成字节码（bytecode），并由 EVM 逐条解释执行。
执行字节码：字节码指令在 EVM 中执行，通过操作栈、内存、存储、程序计数器（PC）等来完成逻辑。
消耗 gas：每执行一条指令都会消耗 gas，gas 不足会导致回滚。
更新状态：执行结束后，将结果写入状态树（State Trie）或回滚。

二、EVM 中的三种关键存储结构

1. 堆栈（Stack）

大小固定：最大深度为 1024。
操作方式：后进先出（LIFO），类似于汇编里的寄存器。
所有算术逻辑操作都必须通过堆栈完成，如 ADD, MUL, CALLDATALOAD 等。
只能操作栈顶元素（通过 PUSH, POP, DUP, SWAP 控制）。

示例：

PUSH1 0x02
PUSH1 0x03
ADD

执行后：堆栈中留下 0x05。

2. 内存（Memory）

临时存储空间：生命周期只在当前合约调用过程中，调用结束就释放。
是一个线性字节数组，按需扩展，初始化全为 0。
可读写，适合处理中间数据，如函数参数、返回值、字符串、数组等。
操作指令包括：MLOAD, MSTORE, MSTORE8。

成本： 访问越高地址，gas 越贵（因为内存要扩容）。

3. 存储（Storage）

永久存储空间：存储在区块链状态中，合约变量的最终值都保存在这里。
结构类似一个巨大的 key-value 映射（256-bit 到 256-bit）。
极其昂贵（读比内存贵很多，写更贵），所以要谨慎使用。
操作指令包括：SLOAD, SSTORE。

示例：
Solidity 中的 uint a; 在 EVM 中会映射到 storage[0]，a = 5 相当于 SSTORE 0 5。

三、EVM 执行模型简图：

四、执行环境（Execution Context）

Program Counter（PC）：指向当前执行的字节码位置。
Gas：限制执行资源，防止死循环。
Call Data：外部调用传入的数据，只读。
Code：当前执行的合约字节码。
Logs：用于触发事件。
返回值区域：放置 RETURN 指令返回的数据。

五、与 Solidity 的映射关系

Solidity 结构	EVM 存储位置
局部变量	Stack / Memory
状态变量	Storage
函数参数	Memory / Stack
数组 / 映射	Storage / Memory（视情况而定）

六、Opcodes

1.分类

Opcodes可以根据功能分为以下几类:

堆栈（Stack）指令: 这些指令直接操作EVM堆栈。这包括将元素压入堆栈（如PUSH1）和从堆栈中弹出元素（如POP）。
算术（Arithmetic）指令: 这些指令用于在EVM中执行基本的数学运算，如加法（ADD）、减法（SUB）、乘法（MUL）和除法（DIV）。
比较（Comparison）指令: 这些指令用于比较堆栈顶部的两个元素。例如，大于（GT）和小于（LT）。
位运算（Bitwise）指令: 这些指令用于在位级别上操作数据。例如，按位与（AND）和按位或（OR）。
内存（Memory）指令: 这些指令用于操作EVM的内存。例如，将内存中的数据读取到堆栈（MLOAD）和将堆栈中的数据存储到内存（MSTORE）。
存储（Storage）指令: 这些指令用于操作EVM的账户存储。例如，将存储中的数据读取到堆栈（SLOAD）和将堆栈中的数据保存到存储（SSTORE）。这类指令的gas消耗比内存指令要大。
控制流（Control Flow）指令: 这些指令用于EVM的控制流操作，比如跳转JUMP和跳转目标JUMPDEST。
上下文（Context）指令: 这些指令用于获取交易和区块的上下文信息。例如，获取msg.sender（CALLER）和当前可用的gas（GAS）。

2.堆栈（Stack）指令

栈结构的定义路径：go-ethereum/core/vm/stack.go

//栈结构是一个uint256的切片
type Stack struct {
    data []uint256.Int
}

push确保了切片的长度最多1024

func (st *Stack) push(d *uint256.Int) {

  // NOTE push limit (1024) is checked in baseCheck

  st.data = append(st.data, *d)

}

func (st *Stack) pop() (ret uint256.Int) {

  ret = st.data[len(st.data)-1]

  st.data = st.data[:len(st.data)-1]

  return

}

PUSH 指令族：`PUSH1` ~ `PUSH32`

作用：

将后面紧跟的 1~32 字节常量压入栈顶。

✅ 举例：

PUSH1 0x60   // 相当于把十六进制 0x60 推入栈顶
PUSH2 0x1234 // 把 0x1234（2字节）推入栈顶

源码实现位置：

makePush 函数是 go-ethereum 中 EVM 对 PUSH1~PUSH32 指令族的统一处理函数，作用是生成一个具体的 PUSHn 执行函数。

size：表示这个操作码整体长度是多少个字节（即 1 + pushByteSize，1 字节的 opcode + n 字节数据）
pushByteSize：要从字节码中读取的字节数（即 PUSHn 中的 n）

该函数返回一个 executionFunc，这个函数会在 EVM 执行该指令时被调用。

🔍 函数体详解：

func makePush(size uint64, pushByteSize int) executionFunc {
    return func(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
        var (
            codeLen = len(scope.Contract.Code)
            start   = min(codeLen, int(*pc+1))
            end     = min(codeLen, start+pushByteSize)
        )
        a := new(uint256.Int).SetBytes(scope.Contract.Code[start:end])

        // Missing bytes: pushByteSize - len(pushData)
        if missing := pushByteSize - (end - start); missing > 0 {
            a.Lsh(a, uint(8*missing))
        }
        scope.Stack.push(a)
        *pc += size
        return nil, nil
    }
}

Step 1️⃣ — 获取 PUSH 数据的范围

codeLen := len(scope.Contract.Code)
start := min(codeLen, int(*pc+1))          // 第一个立即数字节位置
end := min(codeLen, start+pushByteSize)    // 最后一个字节的位置

*pc 当前指向 PUSHn 操作码（1 字节）
所以 *pc+1 才是数据开始的地址
因为 EVM 执行过程中可能读取到结尾，需要用 min 防止越界

🧠 例子：

假设字节码为：

0x60 0x0A   // PUSH1 0x0A

则：

*pc = 0，opcode = 0x60 (PUSH1)
start = 1
end = 2
scope.Contract.Code[1:2] = [0x0A]

Step 2️⃣ — 转为 uint256.Int

a := new(uint256.Int).SetBytes(scope.Contract.Code[start:end])

将 start:end 范围内的字节转换为一个 256-bit 整数（高位补 0）。

Step 3️⃣ — 补齐不足的位数（重要）

if missing := pushByteSize - (end - start); missing > 0 {
    a.Lsh(a, uint(8*missing))  // 左移补0
}

为什么要这么处理？

因为如果字节码写错、数据不足，比如字节码是 0x62 0xFF（本来是 PUSH2 0xFF??，但只写了一个字节），那读取结果是：

pushByteSize = 2
实际读取到的只有 1 字节
所以我们就需要左移 8 位（1 字节 = 8 bits）补 0，相当于将 0xFF 变成 0xFF00

这是 符合 EVM 规范的行为。

Step 4️⃣ — 压入栈

scope.Stack.push(a)

Step 5️⃣ — 更新程序计数器

*pc += size

size = 1 + pushByteSize，跳过当前 opcode 及其立即数。

DUP 指令族：`DUP1` ~ `DUP16`

作用：

将栈顶向下第 N 个元素复制一份，压到栈顶。

例如：

stack: [0x01 0x02 0x03]  // 栈顶是 0x03
执行 DUP2 后
stack: [0x01 0x02 0x03 0x02] // 把第 2 个 0x02 复制到栈顶

源码实现：

func makeDup(size int64) executionFunc {
    return func(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
        scope.Stack.dup(int(size))
        return nil, nil
    }
}

SWAP 指令族：`SWAP1` ~ `SWAP16`

作用：

将栈顶元素与第 N 个元素交换。

例如：

stack: [0x01 0x02 0x03]  // 栈顶是 0x03
执行 SWAP2 后
stack: [0x03 0x02 0x01]

源码实现：

// core\vm\instructions.go
func opSwap1(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    scope.Stack.swap1()
    return nil, nil
}

// core\vm\stack.go
func (st *Stack) swap1() {
    st.data[st.len()-2], st.data[st.len()-1] = st.data[st.len()-1], st.data[st.len()-2]
}

Gas 消耗

指令	Gas Cost
POP	2
PUSHn	3
DUPn	3
SWAPn	3

3.算术（Arithmetic）指令

EVM 中的 ADD、MUL、SUB、DIV 是最基本的 4 个算术操作指令，它们都以 栈操作 的方式工作 —— 取出操作数、计算结果、再压回栈顶。EVM 中除了最基本的 ADD、MUL、SUB、DIV 运算指令外，还支持各种运算操作，包括 模运算、位运算、比较运算、溢出安全运算、签名数运算 等等

源码的实现中做了些优化，少一次 push 少一次 pop

x := scope.Stack.pop()
- 弹出栈顶（操作数1）
y := scope.Stack.peek()
- 获取栈顶的下一个元素（操作数2），但不弹出
y.Add(&x, y)
- 用 x + y 的结果就地更新 y 所在的栈位置

✅ 也就是说，它相当于 y = x + y，结果直接写回原来的 y 的位置。

func opAdd(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    y.Add(&x, y)
    return nil, nil
}

func opSub(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    y.Sub(&x, y)
    return nil, nil
}

func opMul(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    y.Mul(&x, y)
    return nil, nil
}

func opDiv(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    y.Div(&x, y)

⛽ Gas 消耗对比（EIP-150 后常规执行成本）：

指令	操作码	Gas
ADD	0x01	3
MUL	0x02	5
SUB	0x03	3
DIV	0x04	5

4.比较（Comparison）指令

比较的指令实现也比较简单从栈中弹出两个操作数 x 和 y，比较它们的大小，并将结果（1 或 0）写回栈顶。然后做了优化操作数y只取出值然后将比较的结果覆盖y

func opLt(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    if x.Lt(y) {
        y.SetOne()
    } else {
        y.Clear()
    }
    return nil, nil
}

func opGt(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    if x.Gt(y) {
        y.SetOne()
    } else {
        y.Clear()
    }
    return nil, nil
}

func opSlt(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    if x.Slt(y) {
        y.SetOne()
    } else {
        y.Clear()
    }
    return nil, nil
}

func opSgt(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    if x.Sgt(y) {
        y.SetOne()
    } else {
        y.Clear()
    }
    return nil, nil
}

func opEq(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    if x.Eq(y) {
        y.SetOne()
    } else {
        y.Clear()
    }
    return nil, nil
}

func opIszero(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x := scope.Stack.peek()
    if x.IsZero() {
        x.SetOne()
    } else {
        x.Clear()
    }
    return nil, nil
}

🔍 比较运算指令概览

指令名	操作码	描述	操作数	结果	Gas 消耗
`LT`	0x10	无符号小于比较	a, b	a < b	3
`GT`	0x11	无符号大于比较	a, b	a > b	3
`SLT`	0x12	有符号小于比较	a, b	a < b	3
`SGT`	0x13	有符号大于比较	a, b	a > b	3
`EQ`	0x14	等于比较	a, b	a == b	3
`ISZERO`	0x15	判断是否为零	a	a == 0	3

5.位运算（Bitwise）指令

🔧 8 个位级运算指令及其含义

指令	操作码	含义
`AND`	0x16	位与
`OR`	0x17	位或
`XOR`	0x18	位异或
`NOT`	0x19	位取反
`BYTE`	0x1A	提取字节
`SHL`	0x1B	左移（无符号）
`SHR`	0x1C	右移（无符号）
`SAR`	0x1D	右移（有符号）

📁对应源码实现（go-ethereum）

以 go-ethereum 为例，所有这些位运算函数都定义在：

core/vm/instructions.go

func opAnd(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    y.And(&x, y)
    return nil, nil
}

func opOr(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    y.Or(&x, y)
    return nil, nil
}

func opXor(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y := scope.Stack.pop(), scope.Stack.peek()
    y.Xor(&x, y)
    return nil, nil
}

func opByte(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    th, val := scope.Stack.pop(), scope.Stack.peek()
    val.Byte(&th)
    return nil, nil
}

func opAddmod(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y, z := scope.Stack.pop(), scope.Stack.pop(), scope.Stack.peek()
    z.AddMod(&x, &y, z)
    return nil, nil
}

func opMulmod(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    x, y, z := scope.Stack.pop(), scope.Stack.pop(), scope.Stack.peek()
    z.MulMod(&x, &y, z)
    return nil, nil
}

// opSHL implements Shift Left
// The SHL instruction (shift left) pops 2 values from the stack, first arg1 and then arg2,
// and pushes on the stack arg2 shifted to the left by arg1 number of bits.
func opSHL(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    // Note, second operand is left in the stack; accumulate result into it, and no need to push it afterwards
    shift, value := scope.Stack.pop(), scope.Stack.peek()
    if shift.LtUint64(256) {
        value.Lsh(value, uint(shift.Uint64()))
    } else {
        value.Clear()
    }
    return nil, nil
}

// opSHR implements Logical Shift Right
// The SHR instruction (logical shift right) pops 2 values from the stack, first arg1 and then arg2,
// and pushes on the stack arg2 shifted to the right by arg1 number of bits with zero fill.
func opSHR(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    // Note, second operand is left in the stack; accumulate result into it, and no need to push it afterwards
    shift, value := scope.Stack.pop(), scope.Stack.peek()
    if shift.LtUint64(256) {
        value.Rsh(value, uint(shift.Uint64()))
    } else {
        value.Clear()
    }
    return nil, nil
}

// opSAR implements Arithmetic Shift Right
// The SAR instruction (arithmetic shift right) pops 2 values from the stack, first arg1 and then arg2,
// and pushes on the stack arg2 shifted to the right by arg1 number of bits with sign extension.
func opSAR(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    shift, value := scope.Stack.pop(), scope.Stack.peek()
    if shift.GtUint64(256) {
        if value.Sign() >= 0 {
            value.Clear()
        } else {
            // Max negative shift: all bits set
            value.SetAllOne()
        }
        return nil, nil
    }
    n := uint(shift.Uint64())
    value.SRsh(value, n)
    return nil, nil
}

⛽ 三、Gas 消耗（来自 `gas_table.go`）

大部分位运算的 Gas 是 固定的低成本操作（3 gas），属于非常轻量的计算。

指令	Gas 成本	说明
`AND`	3	`params.VeryLowGas`
`OR`	3
`XOR`	3
`NOT`	3	实际在 `EIP-145` 前不支持
`BYTE`	3	提取某个字节
`SHL`	3	EIP-145: 位移操作
`SHR`	3
`SAR`	3

6.内存（Memory）指令

在 EVM 中，内存操作指令主要有以下 4 个：

指令	含义	操作码
`MSTORE`	写入 32 字节	0x52
`MSTORE8`	写入 1 字节	0x53
`MLOAD`	读取 32 字节	0x51
`MSIZE`	查询当前内存大小	0x59

🔧 源码实现（来自 Geth）

➡️core\vm\memory.go

memory对象实现的方式

type Memory struct {
    store       []byte
    lastGasCost uint64
}

EVM 的 Memory 是一个临时的、线性地址空间的内存，用于执行合约时的数据操作（比如 MSTORE, MLOAD 等）。
它是一个自动扩容的 []byte 数组：

store []byte：存储所有内存数据。
lastGasCost：记录上一次内存扩容所花费的 gas（用于避免重复计算 gas）。

♻️ 内存池管理（内存复用）

var memoryPool = sync.Pool{
    New: func() any {
        return &Memory{}
    },
}

为什么用 `sync.Pool`？

内存对象会被频繁创建和销毁（每个合约执行都需要一块 Memory），直接频繁分配和回收会增加 GC 压力。

用 sync.Pool 可以避免重复申请，提高性能。

func NewMemory() *Memory {
    return memoryPool.Get().(*Memory)
}
go复制编辑func (m *Memory) Free() {
    const maxBufferSize = 16 << 10 // 最大 16KB
    if cap(m.store) <= maxBufferSize {
        m.store = m.store[:0]
        m.lastGasCost = 0
        memoryPool.Put(m)
    }
}

只回收「不大」的内存（防止泄漏大对象）。
重置后放回池中。

🧱 内存写入：`Set`, `Set32`

`Set(offset, size, value)`

func (m *Memory) Set(offset, size uint64, value []byte)

写入从 offset 开始、size 大小的数据。
要求必须在写入前先调用 Resize 否则 panic。
用于 MSTORE, MSTORE8 等指令。

`Set32(offset, val)`

func (m *Memory) Set32(offset uint64, val *uint256.Int)

写入一个 32 字节整型值（如 uint256）。
使用 val.PutUint256(m.store[offset:]) 写入 32 字节。
用于 MSTORE 指令。

📐 内存读取：`GetCopy`, `GetPtr`

`GetCopy(offset, size) []byte`

返回内存中 [offset, offset+size) 的拷贝副本。

cpy := make([]byte, size)
copy(cpy, m.store[offset:offset+size])

安全：防止外部修改原始内存。

`GetPtr(offset, size) []byte`

返回原始 store 的切片指针（没有拷贝，性能高）。

return m.store[offset : offset+size]

🪜Resize：内存扩容

func (m *Memory) Resize(size uint64)

如果现有 store 长度不足，就扩容。
实际通过 append(make([]byte, size-delta)) 实现。
- Go 的 append 虽然可以触发扩容，但这是不可控的自动行为。EVM 需要 开发者主动指定目标大小
- 每次 Memory 扩容都需要支付额外 gas，以防止攻击者用 memory 拖垮执行资源。
- memory 中未写入的部分默认值是 0。

⚠️ 所有读写操作前都 必须手动调用 Resize，否则会 panic！

🧪 Copy

func (m *Memory) Copy(dst, src, len uint64)

将内存从 src 拷贝 len 字节到 dst。
可能会发生重叠（支持 memmove）。
⚠️ 不会自动扩容，调用前需确保 Resize 过。

📊 辅助方法

func (m *Memory) Len() int
func (m *Memory) Data() []byte

Len()：当前内存长度（不是容量）。
Data()：返回整个 store。

⚙️使用示例（例如 MSTORE）

EVM 执行 MSTORE：

// 假设 offset = 0x40，val = 0xdeadbeef...
memory.Resize(offset + 32)            // 先扩容
memory.Set32(offset, val)             // 写入32字节

⛽ Gas 计费：由内存增长触发

EVM 的内存使用不是免费：每 32 字节为单位扩容时需要支付 gas，如下（在调用 Resize 的地方）：

memoryGasCost = newMemorySizeWords^2 / 512 - oldSize^2 / 512

这也是为什么 lastGasCost 字段存在——为了记住上一次内存扩容的 gas 开销，避免重复计算。

➡️ core/vm/instructions.go

✅ 1. `MSTORE` — 写入 32 字节（256bit）

func opMstore(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    offset, val := scope.Stack.pop(), scope.Stack.pop()
    interpreter.memory.Set32(offset.Uint64(), &val)
    return nil, nil
}

说明：从栈中取出 offset 和 val，将 val 写入到 memory[offset : offset+32]
函数内部调用：memory.Set32(off, val) 是内部封装的 32 字节写入函数

✅ 2. `MSTORE8` — 写入 1 字节（最低位）

func opMstore8(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    off, val := scope.Stack.pop(), scope.Stack.pop()
    scope.Memory.store[off.Uint64()] = byte(val.Uint64())
    return nil, nil
}

说明：将 val 的最低有效字节（byte 31）写入到内存 offset 处，适合 byte 操作
Byte(31) 取的是最后一个字节（高位在前，符合 big endian）

✅ 3. `MLOAD` — 从内存读取 32 字节

func opMload(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    v := scope.Stack.peek()
    offset := v.Uint64()
    v.SetBytes(scope.Memory.GetPtr(offset, 32))
    return nil, nil
}

说明：从 memory[offset : offset+32] 读取 32 字节，并 push 到栈中

✅ 4. `MSIZE` — 当前内存大小（以字节为单位）

func opMsize(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    scope.Stack.push(new(uint256.Int).SetUint64(uint64(len(interpreter.memory))))
    return nil, nil
}

说明：直接把当前 memory 长度 push 到栈顶，单位是字节

⛽ Gas 消耗（来自 `gas_table.go`）

🧮 Gas 成本相关规则：

指令	基础 Gas 消耗	说明
`MLOAD`	3	`VeryLow`（不包括内存扩展成本）
`MSTORE`	3	同上
`MSTORE8`	3	同上
`MSIZE`	2	`BaseGas`
⏫ 内存扩展	依照扩展字节数，增加额外 Gas，见 `memoryGasCost` 逻辑

示例片段：

VeryLowGas = 3
BaseGas = 2
...
{Op: MLOAD, Gas: VeryLowGas, Exec: opMload},
{Op: MSTORE, Gas: VeryLowGas, Exec: opMstore},
{Op: MSTORE8, Gas: VeryLowGas, Exec: opMstore8},
{Op: MSIZE, Gas: BaseGas, Exec: opMsize},

此外，EVM 内存是动态扩展的，每次写入可能导致内存增长，增长的部分会按比例增加 gas 消耗（在 memoryGasCost.go 中实现）。

📘 总结对比

指令	栈操作	内存操作	Gas 消耗
`MSTORE`	pop(offset), pop(value)	memory[offset : offset+32] ← value	3 + mem扩展
`MSTORE8`	pop(offset), pop(value)	memory[offset] ← value[31]	3 + mem扩展
`MLOAD`	pop(offset) → push(value)	value ← memory[offset : offset+32]	3 + mem扩展
`MSIZE`	push(memory size)	无直接操作	2

7.存储（Storage）指令

在 EVM 中，存储（Storage） 是合约的持久化数据存储区域，数据在交易结束后仍然保留在链上。两条指令用于操作存储：

SLOAD: 加载一个存储槽的数据
SSTORE: 将一个值写入存储槽

🔧 源码级别的实现原理

📥 `SLOAD` 实现

从合约状态中读取指定位置的数据：

func opSload(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    loc := scope.Stack.peek() // 读取存储槽索引
    hash := common.Hash(loc.Bytes32()) // 转为 32 字节 hash 作为 slot key
    val := interpreter.evm.StateDB.GetState(scope.Contract.Address(), hash)
    loc.SetBytes(val.Bytes()) // 设置读取值到栈顶（栈中原来的 slot 被改为值）
    return nil, nil
}

📌 实现要点：

栈顶元素是要读取的 slot 索引（比如 slot 0x00, 0x01...）
合约地址 + slot 构成唯一的 key，访问 StateDB
修改原始 slot 元素值为读取结果（避免 pop 再 push）

📤 `SSTORE` 实现

将一个值写入合约状态：

func opSstore(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
    addr := scope.Contract.Address()
    loc := scope.Stack.pop()        // slot
    val := scope.Stack.pop()        // value
    key := common.Hash(loc.Bytes32())

    // 设置 storage 新值（实际存储延迟提交）
    interpreter.evm.StateDB.SetState(addr, key, val.Bytes32())
    return nil, nil
}

📌 实现要点：

两个参数：slot 和 value，顺序是 value 在上，slot 在下
StateDB.SetState 会记录这个 slot 的修改（可以用于追踪变更）
实际写入状态树会在交易完成后统一提交

💰Gas 消耗逻辑（重点）

✅ 1. `SLOAD` gas 消耗

固定费用：100 gas
不考虑新旧值关系，直接从状态树加载 slot 值

GasSload = 100

✅ 2. `SSTORE` gas 消耗（复杂）

根据旧值 / 新值 / 初始值不同，收费策略不同。

规则详解（London 升级后 EIP-3529）：

情况	Gas 消耗	说明
把非 0 改为 0	`5000`，并返还 `4800` gas	"清除" 存储，EVM 喜欢
把 0 改为非 0	`20000`	增加状态大小，最贵
把非 0 改为非 0（值变了）	`5000`	一般修改操作
把值改为原始值（回滚）	`100`	无实际变更（无效写）

原始值指合约开始执行时 slot 的值（用于判断是否真正变更）

💡 状态修改追踪： `StateDB`

EVM 中通过 StateDB 记录：

originalValue：交易开始时的状态
dirtyValue：当前修改但未提交的值

这让 SSTORE 能判断：

if new == old {
    // 无修改
}
else if original == old {
    // 真正变更，走 expensive path
}
else {
    // 被多次修改，不重复计费
}

在 EVM（以太坊虚拟机）中，StateDB 是管理合约账户、合约代码和存储（Storage）的核心组件，承担着读写全局状态的职责。我们来从源码和功能层面详细拆解 StateDB 的实现：

在以太坊中，每个账户有：

账户余额（balance）
随机数（nonce）
合约代码（如果是合约账户）
存储（键值对：key => value）

StateDB 提供了统一的接口来读取和写入这些数据，并在交易结束时统一提交或丢弃状态变化。

StateDB 的关键接口和结构体

`StateDB` 接口定义

go复制编辑type StateDB interface {
    GetBalance(addr common.Address) *big.Int
    GetNonce(addr common.Address) uint64
    GetCode(addr common.Address) []byte
    GetCodeSize(addr common.Address) int
    GetCodeHash(addr common.Address) common.Hash
    GetState(addr common.Address, key common.Hash) common.Hash
    SetState(addr common.Address, key, value common.Hash)

    AddBalance(addr common.Address, amount *big.Int)
    SubBalance(addr common.Address, amount *big.Int)
    SetNonce(addr common.Address, nonce uint64)
    SetCode(addr common.Address, code []byte)

    Suicide(addr common.Address) bool
    HasSuicided(addr common.Address) bool

    Snapshot() int
    RevertToSnapshot(int)
    Commit(deleteEmptyObjects bool) error
}

这表示它支持账户基本信息的读写、合约状态管理、回滚和提交等。

实现结构体 `stateObject`

每个账户在 StateDB 中以 stateObject 的形式存在：

type stateObject struct {
    address  common.Address
    balance  *big.Int
    nonce    uint64
    code     []byte
    storage  Trie
    dirtyStorage map[common.Hash]common.Hash
    ...
}

内部管理一份账户的状态，包括 storage Trie（Merkle Patricia Trie 结构），并记录脏数据用于延迟写入。

Storage 的底层结构：Merkle Patricia Trie（MPT）

每个合约的 storage 是一个 MPT。它的特点：

可验证（Merkle 结构）
支持路径压缩（Patricia）
Key 为 32 字节哈希，Value 为 32 字节哈希

更新存储时，EVM 会记录 dirty 数据，等交易提交时统一将这些数据写入 MPT 中，并生成新的 stateRoot。

🚀 总结对比

指令	类型	功能	Gas 消耗	是否持久
`SLOAD`	读	读取 slot 值	100 gas	否
`SSTORE`	写	设置 slot 值	100 ~ 20,000 gas（看旧值）	是（写入区块链）

一、EVM 工作原理概述

基本流程：

二、EVM 中的三种关键存储结构

1. 堆栈（Stack）

2. 内存（Memory）

3. 存储（Storage）

三、EVM 执行模型简图：

四、执行环境（Execution Context）

五、与 Solidity 的映射关系

六、Opcodes

1.分类

2.堆栈（Stack）指令

PUSH 指令族：PUSH1 ~ PUSH32

作用：

✅ 举例：

源码实现位置：

🔍 函数体详解：

Step 1️⃣ — 获取 PUSH 数据的范围

🧠 例子：

Step 2️⃣ — 转为 uint256.Int

Step 3️⃣ — 补齐不足的位数（重要）

Step 4️⃣ — 压入栈

Step 5️⃣ — 更新程序计数器

DUP 指令族：DUP1 ~ DUP16

作用：

源码实现：

SWAP 指令族：SWAP1 ~ SWAP16

作用：

源码实现：

Gas 消耗

3.算术（Arithmetic）指令

⛽ Gas 消耗对比（EIP-150 后常规执行成本）：

4.比较（Comparison）指令

🔍 比较运算指令概览

5.位运算（Bitwise）指令

🔧 8 个位级运算指令及其含义

📁对应源码实现（go-ethereum）

⛽ 三、Gas 消耗（来自 gas_table.go）

6.内存（Memory）指令

🔧 源码实现（来自 Geth）

♻️ 内存池管理（内存复用）

为什么用 sync.Pool？

🧱 内存写入：Set, Set32

Set(offset, size, value)

Set32(offset, val)

📐 内存读取：GetCopy, GetPtr

GetCopy(offset, size) []byte

GetPtr(offset, size) []byte

🪜Resize：内存扩容

🧪 Copy

📊 辅助方法

⚙️使用示例（例如 MSTORE）

⛽ Gas 计费：由内存增长触发

✅ 1. MSTORE — 写入 32 字节（256bit）

✅ 2. MSTORE8 — 写入 1 字节（最低位）

✅ 3. MLOAD — 从内存读取 32 字节

✅ 4. MSIZE — 当前内存大小（以字节为单位）

⛽ Gas 消耗（来自 gas_table.go）

🧮 Gas 成本相关规则：

示例片段：

📘 总结对比

7.存储（Storage）指令

🔧 源码级别的实现原理

📥 SLOAD 实现

📌 实现要点：

📤 SSTORE 实现

📌 实现要点：

💰Gas 消耗逻辑（重点）

✅ 1. SLOAD gas 消耗

✅ 2. SSTORE gas 消耗（复杂）

规则详解（London 升级后 EIP-3529）：

💡 状态修改追踪： StateDB

StateDB 的关键接口和结构体

StateDB 接口定义

实现结构体 stateObject

Storage 的底层结构：Merkle Patricia Trie（MPT）

🚀 总结对比

About the EVM Part II

评论 (0)

PUSH 指令族：`PUSH1` ~ `PUSH32`

DUP 指令族：`DUP1` ~ `DUP16`

SWAP 指令族：`SWAP1` ~ `SWAP16`

⛽ 三、Gas 消耗（来自 `gas_table.go`）

为什么用 `sync.Pool`？

🧱 内存写入：`Set`, `Set32`

`Set(offset, size, value)`

`Set32(offset, val)`

📐 内存读取：`GetCopy`, `GetPtr`

`GetCopy(offset, size) []byte`

`GetPtr(offset, size) []byte`

✅ 1. `MSTORE` — 写入 32 字节（256bit）

✅ 2. `MSTORE8` — 写入 1 字节（最低位）

✅ 3. `MLOAD` — 从内存读取 32 字节

✅ 4. `MSIZE` — 当前内存大小（以字节为单位）

⛽ Gas 消耗（来自 `gas_table.go`）

📥 `SLOAD` 实现

📤 `SSTORE` 实现

✅ 1. `SLOAD` gas 消耗

✅ 2. `SSTORE` gas 消耗（复杂）

💡 状态修改追踪： `StateDB`

`StateDB` 接口定义

实现结构体 `stateObject`