diff --git a/EIPS/eip-8120.md b/EIPS/eip-8120.md new file mode 100644 index 00000000000000..7d52ce408a2fb0 --- /dev/null +++ b/EIPS/eip-8120.md @@ -0,0 +1,118 @@ +--- +eip: 8120 +title: MLOAD8 and CALLDATALOAD8 Opcodes +description: Adds EVM opcodes for efficient single-byte memory and calldata loads. +author: Helkomine (@Helkomine) +discussions-to: https://ethereum-magicians.org/t/eip-8120-mload8-and-calldataload8-opcodes/27396 +status: Draft +type: Standards Track +category: Core +created: 2026-01-07 +--- + +## Abstract + +This EIP introduces new EVM opcodes that allow loading a single byte from memory or calldata in a single operation, reducing gas cost and bytecode size compared to existing patterns based on `MLOAD (0x51)` or `CALLDATALOAD (0x35)` followed by bit shifting. + +## Motivation + +Currently, the only way to read a single byte from calldata or memory is to use `CALLDATALOAD` or `MLOAD` and then shift the loaded 32-byte word. +For example, reading the byte at offset x from calldata requires: + +``` +PUSH x +CALLDATALOAD +PUSH1 248 +SHR +``` + +This pattern increases runtime gas cost and adds three extra bytes to the deployed bytecode for each single-byte access. Contracts that frequently parse byte-oriented calldata or instruction streams incur unnecessary overhead. +This EIP proposes two new opcodes that allow loading a single byte directly in one operation. + +## Specification + +### MLOAD8 (TBD) + + + +- **Stack input**: `offset` +- **Stack output**: `value` + +Reads one byte from memory at position offset and pushes it onto the stack as a 32-byte word, with the byte placed in the least significant position. +Memory expansion occurs prior to the load, after which the loaded byte is read. +If the accessed byte lies beyond the previously allocated memory, the returned value is 0 due to zero-initialization. +Memory expansion rules apply in the same way as for `MSTORE8` (extending memory to at least `offset + 1` bytes). + +### CALLDATALOAD8 (TBD) + + + +- **Stack input**: `offset` +- **Stack output**: `value` + +Reads one byte from calldata at position offset and pushes it onto the stack as a 32-byte word, with the byte placed in the least significant position. +If offset is greater than or equal to `CALLDATASIZE (0x36)`, the returned value is 0. + +### Gas Cost + +- Base cost: 3 gas +- `MLOAD8` additionally incurs memory expansion cost as defined by existing memory access rules. +The base gas cost matches `MLOAD`, `MSTORE8`, and `CALLDATALOAD`, ensuring consistency with existing EVM pricing. + +### Exceptional Conditions + +Execution results in an exceptional halt if: + +- There is insufficient gas to execute the instruction +- There are insufficient stack items (stack underflow) + +In both cases, execution halts and the current call frame is reverted, consistent with existing EVM behavior. + +## Rationale + +### Opcode Symmetry + +`MLOAD8` serves as a natural counterpart to `MSTORE8 (0x53)`: one stores exactly one byte from the stack to memory, while the other loads exactly one byte from memory to the stack. This symmetry improves conceptual clarity and developer ergonomics. + +### Efficiency for Byte-Oriented Contracts + +Instruction-based architectures that interpret calldata as a sequence of byte-level commands benefit from reduced gas usage and smaller bytecode size when parsing instruction streams. +A common pattern for reading a single byte from calldata today consists of the following instruction sequence: + +- `CALLDATALOAD` (3 gas) +- `PUSH1` (3 gas) +- `SHR` (3 gas) + +This results in a total cost of 9 gas per byte read, excluding additional stack manipulation overhead, and increases deployed bytecode size due to the extra instructions. +Replacing this sequence with a single `CALLDATALOAD8` instruction priced at 3 gas saves 6 gas per byte read and reduces deployed bytecode size by approximately 3 bytes per occurrence. These savings compound in contracts that repeatedly parse byte-oriented calldata or instruction streams. + +### Opcode Assignment + +While the final opcode values are subject to allocation during review, this proposal suggests placing `MLOAD8` and `CALLDATALOAD8` in the `0x4X` opcode range. The `0x5X` range, which primarily contains stack, memory, storage, and control flow operations, is largely exhausted. +Tentative values of `0x4e` for `MLOAD8` and `0x4f` for `CALLDATALOAD8` are suggested to group these instructions near existing data access operations while minimizing the risk of opcode collisions. These assignments are intended to facilitate early client prototyping and collision checking and may be adjusted during the standardization process. + +## Backwards Compatibility + +This EIP introduces new opcodes and does not modify the semantics of existing instructions. No backwards compatibility issues are introduced beyond those inherent to any opcode-adding hard fork. + +## Test Cases + +Assume: + +- `calldata = 0x0123456789abcdef` +- `memory = 0xfedcba9876543210` + +| Bytecode | Description | Result | +|----------|-------------|--------| +| `5f ` | `PUSH0; CALLDATALOAD8` | pushes `0x01` | +| `6002 ` | `PUSH1 0x02; MLOAD8` | pushes `0xba` | +| `` | missing stack operand | exceptional halt | +| `` | missing stack operand | exceptional halt | + +## Security Considerations + +No new security considerations are introduced beyond those already known for memory and calldata access. + +## Copyright + +Copyright and related rights waived via [CC0](../LICENSE.md).