Contract bytecode, function logic and state variable storage

Hi there, I would like to get a better understanding on the separation between contract bytecode, function logic data and state variables on the storage level.

Although they seem to be separated diagrammatically, is there any possibility that contract bytecode or function logic data could collide with state variables in any way? If function logic data can never collide with state variables, there will be a lot more flexibility in modifying functions on the logic contract in a proxy-logic upgradeable pattern, at least more than what I previously thought.

Any in-depth explanation will be highly appreciated.

The contract bytecode is set at the contract creation time. The account creating the contract send a tx to address 0x00 with the data filed set to the bytecode to be executed (this is the constructor code). This bytecode needs to return a chunk of bytes (which is the contract code).

State variables are stored in a keystore (think of it as a mapping(uint256 => uint256) or mapping(bytes32 => bytes32)) and the layout is described here.

The variables declared inside a function are stored either in memory or in the stack.

The EVM is mostly stack based and the stack holds 256 bit words

Solidity keeps a scratch area from 0x00 to 0x3f, and the free pointer (in memory) at 0x40, it's described here

1 Like

Thanks for the explanations. Where does the contract bytecode get stored? How about the function logic data, like for loop, if-else statements, etc. Are the contract bytecode or function logic data stored in the same way as the state variables?

The ethereum client keeps track of account related data, this includes nonce, contract code (if any) and local storage. EOA's obviously have the bytecode and storage fields empty.

Any variables scoped inside a function never touch the local storage area. Those values either get pushed to the stack and used or copied to memory and then used.

contract A {
    function abc(uint v) external pure returns (uint) { 
        uint r = v + 42;
        return r; 
    }
    
    function xyz(uint v) external pure returns (bytes32) { 
        return keccak256(abi.encodePacked(v, uint(42))); 
    }
}

the function abc pushes the values v and 42 to the stack, then calls the ADD opcode and then returns it

the function xyz needs to copy the values to memory (the SHA3 opcode takes the data to hash from memory)

1 Like

What I was trying to get an answer for is if the function logic is part of contract bytecode. For example, are the contract code for r = v + 42 and r = v - 42 the same? I just did some tests and found out they are different. Then, the contract bytecode contains all the function logic operations as well. Then my question if the function logic data can collide with storage has a clear answer.

yes, bytecode is the result of compiling your contract. it's the set of instructions that the EVM is gonna execute. bytecode is the ethereum equivalent of an .EXE file that you get after compiling a C program, for example

The term collision doesn't really apply here. there is nothing to "collide". collision to me means "hashing collision". This is why you can use a delegate call to build a proxy. All storage is kept in the proxy and the code in the implementation. As long as the storage layout is the same, it works.

you can find a list of EVM opcodes here