Why does "cloneDeterministic" have to shift left by 96 bits?

I was looking at the Clones.sol contract, and I noticed that you are using the EVM instruction shl to shift the implementation address by 96 bits:

mstore(ptr, 0x3d602d80600a3d3981f3363d3d373d3d3d363d73000000000000000000000000)
mstore(add(ptr, 0x14), shl(0x60, implementation))

I understand the first part of this code. You are adding 0x14 (20) bytes to the pointer because the first part of the EIP-1167 bytecode has that length. But why do you have to shift left by 0x60 (96 bits)?

Optionality's original implementation of EIP-1167 doesn't need to do that.

Hmm, I think that I figured it out. I wrote this contract:

contract Shl {
    function foo(address implementation) external pure returns (bytes memory result) {
        uint256 intermediary_result;
        assembly {
            intermediary_result := shl(0x60, implementation)
        }
        result = abi.encode(intermediary_result);
    }
}

I noticed that the result is the implementation address padded by 24 zeroes. Thus the reason you are shifting left by 96 bits is to make the implementation address a 256 bits EVM word.

It should follow then that the reason why Optionality didn't shift left the address is that bytes20 is padded differently. That is, it is padded with trailing zeroes be default. Is this explanation correct?