We use Relay service on Arbitrum and sometimes see slowness when transactions remain in the pending transaction queue for several minutes. We see this in both Arbitrum testnet and mainnet.
Since Arbitrum Sequencer should confirm and send the receipt almost immediately, I wanted to know what are conditions the Relay service uses to determine that a transaction is confirmed and remove it from the pending transaction queue.
We use Surge pricing currently and thus have no access to logs to understand internals ourselves.
Environment
Relayers on Arbitrum Goerli and Arbitrum One
The core logic is we send the transaction immediately and then check the status once per minute. During that status check, we would resubmit the transaction, typically increasing gas price, if it has not yet been mined (in these cases, you would see more than one hash in the transaction's hashes).
During that same status check once per minute, if the transaction has been mined and more than 12 blocks have passed, we consider the transaction "Confirmed".
What might be a reason for a transaction not getting confirmed by Arbitrum Sequencer for more than 60 seconds? I believe in our case, it was not confirmed for this long because Relayer did resubmission after 60 sec and we use the following settings to send transactions:
{ speed: 'fastest', validForSeconds: 60 }
If I give you transactionID would it be possible for you to get the details of what happened?
What might be a reason for a transaction not getting confirmed by Arbitrum Sequencer for more than 60 seconds? I believe in our case, it was not confirmed for this long because Relayer did resubmission after 60 sec and we use the following settings to send transactions:
{ speed: 'fastest', validForSeconds: 60 }
If the transaction was sent right after the update function has run, even if it was mined immediately it would take ~1 minute to confirm
If I give you transactionID would it be possible for you to get the details of what happened?
I think that still should be fine - while it will be in a 'pending' state in the Relay queue, we can see an actual status given by Arbitrum Sequencer via eth_getTransactionReceipt RPC call. Our application waits for at least one network confirmation but didn't receive any and Relay service canceled those transactions after a minute. Hence, I assume something went wrong with the submission to the Arbitrum network itself. I hope my understanding is right, please let me know if it is not.
Here is where it all started with the transaction nonce 28 and till 32:
and then again 3 transactions were not sent/got canceled:
nonce 35: bca4062c-96c0-4954-bced-79dbd41f5222
nonce 36: e0215d87-3ed8-4f6c-9297-c52f6edf9004
nonce 37: fe520e0f-becb-4c62-a322-ec8f3666a534
Thank you Dan. That explains it and we already took measurements to prevent it. Can you please provide the block number when this error occurred?
One last question - if we had had access to the Logging Dashboard feature of the Defender app, were we be able to find out about the "gas limit" error from it?
Can you please provide the block number when this error occurred?
I'm not sure how we would assess this - we don't check the block number before sending eth_sendRawTransaction - you could check the createdAt on a transaction and then check against Arbitrum blocks to get an approximate block number.
if we had had access to the Logging Dashboard feature of the Defender app, were we be able to find out about the "gas limit" error from it?
Unfortunately it does not appear so because the Relayer Send Tx invocation and the actual sending of the transcation are decoupled. For most chains we virtually never receive errors on an eth_sendRawTransaction call because we catch any such errors with validation in our codebase. L2s with centralized sequencers are an exception because they have additional error handling logic that is opaque to us (like a block's gas already being full).
Yes, I did research based on "createdAt" field value to see the original block. However, all of them have less than 1% of the available "gasLimit" used. One of the potential blocks is this, for example, https://arbiscan.io/block/54626254
So I am wondering now, what does "gas limit reached" really mean? There is no way our tx could exceed 1,125,899,906,842,624 gas limit. I searched for "ErrGasLimitReached" in github too, it is used in GasPool only https://github.com/OffchainLabs/go-ethereum/blob/56be2729da9dfca0d222a5afddca502306d4ae0f/core/gaspool.go#L41 and it looks like gasPool initialized with block.gasLimit value. Is it the same gasLimit value as shown on arbiscan (1.1 quadrillion)? That looks weird and "gas limit reached" error doesn't make sense in that case.