How do I make a deep copy of array in Yul?

Hello,

I am trying to solve a problem.

Problem statement : Implement a simple function copying a bytes memory array in most gas efficient way using inline assembly (The copied array must be a deep copy. In other words, the copy cannot merely be a reference to the original array.)

Here is my code. Would be great if someone can help me

// SPDX-License-Identifier: UNLICENSED
pragma solidity ^0.8.19;

abstract contract Challenge {

    /**
     * @notice Returns a copy of the given array in a gas efficient way.
     * @dev This contract will be called internally.
     * @param array The array to copy.
     * @return copy The copied array.
     */
    function copyArray(bytes memory array) 
        internal 
        pure 
        returns (bytes memory copy) 
    {
assembly {
        let length := mload(array)
        copy := mload(0x40)
        mstore(0x40, add(copy, add(32, mul(length, 32))))
        mstore(copy, length)
        if gt(length, 0) {
            let src := add(array, 32)
            let dst := add(copy, 32)
            let end := add(src, mul(length, 32))

            for { } lt(src, end) { } {
                mstore(dst, mload(src))
                src := add(src, 32)
                dst := add(dst, 32)
            }
        }
    }
    //return copy;

    }
}

The issue is it is not matching the gas targets in test

Public Test 1
:heavy_check_mark: Should correctly allocate and copy the array
:heavy_check_mark: Should be more efficient than the reference implementation
1) Should be below the 250 gas consumption target
Public Test 2
:heavy_check_mark: Should correctly allocate and copy the array
:heavy_check_mark: Should be more efficient than the reference implementation
2) Should be below the 450 gas consumption target
Public Test 3
:heavy_check_mark: Should correctly allocate and copy the array
:heavy_check_mark: Should be more efficient than the reference implementation (46ms)
3) Should be below the 1000 gas consumption target

Hey! You are copying too many memory slots.

In solidity, string and bytes behave like arrays but are stored differently where each element only occupies 1 byte (instead of 32 bytes).

So, an array of length X will occupy X+32 bytes where the first 32 are the array length and the rest are its content.

Since each memory slot has 32 bytes, you will need the first slot for the array length and then the roof of X/32 memory slots.

I hope this helps!

Or in technical words, replace mul(length, 32) with length.