Efficient Usage of BitMaps to Store Predefined Set of Randomly Selected Numbers

0xLostArchitect · October 23, 2021, 10:45pm

Hi everyone!

I'm trying to work through the most efficient way to solve the following problem:

Imagine we have an ERC721 with 1000 tokens, ids 0 - 999. On every mint, the caller gets one of the IDs that is remaining, at random. How should we store which ID's are still available?

The following is a POC that I think should work, but would love to hear some other options that I might be missing.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.9;

// External
import "@openzeppelin/contracts/token/ERC721/ERC721.sol";
import "@openzeppelin/contracts/utils/math/Math.sol";

contract RandomMint is ERC721 {
    using Math for *;

    /**
     * @dev Represents up to 256 id's
     *
     * @dev A `maxSupply` T will have T / 256 bitmappings, representing all of the different id's possible.
     *
     * @dev This structure allows us to randomize and keep track of which id's have been minted
     *
     * {uint256} originalPosition - the original position in the index of the array of bitmaps, providing the offset for the id
     * {uint256} bitmap - the actual bitmap holding up to 256 unique ids
     * {uint256} maxBitPosition - the maximum bit position of this bitmap, which allows us to determine when the bitmap is full
     */
    struct BitMapping {
        uint256 originalPosition;
        uint256 bitmap;
        uint256 maxBitPosition;
    }

    /**
     * @dev Map indexes to BitMaps
     *
     * @dev Used to track the remaining id's
     */
    BitMapping[] public bitMappings;

    constructor(uint256 maxSupply) ERC721("RandomMint", "RM") {
        /*
         * Determine the number of bit mappings to make based off of the `maxSupply`
         *
         * Each BitMapping can hold 256 ids, with each id represented by a bit.
         * Therefore, he amount will of BitMappings for this location is calculated as:
         *    `ceiling(maxSupply / 256)_`
         *
         * Our `originalPosition` is created by an incremental `index`,
         * used to indicate what address space the bitmapping has. We can indicate the offset for a given BitMapping as:
         *      `bitMapping.originalPosition * (bitMapping.maxBitPosition + 1)`
         *
         * The `maxBitPosition` indicates how many bits of the bitmap are actually used, which
         * is the minimum `255` or the supply
         */

        uint256 remainingSupply = maxSupply;
        uint256 index;
        while (remainingSupply > 0) {
            bitMappings.push(BitMapping(index++, 0, remainingSupply.min(255)));
           // SEE BELOW FOR EDIT #1
            remainingSupply -= remainingSupply.min(256);
        }
    }

    /**
     * @notice Mint
     * @param count - The number of tokens to mint
     *
     * Requirements:
     * - The `count` is within the range of the current price range
     * - The `signer` has the `SIGNER_ROLE`
     * - The proper amount was sent
     *
    *  * On every call to `mint`, make sure to:
            - generate a new random number
        * On every loop, make sure to:
            - update the bitmap
     */
    function mint(uint256 count) external payable {
        // NOTE: DO NOT USE THIS. NOT A REAL RANDOM NUMBER.
        uint256 randomNumber = uint256(keccak256(abi.encode(block.timestamp)));

        for (uint256 i; i < count; i++) {
            // All us to reuse a random number
            randomNumber = uint256(keccak256(abi.encode(randomNumber, i)));
            // Get the id
            (
                uint256 bitmapIndex,
                uint256 positionIndex
            ) = _getBitMapInfoForNewMint(randomNumber);

            uint256 tokenId = _getTokenIdAndUpdateBitMap(
                bitmapIndex,
                positionIndex
            );

            // Actually mint the token
            _mint(msg.sender, tokenId);
        }
    }

    /**
     * @dev Given a seed return a valid bitmap index and the position within it
     */
    function _getBitMapInfoForNewMint(uint256 randomNumber)
        internal
        view
        returns (uint256, uint256)
    {
        // Select one of the buckets that contains all available token ids
        uint256 bitmapMappingIndex = randomNumber % bitMappings.length;

        // Extract that bitMapping
        BitMapping memory bitMapping = bitMappings[bitmapMappingIndex];

        uint256 potentialTokenId;

        // This represents the first index we check
        uint256 originalBitMappingIndex = randomNumber %
            bitMapping.maxBitPosition;

        // We keep shifting the index ("walking the bitmap") until we iterate over every potential index
           // SEE BELOW FOR EDIT #1
           // SEE BELOW FOR EDIT #1
       // SEE BELOW FOR EDIT 2
        for (uint256 i; i <= bitMapping.maxBitPosition; i++) {
            /*
             * Offset the original selection by whatever index we're on.
             * Use modulo so we can wrap around to make the range [0, bitMapping.maxBitPosition)
             */
            potentialTokenId =
                (originalBitMappingIndex + i) %
                bitMapping.maxBitPosition;

            /*
             * If the result of the `bitwise AND` is zero, we know that that bit
             * has not been set yet in the bitmap, indicating an available index.
             */
            if (bitMapping.bitmap & (1 << potentialTokenId) == 0) {
                break;
            }
        }

        return (bitmapMappingIndex, potentialTokenId);
    }

    function _getTokenIdAndUpdateBitMap(
        uint256 mintingIndex,
        uint256 tokenPosition
    ) internal returns (uint256) {
        BitMapping storage bitMapping = bitMappings[mintingIndex];

        /*
         * Update the bitmap
         *
         * `bit = 1 << tokenPosition`
         */
        bitMapping.bitmap |= (1 << tokenPosition);

        /*
         * Get the actual index by taking the original offset into the bitmap array,
         * multiplying it by the maxPosition of that bitmap + 1, and adding the tokenId.
         *
         * NOTE: Add + 1 because we WANT overflow here:
         * Take: originalPosition = 1, maxBitPosition = 255, tokenPosition = 0
         *
         * By adding one, we ensure we end up in the range [256, X], which ensures we don't overwrite
         * any index from the first bitmap.
         */
        uint256 mintIndex = bitMapping.originalPosition *
            (bitMapping.maxBitPosition + 1) +
            tokenPosition;

        /*
         * If a bitmap is full, delete it from the array.
         * This ensures that every `bitmap` has _at least_ one available index
         *
         * NOTE: Solidity "pop + swap model"
         */
        if (bitMapping.bitmap == ((1 << bitMapping.maxBitPosition) - 1)) {
            bitMappings[mintingIndex] = bitMappings[bitMappings.length - 1];
            bitMappings.pop();
        }

        return mintIndex;
    }
}

A few notes:

My general arithmetic + usage of the bitmap seems sketchy at best. Am I looking at any potential one off errors? Specifically, the check at the end of _getTokenIdAndUpdateBitMap feels like it may not be correct.
Regarding the usage of those structs - are there advantages/disadvantages of using memory vs struck vs reading from the array itself?
I believe I could pack the BitMapping struct itself. maxBitPosition will allows be uint8 at most, and for the use case I think we could safely assume that originalPosition will be < MAX_UINT248, so I could probably even mask those in some which way to combine into one variable (does this have pros + cons vs a tightly packed struct?).. But down the line is reading and operating on that going to then cost significant gas?
The core loop is O(256) - is there a better data structure/arrangment of BitMapping that better uses EVM-specific data structuring in regards to both memory + time complexity?

EDIT #1:

I previously had the following line:

            remainingSupply -= remainingSupply.min(255);

However, this proves to not be inaccurate. Take the trivial case of needing 256 tokenids. If we assume we can use tokenId == 0, then 0 - 255 already provides with 256 valid tokenIds, meaning we should actually decrease supply by 256.

However, the max maxBitPosition still should be < 256, as it also includes 0 as a valid index, meaning it represents maxBitPosition + 1 slots.

EDIT #2:
I previous had the following line:

        for (uint256 i; i < bitMapping.maxBitPosition; i++) {

The need for the change is partly explaing in EDIT #1 - given that maxBitPosition represents maxBitPosition + 1 valid indices, we actually need to iterate through that index as well, hence the <=.

0xLostArchitect · October 24, 2021, 1:24am

Random thought on the potential struct packing.

TL;DR - we don't even need the struct. We could actually just make a uint256[] through some fancy masking.

We could decide that index and maxBitPosition will both be uint8 , leaving the bitmap to be uint240. We now calculate the length of the array to be maxSupply / 240 , which, yes, will add a few more unit256 elements, but that will pay off swimmingly considering that we now have zero structs (which was previously represented by 3 uint256).

Furthermore, all writes + updates will only need to operate on uint256 , as opposed to updating multiple fields of a struct/moving them around.

Worth it?

frangio · October 28, 2021, 7:20pm

You will probably find our BitMaps library useful.

I would try to isolate the logic of drawing an available token id into a single internal function.

0xLostArchitect · November 3, 2021, 5:01pm

BitMaps uses all 256 bits though in a non-configurable manner, and I'm trying to ~~hack~~ design a "packed" bitmap.

It's essentially as though I have a struct like:

struct TightStruct {
    uint8 originalPosition;
    uint8 maxBitPosition;
    uint240 bitmap;
}

and then folding that into a uint256.

@fragio The only reason I don't have them as one function is because those functions already do so much that doubling the size and making it change state just seems hectic, but point well-taken.

Here is an initial implementation. NOTE: Not tested. Do not use in production.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.7;

// External
import "@openzeppelin/contracts/utils/math/Math.sol";

contract BitmapHolder {
    /* Libraries */
    using Math for *;

    /**
     * @dev A collection of bitmaps that builds on masking and bitwise operations
     *
     * @dev top 8 bits ("originalPosition") - the original position in `bitMappings`, providing the offset for an id down the line
     * @dev next 8 bits ("maxBitPosition") - the maximum bit position of this bitmap, which allows us to determine when the bitmap is full
     * @dev bottom 240 bits ("bitmap") - the actual bitmap holding up to `maxBitPosition + 1` id's
     */
    uint256[] private _bitMappings;

    /**
     * @notice Constructor to initialize the bitmap with a given supply
     */
    constructor(uint256 supply) {
        _makeBitMapping(supply);
    }

    /**
     * @dev Initialization function for the bitmap holder
     *
     * @dev Determine the number of bit mappings to make based off of the `maxSupply`
     *
     * @dev An entity looking to hold quantity T will need `ceiling(T / 240)` uint256 to represent all of the different id's possible
     */
    function _makeBitMapping(uint256 maxSupply) internal {
        // Track how many units are still needed
        uint256 remaining = maxSupply;

        // Track the original position inside the array for offsetting reasons downstream
        uint256 index;

        // Iterate until no more units are needed
        while (remaining > 0) {
            /*
             * Base entry has its first 8 bits occupied by the `index`,
             * the next 8 bits describe how many bits are used by the packed `bitmap`,
             * and the last 240 bits are the actual bitmap.
             *
             * Increase the index, and do a bitwise OR with the `maxbits` needed offset to the proper position
             *
             * NOTE: Use `239` because `maxBitPosition` INCLUDES `0` as a valid bit! 
            */
            _bitMappings.push(index++ | (remaining.min(239) << 8));

            // Ensure no overflow
            remaining -= remaining.min(240);
        }
    }

    /**
     * @notice Public mint function to test the functionality
     */
    function mint() public returns (uint256) {
        // NOTE: DO NOT USE THIS FUNCTION TO GET A RANDOM NUMBER.
        uint256 seed = uint256(
            keccak256(abi.encode(block.timestamp, block.difficulty))
        );
        (uint256 index, uint256 position) = _getBitMapInfoForNewId(seed);
        return _getIdAndUpdateBitMap(index, position);
    }

    /**
     * @dev Gets the base information needed for a new id
     *
     * @return uint256 - The index into `_bitmappings`
     * @return uint256 - The position with the nested bitmap stored at the `index` returned
     */
    function _getBitMapInfoForNewId(uint256 seed)
        internal
        view
        returns (uint256, uint256)
    {
        // Select one of the buckets that contains all available token ids
        uint256 bitmapMappingIndex = seed % _bitMappings.length;

        //Shift and grab 8 bits to get mbp, then convert back to 256 for easier solidity handling
        uint256 _maxBitPosition = uint256(
            uint8(_bitMappings[bitmapMappingIndex] >> 8)
        );

        // Move the bitMapping up 16 bits to get just the `bitmap`
        uint256 _bitmap = _bitMappings[bitmapMappingIndex] >> 16;

        // Store the id that is still available
        uint256 position;

        // This represents the first index we check
        uint256 originalBitMappingIndex = seed % _maxBitPosition;

        // We keep shifting the index ("walking the bitmap") until we iterate over every potential index
        for (uint256 i; i < _maxBitPosition; i++) {
            /*
             * Offset the original selection by whatever index we're on.
             * Use modulo so we can wrap around to make the range [0, _maxBitPosition)
             */
            position = (originalBitMappingIndex + i) % _maxBitPosition;

            /*
             * If the result of the `bitwise AND` is zero, we know that that bit
             * has not been set yet in the bitmap, indicating an available index.
             */
            if (_bitmap & (1 << position) == 0) break;
        }

        return (bitmapMappingIndex, position);
    }

    function _getIdAndUpdateBitMap(uint256 bitmapIndex, uint256 position)
        internal
        returns (uint256)
    {
        /*
         * Update the `bitMapping`
         *
         * `bit = 1 << position` gets the position inside the bitmap itself
         * shifting 16 properly places it inside the packed uint256
         */
        uint256 newBitMapping = _bitMappings[bitmapIndex] |
            ((1 << position) << 16);


        // Store the maxBitPosition
        uint256 maxBitPosition = uint256(uint8(newBitMapping >> 8));

        /*
         * Get the actual index by taking the original position in `_bitMappings` (calculated by getting the first 8 bits from the bitmapping)
         * multiplying it by the `maxBitPosition` of that bitmap + 1 (See below),
         * adding the `position` inside the bitmap,
         *
         * NOTE: We use `maxBitPosition + 1` because we want overflow into the next range.
         *
         * Example: uint8(newBitMapping) = 1, maxBitPosition = 239, position = 0
         *
         * By adding one, we ensure we end up in the range [240, ...), gauranteeing we don't overwrite
         * any index from the first range from `_bitMappings`
         */
        uint256 index = uint8(newBitMapping) * (maxBitPosition + 1) + position;

        /*
         * At this point, the new `bitmap` can potentially be full or not.
         * To check, we get just the `bitmap` from the `bitMapping` (by shifting 16),
         * we capture what the highest possible value for this `bitmap` is (see below).
         * and if those are equal, we delete it from the array (using  Solidity "pop + swap model")
         * This ensures that every `bitmap` has _at least_ one available index.
         * 
         * In the case where this is no overflow, simply rewrite to storage.
         *
         * NOTE: We use this `- 1` construct because we want one less than the value of the maximimum represents a full bitmap
         *
         * Example: newBitMapping >> 16 = 1, maxBitPosition = 1
         *
         * Since `maxBitPosition` represents one bit higher than what this bitMapping can hold (given that it uses the 0 slot!),
         * the max value of this bitmap should be `1`. Therefore, we end up with `(1 << 1) - 1 == 1`, which is exactly what was desired.
         *
        */
        if ((newBitMapping >> 16) == ((1 << maxBitPosition) - 1)) {
            _bitMappings[bitmapIndex] = _bitMappings[_bitMappings.length - 1];
            _bitMappings.pop();
        } else {
            _bitMappings[bitmapIndex] = newBitMapping;
        }

        return index;
    }

    /**
     * @notice Public getter function
     */
    function bitMappings() external view returns (uint256[] memory) {
        return _bitMappings;
    }
}

Would love to hear any thoughts!

frangio · November 3, 2021, 6:16pm

The issue with iterating over an array is that it's going to become more and more expensive as the bitmap is more full.

What would you like to configure? Note that if you're concerned about numbers 0 to 1000 they will be packed in the most efficient way possible, as far as I can tell. Can you describe in words what further optimization you're looking for?

(Sorry haven't had time to read the code in depth.)

0xLostArchitect · November 3, 2021, 10:26pm

For more context: Imagine we have 1000 id's that will be selected at random. What is the most efficient way to select one that has not been selected yet?

Naive implementation is O(n) - iterate over every single id and check if it has been selected (e.g. using _exists for a 721, but I argue we can do better.

In the original implementation I had, we lowered our bound to O(256) by creating this fancy struct - each one would essentially track an address space of 256 available id's, and by all structs in an array with the covariant that every member of the array must have at least one available id, we achieved constant time.

But then the memory constraint bothered me - using 3 uint256 to track something so simple seemed overkill. So the above proposed solution instead packs the first struct into one uint256, laid out so that:

The first 8 bits track the original position in the array - so that later on we can find the correct token space
The next 8 bits track the max bit position inside this uint256 id's, to ensure that when we look for id's we only check valid bits
The final 240 bits are the bit map portion.

This is addressed by the solution above, wherein we have O(240) to search for an available id, which, of course, isn't great, but is the best I can find given the constraints of the issue.

sven.meyer · December 7, 2022, 9:46am

" For more context : Imagine we have 1000 id's that will be selected at random. What is the most efficient way to select one that has not been selected yet?"

If that's what you are (mainly) looking for, then just use a hashmap.
(Surprisingly) in Solidity / EVM looking up an address in a hashmap is just O(1) , and I think even a tiny bit less gas than accessing one (!) array element.

Topic		Replies	Views
Using `address` as keys for BitMaps Contracts	3	874	June 29, 2023
Bitmaps Library example Contracts	2	1358	August 11, 2022
Mint ERC721 tokens in pseudorandom order General erc721	9	4550	May 3, 2022
Two qustions about the latest BitMaps library Support	1	355	June 27, 2021
ERC721Enumerable gas optimization Contracts erc721 , openzeppelin-contracts	8	4367	March 8, 2022

Efficient Usage of BitMaps to Store Predefined Set of Randomly Selected Numbers

Related topics