Address Obfuscation

Here’s an odd question for the community: Given a public address (of any type [ETH, BTC, etc.]), what is the minimum amount of characters one could/should display for that information to be unique enough but not easily discoverable via a block explorer?

For example, if I have address: 0xf440635601f3d3085a1d82d874667c9b2623b842 (don’t get excited I just pulled a random one from Etherscan), is 0xf44...842 good enough to use in a “here’s some consistent information regarding an address, but I don’t want someone to know the complete address” context?

I’m being vague with my use-case on purpose.

Edit: I guess what I’m really asking is: how hard would it be to search for an address across any given chain if you only knew the first and last three characters? Does that difficulty increase exponentially by making it 2 and 2, and if so, how long would that take?

1 Like

So, after some investigation, I have some further observations:

I went with ETH as the best example, since it was easiest to quickly find out how many total addresses there are.

There are roughly 120million ETH addresses. Collecting those addresses isn’t the easiest, but it’s possible several ways (I won’t mention here). Once you have a flat file (or something similar) of 120mm lines… one should be able to find matching addresses relatively quickly with standard tools.

So, my question really becomes, what’s the likelihood that a 3…3 or 2…2 obfuscated address has more than one match within those 120mm, and how many matches are there?

Number wizards plz halp :pleading_face:

1 Like

Hi @EvilJordan,

Etherscan shows over 100 million unique addresses, which isn’t a lot of addresses.

I assume that you wouldn’t need many characters of an address to track it down, especially if there was other contextual information e.g. non zero balance of Ether or a specific token, or having interacted with a contract. I haven’t played with any analytics tools but assume you could create some queries to find an address with specific criteria. (thinking back to my life doing database queries).

Doing some back of the envelope maths (feel free to correct), If you used 6 characters, then that would narrow the address space by 1/(16^6) or 1/16,777,216‬ with an even distribution of addresses would give 6 addresses out of 100 million.

Doing a quick play with Etherscan, a unique address was being found after entering the first 7 characters of an address which suggests my maths is in the right ballpark.

Using 4 characters, then that would narrow the address space by 1/(16^4) or 1/65,536 which with even distribution gives 1526 addresses out of 100 million.

The flip side is that fewer characters shown the higher the chance of a collision where you show a user an obfuscated address that isn’t the one that they want. It depends which is worse.

Security through obscurity isn’t security.

MetaMask shows first four and last four characters.

Interested to hear other thoughts from the community.

1 Like

Your math is in line with mine.

I’m thinking in terms of some sort of social system, where an ETH address is used as the account identifier. At what point, given some characters of an address, does it become trivial to identify that user’s assets on the chain? It seems like the answer is probably > 5.

Where’s the tradeoff from user convenience (quickly skimming and noticing an address you recognize) versus security by… plausible deniability. Is 0xab more “secure” than 0xabcd, which is more “secure” than 0xab..82? Probably…

MetaMask’s decision to show the user truncated data is largely a visual UI choice, I think. No one else is looking at my MetaMask data.

When zkProofs that are cheap, easy, and have no upper-bound?

1 Like

Hi @EvilJordan,

Using ENS or an address book (even an auto generated address book such as recently used 0xab, never used 0xab) or blockies could help the user identify addresses.

1 Like