Snowflake ID:
Snowflakes are 64 bits in binary. (Only 63 are used to fit in a signed integer.) The first 41 bits are a timestamp, representing milliseconds since the chosen epoch. The next 10 bits represent a machine ID, preventing clashes. Twelve more bits represent a per-machine sequence number, to allow creation of multiple snowflakes in the same millisecond. The final number is generally serialized in decimal.
Instead of
1 | select count(*) from payment.transaction |
Do
1 | select count(*) from payment.transaction |
- We extract the Unix Epoch (in seconds) from the time, turn it to a bigint in milliseconds.
- Then we remove 1288834974657 ms (the Twitter snowflake Epoch started at that Unix Epoch), which gives us the Epoch time in ms relative to the Twitter Epoch.
- Finally we just offset that by 22 bits and turn it to a text for comparison with Redash’s PGSQL id which is a text (note in MySQL, those are not texts but varbinaries).
In May 2018, the Twitter epoch went to 1000000000000000000. If you did not add the check on the string length, then you’d also be counting transactions that happened before May the 5th, 2018, because their Twitter epoch starts by 9 and this is a text comparison
I have a Snowflake id, how do I turn it into an SQL timestamp ?
1 | select to_timestamp((((<SNOWFLAKE_ID>::bigint)>>22) + 1288834974657) / 1000) as flakeTime |
I have an SQL timestamp, how do I turn it into a Snowflake id ?
1 | select ((extract(epoch from timestamp '2022-04-01 02:00')::bigint * 1000 - 1288834974657) << 22)::text |
Note this is the minimum value for a snowflake at that time (with the 22 bits on the right filled with 0), while the maximum value is with those 22 bits filled with 1
To cover all possible snowflakes, compare with >= and < on a small time range (within one minute, one second, whatever you like down to a millisecond if you have a use-case for that).