全局唯一标识(UID)-多种实现方案

论坛 期权论坛 脚本     
匿名技术用户   2020-12-27 00:30   145   0

原文转自:http://www.tanjp.com/archives/132 (即时修正和更新)

目录

64位全局唯一标识

62进制字符串唯一ID

32位唯一ID(一年(388天)内的每秒内有128个唯一值)

32位唯一ID(一个月(48天)内的每秒内有1024个唯一值)

32位唯一ID(一周(12天)内的每秒内有4096个唯一值)


64位全局唯一标识

snowflake是用64位整型来表示一个序列号的,看上去和我们的时间戳很像,但是它的起始元年不是1970年,而是自定义的。然后时间戳里面只是记录下当前毫秒与自定义的元年时间差,这样起始就用不到64位的bit来记录整个时间戳,多出来的几位就可以做其他的事情,来看下面的结构:

00000.......000 00000 00000 000000000000
|___________| |___| |___| |__________|
| | | |
42bit 5bit 5bit 12bit

参考snowflake算法改造:

00000.......000 0000000000 000000000000
|___________| |________| |__________|
| | |
42bit 10bit 12bit

64位的全局唯一ID,保证在138年内的每1毫秒内有4095个唯一值。

第一部分 长度为42位,精确到毫秒级的自定义时间戳。当前时间戳减去指定时间开始的时间戳,最大可记录138年。

第二部分 长度为10位,根据不同业务ID做区分,取值范围[0,1023]。

第三部分 长度为12位,自增ID,取值范围[0,4095]。

C++代码实现:

class Uid64
{
public:
     explicit Uid64(uint32 pn_1st_from_year = 2000, uint32 pn_2nd_business_id = 0);
     uint64 generate(uint32 pn_2nd_business_id);
private:
     uint32 mn_1st_from_year; //毫秒级时间
     uint32 mn_2nd_business_id; //业务ID
     uint32 mn_3rd_index; //自增
};
Uid64::Uid64(uint32 pn_1st_from_year, uint32 pn_2nd_business_id)
 : mn_1st_from_year(pn_1st_from_year)
 , mn_2nd_business_id(pn_2nd_business_id)
 , mn_3rd_index(0)
{
    namespace pt = boost::posix_time;
 if ( (mn_1st_from_year > pt::microsec_clock::universal_time().date().year())
  || (mn_1st_from_year < 1970) )
 {
  mn_1st_from_year = 1970;
 }
 if (pn_2nd_business_id > 1023)
 {
  pn_2nd_business_id = 0;
 }
}
uint64 Uid64::generate(uint32 pn_2nd_business_id)
{
 namespace pt = boost::posix_time;
 mn_2nd_business_id = pn_2nd_business_id;
 if (mn_2nd_business_id > 1023)
 {
  mn_2nd_business_id = 0;
 }
 ++mn_3rd_index;
 if (mn_3rd_index > 4095)
 {
  mn_3rd_index = 0;
 }
 pt::ptime now = pt::microsec_clock::universal_time();
 pt::ptime from_time(boost::gregorian::date(mn_1st_from_year, 1, 1));
 pt::time_duration time_span = now - from_time;
 uint64 zn_1st = time_span.total_milliseconds();
 uint64 zn_2nd = mn_2nd_business_id;
 uint64 zn_3rd = mn_3rd_index;
 uint64 zn_result = (zn_1st << 22) | (zn_2nd << 12) | zn_3rd;
 return zn_result;
}

测试例子:

void main()
{ 
    const uint32 kLen = 100000;
 Uid64 zo_u64(2019, 1);
 uint64 zc_vec[kLen];
 for (uint32 i = 0; i < kLen; ++i)
 {
  zc_vec[i] = zo_u64.generate(5);
 }
    uint32 zn_repeat_count = 0;
    std::unordered_set<uint64> zc_check_repeat;
    for (uint32 i = 0; i < kLen; ++i)
    {
        auto it = zc_check_repeat.find(zc_vec[i]);
        if (it == zc_check_repeat.end())
        {
            zc_check_repeat.insert(zc_vec[i]);
        }
        else
        {
            ++zn_repeat_count;
            std::cout << "repeat:" << zc_vec[i] << std::endl;
        }
    }
        std::cout << std::endl 
            << "repeat count : " << zn_repeat_count << std::endl;
}

62进制字符串唯一ID

全局唯一ID在程序很多时候不能包含标点符号,于是设计一种不包含标点符号的可见字符转换方案。非标点有效字符包括 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz,分别对应数值 0~61,唯一ID的生成由多个数值转成字符串后拼接起来,为了正确解析和保证唯一性,N个数值拼接,需要N-1个以上带字符串的长度信息。

拼接公式: 自增+线程ID+线程ID字符长度+进程ID+进程ID字符长度+相对时间戳+相对时间戳字符长度

例如:

Uid62 zo_u62;
std::cout << zo_u62.new_uid(135, 61) << std::endl;

结果: 1z1B22RlGj25

解析: 1(自增) z(61相应的字符) 1(61相应的字符的长度) B2(135相应的字符) 2(135相应的字符的长度) RlGj2(相对时间戳相应的字符) 5(相对时间戳相应的字符长度)

代码实现:

class Uid62
{
public:
    explicit Uid62(uint32 pn_basetime = 1514736000U);
    std::string new_uid(uint32 pn_serverid, uint32 pn_actorid);
    std::string split_uid(const std::string & ps_uid);
private:
    Uid62(const Uid62&) = delete;
    Uid62(Uid62&&) = delete;
    Uid62& operator=(const Uid62&) = delete;
 Uid62& operator=(Uid62&&) = delete;
 void convert_to_str(std::stringstream & po_result, uint32 pn_val, bool pb_append_len = true);
 bool convert_to_uint32(uint32 & pn_result, const std::string & ps_str);
private:
 uint32 mn_basetime;
 uint32 mn_calc;
 uint32 mn_lasttime;
 std::string ms_code;
 std::unordered_map<uint8, uint8> mc_codemap;
};
Uid62::Uid62(uint32 pn_basetime)
 : mn_basetime(pn_basetime)
 , mn_calc(0)
 , mn_lasttime(0)
 , ms_code("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz")
{
 for (uint8 i = 0; i < ms_code.size(); ++i)
 {
  mc_codemap[ms_code[i]] = i;
 }
}
std::string Uid62::new_uid(uint32 pn_serverid, uint32 pn_actorid)
{
 uint32 zn_now = (uint32)time(NULL);
 if ( (pn_serverid > 999999999U) || (pn_actorid > 999999999U) || (mn_basetime > zn_now) )
 {
  return std::string();
 }
 if (zn_now == mn_lasttime)
 {
  // 同1s内, 累加
  mn_calc++;
 }
 else
 {
  mn_lasttime = zn_now;
  mn_calc = 1;
 }
 std::stringstream ss;
 convert_to_str(ss, mn_calc, false);
 convert_to_str(ss, pn_actorid, true);
 convert_to_str(ss, pn_serverid, true);
 convert_to_str(ss, zn_now - mn_basetime, true); 
 return std::move(ss.str());
}

std::string Uid62::split_uid(const std::string & ps_uid)
{
 for (uint8 i = 0; i < ps_uid.size(); ++i)
 {
  if (mc_codemap.find(ps_uid[i]) == mc_codemap.end())
  {
   return "error='invalid uid:" + ps_uid + "'";
  }
 }
 //time
 uint8 len_char_index = (uint8)ps_uid.size() - 1;
 char len_char = ps_uid[len_char_index];
 uint8 len = mc_codemap[len_char];
 if (len > len_char_index)
 {
  //长度不匹配
  return "error='invalid time len char : " + ps_uid + "'";
 }
 uint8 start_index = len_char_index - len;
 std::string time_str = ps_uid.substr(start_index, len + 1);
 if (start_index < 2)
 {
  //长度不匹配
  return "error='invalid formation no serverid : " + ps_uid + "'";
 }

 //serverid
 len_char_index = start_index - 1;
 len_char = ps_uid[len_char_index];
 len = mc_codemap[len_char];
 if (len > len_char_index)
 {
  //长度不匹配
  return "error='invalid serverid len char : " + ps_uid + "'";
 }
 start_index = len_char_index - len;
 std::string serverid_str = ps_uid.substr(start_index, len + 1);
 if (start_index < 2)
 {
  //长度不匹配
  return "error='invalid formation no actorid : " + ps_uid + "'";
 }

 //actorid
 len_char_index = start_index - 1;
 len_char = ps_uid[len_char_index];
 len = mc_codemap[len_char];
 if (len > len_char_index)
 {
  //长度不匹配
  return "error='invalid actorid len char : " + ps_uid + "'";
 }
 start_index = len_char_index - len;
 std::string actorid_str = ps_uid.substr(start_index, len + 1);
 if ((start_index < 1) || (start_index >= ms_code.size()))
 {
  //长度不匹配
  return "error='invalid formation no calc : " + ps_uid + "'";
 }

 //calc
 len = start_index;
 std::string calc_str = ps_uid.substr(0, len);
 calc_str.append(1, ms_code[len]);
 uint32 zn_serverid = 0;
 uint32 zn_actorid = 0;
 uint32 zn_calc = 0;
 uint32 zn_time = 0;

 if (!convert_to_uint32(zn_serverid, serverid_str))
 {
  return "error='convert serverid failed : " + serverid_str + "'";
 }
 if (!convert_to_uint32(zn_actorid, actorid_str))
 {
  return "error='convert actorid failed : " + actorid_str + "'";
 }
 if (!convert_to_uint32(zn_calc, calc_str))
 {
  return "error='convert calc failed : " + calc_str + "'";
 }
 if (!convert_to_uint32(zn_time, time_str))
 {
  return "error='convert time failed : " + time_str + "'";
 }
 std::stringstream ss;
 ss << "serverid=" << zn_serverid << ",actorid=" << zn_actorid
  << ",tps=" << zn_calc << ",timespan=" << zn_time;
 return ss.str();
}

void Uid62::convert_to_str(std::stringstream & po_ss, uint32 pn_val, bool pb_append_len)
{
 uint32 len = 0;
 while (pn_val >= ms_code.length())
 {
  po_ss << ms_code[pn_val % ms_code.length()];
  pn_val /= (uint32)ms_code.length();
  ++len;
 }
 if (pn_val >= 0)
 {
  //0 ~ 61
  po_ss << ms_code[pn_val];
  ++len;  
 }
 if (pb_append_len)
 { 
  po_ss << ms_code[len];
 }
}

bool Uid62::convert_to_uint32(uint32 & pn_result, const std::string & ps_str)
{
 if (ps_str.size() < 2)
  return false; //非法结构

 char len_char = ps_str[ps_str.size() - 1];
 if (mc_codemap.find(len_char) == mc_codemap.end())
  return false; //非法长度

 uint8 len = mc_codemap[len_char];
 if (len > (ps_str.size() - 1) )
  return false; //长度不匹配

 char c = 0;
 uint32 zn_result = 0;
 for (uint8 i = 0; i < len; ++i)
 {
  c = ps_str[ps_str.size() - 1 - (len - i)];
  if (mc_codemap.find(c) == mc_codemap.end())
   return false; //非法字符
   
  uint32 val = mc_codemap[c];
  zn_result += val * (uint32)std::pow(ms_code.length(), i);
 }
 pn_result = zn_result;
 return true;
}

32位唯一ID(一年(388天)内的每秒内有128个唯一值)

将32位的整型划分为两部分,存储结构如下:
00000........000 0000000
|___________| |_____|
| |
25bit 7bit

第一部分 长度为25位, 精确到秒级的时间戳。当前时间戳减去程序启动时间开始的时间戳,最大可记录1年(388天)。
第二部分 长度为7位, 自增ID,取值范围[0,127]。

代码实现:

class Uid32YearPS128
{
public:
     Uid32YearPS128();
     uint32 generate();
private:
     uint32 mn_1st_from_time;
     uint32 mn_2nd_index;
     uint32 mn_limit_counter;
 uint32 mn_reset_time;
};
Uid32YearPS128::Uid32YearPS128()
 : mn_1st_from_time(std::time(0)-1) //从上一秒开始
 , mn_2nd_index(0)
 , mn_limit_counter(0)
 , mn_reset_time((uint32)std::time(0)){}
uint32 Uid32YearPS128::generate()
{
 uint32 now = (uint32)std::time(0);
 if (mn_limit_counter > 127)
 {
  if (mn_reset_time == now)
  {
   return 0; //同一秒内超过数量上限, 生成失败
  }  
  mn_reset_time = now;
  mn_limit_counter = 0;
 }
 ++mn_limit_counter;
 uint32 zn_1st = now - mn_1st_from_time;
 // 长度为22位数值大小: 2^25 - 1 = 33554431, 33523200=388天
 if (zn_1st > 33523200)
 {
  //超过周期大小,回归
  mn_1st_from_time = now - 1;
  zn_1st = now - mn_1st_from_time;
 }
 ++mn_2nd_index;
 if (mn_2nd_index > 127)
 {
  mn_2nd_index = 0;
 }
 return ((zn_1st << 7) | mn_2nd_index);
}

32位唯一ID(一个月(48天)内的每秒内有1024个唯一值)

将32位的整型划分为两部分,存储结构如下:

00000........000 0000000000
|___________| |________|
| |
22bit 10bit

第一部分 长度为22位, 精确到秒级的时间戳。当前时间戳减去程序启动时间开始的时间戳,最大可记录1个月(48天)。
第二部分 长度为10位, 自增ID,取值范围[0,1023]。

32位唯一ID(一周(12天)内的每秒内有4096个唯一值)

将32位的整型划分为两部分,存储结构如下:

00000........000 000000000000
|___________| |__________|
| |
20bit 12bit

第一部分 长度为20位, 精确到秒级的时间戳。当前时间戳减去程序启动时间开始的时间戳,最大可记录1周(12天)。
第二部分 长度为20位, 自增ID,取值范围[0,4095]。

分享到 :
0 人收藏
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

积分:7942463
帖子:1588486
精华:0
期权论坛 期权论坛
发布
内容

下载期权论坛手机APP