技术控

    今日:118| 主题:49225
收藏本版 (1)
最新软件应用技术尽在掌握

[其他] Thread-Safe Lock Free Priority Queues in Golang

[复制链接]
国产姑娘不矫情 发表于 2016-10-1 00:45:10
412 15

立即注册CoLaBug.com会员,免费获得投稿人的专业资料,享用更多功能,玩转个人品牌!

您需要 登录 才可以下载或查看,没有帐号?立即注册

x

Thread-Safe Lock Free Priority Queues in Golang-1 (anything,database,hundreds,windows,writing)

  True Story Follows

  So I’m on the white board with Dmitriy “Kermit” Bryndin, and we’re writing all over the windows like it’s a scene out of A Beautiful Mind. The application we work on is one in which we have hundreds of asynchronous workers that process events in the Uber ecosystem, and the output of our application may or may not have a near real-time requirement (maybe like 10 seconds). Another large chunk of the output has no real-time requirement (and anything within a few hours is acceptable). In order to satisfy both, we must ensure that we have enough workers to keep up with incoming events at all times. This can become fairly inefficient though since traffic patterns vary wildly by time of day and day of the week, so the result is that our system is overprovisioned during non-peak hours. Moreover, at the time of this erudite discussion, we had not yet eruditely mastered our database query patterns, and so there was a point at which adding too many workers would exceed the max connection limits to the database (I just want to make clear though, this problem has since been solved).

Thread-Safe Lock Free Priority Queues in Golang-2 (anything,database,hundreds,windows,writing)
    Kermit

  Hence, some of the issues above can be mitigated by processing incoming tasks in different queues if you can figure out which were high priority and which were low priority.
  My above statement though is a conclusion that was reached based off of a number of constraints. We are using celery, and so the standard queueing mechanism is just a standard first in, first out queue. The general strategy for separating tasks into important and not as important involves creating multiple queues and then provisioning a number of workers dedicated to respective queues in proportion to how high of priority that queue is. So for example, we might have 100 low priority workers that consume from “high” and “low” and then an extra 25 workers dedicated to consuming from “high”. Now the “high” queue is permanently overprovisioned and if we fall behind on incoming events, the dedicated high priority workers will need to become completely saturated in order for high priority items to fall behind.
  The other key word here is “mitigate”. The setup of multiple queues is beneficial only to the extent that they help mitigate disasters, but it certainly doesn’t help with resource utilization as long as the low priority queues are at least intended to be processed within a reasonable timeframe. Ideally, for some input task, I could use some heuristic to be able to say “ideally this task gets processed before X timestamp”. If I have workers that could then process events in the order of their priority, now it doesn’t actually matter if we’re not keeping up with real time events as long as the current timestamp is less than the “process by” timestamp of an input task. Now the system could store tasks in a queue that are just sitting and waiting to be executed by a smaller pool of worker machines. Only when the “process by timestamps” start to exceed the current timestamp do we need to start provisioning more workers, but as an added benefit, you would be able to anticipate when this threshold is about to be reached. With only one (priority) queue, there’s also no complicated management of worker allocation to different queues and a probable over provisioning of every single queue.
  Now for celery, we could of course use a redis sorted set to create a priority queue, which uses skip lists under the hood, but that’s totally lame. Only losers write priority queues that have logarithmic enqueue and constant time dequeues. Winners write constant time enqueues and logarithmic dequeues. So we should write our own.
  The Use Case

   The use case for the resultant code I’ll elaborate on below stems from the use case above. Asynchronous workers executing tasks is a concrete use case, but any priority queue that involves a heuristic to assign a priority to an item that later gets processed is valid. Clarifying that a heuristic correlates with the use case is an important distinction: in essence, I can create a datastructure that is fast with no locks and scales to the number of cores in the processor at the expense of not being able to guarantee that a dequeued item is the absolute highest priority item. For use cases that fit, this should not matter in practice.
  If I care about a lock-free datastructure, that must mean that I have multiple workers. If I’m not able to get the absolute highest priority item from a dequeue operation, this is effectively a result of contention, meaning that dequeues (which are writes to the datastructure) are saturated. Therefore, if the highest priority item isn’t returned to the caller, it just means that another caller will get the highest priority item shortly after the contention is resolved.
  So while I might not get a result that is pure and correct from a dequeue operation, as an individual caller I would never know that and I would never care because in the context of the larger system, everything is healthy.
  Moreover, if the queue is indeed scored by a heuristic, then we already know that a heuristic cannot be 100% accurate. So an out of order result could be reasoned about as added variance to whatever scoring mechanism, and the performance benefits of allowing this leeway clearly outweigh the negligible cost of an absence of purity (I say clearly, but if your use case requires dequeues in perfect order, then your use case doesn’t fit anyway).
  The Code

   All of the code referenced in this discussion can be found On Github . Also, I was at Github’s headquarters last night, so let’s just throw in a picture of that as well:
友荐云推荐




上一篇:The Ultimate Guide to What’s New in VMware End-User Computing
下一篇:What to do when hackers break into your cloud
酷辣虫提示酷辣虫禁止发表任何与中华人民共和国法律有抵触的内容!所有内容由用户发布,并不代表酷辣虫的观点,酷辣虫无法对用户发布内容真实性提供任何的保证,请自行验证并承担风险与后果。如您有版权、违规等问题,请通过"联系我们"或"违规举报"告知我们处理。

つ命犯小桃花 发表于 2016-10-1 02:27:14
一个人快活,两个人生活,三个人就是你死我活。
回复 支持 反对

使用道具 举报

602231022 发表于 2016-10-1 03:41:18
做为一只禽兽,我深感压力很大…
回复 支持 反对

使用道具 举报

橼§纷 发表于 2016-10-1 05:16:02
1v1飘过
回复 支持 反对

使用道具 举报

qhcvp 发表于 2016-10-1 08:42:34
敢进男厕所的女汉子别隐身
回复 支持 反对

使用道具 举报

好好好 发表于 2016-10-1 11:46:17
为失联儿童祈福!2016-10-01
回复 支持 反对

使用道具 举报

ghkty 发表于 2016-10-1 23:44:08
我就算是一只癞蛤蟆,我也决不娶母癞蛤蟆.
回复 支持 反对

使用道具 举报

路过看看 发表于 2016-10-2 00:29:13
我觉得挺好的,大家的看法呢
回复 支持 反对

使用道具 举报

1990212 发表于 2016-10-2 03:21:41
为楼主点个赞!
回复 支持 反对

使用道具 举报

royalrl 发表于 2016-10-2 03:45:35
美女未抱身先走,常使色狼泪满襟。。。。。。  
回复 支持 反对

使用道具 举报

*滑动验证:
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

我要投稿

推荐阅读

扫码访问 @iTTTTT瑞翔 的微博
回页顶回复上一篇下一篇回列表手机版
手机版/CoLaBug.com ( 粤ICP备05003221号 | 文网文[2010]257号 )|网站地图 酷辣虫

© 2001-2016 Comsenz Inc. Design: Dean. DiscuzFans.

返回顶部 返回列表