技术控

    今日:130| 主题:49488
收藏本版 (1)
最新软件应用技术尽在掌握

[其他] Zooming-in on Group Replication performance

[复制链接]
╃渼锝Ъú橡話♂ 发表于 2016-10-6 06:36:40
118 4

立即注册CoLaBug.com会员,免费获得投稿人的专业资料,享用更多功能,玩转个人品牌!

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
A previous blog post exposed themain factors affecting Group Replication performance, which was followed by another that showed the scalability of both single-master and multi-masterthroughput. In this post we return with more “inside information” that may be useful for optimizing the performance of Group Replication deployments.
   1. End-to-end throughput in Group Replication

  Group Replication’s basic guarantee is that a transaction will commit only after the majority of the members in a group have received it and agreed on the relative order between all transactions that were sent concurrently. After that, it uses a best-effort approach that intrinsically makes no assumptions regarding when transactions will be applied or even touch storage on the group members.
  The above approach works well if the total number of writes to the group does not exceed the write capacity of any node in the group. If it does and some of the members have less write throughput than others, particularly the writer members, those members will start lagging behind of the writers, increasing over time.
   1.1. Proper infrastructure sizing

   It probably goes without saying, but we’ll stress it anyway: in a distributed system such as GR having proper sizing of the the components is very important and the throughput of a single component is not representative of the output of the replication topology as a whole. In fact, sometimes improving the throughput of some servers can even be counter-productive: if the remaining servers cannot keep up with the workload the bottlenecks just move away from the client-facing server to the rest of infrastructure at possibly higher costs.
   W ith that said, to avoid slave lag – well, in GR all members are slaves of the other members’ workload – one must consider the throughput demanded from each component and the fact that slave lag is composed of two parts: the time it takes for a transaction to cross the replication  pipeline and the time it spends waiting, somewhere in the pipeline, for previous transactions to be applied.
  So the main goal sizing such a system is assuring that the slave lag is not significantly larger than what is strictly necessary for the transactions to cross the replication pipeline, and, particularly, that it does not grow over time.
   1.2. Peak and sustained throughput

  When the client-facing members in the replication system do, in fact, experience a workload that exceeds what the remaining members in the network topology can handle there are two choices:
  
       
  • the users let the servers accept the extra workload because it is known that it will eventually return to manageable levels before exhausting the buffering resources, or
       
  • the user enables the flow-control mechanism so that, if needed, the servers delay the clients to keep the workload within bounds that are acceptable for all servers in the group.
      
  A big drawback of letting the queues grow is that stale data is served by those that have pending updates to be installed. Also the system becomes less efficient because of the buffering, and in multi-master scenarios where conflicting workloads are sent to multiple members having a queue with many transactions waiting to apply will increase the likelihood of transactions being rejected. It is nevertheless well supported just like in asynchronous replication. But the advantage of one, or the other, will end up depending on how stable or bursty the workloads are, on how long the burst periods last and on what are the fault tolerance requirements.
   To reflect this duality, we focus both the peak throughput – the throughput of the client-facing server unimpaired by slower members having to keep-up with it – and the sustained throughput – the rate the whole group can withstand without having members lag behind. Using both metrics allows a better view of the system capabilities and the effect of optimizations targeted at improving one or the other.
   2. Multi-master implications

   2.1. Dealing with conflicting workloads

  Group Replication introduces the ability to accept updates on any member in the group. Any incompatible changes are rolled-back when the certification process detects an overlap by checking which rows are changed by each transaction, other updates are committed in the local member or sent to the relay log to apply on the remote members.
  Updating very often the same rows from multiple servers will bring with it many roll-backs. In fact, if several servers update the same row consecutively, one server will succeed first and the other members will get the transaction in the relay log. While the transaction is in the relay log of some members waiting to be applied after being checked for conflicts, any transaction that uses the rows in those members will be rolled-back eventually. But the first server is not bound by the same problem, so it may continue to get its transactions through the conflict check, and the other members will continue to receive transactions the store in the relay log. So issuing the same transaction repeatedly may not be advantageous  or enough to get transactions through. Especially if the first member changes that hot data so frequently leaving others not enough time to update their copy of the data, effectively widening conflict window for them.
   So, let’s make a rule : to prevent conflicts in Group Replication the next update to a given row can’t be done before the slowest member applies the current update. Otherwise, the next update should be done on the same member that the current one was done.
   2.2. Multi-master write scalability

  In some circumstances Group Replication sending transactions to multiple members may provide higher throughput than sending them to a single server, which provides a small degree of write scalability. While this depends on how conflicting the workloads are, the improvement comes mostly from the following:
  
       
  • Using more members provides higher read capacity, which helps when transactions both read and write to the database;
       
  • Applying transactions in row-based replication consumes less resources then executing the SQL code directly.
      
  The benefit may be smaller when the number of transaction aborts is high, as explained in 1.1, and the write scalability is more significant on peak throughput, the gains in sustained throughput are less pronounced.
   
Zooming-in on Group Replication performance-1 (regarding,guarantee,relative,capacity,received)

   The chart above (repeated fromhere) presents precisely this conundrum. The dashed lines represent peak throughput, both from single-master and from multi-master configurations, going from groups of 2 to 9 members servers. The value presented is the highest throughput reached (still, the median of runs) in any client count configuration, so in some cases the bast configuration may be with 100 threads and in another with 70 threads). It shows simultaneously two interesting things:
  
       
  • writing to more then one server is beneficial in the tested configuration;
       
  • the peak throughput grows with the group size in multi-master configurations but the same does no hold for the sustained throughput;
      
  In this case the workloads use different databases, but that’s not necessary as long as the number of transaction aborts is not high. So, while there is potential for some write scalability with multi-master, evaluate how effective it is considering the number of expected transactions aborts and the rule in 2.1.
   3. Group Replication slave applier implications

  Remote transactions are sent to the relay log of each node once a transaction becomes positively certified, and them applied using the infrastructure from asynchronous and semi-synchronous replication. This implies that the slave applier is instrumental to effectively converting the peak into sustained throughput: which mostly implies using the MTS applier with enough threads to handle the received workload.
   3.1. Writeset-based transaction dependency tracking

  With asynchronous replication the LOGICAL_CLOCK scheduler uses the binary log group commit mechanism to figure out which transactions can be executed in parallel on the slave. Large group commits can lead to high concurrency on the slave, while smaller groups make it much less effective. Unfortunately, both very fast storage and reduced number of clients make commit groups smaller – hence less effective. The storage speed implication can be circumvented with the option binlog-group-commit-sync-delay, but there is no way to circumvent the low-client counts because a group commit cannot include two transactions from the same client. This is an issue mainly in multi-level replication topologies, as we typically have less applier than user threads and that means that less parallelism reaches the lower-level slaves.
  Group Replication takes a very different approach for marking which transactions can be applied in parallel in the slave, one that does not depend on the group commit mechanism. It takes into account the writesets used to certify transactions in each member and builds dependencies according to the order of certification and the database rows the transactions change. If two transactions share rows in their writesets, they cannot run in parallel and the last one to be certified becomes dependent on the previous one to commit.
   Writeset-based dependency tracking allows a significantly higher degree of parallelism, particularly on low-concurrency workloads. In fact, it becomes possible to apply transactions with more threads than they were applied in the master, even on single-threaded workloads . In some situations where a member would start lagging behind the mutl-threaded applier is now able to compensate with higher parallelism once the workload starts to clump. This “secret weapon” allows the multi-threaded applier achieve high throughput even when the asynchronous replication applier would not be very effective, in a mechanism that makes slave-lag much less of an issue. Note, however, that if the number of clients becomes large enough the performance becomes close to what is already possible with MTS in asynchronous replication.
  Writeset-based dependency tracking can also bring reordering of transactions from the same session, which implies that two transactions from the same session can appear to have been committed in the slaves in reversed order. At the cost of a small performance penalty opted to make the use of the slave-preserver-commit-order option mandatory to keep session history consistent on the slaves, and in fact the full master history.
  The chart that follows shows the potential of the writeset-based approach.

Zooming-in on Group Replication performance-2 (regarding,guarantee,relative,capacity,received)

  Illustration 2: Comparison of master and slave throughput on Sysbench RW, using both commit order and writeset dependency tracking
  Main observations:
  
       
  • with commit-order the slave throughput increases as the number of server threads increases, but at low client count the slaves cannot keep-up with the master;
       
  • with writeset-based dependencies the applier throughput is almost constant as the number of server threads increases and it is always above what is achieved when using commit-order-based dependencies.
      
   3.2 Applier threads effect on peak and sustained throughput

  As mentioned before, the sustained throughput depends on all components of the system, and very particularly the number and effectiveness of the slave applier threads. To provide some insight as to its consequences, the following chart shows the peak and sustained throughput when using a combinations of writer members and MTS applier threads (the system configuration is the one described here).
12下一页
友荐云推荐




上一篇:goTcpProxy:支持负载均衡、健康检查的 TcpProxy(Golang)
下一篇:Yup, @StrateSQL’s Running for the PASS Board of Directors
酷辣虫提示酷辣虫禁止发表任何与中华人民共和国法律有抵触的内容!所有内容由用户发布,并不代表酷辣虫的观点,酷辣虫无法对用户发布内容真实性提供任何的保证,请自行验证并承担风险与后果。如您有版权、违规等问题,请通过"联系我们"或"违规举报"告知我们处理。

紫萍 发表于 2016-10-6 12:28:30
前排,留名!
回复 支持 反对

使用道具 举报

tangchaomartin 发表于 2016-10-6 14:27:24
欢乐,不颂~
回复 支持 反对

使用道具 举报

敷衍式℡活下去 发表于 2016-10-10 03:18:34
瞄瞄,不忍直视
回复 支持 反对

使用道具 举报

ycmnob 发表于 2016-10-10 15:10:11
支持楼主,用户楼主,楼主英明呀!!!
回复 支持 反对

使用道具 举报

*滑动验证:
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

我要投稿

推荐阅读

扫码访问 @iTTTTT瑞翔 的微博
回页顶回复上一篇下一篇回列表手机版
手机版/CoLaBug.com ( 粤ICP备05003221号 | 文网文[2010]257号 )|网站地图 酷辣虫

© 2001-2016 Comsenz Inc. Design: Dean. DiscuzFans.

返回顶部 返回列表