Grant Report : Robust Perl 6 Unicode Support – June 2017

Samantha McVey has made progress on hergrant to improve the robustness of Unicode support in Rakudo. She is working in the following repos:

Here are a few highlights from her complete blog post

The script tests the contents of each grapheme individually from the GraphemeClusterBreak.txt file from the Unicode 9.0 test suite.

Previously we only checked the total number of ‘.chars’ each for the string as a whole. Obviously we want something more precise than that, since the test specifies the location of each of the breaks between codepoints. The new code checks that codepoints are put in the correct graphemes in the proper order. In addition we also check the string length as well.

This new test uses a grammar to parse the file and generally is much more robust than the previous script.

  • I have some currently unmerged tests which need to wait to be merged, although sections of it are complete and are being incorporated into the larger Unicode Database Retrofit, reusing this code.

  • I have written grammars and modules to process and provide data on the PropertyValueAliases
    and PropertyAliases
    . They will be used for testing that all of the canonical property names and all the property values themselves properly resolve to separate property codes, as well as that they are usable in regex.

  • As part of my grant work I am working on making Unicode property values distinct per property, and also on allowing all canonical Unicode property values to work.

  • I’ve also started adding some documentation to my Unicode-Grant wiki with information about what is enclosed in each Unicode data files; there are a few other pages as well. This wiki
    is planned to be expanded to have many more sections than it does currently.”


Perlsphere稿源:Perlsphere (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合编程 » Grant Report : Robust Perl 6 Unicode Support – June 2017

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录