]>
Commit | Line | Data |
---|---|---|
374ca955 A |
1 | Unicode 4.0.1 update |
2 | ||
3 | *** related Jitterbugs | |
4 | ||
5 | 3170 RFE: Update to Unicode 4.0.1 | |
6 | 3171 Add new Unicode 4.0.1 properties | |
7 | 3520 use Unicode 4.0.1 updates for break iteration | |
8 | ||
9 | *** data files & enums & parser code | |
10 | ||
11 | * file preparation | |
12 | - ucdstrip: DerivedNormalizationProps.txt, NormalizationTest.txt, DerivedCoreProperties.txt | |
13 | - ucdstrip and ucdmerge: EastAsianWidth.txt, LineBreak.txt | |
14 | ||
15 | * file fixes | |
16 | - fix UnicodeData.txt general categories of Ethiopic digits Nd->No | |
17 | according to PRI #26 | |
18 | http://www.unicode.org/review/resolved-pri.html#pri26 | |
19 | - undone again because no corrigendum in sight; | |
20 | instead modified tests to not check consistency on this for Unicode 4.0.1 | |
21 | ||
22 | * ucdterms.txt | |
23 | - update from http://www.unicode.org/copyright.html | |
24 | formatted for plain text | |
25 | ||
26 | * uchar.h & uprops.h & uprops.c & genprops | |
27 | - add UBLOCK_CYRILLIC_SUPPLEMENT because the block is renamed | |
28 | - add U_LB_INSEPARABLE due to a spelling fix | |
29 | + put short name comment only on line with new constant | |
30 | for genpname perl script parser | |
31 | - new binary properties | |
32 | + STerm | |
33 | + Variation_Selector | |
34 | ||
35 | * genpname | |
36 | - fix genpname perl script so that it doesn't choke on more than 2 names per property value | |
37 | - perl script: correctly calculate the maximum number of fields per row | |
38 | ||
39 | * uscript.h | |
40 | - new script code Hrkt=Katakana_Or_Hiragana | |
41 | ||
42 | * gennorm.c track changes in DerivedNormalizationProps.txt | |
43 | - "FNC" -> "FC_NFKC" | |
44 | - single field "NFD_NO" -> two fields "NFD_QC; N" etc. | |
45 | ||
46 | * genprops/props2.c track changes in DerivedNumericValues.txt | |
47 | - changed from 3 columns to 2, dropping the numeric type | |
48 | + assume that the type is always numeric for Han characters, | |
49 | and that only those are added in addition to what UnicodeData.txt lists | |
50 | ||
51 | *** Unicode version numbers | |
52 | - makedata.mak | |
53 | - uchar.h | |
54 | - configure.in | |
55 | ||
56 | *** tests | |
57 | - update test of default bidi classes according to PRI #28 | |
58 | /tsutil/cucdtst/TestUnicodeData | |
59 | http://www.unicode.org/review/resolved-pri.html#pri28 | |
60 | - bidi tests: change exemplar character for ES depending on Unicode version | |
61 | - change hardcoded expected property values where they change | |
62 | ||
63 | *** other code | |
64 | ||
65 | * name matching | |
66 | - read UCD.html | |
67 | ||
68 | * scripts | |
69 | - use new Hrkt=Katakana_Or_Hiragana | |
70 | ||
71 | * ZWJ & ZWNJ | |
72 | - are now part of combining character sequences | |
73 | - break iteration used to assume that LB classes did not overlap; now they do for ZWJ & ZWNJ |