]>
Commit | Line | Data |
---|---|---|
46f4442e | 1 | * Copyright (C) 2004-2008, International Business Machines |
73c04bcf A |
2 | * Corporation and others. All Rights Reserved. |
3 | * | |
4 | * file name: changes.txt | |
5 | * encoding: US-ASCII | |
6 | * tab size: 8 (not used) | |
7 | * indentation:4 | |
8 | * | |
9 | * created on: 2004may06 | |
10 | * created by: Markus W. Scherer | |
11 | * | |
12 | * change log for Unicode updates | |
13 | ||
14 | ---------------------------------------------------------------------------- *** | |
15 | ||
46f4442e A |
16 | Unicode 5.1 update |
17 | ||
18 | *** related ICU Trac tickets | |
19 | ||
20 | 5696 Update to Unicode 5.1 | |
21 | ||
22 | *** Unicode version numbers | |
23 | - makedata.mak | |
24 | - uchar.h | |
25 | - configure.in & configure | |
26 | - update ucdVersion in gennames.c if an algorithmic range changes | |
27 | ||
28 | *** data files & enums & parser code | |
29 | ||
30 | * file preparation | |
31 | - ucdstrip: | |
32 | DerivedCoreProperties.txt | |
33 | DerivedNormalizationProps.txt | |
34 | NormalizationTest.txt | |
35 | PropList.txt | |
36 | Scripts.txt | |
37 | GraphemeBreakProperty.txt | |
38 | SentenceBreakProperty.txt | |
39 | WordBreakProperty.txt | |
40 | - ucdstrip and ucdmerge: | |
41 | EastAsianWidth.txt | |
42 | LineBreak.txt | |
43 | ||
44 | * my ucd2unidata.bat (needs to be updated each time with UCD and file version numbers) | |
45 | copy 5.1.0\ucd\BidiMirroring.txt ..\unidata\ | |
46 | copy 5.1.0\ucd\Blocks.txt ..\unidata\ | |
47 | copy 5.1.0\ucd\CaseFolding.txt ..\unidata\ | |
48 | copy 5.1.0\ucd\DerivedAge.txt ..\unidata\ | |
49 | copy 5.1.0\ucd\extracted\DerivedBidiClass.txt ..\unidata\ | |
50 | copy 5.1.0\ucd\extracted\DerivedJoiningGroup.txt ..\unidata\ | |
51 | copy 5.1.0\ucd\extracted\DerivedJoiningType.txt ..\unidata\ | |
52 | copy 5.1.0\ucd\extracted\DerivedNumericValues.txt ..\unidata\ | |
53 | copy 5.1.0\ucd\NormalizationCorrections.txt ..\unidata\ | |
54 | copy 5.1.0\ucd\PropertyAliases.txt ..\unidata\ | |
55 | copy 5.1.0\ucd\PropertyValueAliases.txt ..\unidata\ | |
56 | copy 5.1.0\ucd\SpecialCasing.txt ..\unidata\ | |
57 | copy 5.1.0\ucd\UnicodeData.txt ..\unidata\ | |
58 | ||
59 | ucdstrip < 5.1.0\ucd\DerivedCoreProperties.txt > ..\unidata\DerivedCoreProperties.txt | |
60 | ucdstrip < 5.1.0\ucd\DerivedNormalizationProps.txt > ..\unidata\DerivedNormalizationProps.txt | |
61 | ucdstrip < 5.1.0\ucd\NormalizationTest.txt > ..\unidata\NormalizationTest.txt | |
62 | ucdstrip < 5.1.0\ucd\PropList.txt > ..\unidata\PropList.txt | |
63 | ucdstrip < 5.1.0\ucd\Scripts.txt > ..\unidata\Scripts.txt | |
64 | ucdstrip < 5.1.0\ucd\auxiliary\GraphemeBreakProperty.txt > ..\unidata\GraphemeBreakProperty.txt | |
65 | ucdstrip < 5.1.0\ucd\auxiliary\SentenceBreakProperty.txt > ..\unidata\SentenceBreakProperty.txt | |
66 | ucdstrip < 5.1.0\ucd\auxiliary\WordBreakProperty.txt > ..\unidata\WordBreakProperty.txt | |
67 | ucdstrip < 5.1.0\ucd\EastAsianWidth.txt | ucdmerge > ..\unidata\EastAsianWidth.txt | |
68 | ucdstrip < 5.1.0\ucd\LineBreak.txt | ucdmerge > ..\unidata\LineBreak.txt | |
69 | ||
70 | * genpname | |
71 | - run preparse.pl | |
72 | + cd \svn\icuproj\icu\uni51\source\tools\genpname | |
73 | + make sure that data.h is writable | |
74 | + perl preparse.pl \svn\icuproj\icu\uni51 > out.txt | |
75 | + preparse.pl complains with errors like the following: | |
76 | Error: sc:Cari already set to Carian, cannot set to Cari at preparse.pl line 1308, <GEN6> line 30. | |
77 | This is because ICU 3.8 had scripts from ISO 15924 which are now | |
78 | added to Unicode 5.1, and the script shows a conflict between SyntheticPropertyValueAliases.txt | |
79 | and PropertyValueAliases.txt. | |
80 | -> Removed duplicate script entries from SyntheticPropertyValueAliases.txt: | |
81 | Cari, Cham, Kali, Lepc, Lyci, Lydi, Olck, Rjng, Saur, Sund, Vaii | |
82 | + PropertyValueAliases.txt now explicitly contains values for boolean properties: | |
83 | N/Y, No/Yes, F/T, False/True | |
84 | -> Added N/No and Y/Yes to preparse.pl function read_PropertyValueAliases. | |
85 | It will use further values from the file if present. | |
86 | ||
87 | * uchar.h & uscript.h & uprops.h & uprops.c & genprops | |
88 | - new block & script values | |
89 | + 17 new blocks | |
90 | + 11 new script values already added in ICU 3.8 for ISO 15924 coverage | |
91 | (removed from SyntheticPropertyValueAliases.txt) | |
92 | + 14 new script values added for ISO 15924 coverage (not in Unicode 5.1) | |
93 | (added to SyntheticPropertyValueAliases.txt) | |
94 | - uprops.icu (uprops.h) only provides 7 bits for script codes. | |
95 | In ICU 4.0 there are USCRIPT_CODE_LIMIT=130 script codes now. | |
96 | There is none above 127 yet which is the script code for an | |
97 | assigned Unicode character, so ICU 4.0 uprops.icu does not store any | |
98 | script code values greater than 127. | |
99 | However, it does need to store the maximum script value=USCRIPT_CODE_LIMIT-1=129 | |
100 | in a parallel bit field, and that overflows now. | |
101 | Also, future values >=128 would be incompatible anyway. | |
102 | uprops.h is modified to move around several of the bit fields | |
103 | in the properties vector words, and now uses 8 bits for the script code. | |
104 | Two other bit fields also grow to accommodate future growth: | |
105 | Block (current count: 172) grows from 8 to 9 bits, | |
106 | and Word_Break grows from 4 to 5 bits. | |
107 | - renamed property Simple_Case_Folding (sfc->scf) | |
108 | + nothing to be done: handled as normal alias | |
109 | - new property JSN Jamo_Short_Name | |
110 | + no new API: only contributes to the Name property | |
111 | - new Grapheme_Cluster_Break (GCB) value: SM=SpacingMark | |
112 | - new Joining Group (JG) value: Burushashki_Yeh_Barree | |
113 | - new Sentence_Break (SB) values: | |
114 | SB ; CR ; CR | |
115 | SB ; EX ; Extend | |
116 | SB ; LF ; LF | |
117 | SB ; SC ; SContinue | |
118 | - new Word_Break (WB) values: | |
119 | WB ; CR ; CR | |
120 | WB ; Extend ; Extend | |
121 | WB ; LF ; LF | |
122 | WB ; MB ; MidNumLet | |
123 | ||
124 | * Further changes in the 2008-02-29 update: | |
125 | - Default_Ignorable_Code_Point: The new file removes Cc, Cs, noncharacters from DICP | |
126 | because they should not normally be invisible. | |
127 | - new Joining Group (JG) value Burushashki_Yeh_Barree was renamed to Burushaski_Yeh_Barree (one 'h' removed) | |
128 | - new Grapheme_Cluster_Break (GCB) value: PP=Prepend | |
129 | - new Word_Break (WB) value: NL=Newline | |
130 | ||
131 | * hardcoded Unihan range end/limit (see Unicode 4.1 update for comparison) | |
132 | - Unihan range end moves from 9FBB to 9FC3 | |
133 | search for both 9FBB (end) and 9FBC (limit) (regex 9FB[BC], case-insensitive) | |
134 | + do change gennames.c | |
135 | ||
136 | * build Unicode data source code for hardcoding core data | |
137 | C:\svn\icuproj\icu\uni51\source\data>NMAKE /f makedata.mak ICUMAKE=\svn\icuproj\icu\uni51\source\data\ CFG=debug uni-core-data | |
138 | ||
139 | ICU data make path is \svn\icuproj\icu\uni51\source\data\ | |
140 | ICU root path is \svn\icuproj\icu\uni51 | |
141 | Information: cannot find "ucmlocal.mk". Not building user-additional converter files. | |
142 | Information: cannot find "brklocal.mk". Not building user-additional break iterator files. | |
143 | Information: cannot find "reslocal.mk". Not building user-additional resource bundle files. | |
144 | Information: cannot find "collocal.mk". Not building user-additional resource bundle files. | |
145 | Information: cannot find "rbnflocal.mk". Not building user-additional resource bundle files. | |
146 | Information: cannot find "trnslocal.mk". Not building user-additional transliterator files. | |
147 | Information: cannot find "misclocal.mk". Not building user-additional miscellaenous files. | |
148 | Creating data file for Unicode Character Properties | |
149 | Creating data file for Unicode Case Mapping Properties | |
150 | Creating data file for Unicode BiDi/Shaping Properties | |
151 | Creating data file for Unicode Normalization | |
152 | Unicode .icu files built to "\svn\icuproj\icu\uni51\source\data\out\build\icudt39l" | |
153 | Unicode .c source files built to "\svn\icuproj\icu\uni51\source\data\out\tmp" | |
154 | ||
155 | - copy the .c source files to C:\svn\icuproj\icu\uni51\source\common | |
156 | and rebuild the common library | |
157 | ||
158 | *** Break iterators | |
159 | ||
160 | * Update break iterator rules to new UAX versions and new property values | |
161 | ||
162 | *** UCA | |
163 | ||
164 | * update FractionalUCA.txt and UCARules.txt with new canonical closure | |
165 | ||
166 | *** Test suites | |
167 | - Test that APIs using Unicode property value aliases (like UnicodeSet) | |
168 | support all of the boolean values N/Y, No/Yes, F/T, False/True | |
169 | -> TestBinaryValues() tests in both cintltst and intltest | |
170 | ||
171 | *** LayoutEngine script information | |
172 | * Run ICU4J com.ibm.icu.dev.tool.layout.ScriptNameBuilder. This generates LEScripts.h, LELanguage.h, | |
173 | ScriptAndLanguageTags.h and ScriptAndLanguageTags.cpp in the working directory. (it also generates | |
174 | ScriptRunData.cpp, which is no longer needed.) | |
175 | ||
176 | The generated files have a current copyright date and "@draft" statement. | |
177 | ||
178 | * copy the above files into <icu>/source/layout, replacing the old files. | |
179 | ||
180 | Add new default entries to the indicClassTables array in <icu>/source/layout/IndicClassTables.cpp | |
181 | and the complexTable array in <icu>/source/layoutex/ParagraphLayout.cpp. (This step should be automated...) | |
182 | ||
183 | * rebuild the layout and layoutex libraries. | |
184 | ||
185 | *** Documentation | |
186 | - Update User Guide | |
187 | + Jamo_Short_Name, sfc->scf, binary property value aliases | |
188 | ||
189 | ---------------------------------------------------------------------------- *** | |
190 | ||
73c04bcf A |
191 | Unicode 5.0 update |
192 | ||
193 | *** related Jitterbugs | |
194 | ||
195 | 5084 RFE: Update to Unicode 5.0 | |
196 | ||
197 | *** data files & enums & parser code | |
198 | ||
199 | * file preparation | |
200 | - ucdstrip: | |
201 | DerivedCoreProperties.txt | |
202 | DerivedNormalizationProps.txt | |
203 | NormalizationTest.txt | |
204 | PropList.txt | |
205 | Scripts.txt | |
206 | GraphemeBreakProperty.txt | |
207 | SentenceBreakProperty.txt | |
208 | WordBreakProperty.txt | |
209 | - ucdstrip and ucdmerge: | |
210 | EastAsianWidth.txt | |
211 | LineBreak.txt | |
212 | ||
46f4442e | 213 | * my ucd2unidata.bat (needs to be updated each time with UCD and file version numbers) |
73c04bcf A |
214 | copy 5.0.0\ucd\BidiMirroring.txt ..\unidata\ |
215 | copy 5.0.0\ucd\Blocks.txt ..\unidata\ | |
216 | copy 5.0.0\ucd\CaseFolding.txt ..\unidata\ | |
217 | copy 5.0.0\ucd\DerivedAge.txt ..\unidata\ | |
218 | copy 5.0.0\ucd\extracted\DerivedBidiClass.txt ..\unidata\ | |
219 | copy 5.0.0\ucd\extracted\DerivedJoiningGroup.txt ..\unidata\ | |
220 | copy 5.0.0\ucd\extracted\DerivedJoiningType.txt ..\unidata\ | |
221 | copy 5.0.0\ucd\extracted\DerivedNumericValues.txt ..\unidata\ | |
222 | copy 5.0.0\ucd\NormalizationCorrections.txt ..\unidata\ | |
223 | copy 5.0.0\ucd\PropertyAliases.txt ..\unidata\ | |
224 | copy 5.0.0\ucd\PropertyValueAliases.txt ..\unidata\ | |
225 | copy 5.0.0\ucd\SpecialCasing.txt ..\unidata\ | |
226 | copy 5.0.0\ucd\UnicodeData.txt ..\unidata\ | |
227 | ||
228 | ucdstrip < 5.0.0\ucd\DerivedCoreProperties.txt > ..\unidata\DerivedCoreProperties.txt | |
229 | ucdstrip < 5.0.0\ucd\DerivedNormalizationProps.txt > ..\unidata\DerivedNormalizationProps.txt | |
230 | ucdstrip < 5.0.0\ucd\NormalizationTest.txt > ..\unidata\NormalizationTest.txt | |
231 | ucdstrip < 5.0.0\ucd\PropList.txt > ..\unidata\PropList.txt | |
232 | ucdstrip < 5.0.0\ucd\Scripts.txt > ..\unidata\Scripts.txt | |
233 | ucdstrip < 5.0.0\ucd\auxiliary\GraphemeBreakProperty.txt > ..\unidata\GraphemeBreakProperty.txt | |
234 | ucdstrip < 5.0.0\ucd\auxiliary\SentenceBreakProperty.txt > ..\unidata\SentenceBreakProperty.txt | |
235 | ucdstrip < 5.0.0\ucd\auxiliary\WordBreakProperty.txt > ..\unidata\WordBreakProperty.txt | |
236 | ucdstrip < 5.0.0\ucd\EastAsianWidth.txt | ucdmerge > ..\unidata\EastAsianWidth.txt | |
237 | ucdstrip < 5.0.0\ucd\LineBreak.txt | ucdmerge > ..\unidata\LineBreak.txt | |
238 | ||
239 | * update FractionalUCA.txt and UCARules.txt with new canonical closure | |
240 | ||
241 | * genpname | |
242 | - run preparse.pl | |
243 | + make sure that data.h is writable | |
244 | + perl preparse.pl \cvs\oss\icu > out.txt | |
245 | ||
246 | * uchar.h & uscript.h & uprops.h & uprops.c & genprops | |
247 | - new block & script values | |
248 | + script values already added in ICU 3.6 because all of ISO 15924 is now covered | |
249 | ||
250 | * build Unicode data source code for hardcoding core data | |
251 | C:\cvs\oss\icu\source\data>NMAKE /f makedata.mak ICUMAKE=\cvs\oss\icu\source\data\ CFG=debug uni-core-data | |
252 | ||
253 | ICU data make path is \cvs\oss\icu\source\data\ | |
254 | ICU root path is \cvs\oss\icu | |
255 | Information: cannot find "ucmlocal.mk". Not building user-additional converter files. | |
256 | [etc.] | |
257 | Creating data file for Unicode Character Properties | |
258 | Creating data file for Unicode Case Mapping Properties | |
259 | Creating data file for Unicode BiDi/Shaping Properties | |
260 | Creating data file for Unicode Normalization | |
261 | Unicode .icu files built to "\cvs\oss\icu\source\data\out\build\icudt35l" | |
262 | Unicode .c source files built to "\cvs\oss\icu\source\data\out\tmp" | |
263 | ||
264 | - copy the .c source files to C:\cvs\oss\icu\source\common | |
265 | and rebuild the common library | |
266 | ||
267 | *** Unicode version numbers | |
268 | - makedata.mak | |
269 | - uchar.h | |
270 | - configure.in | |
271 | ||
272 | *** LayoutEngine script information | |
273 | * Run ICU4J com.ibm.icu.dev.tool.layout.ScriptNameBuilder. This generates LEScripts.h, LELanguage.h, | |
274 | ScriptAndLanguageTags.h and ScriptAndLanguageTags.cpp in the working directory. (it also generates | |
275 | ScriptRunData.cpp, which is no longer needed.) | |
276 | ||
277 | The generated files have a current copyright date and "@draft" statement. | |
278 | ||
279 | * copy the above files into <icu>/source/layout, replacing the old files. | |
280 | ||
281 | Add new default entries to the indicClassTables array in <icu>/source/layout/IndicClassTables.cpp | |
282 | and the complexTable array in <icu>/source/layoutex/ParagraphLayout.cpp. (This step should be automated...) | |
283 | ||
284 | * rebuild the layout and layoutex libraries. | |
285 | ||
286 | ---------------------------------------------------------------------------- *** | |
287 | ||
288 | Unicode 4.1 update | |
289 | ||
290 | *** related Jitterbugs | |
291 | ||
292 | 4332 RFE: Update to Unicode 4.1 | |
293 | 4157 RBBI, TR29 4.1 updates | |
294 | ||
295 | *** data files & enums & parser code | |
296 | ||
297 | * file preparation | |
298 | - ucdstrip: | |
299 | DerivedCoreProperties.txt | |
300 | DerivedNormalizationProps.txt | |
301 | NormalizationTest.txt | |
302 | GraphemeBreakProperty.txt | |
303 | SentenceBreakProperty.txt | |
304 | WordBreakProperty.txt | |
305 | - ucdstrip and ucdmerge: | |
306 | EastAsianWidth.txt | |
307 | LineBreak.txt | |
308 | ||
309 | * add new files to the repository | |
310 | GraphemeBreakProperty.txt | |
311 | SentenceBreakProperty.txt | |
312 | WordBreakProperty.txt | |
313 | ||
314 | * update FractionalUCA.txt and UCARules.txt with new canonical closure | |
315 | ||
316 | * genpname | |
317 | - handle new enumerated properties in sub read_uchar | |
318 | - run preparse.pl | |
319 | ||
320 | * uchar.h & uscript.h & uprops.h & uprops.c & genprops | |
321 | - new binary properties | |
322 | + Pattern_Syntax | |
323 | + Pattern_White_Space | |
324 | - new enumerated properties | |
325 | + Grapheme_Cluster_Break | |
326 | + Sentence_Break | |
327 | + Word_Break | |
328 | - new block & script & line break values | |
329 | ||
330 | * gencase | |
331 | - case-ignorable changes | |
332 | see http://www.unicode.org/versions/Unicode4.1.0/#CaseMods | |
333 | now: (D47a) Word_Break=MidLetter or Mn, Me, Cf, Lm, Sk | |
334 | ||
335 | *** Unicode version numbers | |
336 | - makedata.mak | |
337 | - uchar.h | |
338 | - configure.in | |
339 | ||
340 | *** tests | |
341 | - verify that u_charMirror() round-trips | |
342 | - test all new properties and some new values of old properties | |
343 | ||
344 | *** other code | |
345 | ||
346 | * hardcoded Unihan range end/limit | |
347 | - Unihan range end moves from 9FA5 to 9FBB | |
348 | search for both 9FA5 (end) and 9FA6 (limit) (regex 9FA[56], case-insensitive) | |
349 | + do not modify BOCU/BOCSU code because that would change the encoding | |
350 | and break binary compatibility! | |
351 | + similarly, do not change the GB 18030 range data (ucnvmbcs.c), | |
352 | NamePrepProfile.txt | |
353 | + ignore trietest.c: test data is arbitrary | |
354 | + ignore tstnorm.cpp: test optimization, not important | |
355 | + ignore collation: 9FA[56] only appears in comments; swapCJK() uses the whole block up to 9FFF | |
356 | + do change line_th.txt and word_th.txt | |
357 | by replacing hardcoded ranges with the new property values | |
358 | + do change gennames.c | |
359 | ||
360 | source\data\brkitr\line_th.txt(229): \u33E0-\u33FE \u3400-\u4DB5 \u4E00-\u9FA5 \uA000-\uA48C \uA490-\uA4C6 | |
361 | source\data\brkitr\word_th.txt(23): \u33E0-\u33FE \u3400-\u4DB5 \u4E00-\u9FA5 \uA000-\uA48C \uA490-\uA4C6 | |
362 | source\tools\gennames\gennames.c(971): 0x4e00, 0x9fa5, | |
363 | ||
364 | * case mappings | |
365 | - compare new special casing context conditions with previous ones | |
366 | see http://www.unicode.org/versions/Unicode4.1.0/#CaseMods | |
367 | ||
368 | * genpname | |
369 | - consider storing only the short name if it is the same as the long name | |
370 | ||
371 | *** other reviews | |
372 | - UAX #29 changes (grapheme/word/sentence breaks) | |
373 | - UAX #14 changes (line breaks) | |
374 | - Pattern_Syntax & Pattern_White_Space | |
375 | ||
376 | ---------------------------------------------------------------------------- *** | |
377 | ||
374ca955 A |
378 | Unicode 4.0.1 update |
379 | ||
380 | *** related Jitterbugs | |
381 | ||
382 | 3170 RFE: Update to Unicode 4.0.1 | |
383 | 3171 Add new Unicode 4.0.1 properties | |
384 | 3520 use Unicode 4.0.1 updates for break iteration | |
385 | ||
386 | *** data files & enums & parser code | |
387 | ||
388 | * file preparation | |
389 | - ucdstrip: DerivedNormalizationProps.txt, NormalizationTest.txt, DerivedCoreProperties.txt | |
390 | - ucdstrip and ucdmerge: EastAsianWidth.txt, LineBreak.txt | |
391 | ||
392 | * file fixes | |
393 | - fix UnicodeData.txt general categories of Ethiopic digits Nd->No | |
394 | according to PRI #26 | |
395 | http://www.unicode.org/review/resolved-pri.html#pri26 | |
396 | - undone again because no corrigendum in sight; | |
397 | instead modified tests to not check consistency on this for Unicode 4.0.1 | |
398 | ||
399 | * ucdterms.txt | |
400 | - update from http://www.unicode.org/copyright.html | |
401 | formatted for plain text | |
402 | ||
403 | * uchar.h & uprops.h & uprops.c & genprops | |
404 | - add UBLOCK_CYRILLIC_SUPPLEMENT because the block is renamed | |
405 | - add U_LB_INSEPARABLE due to a spelling fix | |
406 | + put short name comment only on line with new constant | |
407 | for genpname perl script parser | |
408 | - new binary properties | |
409 | + STerm | |
410 | + Variation_Selector | |
411 | ||
412 | * genpname | |
413 | - fix genpname perl script so that it doesn't choke on more than 2 names per property value | |
414 | - perl script: correctly calculate the maximum number of fields per row | |
415 | ||
416 | * uscript.h | |
417 | - new script code Hrkt=Katakana_Or_Hiragana | |
418 | ||
419 | * gennorm.c track changes in DerivedNormalizationProps.txt | |
420 | - "FNC" -> "FC_NFKC" | |
421 | - single field "NFD_NO" -> two fields "NFD_QC; N" etc. | |
422 | ||
423 | * genprops/props2.c track changes in DerivedNumericValues.txt | |
424 | - changed from 3 columns to 2, dropping the numeric type | |
425 | + assume that the type is always numeric for Han characters, | |
426 | and that only those are added in addition to what UnicodeData.txt lists | |
427 | ||
428 | *** Unicode version numbers | |
429 | - makedata.mak | |
430 | - uchar.h | |
431 | - configure.in | |
432 | ||
433 | *** tests | |
434 | - update test of default bidi classes according to PRI #28 | |
435 | /tsutil/cucdtst/TestUnicodeData | |
436 | http://www.unicode.org/review/resolved-pri.html#pri28 | |
437 | - bidi tests: change exemplar character for ES depending on Unicode version | |
438 | - change hardcoded expected property values where they change | |
439 | ||
440 | *** other code | |
441 | ||
442 | * name matching | |
443 | - read UCD.html | |
444 | ||
445 | * scripts | |
446 | - use new Hrkt=Katakana_Or_Hiragana | |
447 | ||
448 | * ZWJ & ZWNJ | |
449 | - are now part of combining character sequences | |
450 | - break iteration used to assume that LB classes did not overlap; now they do for ZWJ & ZWNJ |