1 # Copyright (C) 2017 and later: Unicode, Inc. and others.
2 # License & terms of use: http://www.unicode.org/copyright.html
4 # Name: GSM 03.38 to Unicode
8 # Authors: Ken Whistler
12 # Source: http://www.unicode.org/Public/MAPPINGS/ETSI/GSM0338.TXT
13 # See there for the license and for a description of the charset.
14 # Formatted into ICU .ucm format by Markus Scherer on 2006-nov-02.
15 # Updated to table version 2.0 by Fredrik Roubert on 2017-feb-08.
16 # Commented-out mappings are turned into fallbacks (|1), all others are turned
17 # into round-trips (|0).
18 # Multi-byte mappings are preserved as multi-single-byte character mappings,
19 # using ICU's m:n conversion capability.
21 # The substitution character is not documented in the Unicode file.
22 # \x3F is chosen here because \x1A is a graphic character.
24 # Other deviations from the Unicode file:
26 # The GSM standard specifies that one or two ESC bytes (\x1B), if not followed
27 # by a recognized final byte, be mapped to spaces (that is, reverse fallbacks
29 # The Unicode file round-trips a single \x1B to U+00A0 (NBSP) and has no mapping
31 # (Reverse fallbacks to U+00A0 would result in Unicode text that cannot be
32 # converted back to GSM 03.38. A roundtrip for U+00A0 adds a character that is
33 # not mappable in the standard.)
35 # See the ietf-charsets list email "Re: GSM 03.38 substitution character?"
36 # at http://mail.apps.ietf.org/ietf/charsets/msg01696.html
39 # The GSM standard maps U+00C7 capital C-cedilla to \x09 but the Unicode file
40 # contains and documents a "fix" to map U+00E7 small c-cedilla instead, based on
41 # an interpretation of the intent of the standard. Prevailing implementations
42 # in mobile phones follow the standard.
44 # This file follows the GSM standard.
46 # See the GSM standard at
47 # http://www.3gpp.org/ftp/Specs/archive/03_series/03.38/0338-720.zip
49 # For problems with the table format please submit a bug
50 # at http://www.icu-project.org/ .
51 # For issues with the mappings please contact Unicode
52 # at http://www.unicode.org/reporting.html
54 <code_set_name> "gsm-03.38-2009"
55 <char_name_mask> "AXXXX"
61 <icu:charsetFamily> "ASCII"