|
See also: Translation tables available on the world wide web.
A B C D E F G H I J-K L M N O P-Q R S T U V W-Z
A
Accelerator menus, An Alt+character combination used to activate menu items, and dialog items in Windows. The character that activates the menu or dialog item
is underlined. It is also called a "hot key."
Accented character A character that has a diacritic attached to it. See Extended characters.
AltGr The Alt key on the right on some non-US Windows keyboard layouts. The AltGr key is equivalent to the Ctrl+Alt key combination and is used to create an
alternative shift state for accessing additional characters on some keys.
Alt+Numpad A method of entering characters, usually accented characters, by typing in the character's decimal code with the numeric keypad keys (num lock turned
on). in windows, pressing alt+<num> generates an ASCII character. Pressing Alt+0<Num> generates an ANSI character.
Array input method An input method that builds characters using radicals. This method defines 10 basic keystrokes,
numbered 0 through 9, that represent basic radicals. The columns of keys beneath each number, for example, on a U.S. keyboard, the letters QAZ beneath the 1 key and the letters WSX
beneath the 2 key , are used to select specific characters.
ASCII Acronym for American Standard Code for Information Interchange, a 7-bit code that is the U.S. national variant of IS
646.
B
Banja Double-byte Latin letters.
Base character (1) A character that has meaning independent of other characters. (2) Any graphical character that is not a diacritic.
Bidirectional (BiDi) text A mixture of characters that are read from left to right and from right to left. Most Arabic and Hebrew
characters, for example, are read from right to left, but numbers and quoted Western terms within Arabic or Hebrew are read from left to right.
Big-5 The multi-byte encoding standardized by Taiwan.
Big font A single font file that contains glyphs representing characters from multiple charsets.
Bitmap font A font whose characters are represented by bitmaps or by a pattern of dots, as opposed to a TrueType font, whose characters are represented by lines
and curves. A bitmap font is generally less scaleable and more jagged than a TrueType font.
Bopomofo A standard Chinese phonetic script developed in 1913.
Byte Order Mark (BOM) The Unicode character U+FEFF, or its non-character mirror-image, U+FFFE, used to indicate the
byte order, or its non-character mirror image, of a text stream. The presence of a BOM can be a strong clue that a file is encoded in Unicode.
C
Candidate window The window of an Input Method Editor that lists characters what the user can choose to replace the text
highlighted in the composition window.
Case The capitalized (uppercase) or non-capitalized (lowercase) form of an alphabetic character, usually in Latin script.
Chang Jei An input method that uses radicals to build Chinese characters. See Radical. Twenty-five radicals are
assigned to the letters A through Y. The letter X is used to generate more complex radicals.
Character (1) The smallest abstract element of a writing system or script. A character
refers to an abstract idea rather than a to a specific shape. (2) A code element. See Glyph.
Charset A set of characters used in Windows. Charsets refer to the same collections of characters as those defined by
Windows code pages, but their ID numbers can be expressed in a single-byte.
Code page An ordered set of characters in which a numeric index (code-point value) is associated with each character. This term
is generally used in the context of code pages defined by Windows 3.1 or MS-DOS, and may also be called a character set or charset.
Compatibility Zone The area in Unicode from U+F900 through U+FEFF that is assigned to characters from other
standards. These characters are variants of other Unicode characters.
Compile-time The raw source is transformed into a more machine-readable object format. Compiling can be initiated from the menu bar or with the VFP
COMPILE command. Object files are created with .FXP, .MPX, or QPX extensions. If you use the project manager, the object code is placed in memo fields in the project .PJX
records.
Contextual analysis A process for determining how to handle text based on surrounding characters, as in Arabic, in which a letter changes shape depending on its
position in a word.
Conversion window, or composition window The window of an Input Method Editor that displays text typed by the
user, either as entered or as converted to ideographic form.
Country setting The set of preferences in the Windows NT 3.x Control Panel that determine the user default date, time, currency, and number formats.
Cultural convention Data or data formats that are specific to a language, local dialect, or geographic location. Examples are currency symbols, date formats,
calendars, numerical separators, and sort orders.
Cyrillic script The script used to represent characters in Slavic languages.
D
Date picture string/time picture string A string used to represent a date or time format, for example, "MMMM dddd yyyy"
Da Yi input method An input method that builds characters using radicals. This
method defines 40 base radicals, arranged on a standard 101-key keyboard and corresponds to the stroke order in which characters are handwritten.
Dead key By itself, the dead key does not generate a character. Pressing a dead key followed by another key is one way to generate accented characters.
Decomposition The breakdown of an accented character or a pre-composed
character into an ordered set of components. For , the components are a followed by the combining character ~.
Determined string A string that has been converted from a phonetic representation into ideographs.
Diacritic (1) Any mark placed over, under, or through a Latin-based character, usually to indicate a change in phonetic
value from the unmarked state. (2) A character that is attached to or overlays a preceding base character. Most diacritics are non-spacing characters that do not increase the width of the base character.
Diaeresis Two dots placed over a vowel to indicate that the vowel is pronounced as a separate syllable. Typically used
when two vowels are adjacent but should be pronounced separately rather than as a diphthong, as in coperation. See Umlaut.
Digraph A combination of characters that is written separately but forms a single lexical unit - for example, the Danish aa and the Spanish ch and
ll.
Double-byte A character set (DBCS). Any 2-byte form of character encoding. See Multi-byte character set.
E
Enabling Altering program code to handle input, display, and editing of bidirectional languages, such as Arabic, and
double-byte languages, such as Japanese.
Encoding A system of assigning numeric values to characters.
End-User Defined Character (EUDC) A special character, such as a rare ideograph, that
the user creates with an editor and assigns to a code point within a reserved range.
Extended character (1) Character above the ASCII range (32 through 127) in
Windows-based single-byte character sets. (2) Accented characters.
F
Floating accent See Diacritic and Floating diacritic.
Floating diacritic A non-spacing diacritic that overlays the preceding base
character and might change position or shape according to the shape of the base character.
Following characters Characters such as closing quotation marks, closing parentheses, and punctuation marks, that
shouldn't be separated from succeeding characters.
Font Any of numerous sets of graphical representations of characters that can be installed on a computer or printer.
Font association The automatic pairing of a font that contains ideographs with a font that does not contain
ideographs. This allows the user to enter ideographic characters regardless of which font is selected.
Front-end processor See Input Method Editor (IME).
Full-width character In a double-byte character set, a character that is represented by 2 bytes and typically has a
half-width variant.
G
GB 2312-80 The multi-byte encoding standardized by the People's Republic of China.
Generate-time The time when power-tool metadata is converted to source code. VFP uses a template program, called GENMENU.PRG, to interpret the .MNX metadata and
produce the .MPR source. Note that the other power tools—the report writer and the label designer—do not go through an external generate process; VFP interprets report and
label metadata at run-time.
Globalization See Internationalization.
Glyph The actual shape (bit pattern, outline, and so forth) of a character image. For example, an italic a and a roman a
are two different glyphs representing the same underlying character.
H
Half-width character In a double-byte character set, a character that is represented by
one byte and typically has a full-width variant.
Han unification The process of assigning the same code point to characters historically perceived as being the same character but represented as unique in more
than one East Asian ideographic character standard. This results in a group of ideographs shared by several cultures and significantly
reduces the number of code points needed to encode them.
Hangeul The native name for the Korean language.
Hanja The Korean name for ideographic characters of Chinese origin.
Hanzi (hantsu) The Chinese name for ideographic characters of Chinese origin.
Hard-coding Putting string or character literals in the main body of code instead of resource files. Basing numeric
constants on the assumed length of a string.
Hiragana The Japanese cursive script. Each Hiragana character represents a phonetic syllable.
HKL See Input language handle (HKL) and Language/layout pair.
I
Ideographic character A character of Chinese origin representing a word or a syllable that is generally used in more than
one Asian language. Sometimes referred to as a Chinese character.
Input context An internal structure that stores Input Method Editor (IME) related status information. Windows 95 supports
multiple IME contexts, automatically creating an input context for each active thread.
Input language handle (HKL) A type of variable that Windows 95 uses to track language/layout pairs.
Input method Any method used to enter text that doesn't involve typing each character directly. Input methods are widely used for entering ideographs and other characters phonetically or component by component.
Input Method Editor (IME) A program that performs the conversion between keystrokes and ideographs or other characters, usually by user-guided dictionary lookup.
Input Method Manager (IMM) The module in Windows that handles communication between Input Method Editors (IMEs) and applications.
Internal code input method An input method that allows the user to select a character by typing in its Big-5 code-point
index.
Internationalization, or globalization The process of developing a program core with
features and code designs that don't make assumptions based on a single language or locale, and whose source code base simplifies the creation
of different language editions of a program.
ISO 8859 The International Standards Organization's 8-bit encoding that served as the basis for the Windows ANSI
code page (also called code page 1252, Western European, Latin 1).
ISO 10646 The International Standards Organization's encoding that is code-for-code equivalent to Unicode.
Isolate, initial, medial, and final character forms The different shapes of an Arabic character that correspond to its position in a word.
J-K
Jamos The 24 basic elements of the Korean script.
Johab The Korean standard character set (KS C-5601-1992), which corresponds to Windows code page 1361. This character set includes all possible Hangeul
character combinations.
Kana The set of Japanese Hiragana and Katakana characters.
Kanji The Japanese name for ideographic characters of Chinese origin.
Katakana A Japanese script of phonetic syllables, chiefly used to spell words borrowed from other languages. Each Katakana character represents a phonetic
syllable.
Keyboard layout A standard arrangement of characters on a keyboard that defines which keys produce particular characters or
scan codes.
KS C-5601-1987 The multi-byte Wansung encoding standardized by Korea.
KS C-5601-1992 The multi-byte Johab encoding standardized by Korea.
L
Language ID (LANGID) A 16-bit value defined by Windows, consisting of a primary language ID and a secondary language ID. Used as a parameter to several Win32
functions and messages.
Language/layout pair (1) A language installed on the system and the keyboard layout
associated with it. (2) The input language.
Latin script The set of 26 characters (A-Z) inherited from the Roman Empire that, together with later additions, is used to write
languages throughout Africa, the Americas, parts of Asia, Europe, and Oceania. The Windows 3.1 Latin 1 character set covers Western European languages and languages that used the same
alphabet, while the Latin 2 character set covers Central and Eastern European languages.
Lead Byte The byte value that is the first half of a double-byte character. See Double-byte character set
(DBCS).
Leading characters Characters such as opening quotation marks, opening parenthesis, and currency signs, that shouldn't be separated from succeeding
characters.
Letter (1) The basic element of a script as understood by the end user. (2) A higher level of abstraction than character. For example, both the Spanish ch and the Danish aa can be considered as single letters for some purposes (both sort as a single
character). See Text element.
Levels of localization The amount of translation and customization necessary to create different language editions. The levels, which are determined by
balancing risk and return, range from translating nothing to shipping a completely translated product with customized features.
Ligature Two or more characters combined to represent a single typographical character. The modern Latin script uses
only a few. Other scripts use many ligatures that depend on font and style. Some languages, such as Arabic, have mandatory ligatures; other languages have characters that were derived
from ligatures, such as the German ligature of long and short "s" () and the ampersand (&), which is the contracted form of the Latin word et.
Literal In program code, a string.
Locale The features of the user's environment that are dependent on language, country, and cultural conventions. The locale
determines conventions such as sort order; keyboard layout; and date, time, number, and currency formats. In Windows, locales usually provide more
information about cultural conventions than about languages.
Locale ID (LCID) A 32-bit value defined by Windows that consists of a language ID, a sort ID, and reserved bits.
Locale-sensitive Exhibiting different behavior or returning different data, depending on the locale. For example, the
Win32 sort functions return different results depending on the locale parameter sent to each function.
Localization Models The localization process can occur in a variety of ways. These can be categorized by the timing of the localization
process. The process itself can be represented as follows:
Locale Block + Application Block = Localized Application
| Locale Block |
Application Block |
Graphics
Lookup tables
Validation
Help
Documentation |
Application code
Processes graphics
Invokes lookups
Hooks for validation
Manages languages
Spans platforms |
The idea being that, to create new version, you only tinker with the locale data. Merging this concept with the sense of process timing conveyed in the previous
section yields something as follows:
| |
Locale Block
|
Application Block
|
| Program-Time |
Cultural content added at program time by programmers who are presumably also linguists. |
BIG. Essentially no Locale block other than maybe CONFIG.FP/w, and setup metadata. One app/exe for all locales. |
| Generate-Time |
Cultural content added by automated means when generating source. Swapping from string / object / screen / menu libraries. |
One app/exe for each locale. |
| Link-Time |
Cultural content added by automated means when building the application. Swapping project records from compiled obj libraries. Not
inherently easy with native VFP tools. |
One app/exe for each locale. |
| Run-Time |
Cultural content added by automated means (from phrasebook tables) when executing the application. Translation is essentially independent
of the development team. |
One app/exe for all locales, with one locale resource for each locale. app/exe independent of locale block. Swapping from string /
graphic resources. |
Enabling techniques segmented by timing.
Localizable resource Any element of a program's user interface that requires translation or modification for different languages.
Localization The process of adapting a program for a specific international market, which includes translating the user interface, resizing dialog boxes,
customizing features (if necessary), and testing results to ensure that the program works as expected.
Localization kit A subset of tools, source files, and binary files that can be used to create a localized edition of a program. Generally given to translators
or third-party vendors.
Logical order The order in which something is typed. Generally refers to text that can be displayed in a different order, such as Arabic, Hebrew, or bidirectional text.
Logograph, or logographic From the Greek word logo, meaning word: a letter, symbol, or sign used to represent an entire word. Chinese
characters are more properly termed logographic than ideographic because they represent words or parts of words rather than abstract
concepts.
M
Morpheme The smallest meaningful unit of a word. The word dog is one morpheme. The word dogs is two morphemes: dog + the plural marker
s. Many ideographic are based on morphemes.
Multi-byte character set (MBCS) A mixed-width character set, in which some characters consist of more than one byte. A
double-byte character set (DBCS), which is a specific type of multi-byte character set, includes some characters that consist of two
bytes.
Multilingual Supporting more than one language simultaneously. Often implies the ability to handle more than one script or character set.
N
National standard A linguistic rule, measurement, educational guideline, or technology-related convention as defined by a government or an industry standards
organization. Examples include character sets, keyboard layouts, and some cultural conventions, such
as punctuation. Windows incorporates many International Standards Organization (ISO) naming conventions.
Neutral character A character that can be considered as either right-to-left or left-to-right, depending on the direction of the surrounding context.
NLSAPI Abbreviation for National Language Support API. The set of system functions in 32-bit Windows that contain national language support (information that is
based on language and cultural convention).
No compile Refers to source code that doesn't require recompiling when you create international editions of a program.
Non-spacing character A character such as a diacritic, that has no meaning
by itself but overlaps an adjacent character to form a third character.
O
Overflow characters Punctuation characters that are allowed to extend beyond the right margin for horizontal text or below the bottom margin for vertical
text.
P-Q
Phoneme A unique individual sound used in a language.
Phrasebook: A concordance table containing individual text strings, in two or more languages, for the purpose of automated translation and substitution of
translated text into a program.
Plain text Computer-encoded text that contains only code elements and no other formatting or structural information, such as font
size, font type, or other layout information. Plain text exchange is commonly used between computer systems that might have no other way to exchange information.
Points The vowel signs in written Hebrew, which are sets of dots and/or short lines written below consonants.
Pre-composed character A single Unicode character that represents a sequence
of characters, usually a combination of a base character and one or more diacritics.
Private-use zone he area in Unicode from U+E000 through U+F8EE that is set aside for vendor-specific or user-designated characters.
Program-time When the developer is working with a text editor, or a power tool, from which raw source code is generated.
R
Radical A group of strokes in a Chinese character that are treated as a unit for the purposes of sorting, indexing, and
classification. A character can contain more than one element that is recognized as a radical, but each character contains only one element, called the main radical, that is used as the
indexing character. Other radicals in the character might indicate how the character is pronounced.
RCDATA resource A custom Windows resource element.
Release delta The time between the release of the domestic product and the release of the localized edition.
Rendering The way in which a character is graphically displayed.
Resource (1) An element, such as a string, icon, bitmap, cursor, dialog, accelerator, or menu, that is included in a
Microsoft Windows resource (.RC) file. (2) Any item that needs to be translated.
Rich text Text saved with formatting instructions that multiple applications, including compatible Microsoft applications, can read and interpret.
Romaji A writing system based on the Latin alphabet that is used to represent Japanese text.
Round-trip conversion Mapping a character from one character encoding to another and back. Of particular interest is how well information is preserved during
round-trip conversion.
Run-time: When the finished program is actually executing.
S
Screen dump A bitmap of an element in a program's graphical user interface, such as a dialog box or menu.
Script A system of characters used to write one or several languages. Characters denote isolated sounds, syllables, or word
elements and are governed by a general set of rules for creating text, such as default writing direction.
Separators Symbols used to separate items in a list, mark the thousands place in numbers, or represent the decimal point. Different locales follow different conventions for separators.
Shift-JIS The Japan Industry Standard multi-byte encoding. The codes are numerically shifted from the codes used by the JIS standard X-0208; hence the
name.
Shortcut key A keyboard combination that activates a program command directly, as an alternative to activating the command through the program menus.
Simplified Chinese The Chinese script used in the People's Republic of China. It consists of several thousand ideographic characters that are simplified versions of traditional Chinese characters.
Simultaneous ship, or sim-ship The release of localized editions of a product at the same time or soon after the domestic edition is released, usually
within 30 days.
Single-byte character set (SBCS) A character encoding in which each character is represented by one byte. Single-byte character sets are
mathematically limited to 256 characters.
Sort key A numeric representation of a sort element based on locale-specific sort rules. A sort key consists of several weighted components that represent a
character's script, diacritics, case, and so on.
Spacing character A character with a non-zero width.
Specification, or spec A detailed plan of a program's user interface design and the expected functionality of program features.
Status window The window of an Input Method Editor (IME) in which the user can change the IME's conversion mode or input
mode.
Syllabary A set of written characters in which each character represents a syllable, for example, a consonant sound followed by a vowel sound.
T
Text element Smallest unit of text that can be displayed or edited.
Traditional Chinese The set of Chinese characters, used in such countries as Hong Kong, Singapore, and Taiwan, that is consistent with the original form of
Chinese ideographic characters that are several thousand years old.
Trail byte The byte value that is the second half of a double-byte character.
U
Umlaut Two dots placed above a vowel, such as , , and , which are used in German and other European languages to indicate a
change in the pronunciation of a vowel. See Diaeresis.
Unicode A fixed-width, 16-bit worldwide character encoding that was developed and
is maintained and promoted by the Unicode Consortium, a nonprofit computer industry organization.
User-defined character See End-User Defined Character (EUDC).
W-Z
Wansung The Korean standard character set (KS C-5601-1987), which corresponds to Windows code page 949. It covers the most common Hangeul character combinations. Extended Wansung covers all possible Hangeul combinations.
Wide character A 16-bit or 32-bit character. Often used to refer to Unicode encoded
characters.
Windows Intelligent Font Environment (WIFE) An operating system layer, introduced with the Far East editions of Windows 3, that manages multiple font technologies and font drivers that can be installed.
|