Oracle provides linguistic sort capabilities that handle the complex
sorting requirements of different languages and cultures. Different
languages have different sort orders. What' s more, different cultures
or countries using the same alphabets may sort words differently.
For example, in Danish, the letter Æ is after Z, while Y and
Ü are considered to be variants of the same letter.
Sort order can becase sensitive or insensitive, and can ignore accents
or not. It can also be either phonetic or based on the appearance of
the character, such as ordering by the number of strokes or by radicals
for East Asian ideographs.Another common sorting issue is when letters
are combined.
For example, in traditional Spanish, "ch"is a distinct character,
which means that the correct order would be: cerveza, Colorado, cheremoya,
and so on. This means that the letter "c" cannot be sorteduntil
checking to see if the next letter is an "h".
Oracle provides several different types of sort, and can achieve a linguistically
correct sort as well as the new multilingual ISO standard (10646) designed
to handle many languages at the same time.
Using Binary Sorts
Conventionally, when character data is stored, the sort sequence is based
on thenumeric values of the characters defined by the character encoding
scheme. This is called a binary sort.
Binary sorts are the fastest type of sort, and produce reasonable results
for the English alphabet because the ASCII and EBCDIC standards define
the letters A to Z in ascending numeric value.
Note, however, that in the ASCII standard, all uppercase letters appear
before any lowercase letters. In the EBCDIC standard, the opposite is
true: all lowercase letters appear before any uppercase letters. When
characters used in other languages are present, a binary sort generally
does not produce reasonable results.
For example, an ascending ORDER BY query
would return the character strings ABC, ABZ, BCD, ÄBC, in the sequence,
when the Ä has a higher numeric value than B in the character encoding
scheme.
For languages using Chinese characters, a binary sort is not linguistically
meaningful.
Using Linguistic Sorts
To produce a sort sequence that matches the alphabetic sequence of characters,
another sort technique must be used that sorts characters independently
of their
numeric values in the character encoding scheme. This technique is called
a linguistic sort.
A linguistic sort operates by replacing characters with numeric values
that reflect each character' s proper linguistic order. These numeric
values are found in a table containing major and minor values. Oracle makes two passes when comparing strings.
The first pass
is to compare the major value of entire string from the major table. The second pass
is to compare the minor value from the minor table.
Each major table entry contains the Unicode codepoint and major value.
Usually, letters with the same appearance will have the same major value.
Oracle defines letters with diacritic and case differences for the same
major value but different minor values.
Oracle offers two kinds of linguistic sort:
Monolingual,
commonly used for European languages Multilingual,
commonly used for Asian languages.
Using Monolingual Linguistic Sorts
Oracle offers monolingual linguistic sorts that contain culture-specific
sorting orde for almost all European languages.
Using Multilingual Linguistic Sorts Oracle9i extends monolingual linguistic
sorts so that you can now sort additional languages as part of one sort.
This is useful for certain regions or languages that have complex sorting
rules or global multilingual databases. Additionally, Oracle9i still supports
all the sort orders defined by the previous releases.
For example, in Oracle9i, a French sort is supported, but the new multilingual
linguistic sort for French can also be applied by changing the sort order
from French to French_M. By doing so, the sorting order will be based
on the GENERIC_M sorting order and with the capability to sort secondary
level from right to left.
Oracle recommends using a multilingual linguistic sort if the tables contain
multilingual data. If the tables contain only pure French, for memory
usage concern, a French sort may get better performance. There is a trade-off
between extensibility and performance.
For Asian language data or multilingual data, Oracle provides a sorting
mechanism based on an ISO standard (ISO14651) and the Unicode 3.0 standard.
Multilingual linguistic sorting for Asian languages are implemented in
a three pass fashion based on the number of strokes, PinYin, or radicals.
In addition, handling of canonical equivalence and surrogate codepoint
pairs is also implemented with a capacity to define up to 1.1 million
codepoints in one sort.
Using Linguistic Indexes
Using linguistic indices you can provide the sophisticated sorting capabilities
of a multilingual sort while achieving sorting performance nearly as good
as a binary sort (which offers the best performance). Function-based index
that uses languages other than English can be created. The index itself
does not change the linguistic
sort order determined by NLS_SORT. The index simply improves the performance.
Multiple Linguistic Indexes
If users wish to store character data of multiple languages into one database,
they should create multiple linguistic indexes for one column. This approach
improves the performance of the linguistic sort for a specific column
for multiple languages and is a powerful feature for multilingual databases.
Date and Time Zones
Applications that support multi-geographical locales will find comprehensive
and precision oriented support for time zones, removing the complexity
of doing manual calculations. The new datetime data types can store time
data with sub-second precision. The datetime data types TSLTZ and TSTZ
are time-zone-aware.
Datetime values can be specified as local time in a particular region,
rather than a particular offset. Using the time zone rules tables for
a given region, the time zone offset for a local time is calculated, taking
into consideration Daylight Savings time adjustments, and used in further
operations.
One of the key focus areas of Oracle 9i has been to enhance Oracle Server
manageability by automating routine DBA tasks, reducing complexity of
administration and making it more self-tuning. A number of new features
have been added to streamline space, memory, and resource management as
well as other day-to-day database administrative tasks. Management of
Oracle networking (Net8) configuration has also been significantly improved
with a standard LDAP compliant directory server.
Dynamic Memory Management
Oracle System Global Area (SGA) is a shared memory region, accessible
to all threads of execution. Oracle 9i makes it simple to add or remove
memory to or from an Oracle instance by allowing administrators to change
the SGA configuration without shutting the instance down.