Understanding and Troubleshooting NLS Issues in Web Deployed Oracle Forms

Understanding and Troubleshooting NLS Issues in Web Deployed Oracle Forms

Understanding and Troubleshooting NLS Issues in Web Deployed Oracle Forms

Titleimage

Posted by Patrick Hamou on 2017:09:26 22:18:04

APPLIES TO: Oracle Forms

Oracle Forms - Version 6.0.8.8.0 and later
Information in this document applies to any platform.
Checked for relevance on 01-JUN-2010
 

PURPOSE : Oracle Forms, NLS settings

To firstly provide an explanation of how web deployed Oracle Forms handles NLS settings. Secondly to provide troubleshooting hints and tips when unexpected problems / behaviour occurs, for example non-English characters not being saved and/or queried back the database correctly.

SCOPE

The article is specific to:

  • Oracle 9iAS Rel 1 Forms Services (Forms 6i (6.0.8))
  • Oracle 9iAS Rel 2 Forms Services (Forms 9i (9.0.2))
  • Oracle 9iDS Rel 2 (Forms 9i (9.0.2))
  • Oracle Application Server 10g (9.0.4) Forms Services (Forms 10g (9.0.4))
  • Oracle Developer Suite 10g (9.0.4) (Forms 10g (9.0.4))
  • Oracle Application Server 10g (10.1.2) Forms Services (Forms 10g (10.1.2))
  • Oracle Developer Suite 10g (10.1.2) (Forms 10g (10.1.2)) 
  • Oracle Fusion Middleware 11gR1 (11.1.1) Forms Services (Forms 11g (11.1.1))

Target Audience: Support Engineers, Consultants, System Administrators, DBAs and Developers

 

DETAILS : NLS Character Handling

Understanding the NLS Basics

Terminology and Concepts of NLS Character Handling

  • Letter: is part of an alphabet. A character may be a letter, a number, a punctuation mark, or any other printable symbol.
  • Script: is a family of characters, that are used in a group of similar languages or alphabets; for example the Latin script or Arabic script.
  • Character set: is a list of characters, typically the list required to write in a particular language. With each character is a description of the character, and a numeric identification, the code point. A code point is unique within a character set. Character encoding is the way code-points are encoded and sent in a binary stream. Strictly speaking the character set is an abstract entity, but in normal usage, including this course, “character set” implies the encoding used.
  • Glyph: is the visual representation of the character. Several characters may sometimes be shown in a single glyph—for example accented letters or composite letters.
  • Font: is a collection of glyphs, together with additional display information, such as size, italic, bold, and style (serif: curly ends, sans-serif, without curly bits; for example, Times, Arial)

Computers and Character Encoding

Computer systems deal only with binary numbers (1s and 0s) and use an encoding scheme that enables alphabetic characters to be handled as binary numbers. Each of the characters used in human communication—letters of the alphabet, numerals and punctuation marks—is represented by a binary value. This encoding scheme is then used by the keyboard, the monitor, and the printer that interacts with the computer. When pressing a character key on a keyboard, the keyboard generates the binary value specified by the encoding scheme. When a monitor (or printer) receives a binary value representing a character, it displays (or prints) the character specified by the encoding scheme. Encoding schemes also include control codes, for example newline, {newline normal} that are used to control data formatting. 

Example: An encoding scheme for a particular keyboard, computer and screen could be 0100001 to represent the “Capital Latin letter A”. For another keyboard, computer and screen the encoding scheme could be 00011 (Baudot; note 5 bits only) or 11000001 (EBCDIC). Note, character set is not character encoding. In the Baudot code, the binary value 00011 can mean either A or - (minus sign), depending on a shift state. In the relatively simple ASCII code, the character set and the encoding have a one-to-one relation, and the distinction between character set and character encoding is not evident. Unicode has one character set, but several different encodings (UTF-8, UTF-16LE, UTF-16BE)

Fonts and Charactersets

A font is used by the OS to convert a numeric value into a graphical 'print' on screen. The WingDings Font on Windows is an example of a font where an 'A' is NOT shown as an 'A' on screen.For the O/S the numeric value represents an 'A'. With WingDings the character is NOT seen as an 'A', but behind the scenes it's an 'A' and will be saved (or used) as an 'A.' It is possible a font does not display ALL the symbols defined in a characterset / code page. This is because a font may not include a graphical representation for all the symbols. That is why sometimes black squares appear on the screen when changing fonts. 
On Windows the 'Character Map' tool can be used to see all the symbols defined in a font.

Types of Characterset

7 bit e.g. ASCII – can define 128 symbols/ characters 
8 bit e.g ROMAN8 – can define 256 symbols / characters
Unicode e.g. UTF8 – multibyte characterset which has the capability to define over a million characters

Why Different Charactersets?

Historically vendors have defined different 'sets' for their hardware and software, mainly because there were no official standards. New character sets have been defined to support new languages. For example, with a 8 bit characterset, number of symbols which can be defined is limited. Therefore there are different sets for different written languages.

Oracle NLS Features

Language support
Territory support 
Character set support
Linguistic sorting
Message support
Date and time formats
Numeric formats 
Monetary formats

What is a database characterset?

The database characterset is specified when the database is created and is stored in the data dictionary table, sys.props$. It is used to determine which characters can be stored in the database. It is also used to determine the character set to be used for object identifiers and PL/SQL variables and for structures.

What is a ‘Client’ Characterset?

This concept only applies to traditional client-server applications, as discussed in:

Note:158577.1 NLS_LANG Explained (How does Client-Server Character Conversion Work?)

Important Note: The web deployed architecture handles NLS c way such that the or encoding on either the client machine hosting the Forms applet or the middle machine hosting Forms Runtime no longer needs to be taken into account. Refer to the sections - "Oracle Forms and the NLS Conversion Process:.." for more detail.

What is NLS_LANG?

NLS_LANG is used to let Oracle know what is the client o/s characterset being used so that Oracle can do (if needed conversion from the client's characterset to the database characterset. NLS_LANG consists of three parts:

[Language]_[Territory].[Characterset] e.g. POLISH_POLAND.EE8MSWIN1250

What does the language part of NLS_LANG control?

  • the language of messages, and day and month names
  • the default sort sequence
  • Also specifies default values for territory and character set so if language is specified then the other two arguments can be omitted.

What does the territory part of NLS_LANG control?

  • the default date format
  • the decimal character and group separator
  • the local currency symbol
  • the week start day

Note: LANGUAGE and TERRITORY have nothing to do with the ability to *store* characters in a database. JAPANESE_JAPAN.WE8MSWIN1252 will *not* store Japanese as WE8MSWIN1252 does not know Japanese characters. AMERICAN_AMERICA.JA16SJIS because JA16SJIS does support / know Japanese characters.

What is the Oracle NLS Runtime Library (NLSRTL)?

  • A library which is common component to Forms and RDBMS
  • Processes strings in various character sets
  • Converts between character sets
  • Compares strings according to linguistic rules
  • Provides Oracle software with information about date, number and monetary formatting conventions for supported locales
  • Implements calendars other than the Gregorian
  • Reads translated UI strings from message files

Oracle Forms and the NLS Conversion Process: Inbound to the Database

Client Machine e.g MS Windows + browser + Oracle Jinitiator / JDK plug-in + Forms Applet

  • User types character into lightweight component e.g. a text item
  • The text item receives (ANSI) code point from the client operating system
  • The JVM (Java Virtual Machine) internally converts code point to UTF-16 (using MultiByteToWideChar Win32 API). This codepoint is stored within the memory of the client. 
  • If using Sun JDK plug-in 1.4.2, the lightweight components receive UTF-16 code point directly from the operating system. No internal conversion to UTF-16 takes place

Client Machine  ---> Middle Tier ( e.g. OracleAS 10g Forms Services)

  • The Forms applet converts the UTF-16 stream to UTF-8 (CESU-8)
  • And then sends it to the Middle Tier (using a CharToByte converter routine)
  • Data on the network between Client and Middle Tier is always in UTF-8 (not UTF-16).
  • On the Middle Tier, the UTF-8 stream is converted to the character set portion of the Forms NLS_LANG setting (using NLSRTL)

Client Machine ---> Middle Tier ---> Database (e.g Oracle Server 10g)

  • Forms Runtime on the Middle Tier makes OCI calls to store data in database
  • If the Forms NLS character set and the database character set declarations are not the same then character conversion is done
  • The conversion to the database character set is performed in the OCI layer of the Oracle Server before the data is stored.

Oracle Forms and the NLS Conversion Process: Outbound from the Database

Database (e.g Oracle Server 10g)

  • Data is sent back to the Forms Runtime on Middle Tier via OCI layer

Database -----> Middle Tier ( e.g. OracleAS 10g Forms Services)

  • The data encoded in the character set portion of the NLS_LANG is converted to UTF-8 by the Forms Runtime (using NLSRTL)
  • It is then sent to the Forms Applet on the client.

Database -----> Middle Tier -----> Client Machine e.g MS Windows + browser + Oracle Jinitiator / JDK plug-in + Forms Applet

  • The UTF-8 stream is then converted to UTF-16 (using a ByteToChar converter routine) and stored within the memory.
  • To display, the JVM (Java Virtual Machine) accesses the font information using the UTF-16 code point
  • The JVM itself renders the characters without using any MS Windows API. So, no further conversion takes place

Setting Up a Basic Oracle Forms NLS Test 

The Database

On MS Windows Oracle RDBMS 10g ‘Database Configuration Assistant’ can be used to quickly create a simple database configured with the target characterset. The database characterset must contain and support the characters which are intended to be saved and displayed via the Oracle Forms application (and indeed any other application like SQL*Plus) e.g.

Start – Programs - Oracle – [Oracle Server Name] – Configuration and Migration Tools – Database Configuration Assistant.

Character Sets are configurable in step 10 of 12 by clicking on the Characters Sets Tab Page. For example, choose characterset UTF8

Forms Builder, Creating and Generating a Form

If using Forms Builder on MS Windows set NLS_LANG in the registry under:

(6.0.8 - 9.0.4) HKEY_LOCAL_MACHINE/SOFTWARE/ORACLE/Homex

(10.1.2 and higher) HKEY_LOCAL_MACHINE/SOFTWARE/ORACLE/KEY_x

This is necessary for characters to be displayed at Forms design time and also so that boilerplate characters are correctly incorporated in the forms module at generate time.

Important to Note:  If the forms is developed using Forms Builder on a MS Windows machine and then ported to a Unix platform, then NLS_LANG must be set correctly when generating the fmx on the Unix platform. For example, edit the $ORACLE_HOME\bin\f90genm.sh and insert the appropriate NLS_LANG setting.

When developing a form using builder choose a font which supports the characters required. For example, use MS Windows Character Map utility to find the right font. For example, Start - Programs – Accessories – System Tools. Pick a Chinese character and copy/paste into the Forms Builder layout editor i.e as part of a boilerplate or a label. Note, if the font chosen in Forms Builder layout editor is different from Character Map utility, copy and paste from character map utility to the layout editor in Forms Builder will not work.

Running the Form

For characters to be displayed correctly at Forms Runtime, the NLS_LANG also needs to be set in the .env file e.g %ORACLE_HOME%\forms90\server\default.env. If an inappropriate NLS_LANG setting is specified boilerplate may appear as symbols / garbage and characters returned from the database may appear as question marks.

What happens if NLS_LANG is not set correctly in the registry or at Forms generate time?

  • In Forms Builder and at Forms Runtime characters entered as boilerplate in the layout editor may appear as upside down question marks.
  • Characters returned as data from the database should continue to display fine (assuming NLS_L set in the default.env file)

What happens if only the Language pects of the NLS_LANG are changed?

  • The display of characters will be unchanged
  • Only the format of date, decimal and currency values will change

Does the NLS characterset set for Forms have to match the characterset set for the Oracle Server database?

The character set portion of the Forms NLS_LANG environment variable does not have to match the database character set, but it cannot be a super set of the database character set. For example, when the database character set is WE8MSWIN1252, the Forms character set should not be UTF8.

When the Forms character set is different from the database character set and the NLS_LENGTH_SEMANTICS is not set to CHAR, there can be a data expansion issue (multi-byte character sets only). Example: When the Forms characterset is JA16SJIS (Japanese) and the database character set is UTF8, it will be possible to enter 5 Japanese characters in a Forms text field with the maximum length set to 10. The 5 Japanese characters, however, will expand to 15 bytes, it will not fit to a database column declared as VARCHAR2(10). For more information about character semantics, refer to:

Note:305259.1 Executing a Query Causes FRM-40831 in Forms

Viewlet / Demo

To download a 5 minute viewlet demonstrating a basic Oracle Forms NLS test scenario, click here. To play the viewlet, extract the contents of the zip file and double click on the file 'forms_and_nls_viewlet_swf.html'

Posted by Patrick Hamou on 2017:09:26 22:18:04

Return to Blog