Alex Taylor, 20091218
On Sun, 1 Nov 2009 08:54:18 +0200 Steven Levine wrote:
>> > >Pretty much. The standard library string functions operate on a
>> > >per-byte level; they neither know nor care how many displayable
>> > >characters those bytes represent.
> >
> > Most runtimes also have wide-character string functions.
True, but all of OS/2's NLV DBCS codepages are byte-based variable-width
encodings, so the wide-character functions are actually pretty useless.
(They're more or less useable when dealing with UCS-2 UniChar strings,
which are USHORT sequences; but since the OS/2 ULS API provides its own,
more capable, family of functions for dealing with same, they're still not
terribly useful for the most part.)
>> > >The main worry is in dealing with double-byte characters that contain a
>> > >backslash \ as the second byte. C compilers (and runtimes) all seem to
>> > >be smart enough to recognize them, so you don't have to escape them...
>> > >NORMALLY.
> >
> > In C you are going to have to escape them if they are followed by a
> > character the look like a valid backslash escape. For example "\F" will
> > typically pass through unsullied while "\B" will not.
I could be mistaken, but I'm pretty sure that if a DBCS codepage is used
(or specified), the C compiler generally knows enough to ignore any \ that
occurs after a DBCS lead byte.
> > >from memory, I think STRINGTABLE resources have to escape their
>> > >secondary-byte backslashes but all other items (e.g. MENUITEMS, dialog
>> > >strings, etc) must not. (Unless I have that backwards... like I said,
>> > >this is from memory.)
> >
> > If you find an example of this, I'd like to see it. You've been a lot
> > more involved in the gory details of DBCS than I have.
I don't remember very clearly now, unfortunately. I mostly just arrived at
my conclusions through repeated experimentation in trying to build the
XWorkplace Japanese DLL. I could probably work it out if I dug into the
sources for a while, but it may be a while before I can get around to doing
that.
I really should, though... frankly, this is the sort of thing that needs to
be quantified and written down.
>> > >properly. Some backslashes had to be escaped, others had to be left
>> > >unescaped. And it didn't really compile reliably unless I was actually
>> > >running under codepage 932 (Japanese), regardless of the command-line
>> > >switches I used.
> >
> > Was this with or without the DBCS variable set in the environment?
Which DBCS variable? I did try setting all the relevant command-line
switches in the compiler(s) that I could find, without it helping much.