ASCII Code of Currency Symbol unexpectedly changed

robertb_NZ

Well-known member
Joined
May 11, 2010
Messages
146
Location
Auckland, New Zealand
Programming Experience
10+
MANASYS Jazz is a COBOL-generating programming system. Sometimes I want to generate a line like this, with a currency symbol for Euro, Pounds, or Yen: -
VB.NET:
000090            CURRENCY SIGN IS '€' WITH PICTURE SYMBOL '$'.
The line is written out with
VB.NET:
CobolOut.WriteLine(Lineout)
CobolOut is defined in the WriteCOBOL class as
VB.NET:
Property CobolOut As StreamWriter

When CobolOut.WriteLine(Lineout) is executed, Lineout carries the Euro symbol as x80, which is correct. However it is written to the file as xE282AC, which I'm told is the UTF-8 encoding for a Euro. This causes the COBOL compile to fail. I get correct results if I edit the COBOL line to ensure that x80 is used. With both x80 and xE282AC the symbol is displayed correctly.

How do I prevent this code change?
Thank you, Robert Barnes.
 
Last edited:
The StreamWriter is opened with
VB.NET:
    Sub Open(COBOLFile As String)
        CobolOut = New StreamWriter(COBOLFile)
so it opens it with the default encoding, UTF-8. COBOLFile is of course the path to the file which is to be written. If I type a comma after COBOLFile in the statement CobolOut = New ... then the first option suggested (identified as the 3rd of 7 possible overrids) is "Encoding as System.text.encoding"). So I wrote
VB.NET:
        CobolOut = New StreamWriter(COBOLFile, System.Text.Encoding.UTF7)
Now the statement is flagged as invalid, with message code BC30518 which has description
Error BC30518 Overload resolution failed because no accessible 'New' can be called with these arguments:
'Public Overloads Sub New(stream As Stream, encoding As Encoding)': Value of type 'String' cannot be converted to 'Stream'.
'Public Overloads Sub New(path As String, append As Boolean)': Value of type 'Encoding' cannot be converted to 'Boolean'.
I was able to write
VB.NET:
CobolOut = New StreamWriter(COBOLFile, False, System.Text.Encoding.UTF8)

I experimented with various encoding values. MANASYS Jazz has generated a line which is seen in Notepad and Visual Studio as
000090 CURRENCY SIGN IS '€' WITH PICTURE SYMBOL '$'.
ASCII renders line 000090 with ? for the Euro symbol. The COBOL program compiles and runs, printing data with ? where the PICTURE clause specified $.
UTF7 screws up the COBOL program.
UTF8 is the default, and I expected this to give the same results as before any change, i.e the COBOL program looked exactly what I wanted, but it wouldn't compile. However although the COBOL program looked identical, it compiles. Line 000090 was exactly as above, and the Euro symbol was printed correctly
UNICODE: the COBOL looked perfect, but had a number of errors when I tried to compile it.

There is a mystery about why I get different results for the default encoding and UTF8 as I thought that UTF8 was the default, but I'm not motivated enough to find out why. I have a solution that works, so I will go with this.

Thank you JohnH for your help.
 
When you don't specify encoding it uses UTF-8 without a BOM to the start of file, when you specify UTF-8 encoding it adds that BOM. It explained in constructor remarks I linked to.

I don't know what encoding Cobol source code is expexted to be in. From some web reading I find Windows codepage 1252 does translates euro sign to hex80/dec128, you can get such encoding with Encoding.GetEncoding(1252), but it would be speculative to use without knowing. You will not get x80 for that character with UTF-8 encoding.

Ascii only goes to dec 127 and substitutes all other characters as ?
 
Thank you JohnH. After Googling "What is a Byte Order Mark" I understand a little more about what's going on. I expect I'll have to remain alert to this issue as users will have different culture settings to me. I wonder what happens if when the COBOL program is sent to a mainframe. Somewhere in the FTP software the UTF characters get converted to EBCDIC. I'll ask my user in Copenhagen to find out for me.
 
Back
Top