Question Reading and writing UTF8 file?

littlebigman

Well-known member
Joined
Jan 5, 2010
Messages
75
Programming Experience
Beginner
Hello

This is probably an easy issue for experts but Google didn't help.

I need to read a text file that contains addresses formatted thusly:
VB.NET:
Address1
Address1
Address1

Address2
Address2
Address2
etc.

... and write them into this:
VB.NET:
Address1, Address1, Address1
Address2, Address2, Address2
etc.

The data contains East European characters.

The following code works mostly, but if I don't specify any codeset for the output, it's saved as ASCII (DOC charset), while I get a compile error if I add "Encoding.UTF8" ("Error 1 Overload resolution failed because no accessible 'New' can be called with these arguments").

VB.NET:
Dim objReader As New StreamReader("c:\input.utf8.txt", Encoding.Default)

[b]
'BAD : ASCII Dim objWriter As New StreamWriter("c:\output.utf8.txt")
'Compile error
Dim objWriter As New StreamWriter("c:\output.utf8.txt",Encoding.UTF8)
[/b]

Dim Line As String
Dim WholeFile As String = Nothing

Do While objReader.Peek() <> -1
    Line = objReader.ReadLine
    If Line.Length = 0 Then
        WholeFile = WholeFile.Substring(0, WholeFile.Length - 2)
        WholeFile = WholeFile + vbCrLf
        RichTextBox1.Text = WholeFile
    Else
        WholeFile = WholeFile + Line + ", "
    End If
Loop

objWriter.Write(WholeFile)

objWriter.Close()
objReader.Close()

Does someone know what must be done for VB.Net to handle UTF8 as expected?

Thank you.
 

JohnH

VB.NET Forum Moderator
Staff member
Joined
Dec 17, 2005
Messages
15,557
Location
Norway
Programming Experience
10+
'BAD : ASCII Dim objWriter As New StreamWriter("c:\output.utf8.txt")
No, UTF8 is the default. There is no byte-order-mark, but the string content is still UTF8.
'Compile error
Dim objWriter As New StreamWriter("c:\output.utf8.txt",Encoding.UTF8)
Have you even looked at the available constructors? StreamWriter Constructor (System.IO)
 

littlebigman

Well-known member
Joined
Jan 5, 2010
Messages
75
Programming Experience
Beginner
Thank you. Turns out the output file itself is indeed UTF8, and the display errors I had was due to an editor not really supporting it.
 

JohnH

VB.NET Forum Moderator
Staff member
Joined
Dec 17, 2005
Messages
15,557
Location
Norway
Programming Experience
10+
If that editor requires detecting UTF8 by BOM you have to specify Encoding.UTF8 explicitly when creating the StreamWriter.
 
Top Bottom