Question Decode cyrillic string (RFC 1522)

daniel.dfg

Member
Joined
Sep 10, 2012
Messages
5
Programming Experience
3-5
Hi all,

I hope you can help me how I can resolve this using VB.NET

From a third-party tool I receive a string which is originally in cyrillic (koi8-r)

The string is always written in this format:

"=?" charset "?" encoding "?" encoded-text "?="

This standard comes from the definition of RFC 1522
http://www.ietf.org/rfc/rfc1522.txt


One string I have to decode can look like this:

=?koi8-r?Q?56552813_=CE=C5_=C4=CC=D1_DPC?=

The encoding "Q" mixes readable text with encoded characters like "CE".
How can I retreive the correctly decoded string by use of VB.NET?

Thank you so much in advance,
Daniel
 

JohnH

VB.NET Forum Moderator
Staff member
Joined
Dec 17, 2005
Messages
15,431
Location
Norway
Programming Experience
10+
I recommend Regex to find the encoded-word pattern and extract the charset part and the text part, and the Q/B part if you need it. With charset you can get an Encoding object. Then I would use Regex again to replace all '=XX' entities with the relevant characters. As for the hexadecimal values, like "CE", you must parse that string into a Byte value using Byte.Parse method and use that value as input for the encodings GetString method.
 

daniel.dfg

Member
Joined
Sep 10, 2012
Messages
5
Programming Experience
3-5
Hi John,

Thank you, this sounds good. The Extraction of the Byte Values is fine for me. But could you show me how you would solve the parsing in vb.net code?
 

JohnH

VB.NET Forum Moderator
Staff member
Joined
Dec 17, 2005
Messages
15,431
Location
Norway
Programming Experience
10+
Top Bottom