Reading all characters from a font

Robert_Zenz

Well-known member
Joined
Jun 3, 2008
Messages
503
Location
Vienna, Austria
Programming Experience
3-5
Hello.

I'm trying to recreate the Windows Character Map. Creating the list of installed fonts is rather easy, but I can't figure out how to read the characters from the font.

My first guess was to just display all charactercodes (Char.ConvertFromUTF32), but there are a lot of 'empty' characters in between (which are most likely not supported or included by the font). How can I read all the characters from a Font without the need to go through every charactercode?

Best Regards and Thanks in advance,
Bobby
 

JohnH

VB.NET Forum Moderator
Staff member
Joined
Dec 17, 2005
Messages
15,334
Location
Norway
Programming Experience
10+
After some research the Win32 function GetFontUnicodeRanges is what I have found and tested, I will share the code for using it because the few code samples I've found on the web has not been functional for VB.Net.
Code:
Declare Function GetFontUnicodeRanges Lib "gdi32.dll" (ByVal hdc As IntPtr, ByVal lpGlyphset As IntPtr) As UInt32

Declare Function SelectObject Lib "gdi32.dll" (ByVal hdc As IntPtr, ByVal hObject As IntPtr) As IntPtr

Declare Function DeleteObject Lib "gdi32.dll" (ByVal hObject As IntPtr) As Int32

Public Structure FontRange
    Public Low, High, Count As UShort
End Structure

Public Structure Glyphset
    Public cbThis, flAccel, cGlyphsSupported, cRanges As UInteger, ranges() As FontRange
End Structure

Public Function GetUnicodeRangesForFont(ByVal font As Font) As Glyphset
    'Win32 GetFontUnicodeRanges
    Dim hdc, hFont, old, lpGlyphSet As IntPtr
    Dim g As Graphics = Graphics.FromHwnd(IntPtr.Zero)
    hdc = g.GetHdc()
    hFont = font.ToHfont()
    old = SelectObject(hdc, hFont)
    Dim size As UInteger = GetFontUnicodeRanges(hdc, IntPtr.Zero)
    lpGlyphSet = Marshal.AllocHGlobal(CInt(size))
    Dim read As UInteger = GetFontUnicodeRanges(hdc, lpGlyphSet)
    Dim bytes(CInt(read) - 1) As Byte
    Marshal.Copy(lpGlyphSet, bytes, 0, bytes.Length)

    'cleanup
    SelectObject(hdc, old)
    Marshal.FreeHGlobal(lpGlyphSet)
    g.ReleaseHdc(hdc)
    g.Dispose()
    DeleteObject(hFont)

    'get glyph data
    Dim gs As New Glyphset
    gs.cbThis = BitConverter.ToUInt32(bytes, 0)
    gs.flAccel = BitConverter.ToUInt32(bytes, 4)
    gs.cGlyphsSupported = BitConverter.ToUInt32(bytes, 8)
    gs.cRanges = BitConverter.ToUInt32(bytes, 12)
    Array.Resize(gs.ranges, CInt(gs.cRanges))
    For i As Integer = 0 To gs.ranges.Length - 1
        gs.ranges(i).Low = BitConverter.ToUInt16(bytes, 16 + (i * 4))
        gs.ranges(i).Count = BitConverter.ToUInt16(bytes, 18 + (i * 4))
        gs.ranges(i).High = gs.ranges(i).Low + gs.ranges(i).Count - 1US
    Next
    '
    Return gs
End Function
Call GetUnicodeRangesForFont supplying a Font object, then you can loop the returned FontRange array in Glyphset and for each loop Low to High with Char.ConvertFromUtf32.
Code:
Dim gs As Glyphset = GetUnicodeRangesForFont(Me.Font)
For Each range As FontRange In gs.ranges
    For i As Integer = range.Low To range.High
        Dim s As String = Char.ConvertFromUtf32(i)

    Next
Next
I haven't figured out if or how the latest unicode surrogate pairs (two code points) work for these ranges. Note also that some fonts use font substitution, for example if you create a "Courier" font it really returns a "Microsoft Sans Serif" font. I've also seen mentioned Uniscribe APIs in relation to this, but haven't look more into this yet.

The names of the characters is useful for a map, check out the Unicode Character Database especially UnicodeData.txt. You can use Integer.Parse (NumberStyles.HexNumber) with the first field to get the code point, and name string from second field.
 

Robert_Zenz

Well-known member
Joined
Jun 3, 2008
Messages
503
Location
Vienna, Austria
Programming Experience
3-5
Thanks for this piece of code!
It's working great for most fonts...most. I'll have a look into it when I have the time again and post what I can find out on this.

Thanks again!
Bobby

Edit: Whoops, seems like it works on all fonts...I just should implent it right. :p
 
Top Bottom