Sorting groups GetFiles bassed on multiple exstentions.

Johnson

Well-known member
Joined
Mar 6, 2009
Messages
158
Programming Experience
Beginner
Just need a few ideas. Say i wanted to sort a directories files into groups based on file extension. Each group currently uses the GetFiles but using WHERE to obtain the different types. I have Two concerns. The first is i have 6/7 query's using GetFiles Each query culd have between 1 to 4 different extensions per group. This means i am doing the same search 6/7 times based on file type so performance may be an issue. The other is the last query is a "Other type". Where any other file type not listed is grouped. I dont really fancy searching this group and removing each type already obtained above.

Ideas? Hope i have explained this well.
 
Last edited:
Hi,

Here is one example of grouping files by extension from a particular directory using LINQ.

Imports System.IO

VB.NET:
Public Class Form1
 
  Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
    Dim DI As New DirectoryInfo("d:\temp")
    ListBox1.Items.AddRange(GetFilesByType(DI, "*.txt").ToArray)
    ListBox2.Items.AddRange(GetFilesByType(DI, "*.bmp").ToArray)
    ListBox3.Items.AddRange(GetFilesByType(DI, "*.xml").ToArray)
    ListBox4.Items.AddRange(GetOtherFiles(DI).ToArray)
  End Sub
 
  Private Function GetFilesByType(ByVal DirInfo As DirectoryInfo, ByVal strExtension As String) As IEnumerable(Of FileInfo)
    Dim myFiles As IEnumerable(Of FileInfo) = (From myFile As FileInfo In DirInfo.GetFiles(strExtension) Select myFile)
    Return myFiles
  End Function
 
  Private Function GetOtherFiles(ByVal DirInfo As DirectoryInfo) As IEnumerable(Of FileInfo)
    Dim myFiles As IEnumerable(Of FileInfo) = (From myFile As FileInfo In DirInfo.GetFiles Where Not myFile.Extension = ".txt" And Not myFile.Extension = ".bmp" And Not myFile.Extension = ".xml" Select myFile)
    Return myFiles
  End Function
End Class

Hope that helps.

Cheers,

Ian
 
I would start by defining the groups as an Enum, then map the extensions into a Dictionary. With a single pass over GetFiles you can then group the files by their extension into respective groups.

IanRyder, that's exactly what Johnson tries to avoid, multiple calls to GetFiles.
 
Hi Johnson/JohnH,

I enjoy a challenge and based on JohnH's comments here is a revised post on how you could do this based on JohnH's comments and your initial requirements:-

VB.NET:
Imports System.IO
 
Public Class Form2
  Private Enum myExtensions
    BipMapFiles
    XMLFiles
    TextFiles
  End Enum
 
  Private myDict As New Dictionary(Of Integer, String)
 
  Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
    GetFilesByType()
  End Sub
 
  Private Sub GetFilesByType()
    Dim DirInfo As New DirectoryInfo("d:\temp")
    Dim myGroupedFiles = DirInfo.GetFiles.Where(Function(x As FileInfo) myDict.ContainsValue(x.Extension)).GroupBy(Function(x As FileInfo) x.Extension)
    Dim myOtherFiles = DirInfo.GetFiles.Where(Function(x As FileInfo) Not myDict.ContainsValue(x.Extension))
 
    ListBox1.Items.AddRange(myGroupedFiles(0).ToArray)
    ListBox2.Items.AddRange(myGroupedFiles(1).ToArray)
    ListBox3.Items.AddRange(myGroupedFiles(2).ToArray)
    ListBox4.Items.AddRange(myOtherFiles.ToArray)
  End Sub
 
  Private Sub Form1_Load(sender As System.Object, e As System.EventArgs) Handles MyBase.Load
    myDict.Add(myExtensions.BipMapFiles, ".bmp")
    myDict.Add(myExtensions.XMLFiles, ".xml")
    myDict.Add(myExtensions.TextFiles, ".txt")
  End Sub
End Class

JohnH, I would love to see how you would have solved this since you specified using a Dictionary so I am guessing I am still missing something from an efficiency point of view.

Hope it helps,

Cheers,

Ian
 
Each query culd have between 1 to 4 different extensions per group.
I understand this to be typical extension grouping where for example a 'images' group is the files that has extensions .jpg,.png,.bmp and so on, and that there are other defined groups, and an 'undefined' group.
    Public Enum FileType
        Other
        Image
        Document
    End Enum

Here a few extensions is mapped to a group and a lambda function is used to lookup the extension to the group.
        Dim extGroups As New Dictionary(Of String, FileType)
        extGroups(".jpg") = FileType.Image
        extGroups(".png") = FileType.Image
        extGroups(".bmp") = FileType.Image
        extGroups(".doc") = FileType.Document
        extGroups(".txt") = FileType.Document
        extGroups(".xls") = FileType.Document

        Dim selector = Function(pt As String) As FileType
                           Dim ext = IO.Path.GetExtension(pt).ToLower
                           Return If(extGroups.ContainsKey(ext), extGroups(ext), FileType.Other)
                       End Function

        Dim filegroups = IO.Directory.GetFiles("I:\").GroupBy(selector).ToDictionary(Function(item) item.Key, Function(item) item.ToArray)

The result here is a dictionary where each key is the Enum values and each value is an array of file paths corresponding to the group extensions. Depending on usage maybe a different approach would be appropriate.

Maybe you could also turn around the extension table for a different kind of lookup like this:
        Dim extGroups As New Dictionary(Of FileType, String())
        extGroups(FileType.Document) = {".doc", ".txt", ".xls"}
        extGroups(FileType.Image) = {".jpg", ".png", ".bmp"}

        For Each file In IO.Directory.EnumerateFiles("I:\")
            Dim lfile = file
            Dim group = (From g In extGroups Where g.Value.Contains(IO.Path.GetExtension(lfile).ToLower) Select g.Key).FirstOrDefault

        Next

Note that FirstOrDefault in this last example returns either a found group (f.x FileType.Image) or the default FileType value (which is FileType.Other).
 
Two very neat, tidy and informative examples JohnH. I would never have thought to create a string array within a Dictionary with the key being assigned to the File Group.

Thanks very much.

I hope that also helps to get you where you need to be Johnson?

Cheers,

Ian
 
Some good stuff guys. Only thing i dont understand is how to loop through "group" and print each groups new exe name out.

debug.writeline(group(0).tostring) as for an example i know dont work lol.
 
Johnson said:
i dont understand is how to loop through "group"
If by "group" you are referring to the 'group' variable in last example in my post, then that is the FileType value for the current file in loop.
 
If by "group" you are referring to the 'group' variable in last example in my post, then that is the FileType value for the current file in loop.

Hi John. What i mean is i wish to display each group. Each group has it's own ListView. Bit confused how to split the groups now :S
 
If you look at the last example I posted you can see the group is identified for each file. You can then put the file in the ListView for that group. This is probably the simplest and uses less memory, where you have a single loop and 'put here or put there'.

In the first example I posted you can loop through the array of files in the Dictionary by FileType key, and put the file in the ListView for that group. All files here is grouped in memory before you use one loop for each group. For example:
For Each file in filegroups(FileType.Images)

Next
 
If you look at the last example I posted you can see the group is identified for each file. You can then put the file in the ListView for that group. This is probably the simplest and uses less memory, where you have a single loop and 'put here or put there'.

In the first example I posted you can loop through the array of files in the Dictionary by FileType key, and put the file in the ListView for that group. All files here is grouped in memory before you use one loop for each group. For example:
For Each file in filegroups(FileType.Images)

Next

I'm really having a bad day. I have read and reread what you just posted regarding the last snippet but it's not sinking in

VB.NET:
        For Each file In IO.Directory.EnumerateFiles(Me.DirectorySearchTextBox.Text)
            Dim lfile = file
            Me.ListBox1.Items.Add(From g In extGroups Where g.Value.Contains(IO.Path.GetExtension(lfile).ToLower) Select g.Key)
        Next

hmm
 
Hi Johnson,

To help with JohnH's examples you have misunderstood what JohnH has said. In the example which you are trying to get to work you are actually adding the name of the derived variable type to the ListBox.

What you should be doing is interrogating the returned FileType that is returned from the LINQ statement. So the statement:-

VB.NET:
Dim Group As FileType = (From g In extGroups Where g.Value.Contains(IO.Path.GetExtension(lfile).ToLower) Select g.Key).FirstOrDefault

Will return the variable Group of type FileType which will result in one of the following:-

VB.NET:
Group = FileType.Other
Group = FileType.Image
Group = FileType.Document

You can then easily say something like:-

VB.NET:
Select Case Group
  Case FileType.Other
    listboxOther.Items.Add(lfile)
  Case FileType.Image
    listboxImage.Items.Add(lfile)
  Case FileType.Document
    listboxDocument.Items.Add(lfile)
End Select

In JohnH's first example, populating the ListBox's is even easier since JohnH ended up creating a Dictionary with it's key being set to the FileType and the value being set as an array of strings which represented the files within that file group. So to populate a ListBox with a particular group of files you would just say:-

VB.NET:
listboxDocument.Items.AddRange(filegroups(FileType.Document))
listboxImage.Items.AddRange(filegroups(FileType.Image))
listboxOther.Items.AddRange(filegroups(FileType.Other))

Hope that helps and Merry Christmas.

Cheers,

Ian
 
Back
Top