Window form that reads multiple XML

daveofgv

Well-known member
Joined
Sep 17, 2008
Messages
218
Location
Dallas, TX
Programming Experience
1-3
I hope someone can help me....

I need, for work, a simple window form that can search multiple XML's (in sub folders (there will be about 500 + XML's)) and look for 2 specific tags (<name> and <number>).

All the XML's will have the same values. If one dosn't match then it flags it (displays in text file or moves the XML)

Can anyone help me with this?

Thanks in advanced

daveofgv
 
you can try something like this?

VB.NET:
            Imports System.IO

VB.NET:
            Dim dir As DirectoryInfo = New DirectoryInfo("C:\XMLFiles")
            Dim files() As FileInfo = dir.GetFiles()

            'go through all files found, besure to only read xml files
            For element As Integer = 0 To files.Length - 1
                'read xml file here

            Next

check out these links for parsing xml files

XML File Parsing in VB.NET - CodeProject

How to: Parse XML with XmlReader


hope that helps..
 
I understand how to read XML in a datagrid and how to view directory listings for XML's. Out of the links you provided -(maybe I am missing it), however, I did not see how I can choose a directory and search all XML's (maybe 500 of them) for a certain value of an element.

Am I not getting something?

Thanks
daveofgv
 
Here is some sample code i wrote, hope this helps.


VB.NET:
Imports System.IO
Imports System.Xml

'I didnt take the time to write this, but i did test it and it works. Please let me know if this helps. All i have on the form is just two listboxes. The 
'content in the xml files and the results this code gives is shown in comments below
Public Class Form1
    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
        Dim xm As New XMLfiles_Flagged()

        Dim flaggeds() As String = xm.getFlaggedFiles
        Dim files() As String = xm.getAllFiles

        ListBox2.DataSource = files
        ListBox1.DataSource = flaggeds

    End Sub


    Class XMLfiles_Flagged
        Private rootLocation As String
        Private xmlFileLocations As New ArrayList
        Private xmlFileLocations_Flagged As New ArrayList
        ReadOnly Property getAllFiles As String()
            Get
                Return CType(xmlFileLocations.ToArray(GetType(String)), String())
            End Get
        End Property
        ReadOnly Property getFlaggedFiles() As String()
            Get
                Return CType(xmlFileLocations_Flagged.ToArray(GetType(String)), String())
            End Get
        End Property

        Sub New(Optional ByVal location As String = "C:\XMLfiles")
            rootLocation = location

            Dim dirRootLoc As New DirectoryInfo(rootLocation)
            recursiveGetXMLFiles(dirRootLoc)

            flagXMLFiles()
        End Sub

        Private Sub flagXMLFiles()
            For Each xmlLoc As String In xmlFileLocations
                If Not xmlContainsValue(xmlLoc, "firstname", "Tom") Then
                    xmlFileLocations_Flagged.Add(xmlLoc)
                End If
            Next
        End Sub

        Private Function xmlContainsValue(ByVal xmlFileLocation As String, ByVal element As String, ByVal matchValue As String) As Boolean
            Dim matches As Boolean = False

            'The code below enclosed in ---------vvv------
            'is from  http://www.codeproject.com/KB/cpp/parsefilecode.aspx 'I TAKE NO CREDIT FOR IT 'modified a bit of it though
            Dim m_xmlr As XmlTextReader
            'Create the XML Reader

            m_xmlr = New XmlTextReader(xmlFileLocation)
            'Disable whitespace so that you don't have to read over whitespaces

            m_xmlr.WhitespaceHandling = WhitespaceHandling.None
            'read the xml declaration and advance to family tag

            m_xmlr.Read()
            'read the family tag

            m_xmlr.Read()
            'Load the Loop

            While Not m_xmlr.EOF
                'Go to the name tag

                m_xmlr.Read()
                'if not start element exit while loop

                If Not m_xmlr.IsStartElement() Then
                    Exit While
                End If
                'Get the Gender Attribute Value

                Dim genderAttribute = m_xmlr.GetAttribute("gender")
                'Read elements firstname and lastname

                m_xmlr.Read()
                'Get the firstName Element Value
                Dim firstNameValue = m_xmlr.ReadElementString(element)

                If firstNameValue = matchValue Then
                    matches = True
                Else
                    matches = False
                End If

                Exit While 'not the best way to do this..
            End While
            'close the reader

            m_xmlr.Close()
            '------^^^---------'

            Return matches
        End Function
        Private Sub recursiveGetXMLFiles(ByVal rootDir As DirectoryInfo)
            Dim files() As FileInfo = rootDir.GetFiles

            For Each File As FileInfo In files
                If File.Extension = ".xml" Then
                    xmlFileLocations.Add(File.FullName)
                End If
            Next

            If rootDir.GetDirectories.Length > 0 Then
                For Each subDir As DirectoryInfo In rootDir.GetDirectories
                    recursiveGetXMLFiles(subDir)
                Next
            End If

        End Sub
    End Class


    'These are a list of files in the directory
    'C:\XMLfiles\file1.xml
    'C:\XMLfiles\file2.xml
    'C:\XMLfiles\Some\New folder\file1 (2).xml
    'C:\XMLfiles\Some\New folder\file1.xml
    'C:\XMLfiles\Some\New folder\asdf\file1.xml
    'C:\XMLfiles\Some - Copy\New folder\asdf\file1 (2).xml
    'C:\XMLfiles\Some - Copy\New folder\asdf\file1.xml
    'C:\XMLfiles\Some - Copy\Some\file1 (2).xml
    'C:\XMLfiles\Some - Copy\Some\file1.xml
    'C:\XMLfiles\Some - Copy\Some\New folder\asdf\file1.xml


    'These two files are returned as flagged
    'C:\XMLfiles\file2.xml
    'C:\XMLfiles\Some - Copy\Some\New folder\asdf\file1.xml


    'this is xml content of file1.xml that was flagged
    '    <?xml version="1.0" encoding="UTF-8"?>
    '<family>
    '  <name gender="Male">
    '    <firstname>ToNNm</firstname>
    '    <lastname>Smith</lastname>
    '  </name>
    '  <name gender="Female">
    '    <firstname>Dale</firstname>
    '    <lastname>Smith</lastname>
    '  </name>
    '</family>

    'this is the file2.xml that was flagged
    '    <?xml version="1.0" encoding="UTF-8"?>
    '<family>
    '  <name gender="Male">
    '    <firstname>Tommy</firstname>
    '    <lastname>Smith</lastname>
    '  </name>
    '  <name gender="Female">
    '    <firstname>Dale</firstname>
    '    <lastname>Smith</lastname>
    '  </name>
    '</family>


    'ALL of the rest are this exact xml file
    '    <?xml version="1.0" encoding="UTF-8"?>
    '<family>
    '  <name gender="Male">
    '    <firstname>Tom</firstname>
    '    <lastname>Smith</lastname>
    '  </name>
    '  <name gender="Female">
    '    <firstname>Dale</firstname>
    '    <lastname>Smith</lastname>
    '  </name>
    '</family>
End Class




also if you didn't know how to move the files, you use the File.Move() function if you want, and im sure you already know how, but you can use the System.IO.StreamWriter to write the flagged file locations to a textfile.
 
Last edited:
daveofgv said:
search multiple XML's (in sub folders (there will be about 500 + XML's)) and look for 2 specific tags (<name> and <number>).

All the XML's will have the same values. If one dosn't match then it flags it
Dim sameName = "name"
Dim sameNumber = "99"
For Each file In IO.Directory.GetFiles("c:\folder\", "*.xml", IO.SearchOption.AllDirectories)
    Dim doc = XDocument.Load(file)
    If doc...<name>.Value <> sameName OrElse doc...<number>.Value <> sameNumber Then
        Debug.WriteLine(file)
    End If
Next
 
wow, thanks JohnH, talk about me reinventing the wheel, i hate when i do that, i wish i know about that function earlier.
 
Thanks JohnH.

I do have a question though.

All our XML files will look like this:

VB.NET:
  <?xml version="1.0" encoding="iso-8859-1" ?> 
- <Documents xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="ourxsdfile.xsd">
- <Document Version="1.0" Name="XML Invoice" Type="Invoice" DocumentReference="" ImageReference="" CertificateReference="">
- <Invoice>
- <Supplier>
  <name>Made up Company</name> 
  <Identifier /> 
  <Number>455325434</Number> 
  <Description /> 
  <CorporateGroupNumber /> 
  <POBox /> 
  <Street /> 
  <StreetSupplement /> 
  <PostalCode /> 
  <City /> 
  <CountryCode /> 
  <CountryName /> 
  <TelephoneNumber /> 
  <FaxNumber /> 
  <VATRegistrationNumber /> 
  </Supplier>

There will be other <name> and <number> tags within the XML - so I have to make sure it checks for them under the <Supplier> tag.

Debug.WriteLine(File) will not write the file to this location?

I will use My.Settings to choose a folder where a file (text) will be created and a My.Settings to choose which folder (including subfolders) to check all XML's.

Does the schema of the XML make a difference on how to create this?

Thanks

daveofgv
 
Last edited:
There will be other <name> and <number> tags within the XML - so I have to make sure it checks for them under the <Supplier> tag.
...<> means descendants, ie any child, and where is belongs doesn't matter and full qualification is not necessary. So if you want <Supplier> node you simply ask for that first. You'll be looking at doc...<Supplier>.<name> then. The query represents the logical tree structure of the Xml. If you look at the xml in a browser or in VS you may see it displayed as an indented tree where you can also expand/collapse any level:
HTML:
<Documents>
  <Document>
    <Invoice>
      <Supplier>
        <name>Made up Company</name>
        <Identifier />
        <Number>455325434</Number>
Debug.WriteLine(File) will not write the file to this location?
Debug.WriteLine it just for debug output in VS, like MessageBox, in code samples it is typically a statement that shows the result of the operation. How you would like to arrange the results once you figure out how to get the results is an unrelated next step really.
 

Latest posts

Back
Top