Deleting portions of a string - wildcards in replace statements

ralphie81

Member
Joined
Oct 31, 2006
Messages
7
Programming Experience
1-3
Hi all,

I am parsing an xml file (long story) and was wondering if there was a way to easily delete content between certain tags. For example, let's say I want to not display a "speed" column, I can do a replace statements such as:

strData = replace(strData, "<speed>", "") and strData = replace(strData, "</speed>", "")

but that leaves everything in the middle. Or I could replace them with <div style='visibility:hidden> tags (which is actually what I'm doing at the moment). But what I really want to do is get rid of them completely - since this string is over 2 million characters, I'd really like to trim it down some. I also tried looping through with a counter like:

For i=1 to iDataLength
iSpeed = InStr(i, strData, "<speed", 1)
strTempData = Mid(strData,iSpeed,43) '43 will almost always be the length of what I want pulled here, but this could be variable other places
strWeatherData = Replace(strData, strTempData, "")
i += 43
Next

But this is not only lethargic, it's often wrong. What would be best is if there was someway to specify a wildcard for the replace statement such as strData = replace(strData, "<speed>*</speed>", ""). If someone knows of a good way to do this, please do tell! Thanks!
 
couldnt you just search through the elements, and when it hits your "speed" element, set the current value for that element to "" or something? and THEN delete the element itself.

there probably is a better way to do it though, at the moment im not sure
 
How would it know what to delete though? I'm displaying it as html, not xml, so the tags won't show up anyway, only what's in between them. And I don't know how to tell it to delete something that I don't know the value of (since it's variable).
 
Load the Xml into a XmlDocument, find the 'speed' node and just remove it.
 
JohnH,

Thanks for the reply. I had kind of tried that but had been having troubles there too. In fact, see the other thread I started here: http://www.vbdotnetforums.com/showthread.php?p=44581 for more on that. In case you couldn't tell, I'm kind of new to the xml scene, so I'm probably just doing it wrong. If you could be more detailed about how you would go about loading it into an xmlDocument, that would be very helpful (you can test it with the actual file if you wish, I linked to it in the other thread).

Thanks for your help.
 
XmlDocument is a class in System.Xml namespace. Use its Load method to load from URL/stream/reader or its LoadXml to load from string. Then you use the System.Xml library on that xmldocument instance to find and handle nodes. Learning the XPath query language is very useful (and very easy!). Example here does some select and remove:
VB.NET:
[SIZE=2][COLOR=#0000ff]Dim[/COLOR][/SIZE][SIZE=2] url [/SIZE][SIZE=2][COLOR=#0000ff]As [/COLOR][/SIZE][SIZE=2][COLOR=#0000ff]String[/COLOR][/SIZE][SIZE=2] = [/SIZE][SIZE=2][COLOR=#800000]"http://brianpaulsmith.net/WF031606.xml"[/COLOR][/SIZE]
[SIZE=2][COLOR=#0000ff]Dim[/COLOR][/SIZE][SIZE=2] x [/SIZE][SIZE=2][COLOR=#0000ff]As [/COLOR][/SIZE][SIZE=2][COLOR=#0000ff]New[/COLOR][/SIZE][SIZE=2] XmlDocument[/SIZE]
[SIZE=2]x.Load(url)[/SIZE]
[SIZE=2][COLOR=#0000ff]Dim[/COLOR][/SIZE][SIZE=2] xnl [/SIZE][SIZE=2][COLOR=#0000ff]As[/COLOR][/SIZE][SIZE=2] XmlNodeList = x.SelectNodes([/SIZE][SIZE=2][COLOR=#800000]"/data/item"[/COLOR][/SIZE][SIZE=2])[/SIZE]
[SIZE=2][COLOR=#0000ff]Dim[/COLOR][/SIZE][SIZE=2] sp [/SIZE][SIZE=2][COLOR=#0000ff]As[/COLOR][/SIZE][SIZE=2] XmlNode[/SIZE]
[SIZE=2][COLOR=#0000ff]For [/COLOR][/SIZE][SIZE=2][COLOR=#0000ff]Each[/COLOR][/SIZE][SIZE=2] xn [/SIZE][SIZE=2][COLOR=#0000ff]As[/COLOR][/SIZE][SIZE=2] XmlNode [/SIZE][SIZE=2][COLOR=#0000ff]In[/COLOR][/SIZE][SIZE=2] xnl[/SIZE]
[SIZE=2]  sp = xn.SelectSingleNode([/SIZE][SIZE=2][COLOR=#800000]"speed"[/COLOR][/SIZE][SIZE=2])[/SIZE]
[SIZE=2][COLOR=#0000ff]  If [/COLOR][/SIZE][SIZE=2][COLOR=#0000ff]Not[/COLOR][/SIZE][SIZE=2] sp [/SIZE][SIZE=2][COLOR=#0000ff]Is [/COLOR][/SIZE][SIZE=2][COLOR=#0000ff]Nothing [/COLOR][/SIZE][SIZE=2][COLOR=#0000ff]Then[/COLOR][/SIZE][SIZE=2] xn.RemoveChild(sp)[/SIZE]
[SIZE=2][COLOR=#0000ff]Next[/COLOR][/SIZE]
[SIZE=2]x.Save([/SIZE][SIZE=2][COLOR=#800000]"modified.xml"[/COLOR][/SIZE][SIZE=2])[/SIZE]
 
Very cool, thanks! However, I am still a little confused about how to get it from the xmlDocument to a DataSet out to a GridView. Rather than

x.Save("modified.xml")

I need to just throw it in a dataset. I don't really see a property of xmlDocument with which I can do this, so I'm guessing I need a converter? I tried playing around with the xmlwriter and xmlreader, but the reader needs an xml string and the writer...well, I didn't even know what I was writing to, haha. I think I need one more little push in the right direction. Thanks again for all your help!
 
im pretty sure all you need to do is this...

yourdataset.readxmlschema("path and name of your xml file")

you will need to save the modified xml file though. there probably is a way to skip the saving part and just read off the modified xml but im not sure what it is..

hope it helps


regards
adam
 
I need to just throw it in a dataset.
If you have to do that operation in memory you need a stream that can both read and write, that is a IO.MemoryStream. So to do that XmlDocument.Save to MemoryStream, seek it to start, the Dataset.ReadXml it. Else you can just write the xml file and read it to Dataset.ReadXml method.

Anti-Rich, readxmlschema does only read the schema.
 
Alright, well I'm not using the memorystream now (I decided just to write the file) but now I'm getting "Exception of type 'System.OutOfMemoryException' was thrown. " Does this mean my xml file is just too big for it, or what? It seems to be writing the temp file ok (removing the proper nodes) but it's only getting halfway through the data. Here's the code...

VB.NET:
x.Save("C:\.....\webroot\WebServices\temp.xml")
Dim objDS As New DataSet
objDS.ReadXml("C:\.....\webroot\WebServices\temp.xml")
Try
GridView1.DataSource = objDS
GridView1.DataBind()
Catch ex As Exception
Response.Write(ex.ToString)
End Try
 
The xml document from that link is about 1MB which is a fairly large xml file, but it's not a problematic size neither for the framework or the operating system. Now I see that you are operating in ASP, so I move this thread from VB.Net section (basically Winforms) to ASP.Net section, in this regard perhaps the webserver get trouble if there is a lot of accessing wanting it to process this 1MB a lot? You could set the 'x' instance to Nothing after it is save to immediately reduce the aquired memory, as it currently is you have this data loaded twice on each server session for Xmldocument and also Dataset.
 
Ok, I added x = Nothing, but it doesn't seemed to have made much of a difference (which makes sense, because since it's not writing the whole file on the save line right before it). This site is hosted by Brinkster so they have control of the server.

What about using the XPath in the Gridview dynamically on the client side? Something like this:

VB.NET:
<asp:GridView ID="GridView1" runat="server" 
     AutoGenerateColumns="False" 
     DataSourceID="XmlDataSource1">
      <Columns>
        <asp:BoundField DataField="vesselname" 
         HeaderText="Vessel Name" 
         SortExpression="VesselName" /> 
        <asp:BoundField DataField="timestamp" 
         HeaderText="Timestamp" 
         SortExpression="timestamp" /> 
      </Columns>
    </asp:GridView>
    <asp:XmlDataSource ID="XmlDataSource1" 
     runat="server" DataFile="http://brianpaulsmith.net/WF031606.xml"
     XPath="Data/Item">
    </asp:XmlDataSource>
I actually tried that, but the page didn't do anything (I assume I need some code-behind or something)...What are your thoughts?
 
Back
Top