battlescar
New member
- Joined
- Nov 10, 2010
- Messages
- 1
- Programming Experience
- Beginner
I have a mass of files, all named 1.txt, 2.txt, etc. all the way up to 100.txt in the same directory (let's say C:\Example1) that have come from a website and are either of one of the two following formats:
"blah blah blah, 1234, blahblah" (Always 4 numbers, it's a year)
or, alternatively:
"blah blah blah, 1234, blah blah"
" , , "
I'd like to convert them to a CSV format, using the commas inside the text files at the moment as the delimiters in the CSV file.
After this, I need to link them to another load of data (also named 1.txt through to 100.txt), but not only is it in HTML but it's been saved from a website, and consequently begins with a ", it ends with a ", and every " that appears in the HTML code appears twice.
How would I write a script to strip out the extra quotation marks?
After this, how would I then convert this HTML to a CSV format, before joining them up with the original data mentioned above?
I'm completely new to Visual Basic express, and have some experience with Visual Basic 6 (Why they ever changed it is beyond me...)
"blah blah blah, 1234, blahblah" (Always 4 numbers, it's a year)
or, alternatively:
"blah blah blah, 1234, blah blah"
" , , "
I'd like to convert them to a CSV format, using the commas inside the text files at the moment as the delimiters in the CSV file.
After this, I need to link them to another load of data (also named 1.txt through to 100.txt), but not only is it in HTML but it's been saved from a website, and consequently begins with a ", it ends with a ", and every " that appears in the HTML code appears twice.
VB.NET:
"<table style=""border: 2px solid rgb(0, 0, 255);"" align=""center"" bgcolor=""#e5e5e5"" border=""0"" cellpadding=""0"" cellspacing=""0"" width=""550""> <input name=""index"" value="""" type=""hidden""> <tbody><tr> <td colspan=""9""> <div id=""ppDiv""> <table align=""center"" bgcolor=""#e5e5e5"" border=""0"" cellpadding=""2"" cellspacing=""1"" width=""550""> <tbody><tr class=""table-nav""> <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""85""><a href=""javascript:sortOnOption('AT.TRANSACTION_DATE')"" class=""tab-nav"">Date</a></td> <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""55""><a href=""javascript:sortOnOption('AT.AUCTION_PRICE')"" class=""tab-nav""> GBP </a></td> <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""35""><a href=""javascript:sortOnOption('AT.UNITS')"" class=""tab-nav"">Qty</a></td> <td class=""tab-nav"" align=""center"" width=""85"">House</td> <td class=""tab-nav"" align=""center"" width=""85"">Location</td> </tr> <tr class=""table-content""> <td align=""center"">19/05/2010</td> <td align=""center"">414</td> <td align=""center"">1</td> <td align=""center"">Sothebys</td> <td align=""center"">London</td> </tr> <tr class=""table-content1""> <td align=""center"">11/05/2010</td> <td align=""center"">384</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">Geneva</td> </tr> <tr class=""table-content""> <td align=""center"">01/12/2008</td> <td align=""center"">322</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> <tr class=""table-content1""> <td align=""center"">01/12/2008</td> <td align=""center"">299</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> <tr class=""table-content""> <td align=""center"">10/12/2007</td> <td align=""center"">214</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> <tr class=""table-content1""> <td align=""center"">10/12/2007</td> <td align=""center"">214</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> <tr class=""table-content""> <td align=""center"">19/02/2007</td> <td align=""center"">214</td> <td align=""center"">2</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> <tr class=""table-content1""> <td align=""center"">01/12/2005</td> <td align=""center"">242</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/03/2005</td> <td align=""center"">198</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/09/2004</td> <td align=""center"">286</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/06/2004</td> <td align=""center"">190</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/04/2004</td> <td align=""center"">225</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/03/2004</td> <td align=""center"">255</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/01/2004</td> <td align=""center"">170</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/11/2003</td> <td align=""center"">203</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/09/2003</td> <td align=""center"">253</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/12/2002</td> <td align=""center"">193</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/11/2002</td> <td align=""center"">241</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/08/2002</td> <td align=""center"">275</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/11/2000</td> <td align=""center"">264</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/07/2000</td> <td align=""center"">214</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">15/02/1999</td> <td align=""center"">182</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> </tbody></table> </div> </td> </tr> </tbody></table>"
"<table style=""border: 2px solid rgb(0, 0, 255);"" align=""center"" bgcolor=""#e5e5e5"" border=""0"" cellpadding=""0"" cellspacing=""0"" width=""550""> <input name=""index"" value="""" type=""hidden""> <tbody><tr> <td colspan=""9""> <div id=""ppDiv""> <table align=""center"" bgcolor=""#e5e5e5"" border=""0"" cellpadding=""2"" cellspacing=""1"" width=""550""> <tbody><tr class=""table-nav""> <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""85""><a href=""javascript:sortOnOption('AT.TRANSACTION_DATE')"" class=""tab-nav"">Date</a></td> <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""55""><a href=""javascript:sortOnOption('AT.AUCTION_PRICE')"" class=""tab-nav""> GBP </a></td> <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""35""><a href=""javascript:sortOnOption('AT.UNITS')"" class=""tab-nav"">Qty</a></td> <td class=""tab-nav"" align=""center"" width=""85"">House</td> <td class=""tab-nav"" align=""center"" width=""85"">Location</td> </tr> <tr class=""table-content""> <td align=""center"">19/05/2010</td> <td align=""center"">414</td> <td align=""center"">1</td> <td align=""center"">Sothebys</td> <td align=""center"">London</td> </tr> <tr class=""table-content1""> <td align=""center"">11/05/2010</td> <td align=""center"">384</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">Geneva</td> </tr> <tr class=""table-content""> <td align=""center"">01/12/2008</td> <td align=""center"">322</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> <tr class=""table-content1""> <td align=""center"">01/12/2008</td> <td align=""center"">299</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> <tr class=""table-content""> <td align=""center"">10/12/2007</td> <td align=""center"">214</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> <tr class=""table-content1""> <td align=""center"">10/12/2007</td> <td align=""center"">214</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> <tr class=""table-content""> <td align=""center"">19/02/2007</td> <td align=""center"">214</td> <td align=""center"">2</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> <tr class=""table-content1""> <td align=""center"">01/12/2005</td> <td align=""center"">242</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/03/2005</td> <td align=""center"">198</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/09/2004</td> <td align=""center"">286</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/06/2004</td> <td align=""center"">190</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/04/2004</td> <td align=""center"">225</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/03/2004</td> <td align=""center"">255</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/01/2004</td> <td align=""center"">170</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/11/2003</td> <td align=""center"">203</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/09/2003</td> <td align=""center"">253</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/12/2002</td> <td align=""center"">193</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/11/2002</td> <td align=""center"">241</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/08/2002</td> <td align=""center"">275</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">01/11/2000</td> <td align=""center"">264</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content""> <td align=""center"">01/07/2000</td> <td align=""center"">214</td> <td align=""center"">1</td> <td align=""center"">Historic Archive</td> <td align=""center"">UK</td> </tr> <tr class=""table-content1""> <td align=""center"">15/02/1999</td> <td align=""center"">182</td> <td align=""center"">1</td> <td align=""center"">Christies</td> <td align=""center"">London</td> </tr> </tbody></table> </div> </td> </tr> </tbody></table>"
How would I write a script to strip out the extra quotation marks?
After this, how would I then convert this HTML to a CSV format, before joining them up with the original data mentioned above?
I'm completely new to Visual Basic express, and have some experience with Visual Basic 6 (Why they ever changed it is beyond me...)