Converting TXT to CSV, HTML to CSV, before concatenating these two CSV files

battlescar

New member
Joined
Nov 10, 2010
Messages
1
Programming Experience
Beginner
I have a mass of files, all named 1.txt, 2.txt, etc. all the way up to 100.txt in the same directory (let's say C:\Example1) that have come from a website and are either of one of the two following formats:

"blah blah blah, 1234, blahblah" (Always 4 numbers, it's a year)

or, alternatively:

"blah blah blah, 1234, blah blah"
" , , "

I'd like to convert them to a CSV format, using the commas inside the text files at the moment as the delimiters in the CSV file.

After this, I need to link them to another load of data (also named 1.txt through to 100.txt), but not only is it in HTML but it's been saved from a website, and consequently begins with a ", it ends with a ", and every " that appears in the HTML code appears twice.

VB.NET:
"<table style=""border: 2px solid rgb(0, 0, 255);"" align=""center"" bgcolor=""#e5e5e5"" border=""0"" cellpadding=""0"" cellspacing=""0"" width=""550"">     <input name=""index"" value="""" type=""hidden"">      <tbody><tr>     <td colspan=""9"">       <div id=""ppDiv"">                   <table align=""center"" bgcolor=""#e5e5e5"" border=""0"" cellpadding=""2"" cellspacing=""1"" width=""550"">   <tbody><tr class=""table-nav"">     <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""85""><a href=""javascript:sortOnOption('AT.TRANSACTION_DATE')"" class=""tab-nav"">Date</a></td>     <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""55""><a href=""javascript:sortOnOption('AT.AUCTION_PRICE')"" class=""tab-nav"">        GBP              </a></td>     <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""35""><a href=""javascript:sortOnOption('AT.UNITS')"" class=""tab-nav"">Qty</a></td>     <td class=""tab-nav"" align=""center"" width=""85"">House</td>     <td class=""tab-nav"" align=""center"" width=""85"">Location</td>        </tr>      <tr class=""table-content"">     <td align=""center"">19/05/2010</td>     <td align=""center"">414</td>     <td align=""center"">1</td>     <td align=""center"">Sothebys</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">11/05/2010</td>     <td align=""center"">384</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">Geneva</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/12/2008</td>     <td align=""center"">322</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/12/2008</td>     <td align=""center"">299</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content"">     <td align=""center"">10/12/2007</td>     <td align=""center"">214</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">10/12/2007</td>     <td align=""center"">214</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content"">     <td align=""center"">19/02/2007</td>     <td align=""center"">214</td>     <td align=""center"">2</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/12/2005</td>     <td align=""center"">242</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/03/2005</td>     <td align=""center"">198</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/09/2004</td>     <td align=""center"">286</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/06/2004</td>     <td align=""center"">190</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/04/2004</td>     <td align=""center"">225</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/03/2004</td>     <td align=""center"">255</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/01/2004</td>     <td align=""center"">170</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/11/2003</td>     <td align=""center"">203</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/09/2003</td>     <td align=""center"">253</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/12/2002</td>     <td align=""center"">193</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/11/2002</td>     <td align=""center"">241</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/08/2002</td>     <td align=""center"">275</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/11/2000</td>     <td align=""center"">264</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/07/2000</td>     <td align=""center"">214</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">15/02/1999</td>     <td align=""center"">182</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>          </tbody></table>  </div>     </td>     </tr>    </tbody></table>"
"<table style=""border: 2px solid rgb(0, 0, 255);"" align=""center"" bgcolor=""#e5e5e5"" border=""0"" cellpadding=""0"" cellspacing=""0"" width=""550"">     <input name=""index"" value="""" type=""hidden"">      <tbody><tr>     <td colspan=""9"">       <div id=""ppDiv"">                   <table align=""center"" bgcolor=""#e5e5e5"" border=""0"" cellpadding=""2"" cellspacing=""1"" width=""550"">   <tbody><tr class=""table-nav"">     <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""85""><a href=""javascript:sortOnOption('AT.TRANSACTION_DATE')"" class=""tab-nav"">Date</a></td>     <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""55""><a href=""javascript:sortOnOption('AT.AUCTION_PRICE')"" class=""tab-nav"">        GBP              </a></td>     <td onmouseover=""hOn(this);"" onmouseout=""hOut(this);"" align=""center"" width=""35""><a href=""javascript:sortOnOption('AT.UNITS')"" class=""tab-nav"">Qty</a></td>     <td class=""tab-nav"" align=""center"" width=""85"">House</td>     <td class=""tab-nav"" align=""center"" width=""85"">Location</td>        </tr>      <tr class=""table-content"">     <td align=""center"">19/05/2010</td>     <td align=""center"">414</td>     <td align=""center"">1</td>     <td align=""center"">Sothebys</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">11/05/2010</td>     <td align=""center"">384</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">Geneva</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/12/2008</td>     <td align=""center"">322</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/12/2008</td>     <td align=""center"">299</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content"">     <td align=""center"">10/12/2007</td>     <td align=""center"">214</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">10/12/2007</td>     <td align=""center"">214</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content"">     <td align=""center"">19/02/2007</td>     <td align=""center"">214</td>     <td align=""center"">2</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/12/2005</td>     <td align=""center"">242</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/03/2005</td>     <td align=""center"">198</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/09/2004</td>     <td align=""center"">286</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/06/2004</td>     <td align=""center"">190</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/04/2004</td>     <td align=""center"">225</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/03/2004</td>     <td align=""center"">255</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/01/2004</td>     <td align=""center"">170</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/11/2003</td>     <td align=""center"">203</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/09/2003</td>     <td align=""center"">253</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/12/2002</td>     <td align=""center"">193</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/11/2002</td>     <td align=""center"">241</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/08/2002</td>     <td align=""center"">275</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">01/11/2000</td>     <td align=""center"">264</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content"">     <td align=""center"">01/07/2000</td>     <td align=""center"">214</td>     <td align=""center"">1</td>     <td align=""center"">Historic Archive</td>     <td align=""center"">UK</td>             </tr>         <tr class=""table-content1"">     <td align=""center"">15/02/1999</td>     <td align=""center"">182</td>     <td align=""center"">1</td>     <td align=""center"">Christies</td>     <td align=""center"">London</td>             </tr>          </tbody></table>  </div>     </td>     </tr>    </tbody></table>"

How would I write a script to strip out the extra quotation marks?

After this, how would I then convert this HTML to a CSV format, before joining them up with the original data mentioned above?

I'm completely new to Visual Basic express, and have some experience with Visual Basic 6 (Why they ever changed it is beyond me...)
 
Back
Top