Alternative to dataset for large data imports

mikethecoder · Aug 23, 2008

I have plenty of programming experience but unfortunately not as much with .NET and a project has been dropped on me last minute. I can code in the language but this issue deals with performance and choosing a best solution... and not knowing the language in and out I do not know which is best.

We have a program that runs 24-7 running imports. It creates a new importer object which then takes a tab separated text file from a UNC path on our server and uses this to form a dataset to represent the data originally in the text file. After the text file data is processed into datasets for each table (4 tables)... one dataset for each table... we then run the .update method on each dataset to insert the data to an access .mdb file. So basically the process is essentially to update an existing access database file based on changes in a tab separated .txt file.

The problem is these imports run very frequently and recently we have someone with a much larger database than ever before. Datasets have been shown to be the cause when running the update. It is using too much server memory and timing out other imports. I was told to come up with an alternative to using datasets.

Any ideas what the best method would be for server performance to avoid storing the data from the txt file in datasets and updating the dataset back to the mdb file? Should the solution, since we need to avoid using any server memory, just be to run multiple SQL insert commands against it for each record?

The average import, across all 4 tables together, processes about 60,000 records. Thanks!!!!!!

liwaste · Jan 8, 2009

cjard said:
Minimal. Read several lines if you care, but somewhere along the pipe from drive to memory you'll find a cache. Reading a file line by line from disk is a very fast and trivial op. Try it for yourself. Line by line also necessarily uses less memory resource because youre only reading and keeping one line at a time

Be careful, when presenting arguments, to use wording that is sensible and understandable.. Line by line is a disk hit, whole file is a memory hit. Both use system resources. Reading a 1 gig file into memory when you only have 512mb of ram is clearly a retarded idea with overall massively higher resource usage due to the paging.

You guess? So you don't actually know what youre talking about?

I "guess" you should investigate how DataTable/DataAdapter works.. You may be surprised at how ill informed your comment is.

If youre after the fastest way to load info into or out of a db, you take a look at the relevant documentation. In Oracle I would use bulked array transfer mode, though this wasnt available on a server I recently worked on so I implemented an improvement that mimicked the bulk transfer, by sending 80 records of 50 width to the server at a time, in a single VARCHAR2(4000). Dataset performance was 600 lines/second. The new method, using the direct stored procedure (which does nothing other than break apart the 4000 long string and call INSERT 80 times) gets around 8,000 lines per second.

While that's not as fast as Oracle's proper designed bulk transfer (doing 500 records at a time; i'm limited to 80) at 10,000 (an improvement over the previous 3,000 as the source PC is a proper dual cpu server on gigabit ethernet rather than a centrino laptop on 100meg) it's close enough not to warrant installing ODP.NET on a server that otherwise has no need for it

well i am sure that reading a large file using a data adapter took a long time for me.. it took 15 mins for me.. there were 500,000 records in the text file

so i the following sql query which took 2.5 mins to pull up all the records in to the data base

INSERT INTO tbl_dest" & _
"(destcolumn1, destcolumn2, destcolumn3, destcolumn4,...) SELECT (column1, column2, column3, column4,...)
FROM [Text;DATABASE=" & strFilePath & ";" & _
"HDR=Yes;FMT=Delimited].[" & strFileName & "]", conn, tran)

Yes i was ill informed before, but i am well informed now

Alternative to dataset for large data imports

mikethecoder

New member

liwaste

Active member

Similar threads

Share this page

Latest posts