Returning Very Large Datasets?

Administrator

VB.NET Forum Admin
Joined
Jun 3, 2004
Messages
1,462
Programming Experience
10+
I have been contracted to produce an application (smart client) that will return and analyze data from very large datasets. These datasets contain at least one datatable with over 100,000 records, and several other DataTables that can contain a substantial number of records as well.

Anyone have experience with using web services to return what appears to be a substantial sized dataset? Any warnings, advice, pointers on optimizing the return of data and preventing timeouts from either session or authentications?

Thanks in advance.
 
Large data communication over Network or internet or intranet has been issue in most of the organizations. Some big organizations regularly improve their network infrastructure but not all of businesses willing to do so.



Whereas more efforts on improving internet speed is seen common and widely appreciated at corporate level. Eventually this will help in improving communication via web services then network.



Technically; large data transfer via web services is possible. Try increasing time outs just before the start of data transfer. Reset them (time outs) back as soon as transfer is complete. If, this is every night operation then you are well placed. If, the application frequently processes large data from service then you may need to change processing strategy. Try processing data inside web services or database side.
 
Jugo,

Please re-read my post, it's for a "smart client" aka winforms app returning serialized data via web services. There is no mention, or intent, to display this data in a web page. I know you are passionate about DataSets vs. Data Readers for the article I posted as I too believe the same, but we don't have that luxury in smart clients/web services as data readers are not serializable (in 1.1). Therefore to return a lot of data via web service is the concern and having issues with timeouts, etc. Naturally smart SQL and schema's is critical, but may not alleviate the situation, i.e. needing to bring down a large amount of data, such as initializing a new application, with subsequent calls retrieving only new information.
 
Wow, this is an interesting issue. Your main concern are timeouts. Sounds your gonna to need a strong T3 connection or more powerful, small companies can't afford this type of technology.

I would say paging is the way; I was speaking to a collegue, he mentioned have the stored procedure that returns this information, or a select statement. I would say get the data in chunks, i know is not a web application but imagine filling a dataset on the server with various tables of 100 records or more and then sending to the winform client. THat would be insane. do it maybe 1000 records at a time. You will have to keep track of pages that you are on and the pages that you already have on the thin. THis is just a general idea. Hope this helps.
 
data paging is a solution. But I think this is something that can't be tackled unless some infrastructure changes are made. In my short experience I've once seen a company that solved this by installing a compression ISAPI (if you're using IIS this is) which compressed all outgoing data heavily and in this way reduced the load on the network. I do however don't think it was a very affordable solution, because it had to be tweaked a lot...
 
What is the compelling reason to use a web service (designed primarily to handle small reoccurring data requests - Like the most current stock price for Target or the conversion rate for yen to dollars) verses a more robust windows service (or even a 3 tiered application)?

I have to retrieve data from an as400, 3 separate SQL databases and 2 separate oracle databases mush it together and finally load a different oracle database. I have one windows service that runs every 30 minutes and it works fine. It handles over 100,000 records each run and it never returns records the finial destination doesn’t need. Oh and it Inserts, Deletes and Updates where required.

My user interface even has a clock that counts down to when the next “data load” will happen so that users understand why their data may change between one button push and the next.
 
Back
Top