Access Websites and manipulate data

damon88

Member
Joined
Nov 8, 2009
Messages
9
Programming Experience
3-5
Basically i want to know how to access a particular website from my application.
I have used System.Net Namespace to request HTML from a website, It returns a complete HTML code which is used by Browsers to render the content on the screen but what i want is to interact with the website only without the use of a browser. I currently dont know how to do that.

To make things clear about what im saying here is a sample scenario :

I have a application which displays a Image when a user enters a keyword, lets say user enters a word "Envelope", the application will go to Google Images and search for this keyword. Now google returns images as results, the application now needs to download the first result ( which is a image) and display it on a form ). Everything here is done without a browser but via my application.

I hope this makes it clear, I will really appreciate if someone can give me a direction on how to do it.
:rolleyes:
 
The web is the web. It's made up of web pages. It's made for browsers and if you're not using a browser then you have to parse the HTML code yourself. Unless you can access a web service that will return you the exact information you need, your only choice is to download the HTML code and then extract out the parts you need. It's so common a practice there's a name for it: screen scraping.

Generally speaking you will use regular expressions, implemented in the .NET Framework using the Regex class. I'm far from an expert so I can't give you specifics for your case but the first step is to read the HTML code you're getting by your own eye and analyse the patterns it contains. Work out the general form of the text you want to get, e.g. the part inside the src attribute of a img tag:
VB.NET:
<img src="this part here">
Once you know that pattern, then you can look at creating a regular expression to describe it, or get someone else to do it for you.
 
The web is the web. It's made up of web pages. It's made for browsers and if you're not using a browser then you have to parse the HTML code yourself. Unless you can access a web service that will return you the exact information you need, your only choice is to download the HTML code and then extract out the parts you need. It's so common a practice there's a name for it: screen scraping.

Generally speaking you will use regular expressions, implemented in the .NET Framework using the Regex class. I'm far from an expert so I can't give you specifics for your case but the first step is to read the HTML code you're getting by your own eye and analyse the patterns it contains. Work out the general form of the text you want to get, e.g. the part inside the src attribute of a img tag:
VB.NET:
<img src="this part here">
Once you know that pattern, then you can look at creating a regular expression to describe it, or get someone else to do it for you.

hey thanks for the tip, this is just what ive been doing before, i thought maybe there could be some other way to do it more systematically. And about the webservices, i got that thought yesterday so i searched and found many api's provided by google and yahoo for serarching the web..

I think i can make this application of mine by combining both, webservices and parsing of HTML code..

Lastly thanks helping me out, ill look for some neat ways of parsing if i can find one.....

Any further suggestions are always welcome. .. :)
 
Back
Top