Question Webbrowser - strip out javascript?

J. Scott Elblein

Well-known member
Joined
Dec 14, 2006
Messages
166
Location
Chicago
Programming Experience
10+
Hi all

I have a little bit of a complicated(?) problem?

I have a website that requires a login and pass, which sends to the next page, nothing unusual there; on the next page, there is some javascript that periodically sends a popup dialog, which the user must click.

The timer code for that popup is right there in the source made out of javascript. I was wondering if it would be possible using the webbrowser control to edit (maybe vastly increase the timer value) or completely strip out the javascript code (quite a few lines)?

I am thinking, but not sure, that the documentcompleted event is only fired after the page has finished loading, so my thinking is that is too late. I would like to do the editing or stripping after the code is sent to my program, but before displayed in the browser control, so that the code in the webbrowser is actually my altered code.

There is a streaming media player on the same page, so I don't want my editing to affect that.

Is this possible, if so, how? Sorry for the long description.

Thanks! Hope everyone is having happy holidays so far!
 
The script can be replaced, how depends on if it's on page or referenced as an external js source file. It is also possible to change the inline event call that triggers the timer code. On page get the script element and modify the text attribute, if it's external I see no way to modify the loaded source, but one can either change the src attribute or add a new on page script tag that replaces the function. Both replacing the function that set the timeout and the callback function can work. Below are some examples so you can see what this means. The VB.Net code examples here all goes in DocumentCompleted event after ReadyState=Complete, or at a later occation. You can get reference to the script HtmlElement through usual means, like GetElementsByTagName or GetElementById and similar.

Take this sample html page with script, body onload calls inline javascript to loader() function, loader() sets a timeout call to timeout() function:
HTML:
<html>
<head>
<script type="text/javascript">
function loader() {
  setTimeout("timeout()",1000);
}
function timeout(){
  alert("hi");
}
</script>
</head>
<body onload="javascript:loader()">
wait and see.
</body>
</html>

Here the body elements onload attribute can be reset like this:
VB.NET:
Me.WebBrowser1.Document.Body.SetAttribute("onload", String.Empty)
Or the script elements text attribute value replaced:
VB.NET:
script.SetAttribute("text", "function timeout(){}")
Both loader() and timeout() can be replaced here for same effect. Notice that if you add some code to the new timeout() function, for examle alert("replaced"), then this is actually called from the original setTimeout call even though the above code did replace the whole script body. This happens because the original script was loaded into memory and the new script just add to the existing, or in this case overrides it.

Let's say the page source is this instead, where script file contains the same functions:
HTML:
<script type="text/javascript" src="timeout.js"></script>
Here you can change the src attribute value to point to a different source file, this file must be available from the page that is loaded, but it can be cross-domain. One can also reset the src attribute and instead add a replacement function to text attribute like this:
VB.NET:
script.SetAttribute("src", "")
script.SetAttribute("text", "function timeout(){}") 'or loader()
Another possibility for external scripts is to add a new script to the page that overrides the original function, example:
VB.NET:
Dim repl As HtmlElement = Me.WebBrowser1.Document.CreateElement("script")
repl.SetAttribute("type", "text/javascript")
repl.SetAttribute("text", "function timeout(){}") 'or loader()
Me.WebBrowser1.Document.Body.AppendChild(repl)
 
Excellent help John, thanks. I'm still having trouble getting it working though, and I've become frustrated, lol. Seems no matter what I try, though I don't get any errors, once the page is loaded and I click View Source, nothing changed. Here is the (inline) JS that I'm dealing with, I was wondering what your thoughts were on how to attack this issue?

Here is the first part of the code, ripped from the page:

VB.NET:
<script>

var noLongerListeningDiv = 'NoLongerListening';
var stillListeningDiv = 'StillListening';
var transparentDiv = 'Transparent';
var noLongerListeningTimeout = 5 * 60 * 1000;
var noLongerListeningTimer;
var stillListeningSoundAlertTimeout1 = noLongerListeningTimeout / 2;
var stillListeningSoundAlertTimeout2 = noLongerListeningTimeout - 30000;
var stillListeningSoundAlertTimer1;
var stillListeningSoundAlertTimer2;
var stillListeningTimeout = 90 * 60 * 1000;
var stillListeningTimer;

window.setTimeout("isUserStillListening()", stillListeningTimeout);

function isUserStillListening() {
	trackLink("StillListening_sent");
    clearTimeout(stillListeningTimer);
    clearTimeout(noLongerListeningTimer);
    document.getElementById(transparentDiv).style.display = 'inline';
    document.getElementById(stillListeningDiv).style.display = 'inline';
    this.focus();
    playStillListeningAlert();
    noLongerListeningTimer = setTimeout("userIsNoLongerListening()", noLongerListeningTimeout);
    stillListeningSoundAlertTimer1 = setTimeout("playStillListeningAlert()", stillListeningSoundAlertTimeout1);
    stillListeningSoundAlertTimer2 = setTimeout("playStillListeningAlert()", stillListeningSoundAlertTimeout2);
}

function userIsListening() {
	trackLink("StillListening_Yes");
    clearTimeout(stillListeningTimer);
    clearTimeout(noLongerListeningTimer);
    clearTimeout(stillListeningSoundAlertTimer1);
    clearTimeout(stillListeningSoundAlertTimer2);
    document.getElementById(stillListeningDiv).style.display = 'none';
    document.getElementById(transparentDiv).style.display = 'none';
    stillListeningTimer = setTimeout("isUserStillListening()", stillListeningTimeout);
}

function userIsNoLongerListening() {
	trackLink("StillListening_No");
    clearTimeout(stillListeningTimer);
    clearTimeout(noLongerListeningTimer);
    clearTimeout(stillListeningSoundAlertTimer1);
    clearTimeout(stillListeningSoundAlertTimer2);
    document.getElementById(stillListeningDiv).style.display = 'none';
	location.href = "/sirius/servlet/MediaPlayer?activity=selectTab" +
	    "&tab=music&stream=howardstern100&genre=" +
	    "&category=&status=stop&token=e8a8218218f210616b57aa97b894c6" +
		"&timeout=true";
}

function playStillListeningAlert() {
	var soundURL = "/mp/i/still_listening_alert.mp3";
	var embedCode = 
		'<OBJECT id="stillListeningAlertPlayer" width="0" height="0" ' +
				'CLASSID="CLSID:6BF52A52-394A-11d3-B153-00C04F79FAA6" ' +
				'type="audio/mpeg">' +
			'<PARAM NAME="URL" VALUE="' + soundURL + '"> ' +
			'<PARAM NAME="AutoStart" VALUE="True"> ' +
			'<PARAM name="uiMode" value="none"> ' +
			'<EMBED HEIGHT="0" WIDTH="0" TYPE="audio/mpeg"' + 
				'SRC="' + soundURL + '" HIDDEN="true" AUTOSTART="true" ' +
		'</OBJECT>'
	gGetElementById("playStillListeningAlert").innerHTML = "";
	gGetElementById("playStillListeningAlert").innerHTML = embedCode;
}

function gGetElementById(s) {
	var obj = (document.getElementById ? document.getElementById(s): document.all[s]);
	return ((obj == null) ? false : obj);
}

</script>

And then towards the bottom of the source is this:

VB.NET:
<div id="StillListening" style="position: absolute; left: 164px; top: 239px; 
        height: 97px; width: 321px; z-index: 3; display: none">
    <img src="/mp/i/still_listening.gif" name="sl" 
        onclick="userIsListening()" align="center" 
        alt="Are you still listening? If so, click here to continue.">
</div>

<div id="NoLongerListening" style="position: absolute; left: 164px; top: 229px; 
        height: 116px; width: 321px; z-index: 3; display: none">
    <img src="/mp/i/no_longer_listening.gif"
         name="nl" onclick=location.href="/foo/servlet/MediaPlayer?activity=selectTab&tab=music&stream=100&genre=&category=&token=e8a8218218f210616b57aa97b897c6" 
         align="center"
         alt="Click here to continue.">
</div>

<div id="playStillListeningAlert"></div>

<script>
  	var timeout = "null";
  	var transparentDiv = 'Transparent';
  	var noLongerListeningDiv = 'NoLongerListening';
  	var stillListeningDiv = 'StillListening';
    if (timeout == 'true') {
        document.getElementById(stillListeningDiv).style.display = 'none';
        document.getElementById(transparentDiv).style.display = 'inline';
        document.getElementById(noLongerListeningDiv).style.display = 'inline';
        this.focus();
    }
</script>
 
once the page is loaded and I click View Source, nothing changed
You can't "view source" to see the script objects that is loaded into memory.
I was wondering what your thoughts were on how to attack this issue?
I don't know what the page and the scrips are doing regarding the workflow and event chain and all, so I don't know. Most likely you are experiencing something in relation to one of the Listening() functions, so adding a dummy function of your own to replace it would probably do the trick. I did show you code examples and explained this in previous post.
 
Thanks again for your help John. :)

The problem I was having, though I'm certainly no expert and you might of not had the same experience, was that when I would try and loop through the script object tabs, only the first block of Vars listed above would show. And though I would still add the code that "should" work even though I couldn't see them in the locals window, it simply didn't work (no errors either).

Then I tried to do a .replace for any code I wanted to edit or delete, but that also had no effect, and was also why I was wondering why a View Source wasn't showing the changes. I figured if I replaced the code, it should of shown up in the source that way afterward.

I did manage to solve my problem though, and I know there is more than one way to skin a cat when it comes to coding, and my way is most likely not the most efficient or elegant, but I ended up just loading the page in the browser control, .replace any code needed in the innerhtml, save the result to a local .htm file, then immediately load the file in the control.

Seems to work great so far. :)
 
Back
Top