Website Scraping Using Jquery And Ajax
Solution 1:
I would like to point out that there are situations where it is perfectly acceptable to use jQuery to scrape screens across domains. Windows Sidebar gadgets run in a 'Local Machine Zone' that allows cross domain scripting.
And jQuery does have the ability to apply selectors to retreived html content. You just need to add the selector to a load() method's url parameter after a space.
The example gadget code below checks this page every hour and reports the total number of page views.
<html><head><scripttype="text/javascript"src="jquery.min.js"></script><style>body {
height: 120px;
width: 130px;
background-color: white;
};
</style></head><body>
Question Viewed:
<divid="data"></div><scripttype="text/javascript">var url = "http://stackoverflow.com/questions/1936495/website-scraping-using-jquery-and-ajax"updateGadget();
inervalID = setInterval("updateGadget();", 60 * 1000);
functionupdateGadget(){
$(document).ready(function(){
$("#data").load(url + " .label-value:contains('times')");
});
}
</script></body></html>
Solution 2:
You cannot do Ajax request to another domain-name than the one your website is on, because of the Same Origin Policy ; which means you will not be quite able to do what you want... At least directly.
A solution would be to :
- have some kind of "proxy" on your own server,
- send your Ajax request to that proxy,
- which, in turn, will fetch the page on the other domain name ; and return it to your JS code as response to the Ajax request.
This can be done in a couple of lines with almost any language (like PHP, using curl, for instance)... Or you might be able to use some functionnality of your webserver (see mod_proxy
and mod_proxy_http
, for instance, for Apache)
Solution 3:
Its not that difficult.
$(document).ready(function() {
baseUrl = "http://www.somedomain.com/";
$.ajax({
url: baseUrl,
type: "get",
dataType: "",
success: function(data) {
//do something with data
}
});
});
I think this can give you a good clue - http://jsfiddle.net/skelly/m4QCt/
Solution 4:
http://www.nathanm.com/ajax-bypassing-xmlhttprequest-cross-domain-restriction/
The only problem is that due to security in both Internet Explorer and in FireFox, the XMLHTTPRequest object is not allowed to make cross-domain, cross-protocol, or cross-port requests.
Solution 5:
Instead of curl, you could use a tool like Selenium which will automate loading the page in the browser. You can run JavaScript with it.
Post a Comment for "Website Scraping Using Jquery And Ajax"