1
Jat001 2014-12-22 23:18:32 +08:00
带上 header
X-Requested-With XMLHttpRequest Referer http://fm.xinli001.com/ 做爬虫就是模拟浏览器,看看浏览器发了什么 header,一个个减少,直到出错,就知道需要什么 header。 |
2
fising 2014-12-22 23:38:37 +08:00 via iPad
ajax跨域了,被浏览器block住了
|
4
Jat001 2014-12-23 00:37:07 +08:00
@bosshida 要么在他们的服务端设置 Access-Control-Allow-Origin header,当然,你肯定没这权限。要么就用类似 userscripts 的方法搞。
其实我觉得这种请求最好在服务端搞定。 |
5
esile 2014-12-23 01:52:08 +08:00
设置referer和X-Requested-With即可成功获取了
以下是测试返回值 {"code": 0, "data": {"favnum": 398, "commentnum": 120, "speaker_id": 108, "is_home": true, "background": "http://image.xinli001.com/20141220/18083879570a3ec9b9a360.jpg", "speak_url": "http://www.xinli001.com/user/742450/", "duration": 1283, "tags": [], "weight": 397, "title1": "", "_cache_key": "data_fm_broadcast_4916186", "article": null, "specials": [], "_id": "54954aea4f670ade3e8b4a1b", "range": 20535196, "word": "\u6625\u6653", "speakers_id": [], "lizhi_url": "", "created": "2014-12-20 18:01", "word_url": "http://www.xinli001.com/user/article/3866918/", "speak": "\u5cf0_\u5c0f\u5cf0", "id": 4916186, "is_teacher": false, "message_url": "", "cover": "http://image.xinli001.com/20141220/18094254011b53336c1227.jpg", "title": "\u6211\u548c\u90b5\u6bdb\u6bdb\u7684\u65e5\u4e0e\u591c", "url": "http://image.kaolafm.net/mz/audios/201412/a59b5e60-e515-4804-88f5-64f167aa957e.mp3", "absolute_url": "http://fm.xinli001.com/4916186/", "content": "\u4e0d\u8bba\u751f\u6d3b\u5728\u54ea\u91cc\uff0c\u53ea\u8981\u5728\u4e00\u8d77\u5c31\u597d\u4e86\u3002\u6211\u4eec\u5728\u83dc\u5e02\u573a\u4e70\u83dc\uff0c\u5728\u623f\u95f4\u91cc\u505a\u996d\uff0c\u996d\u540e\u6cbf\u7740\u8857\u8fb9\u6563\u6b65\uff0c\u4e00\u8d77\u770b\u592a\u9633\u5347\u8d77\uff0c\u592a\u9633\u843d\u4e0b\uff0c\u8fd9\u6837\u5c31\u8db3\u591f\u4e86\u3002", "url1": ""}} |
6
bosshida OP @Jat001 可以加的header都加了,但都不行。我对着Firefox的header,逐个增加参数,还是提示403 FORBIDDEN.
<!DOCTYPE html> <html> <head> <meta http-equiv="content-type" content="text/html; charset=UTF-8"> <script src="./jquery-2.0.0.min.js"></script> <script type="text/javascript"> function test(){ $.ajax({ type : "get", url : "http://fm.xinli001.com/broadcast/", datatype:"json", data: "pk=97701348&t=1419296643104", headers:{ "Referer":"http://fm.xinli001.com/", "X-Requested-With":"XMLHttpRequest", "Accept":"*/*", "Accept-Encoding":"gzip, deflate", "Accept-Language":" zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3", "Connection":"keep-alive", "Host":"fm.xinli001.com", "User-Agent":"Mozilla/5.0 (Windows NT 5.1; rv:34.0) Gecko/20100101 Firefox/34.0", }, success : function(json){ alert('ok'); }, error:function(){ alert('fail'); } }); } </script> <title>parseFm</title> </head> <body> <input type="button" value="test" onclick="test();"> </body> </html> |
7
yrdr 2014-12-23 10:18:23 +08:00
第一,你跨域了,所以请用jsonp
第二,你没设置http头,被服务器屏蔽了请求了吧 |
9
zhangwei727 2014-12-23 12:10:54 +08:00
@esile 同求测试源码,[email protected] 谢谢!
|
11
bosshida OP @yrdr 试过jsonp了,还是不行。用jquery和用原生Js代码的Jsonp都返回403 forbidden。
Jquery: <script type="text/javascript"> function haha(){ $.ajax({ type : "get", async:false, url : "http://fm.xinli001.com/broadcast/", data: "pk=97701348&t=1419336731430", dataType: "jsonp", jsonpCallback:"fmHandler", headers:{ "Referer":"http://fm.xinli001.com/", "X-Requested-With":"XMLHttpRequest", "Accept":"*/*", "Accept-Encoding":"gzip, deflate", "Accept-Language":" zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3", "Connection":"keep-alive", "Host":"fm.xinli001.com", "User-Agent":"Mozilla/5.0 (Windows NT 5.1; rv:34.0) Gecko/20100101 Firefox/34.0", }, success : function(json){ console.log(json); alert('ok'); }, error:function(){ alert('fail'); } }); } </script> 原生Js: <script type="text/javascript"> var myFmHandler = function(data){ alert('ok'); }; var url = "http://fm.xinli001.com/broadcast/?pk=97701348&t=1419336731430&callback=myFmHandler"; var script = document.createElement('script'); script.setAttribute('src', url); document.getElementsByTagName('head')[0].appendChild(script); </script> 楼上说的Node.js,我没用过,现在来现学现用一下。。。 |
12
esile 2014-12-24 11:01:38 +08:00
@bosshida @zhangwei727 需要搞那么负责么?
<?php function fetchpage($url, $referer) { $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HTTPHEADER, array ('X-Requested-With: XMLHttpRequest') ); curl_setopt($ch, CURLOPT_HEADER,false); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_REFERER, $referer); curl_setopt($ch, CURLOPT_USERAGENT,"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6; .NET CLR 2.0.50727; CIBA)"); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10); $temp = curl_exec($ch); curl_close($ch); return $temp; } var_dump(fetchpage('http://fm.xinli001.com/broadcast/?pk=4916186&t=1419258885474', 'http://fm.xinli001.com/')); |
13
esile 2014-12-24 11:02:21 +08:00
负责=复杂,o(︶︿︶)o 唉 拼音坑人
|