
1 xiandao7997 2014 年 8 月 1 日 via Android Wget |
3 zzetao 2014 年 8 月 1 日 其实一些浏览器的插件可以做到 |
4 androidBrant OP @faceair 如何快速抓到这些地址的? |
5 nealv2ex 2014 年 8 月 1 日 list = $('.pic img').map(function(o,item){ var a = document.createElement('a'); a.href = $(item).attr('original'); return a.href; }) |
6 androidBrant OP @xiandao7997 jiaqiqunaerdeiMac:pic jiaqiqunaer$ wget -r http://www.szeros-wedding.com/UpFile/editor/ --2014-08-01 14:20:58-- http://www.szeros-wedding.com/UpFile/editor/ Resolving www.szeros-wedding.com... 211.154.142.215 Connecting to www.szeros-wedding.com|211.154.142.215|:80... connected. HTTP request sent, awaiting response... 403 Forbidden 2014-08-01 14:20:59 ERROR 403: Forbidden. |
7 NemoAlex 2014 年 8 月 1 日 |
8 faceair 2014 年 8 月 1 日 |
9 imn1 2014 年 8 月 1 日 save as... complete html |
10 Roboo 2014 年 8 月 1 日 idm |
11 xiandao7997 2014 年 8 月 1 日 via Android Wget -r --level=2 --accept=jpg [标题里的 url] 结束后在子目录的 upfile/editor 里面找 |
12 xiandao7997 2014 年 8 月 1 日 via Android @imn1 感觉自己 《社交网络》白看了 |
13 wesley 2014 年 8 月 1 日 先清空浏览器缓存, 再打开那个网页, 再去浏览器缓存文件夹里找 |
14 androidBrant OP @faceair 用xpath如何找到这些地址,表达式,谢谢 |
15 mengzhuo 2014 年 8 月 1 日 再来个python版 import requests from lxml import html URL = 'http://www.szeros-wedding.com/html/service/804.html#1' [x.attrib['src'] for x in html.fromstring(requests.get('http://www.szeros-wedding.com/html/service/804.html#1'.text).xpath('//img')] ------- ['/skins/20140425/images/bg74.gif', '/skins/20140425/images/t0.gif', '/skins/20140425/images/t3.gif', '/skins/20140425/images/t01.gif', '/skins/20140425/images/t01.gif', '/skins/20140425/images/bg75.gif', '/skins/20140425/images/logo.jpg', '/skins/20140425/images/bg6.jpg', '/skins/20140425/images/bg7.jpg', '/skins/20140425/images/bg8.jpg', '/skins/20140425/images/bg9.jpg', '/skins/20140425/images/bg10.jpg', '/skins/20140425/images/f.jpg', '/ueditor/asp/../../UpFile/editor/2014032002455418.jpg', '/ueditor/asp/../../UpFile/editor/2014032002455652.jpg', '/ueditor/asp/../../UpFile/editor/2014032002456340.jpg', '/ueditor/asp/../../UpFile/editor/2014032002456480.jpg', '/ueditor/asp/../../UpFile/editor/2014032002457027.jpg', '/ueditor/asp/../../UpFile/editor/2014032002457496.jpg', '/ueditor/asp/../../UpFile/editor/2014032002457996.jpg', '/ueditor/asp/../../UpFile/editor/2014032002458527.jpg', '/ueditor/asp/../../UpFile/editor/2014032002458652.jpg', '/ueditor/asp/../../UpFile/editor/2014032002459152.jpg', '/ueditor/asp/../../UpFile/editor/2014032002460184.jpg', '/ueditor/asp/../../UpFile/editor/2014032002460340.jpg', '/ueditor/asp/../../UpFile/editor/2014032002460512.jpg', '/ueditor/asp/../../UpFile/editor/2014032002461262.jpg', '/ueditor/asp/../../UpFile/editor/2014032002461902.jpg', '/ueditor/asp/../../UpFile/editor/2014032002462480.jpg', '/ueditor/asp/../../UpFile/editor/2014032002463027.jpg', '/ueditor/asp/../../UpFile/editor/2014032002463746.jpg', '/ueditor/asp/../../UpFile/editor/2014032002464809.jpg', '/ueditor/asp/../../UpFile/editor/2014032002464934.jpg', '/ueditor/asp/../../UpFile/editor/2014032002465652.jpg', '/ueditor/asp/../../UpFile/editor/2014032002466230.jpg', '/ueditor/asp/../../UpFile/editor/2014032002466730.jpg', '/ueditor/asp/../../UpFile/editor/2014032002466918.jpg', '/ueditor/asp/../../UpFile/editor/2014032002467590.jpg', '/ueditor/asp/../../UpFile/editor/2014032002467746.jpg', '/ueditor/asp/../../UpFile/editor/2014032002468449.jpg', '/ueditor/asp/../../UpFile/editor/2014032002469090.jpg', '/ueditor/asp/../../UpFile/editor/2014032002469230.jpg', '/ueditor/asp/../../UpFile/editor/2014032002469902.jpg', '/ueditor/asp/../../UpFile/editor/2014032002470699.jpg', '/ueditor/asp/../../UpFile/editor/2014032002470840.jpg', '/skins/20140425/images/f.jpg', '/skins/20140425/images/jd.jpg', '/skins/20140425/hzjd/1.jpg', '/skins/20140425/hzjd/2.jpg', '/skins/20140425/hzjd/3.jpg', '/skins/20140425/hzjd/4.jpg', '/skins/20140425/hzjd/5.jpg', '/skins/20140425/hzjd/6.jpg', '/skins/20140425/hzjd/7.jpg', '/skins/20140425/hzjd/8.jpg', '/skins/20140425/hzjd/9.jpg', '/skins/20140425/hzjd/10.jpg', '/skins/20140425/images/link.jpg', '/skins/20140425/images/logo1.jpg'] |
16 zoudm 2014 年 8 月 1 日 @androidBrant Xpath: /html/body/table/tbody/tr[1]/td/table/tbody/tr[5]/td/div/p[1]/img ... ... /html/body/table/tbody/tr[1]/td/table/tbody/tr[5]/td/div/p[5]/img |
17 muziyue 2014 年 8 月 1 日 如果不是特别多的页面的话,我一般都是curl+s 然后文件夹里找 |
18 decken 2014 年 8 月 1 日 pyquery实在是太好用了 https://gist.github.com/28dea5a2553190223ca6.git |
19 decken 2014 年 8 月 1 日 |
20 mopvhs 2014 年 8 月 2 日 |
21 mopvhs 2014 年 8 月 2 日 gist怎么可以不解析! <script src="https://gist.github.com/mopvhs/4b93757c88b5fe558846.js"></script> https://gist.github.com/mopvhs/4b93757c88b5fe558846 |
23 BGLL 2014 年 8 月 2 日 Chorme 扩展:Fatkun |
24 androidBrant OP |
25 xiandao7997 2014 年 8 月 2 日 via Android 保存网页然后去文件夹找那个最简单,21楼的方法也很酷 |
26 run2 2014 年 8 月 3 日 firefox + downthemall |