my8100 最近的时间轴更新
my8100's repos on GitHub
Python · 2551 人关注
scrapydweb
Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI. DEMO :point_right:
405 人关注
files
Docs and files for ScrapydWeb, Scrapyd, Scrapy, and other projects
Python · 116 人关注
scrapyd-cluster-on-heroku
Set up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO :point_right:
Python · 70 人关注
logparser
A tool for parsing Scrapy log files periodically and incrementally, extending the HTTP JSON API of Scrapyd.
Python · 9 人关注
scrapyd
[PR #326] Native support for basic auth :lock: `pip install -U git+https://github.com/scrapy/scrapyd.git`, then add `username = yourusername` and `password = yourpassword` in the scrapyd.conf file. DOCS :point_right:
Python · 9 人关注
scrapyd-cluster-on-heroku-scrapyd-app
How to set up Scrapyd cluster on Heroku
6 人关注
awesome-scrapy
A curated list of awesome packages, articles, and other cool resources from the Scrapy community.
Makefile · 5 人关注
awesome-python-cn
Python资源大全中文版,包括:Web框架、网络爬虫、模板引擎、数据库、数据可视化、图片处理等,由伯乐在线持续更新。
Python · 4 人关注
notes
Keep on reading
3 人关注
awesome-crawler
A collection of awesome web crawler,spider in different languages
Makefile · 2 人关注
awesome-web-scraping
List of libraries, tools and APIs for web scraping and data processing.
1 人关注
awesome-flask
A curated list of awesome Flask resources and plugins
Python · 1 人关注
awesome-python
A curated list of awesome Python frameworks, libraries, software and resources
Python · 1 人关注
awesome-python-applications
💿 Free software that works great, and also happens to be open-source Python.
Python · 1 人关注
public-test
public-test
Python · 1 人关注
scrapy-redis
Redis-based components for Scrapy.
Python · 1 人关注
scrapyd-CircleCI
Python · 1 人关注
scrapyd-cluster-on-heroku-scrapydweb-app-git
How to set up Scrapyd cluster on Heroku
Python · 0 人关注
Python-Algorithms
All Algorithms implemented in Python
Python · 0 人关注
queuelib
Collection of persistent (disk-based) queues
Python · 0 人关注
scrapyd-cluster-on-heroku-scrapyd-app-basic-auth
scrapyd-cluster-on-heroku-scrapyd-app-basic-auth
Python · 0 人关注
scrapyd-cluster-on-heroku-scrapyd-app-git
How to set up Scrapyd cluster on Heroku
Python · 0 人关注
scrapyd-cluster-on-heroku-scrapydweb-app
How to set up Scrapyd cluster on Heroku
HTML · 0 人关注
temp
temp
my8100

my8100

V2EX 第 353967 号会员,加入于 2018-10-05 14:40:26 +08:00
my8100 最近回复了
2021-01-16 23:46:43 +08:00
回复了 yixiugegegege 创建的主题 Python 迫于逻辑实在理不清了, Python 求助
from collections import defaultdict

child_dict = defaultdict(list)
for d in data["child"]:
child_dict[d["f_pyfirstletter"]].append(d)

assert {"child": child_dict} == target_data
2019-11-07 07:55:04 +08:00
回复了 Livid 创建的主题 Python 关于 Flask 项目的代码文件组织
Visitors and git clone insights traffic stats on all repos shows as zero since the 21st August 2019 #1650
https://github.com/isaacs/github/issues/1650
2019-08-23 09:58:35 +08:00
回复了 aaronhua 创建的主题 Python scrapydweb 和 spiderkeeper 有什么区别?
“请尽量让自己的回复能够对别人有帮助”
2019-08-22 23:50:03 +08:00
回复了 aaronhua 创建的主题 Python scrapydweb 和 spiderkeeper 有什么区别?
1. 可靠性:持续集成,目前代码覆盖率 > 89%。
2. 实用性:集成 LogParser,爬虫进度可视化,基于 Scrapy 日志分析的监控和警报。
3. 可扩展性:在爬虫集群的任意多个节点实现一键操作,包括部署,运行,停止和删除项目,汇总分布式爬虫的日志分析报告等。
4. 权威性:Scrapyd 开发者成员之一,及时适配新版本新特性。

在线体验就完事了: https://scrapydweb.herokuapp.com/
2019-06-30 22:03:27 +08:00
回复了 kikaoki 创建的主题 问与答 有办法知道这两个网页在时间上的先后关系么?
## Chrome F12 开发者工具
http://www.pudong.gov.cn/shpd/department/20190315/019020004004_3377cd83-5f78-4809-ad60-f5eef65ad1c2.htm
Last-Modified: Mon, 25 Mar 2019 08:47:12 GMT

http://www.pudong.gov.cn/shpd/department/20190315/019020004004_988dd3b7-77ec-4ba8-bd3d-b6badaf470ca.htm
Last-Modified: Fri, 15 Mar 2019 09:18:50 GMT

## 下载 xls 文件
右键>属性>详细信息>最后一次保存的日期 也能看出区别。

你自己再确认一下。
参考 #1 的写法:
```
In [229]: sel.xpath("//tbody[tr/th/text()='跑步机']/tr[@align='center']/td/text()").extract()
Out[229]:
['\n ',
'\n ',
'\n ',
'\n ',
'38Min.',
'14:29',
'15:07']

In [230]:
```
<tr><th colspan="5" class="pit" align="center">跑步机</th></tr>
<td>
这里第二行的 <td> 应该是多余的

```
In [215]: from scrapy import Selector

In [216]: sel = Selector(text=doc)

In [217]: sel.xpath("//th[contains(text(), '跑步机')]/parent::tr/following-sibling::tr/td/text()").extract()
Out[217]:
['\n ',
'\n ',
'\n ',
'\n ',
'38Min.',
'14:29',
'15:07']

In [218]: sel.xpath("//th[text()='跑步机']/parent::tr/following-sibling::tr/td/text()").extract()
Out[218]:
['\n ',
'\n ',
'\n ',
'\n ',
'38Min.',
'14:29',
'15:07']

In [219]:
```
@itskingname 参考 #3 链接:
1. 提交,回复 issues
2. 提交 PR
3. 持续下去,等待 invitation
关于   ·   帮助文档   ·   API   ·   FAQ   ·   我们的愿景   ·   广告投放   ·   感谢   ·   实用小工具   ·   1753 人在线   最高记录 5497   ·     Select Language
创意工作者们的社区
World is powered by solitude
VERSION: 3.9.8.5 · 28ms · UTC 01:06 · PVG 09:06 · LAX 18:06 · JFK 21:06
Developed with CodeLauncher
♥ Do have faith in what you're doing.