有效沟通问答-【官方】百战程序员_IT在线教育培训机构

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from selenium import webdriver
from time import sleep
from lxml import etree
'''
例子：百度图片
但是这个案例没实现把滚动条拉到最下端
'''
driver = webdriver.Chrome()
driver.get('https://image.baidu.com/')
driver.find_element_by_id('kw').send_keys('成吉思汗')
driver.find_element_by_class_name('s_search').click()
js = 'document.documentElement.scrollTop=1000000'
# js = 'window.scrollTo(0,1000000)'
driver.execute_script(js)
sleep(5)
html = driver.page_source
e = etree.HTML(html)
imgurl_list = e.xpath('//li[@class="imgitem"]/div/a/img/@data-imgurl')
linkurl_list = ['https://image.baidu.com{}'.format(url) for url in e.xpath('//li[@class="imgitem"]/div/a/@href')]
for imgurl,linkurl in zip(imgurl_list,linkurl_list):
    print(imgurl,'|',linkurl)
driver.quit()
print(len(linkurl_list))

Python全系列/第十六阶段：Python 爬虫开发/爬虫反反爬- 918楼

老师，scrapy可以像requests.session一样会话保持吗？

Python全系列/第十六阶段：Python 爬虫开发/scrapy 框架高级 919楼

一直这样别的版本都试过都不行

Python全系列/第十六阶段：Python 爬虫开发/移动端爬虫开发- 920楼

保存地址的name为空，还望老师看一下，使用视频老师的讲解会报 TypeError: unsupported operand type(s) for +: 'set' and 'str' 错误，所以自己写成了这样

Python全系列/第十六阶段：Python 爬虫开发/scrapy 框架高级 921楼

这种的深层次爬取怎么搞，，，，，。

Python全系列/第十六阶段：Python 爬虫开发/爬虫反反爬- 922楼

每个类后，为什么有的加括号有的不加括号，什么情况加什么情况不加

class Dataoutput():
class Scheduler:

Python全系列/第十六阶段：Python 爬虫开发/爬虫反反爬- 923楼

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

from urllib.request import Request,build_opener
from fake_useragent import UserAgent
from urllib.request import ProxyHandler

url = 'http://httpbin.org/get'
headers = {'User-Agent':UserAgent().chrome}
request = Request(url,headers=headers)
# 使用代理的两种方式
# handler = ProxyHandler({"http":"ip:port"})
# handler = ProxyHandler({"http":"name:password@ip:port"})
handler = ProxyHandler({'http':'27.46.20.189:8888'})
# handler = ProxyHandler({"http": "398707160:j8inhg2g@39.105.117.225:16818"})
opener = build_opener(handler)
response = opener.open(request)
print(response.read().decode())

运行结果：

为什么结果中"origin"的值有两个，而且跟代理的ip不一样

Python全系列/第十六阶段：Python 爬虫开发/scrapy框架使用（旧） 924楼

使用proxy下载中间件之后，没有执行爬虫文件

Python全系列/第十六阶段：Python 爬虫开发/scrapy 框架高级 925楼

老师这是什么情况，一直安装不了，python3 setup.py install

Python全系列/第十六阶段：Python 爬虫开发/分布式爬虫 926楼

$%WJ){S@Q6F`_(K))%OWUP3J.png$

老师，我在配置网络的时候，不能够成功，我卸载重新安装也是不行。这是为什么呢？

Python全系列/第十六阶段：Python 爬虫开发/分布式爬虫 927楼

room.zip

$0PLM)]{0DSV}0XH)GL5_KJR.png$ X~U2JM}(5A0PADEADIFUWF4.png

我现在是想在获得项目地址这个信息后，然后再获得查看更多楼盘详情里面的内容。
所以我在parse_info里面又进行了一次yield，发现代码不能够执行base_info函数里面的代码，难道说是这个yield只能在
parse函数里面执行吗？

Python全系列/第十六阶段：Python 爬虫开发/动态数据抓取 928楼

mongo.zip

老师，我这个代码可以运行，但是数据不能够保存到mongo数据库中，这是为什么呢？

Python全系列/第十六阶段：Python 爬虫开发/爬虫数据存储 929楼

老师,mongodb中怎么获取所有的字段名呢？

Python全系列/第十六阶段：Python 爬虫开发/爬虫数据存储 930楼

同学您好