会员可以在此提问，百战程序员老师有问必答

对大家有帮助的问答会被标记为“推荐”
看完课程过来浏览一下别人提的问题，会帮你学得更全面

截止目前，同学们一共提了 133940个问题

Python 全系列/第十六阶段：Python 爬虫开发/爬虫基础 466楼

啥也没有：

Python 全系列/第十六阶段：Python 爬虫开发/爬虫反反爬 467楼

老师我在做课后作业：爬取拉勾网职位信息，响应回来的信息提示我需要验证，我加了cookies上去也没用，我该怎么处理啊？

import requests
from fake_useragent import UserAgent
from lxml import etree

headers = {'User-Agent':UserAgent().chrome}
args = {
    'user':'18825710688',
    'password':'ljj20020722'
}
url = 'https://www.lagou.com/jobs/list_python%E5%B7%A5%E7%A8%8B%E5%B8%88/p-city_0?&cl=false&fromSearch=true&labelWords=sug&suginput=pytho'
response = requests.post(url,headers=headers,data=args)
print(response.text)
# e = etree.HTML(response.text)
# names = e.xpath('//h3/text()')
# details = e.xpath('//span[@class="add"]/em/text()')
# price = e.xpath('//div/span[@class="money"]/text()')
# for name,detail,price in zip(names,details,price):
#     print(name,':',detail,':',price)

Python 全系列/第十六阶段：Python 爬虫开发/移动端爬虫开发- 468楼

有一个问题，就是如果遇到这种网站的话，登录后，再用爬虫爬取其他页面的话，是需要使用到splash控制浏览器到相应界面，还是在下面直接跟页面链接就行。希望出一个爬去验证过后，浏览器里面其他的页面操作

Python 全系列/第十六阶段：Python 爬虫开发/爬虫反反爬 469楼

"""
    爬取淘宝商品价格
"""
import requests
from lxml import etree


def get_html_text(url):
    ua = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'

    }
    try:

        resp = requests.get(url, headers=ua)
        resp.raise_for_status()
        resp.encoding = resp.apparent_encoding
        return resp.text
    except BaseException as e:
        print(e)


def parser_page(html):
    e = etree.HTML(html)
    prices = e.xpath("//div[@class='price g_price g_price-highlight']/strong/text()")
    product_infos = e.xpath("//div[@class='row row-2 title']/a/text()")
    for price, info in zip(prices, product_infos):
        print(f"商品:{info.strip()}\n价格:{price}\n\n\n")


def main():
    url ='https://s.taobao.com/search?q=python书籍&imgfile=&js=1&stats_click=search_radio_all%3A1&initiative_id=staobaoz_2'
    html = get_html_text(url)
    parser_page(html)

main()

老师，我这里有一个爬取商品价格的爬虫，运行后显示进程结束，什么都没有，我是被反爬了吗？

Python 全系列/第十六阶段：Python 爬虫开发/爬虫反反爬- 470楼

老师，有验证码的怎么ban呢？？？？？？？？？？？？？？？？？？？？？？？？？？？？

Python 全系列/第十六阶段：Python 爬虫开发/scrapy框架使用（旧） 471楼

设置完IP后，模拟器报网络连接问题

Python 全系列/第十六阶段：Python 爬虫开发/移动端爬虫 472楼

老师这3个那个常用呀

Python 全系列/第十六阶段：Python 爬虫开发/爬虫基础（旧） 473楼

老师怎么去除红色里面的文字

如果不去掉得的的内容是这样的

Python 全系列/第十六阶段：Python 爬虫开发/移动端爬虫开发- 474楼

老师，请问，这句代码怎么理解呢，中间的string(.)又是什么意思呢

Python 全系列/第十六阶段：Python 爬虫开发/爬虫反反爬- 475楼

就是补充，我不知道怎么回复问题。

现在就是我运行，并不能返回html的网页信息，我发现里面有两个debug，我解决了一个，还剩一个

我想应该是

这个debug到导致的问题。

Python 全系列/第十六阶段：Python 爬虫开发/移动端爬虫开发- 476楼

多线程多进程协程

Python 全系列/第十六阶段：Python 爬虫开发/爬虫基础（旧） 477楼

from urllib.request import urlopen,Request
from urllib.parse import quote

arg = "尚学堂"
# print(quote(arg))

url = "https://www.baidu.com/s?wd={}".format(quote(arg))
#设置UA变量
headers = {
    "User-Agent":"Mozilla/5.0 (Windows NT 10.0; WOW64) 123456"
}
#封装request对象
req = Request(url,headers=headers)
#发送请求
response = urlopen(req)
#打印内容
print(response.read().decode())

Python 全系列/第十六阶段：Python 爬虫开发/爬虫基础（旧） 478楼

老师请问为什么我下载的boot2docker.iso是个压缩文件啊?我解压缩以后没有得到老师视频中的光盘印象文件?

Python 全系列/第十六阶段：Python 爬虫开发/动态数据抓取 479楼

老师，为啥我运行后是这样的，但是我在Terminal里面就可以运行

Python 全系列/第十六阶段：Python 爬虫开发/移动端爬虫开发- 480楼

多线程 多进程 协程

同学您好

多线程多进程协程