sky's blog

Python 爬虫教程 09

loskyertt Unknown

2024-10-21 14:35:56 2024-10-21 14:35:56 Created 2025-02-17 04:36:55 2025-02-17 04:36:55 Updated

Python爬虫

python

410 Words 2 Mins

1.代码示例

import requests
from lxml import etree

url = "https://www.spiderbuf.cn/playground/s08"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36",
}
html = requests.get(url=url, headers=headers).text

f = open('./课程/08course/08.html', 'w', encoding='utf-8')
f.write(html)
f.close()

root = etree.HTML(html)
trs = root.xpath('//tr')

f = open('./课程/08course/data08.txt', 'w', encoding='utf-8')
for tr in trs:
    tds = tr.xpath('./td')
    s = ''
    for td in tds:
        s = s + str(td.xpath('string(.)')) + '|'
    print(s)
    if s!= '':
        f.write(s + '\n')

直接运行这段代码是不会解析出任何数据的，同时可以看到抓取到的网页与我们想要的不一样。

2,网页分析

打开浏览器控制台，选择network：

控制台.png

可以发现，请求方式是post。所以我们就得在代码中采用post请求方式：

import requests
from lxml import etree

url = "https://www.spiderbuf.cn/playground/s08"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36",
}
# 传入 post 请求中的数据
payload = {'level': '8'}
# post 请求
html = requests.post(url=url, headers=headers, data=payload).text

f = open('./课程/08course/08.html', 'w', encoding='utf-8')
f.write(html)
f.close()

root = etree.HTML(html)
trs = root.xpath('//tr')

f = open('./课程/08course/data08.txt', 'w', encoding='utf-8')
for tr in trs:
    tds = tr.xpath('./td')
    s = ''
    for td in tds:
        s = s + str(td.xpath('string(.)')) + '|'
    print(s)
    if s!= '':
        f.write(s + '\n')

payload是在这里：

payload.png

可以参考这里：更加复杂的 POST 请求

Title: Python 爬虫教程 09
Author: loskyertt
Created at : 2024-10-21 14:35:56
Updated at : 2025-02-17 04:36:55
Link: https://redefine.ohevan.com/2024/10/21/09Python爬虫/
License: This work is licensed under CC BY-NC-SA 4.0.

#python

推荐阅读

Python 爬虫教程 03

Python 爬虫教程 03

Python 爬虫教程 07

Python 爬虫教程 07

Python 爬虫教程 05

Python 爬虫教程 05

推荐阅读

Python 爬虫教程 03

Python 爬虫教程 03

Python 爬虫教程 07

Python 爬虫教程 07

Comments

On this page

Python 爬虫教程 09

1.代码示例
2,网页分析