夭柳

豆瓣爬某个贴子的所有刮刮乐答案python代码

请直接複製 cookie可能会过期，如果过期，请开启你的豆瓣网页，在搜索栏查question，然后再随意点击一个id，在Header中得到Cookie複製并更新即可运行代码 from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.chrome.options import Options import time import requests import json import urllib.parse # 设置Chrome选项，启用Headless模式 chrome_options = Options() chrome_options.add_argument('--headless') # 不显示浏览器界面 # 初始化 WebDriver driver = webdriver.Chrome(options=chrome_options) # 让用户输入网址 url = input("请输入目标网页的网址：") # 打开目标网页 driver.get(url) # 等待页面加载 time.sleep(5) # 从页面中提取所有的data-id data_id_elements = driver.find_elements(By.XPATH, "//*[@data-id]") # 提取所有的data-id，并过滤非7位数的 data_ids = [ element.get_attribute("data-id") for element in data_id_elements if len(element.get_attribute("data-id")) == 7 and element.get_attribute("data-id").isdigit() ] # 打印出所有符合条件的data-id print(f"Found valid 7-digit data-ids: {data_ids}") # 请求头信息，用于模拟浏览器请求 headers = { ...

阅读全文

夭柳

搜索此博客

博文

豆瓣爬某个贴子的所有刮刮乐答案python代码