亚洲av无码片在线观看,久久国产成人精品国产成人亚洲 ,亚洲av无码有乱码在线观看

python爬蟲入門 requests 模塊

網友投稿 698 2022-05-30

Requests

Python標準庫中提供了：urllib、urllib2、httplib等模塊以供Http請求，

但是，它的 API 太渣了。它是為另一個時代、另一個互聯網所創建的。

它需要巨量的工作，甚至包括各種方法覆蓋，來完成最簡單的任務。

Requests 是使用 Apache2 Licensed 許可證的基于Python開發的HTTP 庫，

其在Python內置模塊的基礎上進行了高度的封裝，從而使得Pythoner進行網絡請求時，

變得美好了許多，使用Requests可以輕而易舉的完成瀏覽器可有的任何操作。

簡介

安裝：

pip install requests

requests常用屬性：

response = requests.get(url) response.text response.content response.encoding response.apparent_encoding response.status_code 301 永久重定向 302 臨時重定向 response.cookies.get_dict()

1、GET請求

1、無參數實例

import requests ret = requests.get('https://github.com/timeline.json') print ret.url print ret.text

2、有參數實例

import requests payload = {'key1': 'value1', 'key2': 'value2'} ret = requests.get("http://httpbin.org/get", params=payload) print ret.url print ret.text

2、POST請求

1、基本POST實例

import requests payload = {'key1': 'value1', 'key2': 'value2'} ret = requests.post("http://httpbin.org/post", data=payload) print ret.text

2、發送請求頭和數據實例

import requests import json url = 'https://api.github.com/some/endpoint' payload = {'some': 'data'} headers = {'content-type': 'application/json'} ret = requests.post(url, data=json.dumps(payload), headers=headers) print ret.text print ret.cookies

3、其他請求

requests.get(url, params=None, **kwargs) requests.post(url, data=None, json=None, **kwargs) requests.put(url, data=None, **kwargs) requests.head(url, **kwargs) requests.delete(url, **kwargs) requests.patch(url, data=None, **kwargs) requests.options(url, **kwargs) # 以上方法均是在此方法的基礎上構建 requests.request(method, url, **kwargs)

4、更多參數

def request(method, url, **kwargs): """Constructs and sends a :class:`Request `. :param method: method for the new :class:`Request` object. :param url: URL for the new :class:`Request` object. :param params: (optional) Dictionary or bytes to be sent in the query string for the :class:`Request`. :param data: (optional) Dictionary, bytes, or file-like object to send in the body of the :class:`Request`. :param json: (optional) json data to send in the body of the :class:`Request`. :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`. :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`. :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload. ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')`` or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers to add for the file. :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth. :param timeout: (optional) How long to wait for the server to send data before giving up, as a float, or a :ref:`(connect timeout, read timeout) ` tuple. :type timeout: float or tuple :param allow_redirects: (optional) Boolean. Set to True if POST/PUT/DELETE redirect following is allowed. :type allow_redirects: bool :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy. :param verify: (optional) whether the SSL cert will be verified. A CA_BUNDLE path can also be provided. Defaults to ``True``. :param stream: (optional) if ``False``, the response content will be immediately downloaded. :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair. :return: :class:`Response ` object :rtype: requests.Response Usage:: >>> import requests >>> req = requests.request('GET', 'http://httpbin.org/get') """ def param_method_url(): requests.request(method='get', url='http://127.0.0.1:8000/test/') requests.request(method='post', url='http://127.0.0.1:8000/test/') def param_param(): # - 可以是字典 # - 可以是字符串 # - 可以是字節（ascii編碼以內） requests.request(method='get', url='http://127.0.0.1:8000/test/', params={'k1': 'v1', 'k2': '水電費'}) requests.request(method='get', url='http://127.0.0.1:8000/test/', params="k1=v1&k2=水電費&k3=v3&k3=vv3") requests.request(method='get', url='http://127.0.0.1:8000/test/', params=bytes("k1=v1&k2=k2&k3=v3&k3=vv3", encoding='utf8')) # 錯誤 requests.request(method='get', url='http://127.0.0.1:8000/test/', params=bytes("k1=v1&k2=水電費&k3=v3&k3=vv3", encoding='utf8')) def param_data(): # 可以是字典 # 可以是字符串 # 可以是字節 # 可以是文件對象 requests.request(method='POST', url='http://127.0.0.1:8000/test/', data={'k1': 'v1', 'k2': '水電費'}) requests.request(method='POST', url='http://127.0.0.1:8000/test/', data="k1=v1; k2=v2; k3=v3; k3=v4") requests.request(method='POST', url='http://127.0.0.1:8000/test/', data="k1=v1;k2=v2;k3=v3;k3=v4", headers={'Content-Type': 'application/x-www-form-urlencoded'}) requests.request(method='POST', url='http://127.0.0.1:8000/test/', data=open('data_file.py', mode='r', encoding='utf-8'), # 文件內容是：k1=v1;k2=v2;k3=v3;k3=v4 headers={'Content-Type': 'application/x-www-form-urlencoded'}) def param_json(): # 將json中對應的數據進行序列化成一個字符串，json.dumps(...) # 然后發送到服務器端的body中，并且Content-Type是 # {'Content-Type': 'application/json'} requests.request(method='POST', url='http://127.0.0.1:8000/test/', json={'k1': 'v1', 'k2': '水電費'}) def param_headers(): # 發送請求頭到服務器端 requests.request(method='POST', url='http://127.0.0.1:8000/test/', json={'k1': 'v1', 'k2': '水電費'}, headers={'Content-Type': 'application/x-www-form-urlencoded'} ) def param_cookies(): # 發送Cookie到服務器端 requests.request(method='POST', url='http://127.0.0.1:8000/test/', data={'k1': 'v1', 'k2': 'v2'}, cookies={'cook1': 'value1'}, ) # 也可以使用CookieJar（字典形式就是在此基礎上封裝） from http.cookiejar import CookieJar from http.cookiejar import Cookie obj = CookieJar() obj.set_cookie(Cookie(version=0, name='c1', value='v1', port=None, domain='', path='/', secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={'HttpOnly': None}, rfc2109=False, port_specified=False, domain_specified=False, domain_initial_dot=False, path_specified=False) ) requests.request(method='POST', url='http://127.0.0.1:8000/test/', data={'k1': 'v1', 'k2': 'v2'}, cookies=obj) def param_files(): # 發送文件 file_dict = { 'f1': open('readme', 'rb') } requests.request(method='POST', url='http://127.0.0.1:8000/test/', files=file_dict) # 發送文件，定制文件名 file_dict = { 'f1': ('test.txt', open('readme', 'rb')) } requests.request(method='POST', url='http://127.0.0.1:8000/test/', files=file_dict) # 發送文件，定制文件名 file_dict = { 'f1': ('test.txt', "hahsfaksfa9kasdjflaksdjf") } requests.request(method='POST', url='http://127.0.0.1:8000/test/', files=file_dict) # 發送文件，定制文件名 file_dict = { 'f1': ('test.txt', "hahsfaksfa9kasdjflaksdjf", 'application/text', {'k1': '0'}) } requests.request(method='POST', url='http://127.0.0.1:8000/test/', files=file_dict) def param_auth(): from requests.auth import HTTPBasicAuth, HTTPDigestAuth ret = requests.get('https://api.github.com/user', auth=HTTPBasicAuth('wupeiqi', 'sdfasdfasdf')) print(ret.text) ret = requests.get('http://192.168.1.1', auth=HTTPBasicAuth('admin', 'admin')) ret.encoding = 'gbk' print(ret.text) ret = requests.get('http://httpbin.org/digest-auth/auth/user/pass', auth=HTTPDigestAuth('user', 'pass')) print(ret) def param_timeout(): ret = requests.get('http://google.com/', timeout=1) print(ret) ret = requests.get('http://google.com/', timeout=(5, 1)) print(ret) def param_allow_redirects(): ret = requests.get('http://127.0.0.1:8000/test/', allow_redirects=False) print(ret.text) def param_proxies(): proxies = { "http": "61.172.249.96:80", "https": "http://61.185.219.126:3128", } proxies = {'http://10.20.1.128': 'http://10.10.1.10:5323'} ret = requests.get("http://www.proxy360.cn/Proxy", proxies=proxies) print(ret.headers) from requests.auth import HTTPProxyAuth proxyDict = { 'http': '77.75.105.165', 'https': '77.75.105.165' } auth = HTTPProxyAuth('username', 'mypassword') r = requests.get("http://www.google.com", proxies=proxyDict, auth=auth) print(r.text) def param_stream(): ret = requests.get('http://127.0.0.1:8000/test/', stream=True) print(ret.content) ret.close() from contextlib import closing with closing(requests.get('http://httpbin.org/get', stream=True)) as r: # 在此處理響應。 for i in r.iter_content(): print(i) def requests_session(): import requests session = requests.Session() # 1、首先登陸任何頁面，獲取cookie i1 = session.get(url="http://dig.chouti.com/help/service") # 2、用戶登陸，攜帶上一次的cookie，后臺對cookie中的 gpsd 進行授權 i2 = session.post( url="http://dig.chouti.com/login", data={ 'phone': "8615131255089", 'password': "xxxxxx", 'oneMonth': "" } ) i3 = session.post( url="http://dig.chouti.com/link/vote?linksId=8589623", ) print(i3.text)

100

101

102

103

104

105

106

107

108

109

110

111

112

113

python爬蟲入門requests模塊

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

參考：

武沛齊：http://www.cnblogs.com/wupeiqi/articles/6283017.html

官方文檔：http://cn.python-requests.org/zh_CN/latest/

Python

辦公 自動化(三) | 借助服務器定時爬數據發郵件">python辦公 自動化(三) | 借助服務器定時爬數據發郵件

698 2022-05-30

Python3 網絡爬蟲開發實戰] 1.4.3-Redis 的安裝">[Python3 網絡爬蟲開發實戰] 1.4.3-Redis 的安裝

698 2022-05-30

入門之十六操作DOM節點">WEB入門之十六操作DOM節點

698 2022-05-30

python爬蟲 入門 requests 模塊

辦公 自動化(三) | 借助服務器定時爬數據發郵件">python辦公 自動化(三) | 借助服務器定時爬數據發郵件

Python3 網絡爬蟲開發實戰] 1.4.3-Redis 的安裝">[Python3 網絡爬蟲開發實戰] 1.4.3-Redis 的安裝

入門之十六操作DOM節點">WEB入門之十六操作DOM節點

推薦文章

企業生產管理是什么，企業生產管理軟件

進盤點進銷存軟件排行榜前十名

進銷存系統哪個簡單好用？進銷存系統優點

工廠生產管理（工廠生產管理流程及制度）

生產管理軟件，機械制造業生產管理，制造業生產過程管理軟件

進銷存軟件和ERP有什么區別？進銷存與erp軟件理解

進銷存如何進行庫存管理

如何利用excel制作銷售訂單管理系統？

數據庫訂單管理系統有哪些功能？數據庫訂單管理系統怎么設計？

什么是數據庫管理系統？

最近發表

熱評文章

零代碼開發是什么？2022低代碼平臺排行榜">零代碼開發是什么？2022低代碼平臺排行榜

進銷存庫存管理 系統（智慧進銷存）">智能進銷存庫存管理系統（智慧進銷存）

在線文檔哪家強？8款在線文檔編輯軟件推薦">在線文檔哪家強？8款在線文檔編輯軟件推薦

WPS2016怎么繪制簡單的價格表?

定制家居數字化管理模式：提升品質、智能化和個性化的未

智能定制家居管理系統：重新定義家庭生活方式

友情鏈接

python爬蟲入門requests模塊

辦公自動化(三) | 借助服務器定時爬數據發郵件">python辦公自動化(三) | 借助服務器定時爬數據發郵件

Python3 網絡爬蟲開發實戰] 1.4.3-Redis 的安裝">[Python3 網絡爬蟲開發實戰] 1.4.3-Redis 的安裝

入門之十六 操作DOM節點">WEB入門之十六 操作DOM節點

推薦文章

最近發表

熱評文章

零代碼開發是什么？2022低代碼平臺排行榜">零代碼開發是什么？2022低代碼平臺排行榜

進銷存庫存管理系統（智慧進銷存）">智能進銷存庫存管理系統（智慧進銷存）

在線文檔哪家強？8款在線文檔編輯軟件推薦">在線文檔哪家強？8款在線文檔編輯軟件推薦

友情鏈接

python爬蟲入門 requests 模塊

入門之十六操作DOM節點">WEB入門之十六操作DOM節點