python网页请求urllib2模块简单封装代码

2026-05-03 17:02:04

对python网页请求模块urllib2进行简单的封装。

例子：

代码如下:

#!/usr/bin/python
#coding: utf-8
import base64
import urllib
import urllib2
import time

class SendRequest:
'''
This class use to set and request the http, and get the info of response.
e.g. set Authorization Type, request tyep..
e.g. get html content, state code, cookie..
SendRequest('http://10.75.0.103:8850/2/photos/square/type.json',
              data='source=216274069', type='POST', auth='base',
     user='zl2010', password='111111')
'''
def __init__(self, url, data=None, type='GET', auth=None, user=None, password=None, cookie = None, **header):
    '''
    url:request, raise error if none
    date: data for post or get, must be dict type
    type: GET, POST
    auth: option, if has the value must be 'base' or 'cookie'
    user: user for auth
    password: password for auth
    cookie: if request with cookie
    other header info:
    e.g. referer='www.sina.com.cn'
    '''
    self.url = url
    self.data = data
    self.type = type
    self.auth = auth
    self.user = user
    self.password = password
    self.cookie = cookie

if 'referer' in header:
      self.referer = header[referer]
    else:
      self.referer = None

if 'user-agent' in header:
      self.user_agent = header[user-agent]
    else:
      self.user_agent = None

self.setup_request()
self.send_request()

def setup_request(self):
    '''
    setup a request
    '''
    if self.url == None or self.url == '':
      raise 'The url should not empty!'

# set request type
    #print self.url
    #print self.type
    #print self.data
    #print self.auth
    #print self.user
    #print self.password
    if self.type == 'POST':
      self.Req = urllib2.Request(self.url, self.data)
    elif self.type == 'GET':
      if self.data == None:
          self.Req = urllib2.Request(self.url)
      else:
        self.Req = urllib2.Request(self.url + '?' + self.data)
    else:
      print 'The http request type NOT support now!'

##set auth type
    if self.auth == 'base':
      if self.user == None or self.password == None:
        raise 'The user or password was not given!'
      else:
        auth_info = base64.encodestring(self.user + ':' + self.password).replace('\n','')
        auth_info = 'Basic ' + auth_info
        #print auth_info
        self.Req.add_header("Authorization", auth_info)
    elif self.auth == 'cookie':
      if self.cookie == None:
        raise 'The cookie was not given!'
      else:
        self.Req.add_header("Cookie", self.cookie)
    else:
      pass    ##add other auth type here

##set other header info
    if self.referer:
      self.Req.add_header('referer', self.referer)
    if self.user_agent:
      self.Req.add_header('user-agent', self.user_agent)

def send_request(self):
    '''
    send a request
    '''
    # get a response object
    try:
      self.Res = urllib2.urlopen(self.Req)
      self.source = self.Res.read()
      self.goal_url = self.Res.geturl()
      self.code = self.Res.getcode()
      self.head_dict = self.Res.info().dict
      self.Res.close()
    except urllib2.HTTPError, e:
      self.code = e.code
      print e

def get_code(self):
return self.code

def get_url(self):
return self.goal_url

def get_source(self):
return self.source

def get_header_info(self):
return self.head_dict

def get_cookie(self):
    if 'set-cookie' in self.head_dict:
      return self.head_dict['set-cookie']
    else:
      return None

def get_content_type(self):
    if 'content-type' in self.head_dict:
      return self.head_dict['content-type']
    else:
      return None

def get_expires_time(self):
    if 'expires' in self.head_dict:
      return self.head_dict['expires']
    else:
      return None

def get_server_name(self):
    if 'server' in self.head_dict:
      return self.head_dict['server']
    else:
      return None

def __del__(self):
pass

__all__ = [SendRequest,]

if __name__ == '__main__':
'''
The example for using the SendRequest class
'''
value = {'source':'216274069'}
data = urllib.urlencode(value)
url = 'http://10.75.0.103:8850/2/photos/square/type.json'
user = 'wz_0001'
password = '111111'
auth = 'base'
type = 'POST'
t2 = time.time()
rs = SendRequest('http://www.google.com')
#rs = SendRequest(url, data=data, type=type, auth=auth, user=user, password=password)
print 't2: ' + str(time.time() - t2)
print '---------------get_code()---------------'
print rs.get_code()
print '---------------get_url()---------------'
print rs.get_url()
print '---------------get_source()---------------'
print rs.get_source()
print '---------------get_cookie()---------------'
print rs.get_cookie()
rs = None

python中文乱码的解决方法

乱码原因:源码文件的编码格式为utf-8,但是window的本地默认编码是gbk,所以在控制台直接打印utf-8的字符串当然是乱码了! 解决方法:1.print mystr.decode('utf-8').encode('gbk')2.比较通用的方法: 复制代码代码如下: import systype = sys.getfilesystemencoding()print mystr.decode('utf-8').encode(type)
Python中urllib2模块的8个使用细节分享

Python 标准库中有很多实用的工具类,但是在具体使用时,标准库文档上对使用细节描述的并不清楚,比如 urllib2 这个 HTTP 客户端库.这里总结了一些 urllib2 库的使用细节. 1 Proxy 的设置 urllib2 默认会使用环境变量 http_proxy 来设置 HTTP Proxy.如果想在程序中明确控制 Proxy,而不受环境变量的影响,可以使用下面的方式复制代码代码如下: import urllib2 enable_proxy = True proxy_hand
python操作mysql中文显示乱码的解决方法

本文实例展示了一个脚本python用来转化表配置数据xml并生成相应的解析代码. 但是在中文编码上出现了乱码,现将解决方法分享出来供大家参考. 具体方法如下: 1. Python文件设置编码 utf-8 (文件前面加上 #encoding=utf-8) 2. MySQL数据库charset=utf-8 3. Python连接MySQL是加上参数 charset=utf8 4. 设置Python的默认编码为 utf-8 (sys.setdefaultencoding(utf-8) 示例代码如下:
wxPython窗口中文乱码解决方法

本文实例讲述了wxPython窗口中文乱码解决方法,分享给大家供大家参考.具体方法如下: 文件保存为 utf-8 文件开头添加 # -*- coding: utf-8 -*- 在有中文字符串前加u或U,例如:u"我的网站:http://www.jb51.net" 示例如下: 复制代码代码如下: # -*- coding: utf-8 -*- import wx class App(wx.App): def OnInit(self): frame = wx.
python33 urllib2使用方法细节讲解

Proxy 的设置 urllib2 默认会使用环境变量 http_proxy 来设置 HTTP Proxy.如果想在程序中明确控制 Proxy 而不受环境变量的影响,可以使用下面的方式复制代码代码如下: import urllib2 enable_proxy = Trueproxy_handler = urllib2.ProxyHandler({"http" : 'http://some-proxy.com:8080'})null_proxy_handler = urllib2.P
Python库urllib与urllib2主要区别分析

作为一个Python菜鸟,之前一直懵懂于urllib和urllib2,以为2是1的升级版.今天看到老外写的一篇<Python: difference between urllib and urllib2>才明白其中的区别 You might be intrigued by the existence of two separate URL modules in Python -urllib and urllib2. Even more intriguing: they are not alte
Python中使用中文的方法

先来看看python的版本: >>> import sys >>> sys.version '2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)]' (一) 用记事本创建一个文件ChineseTest.py,默认ANSI: s = "中文" print s 测试一下瞧瞧: E:\Project\Python\Test>pyt
python中使用urllib2获取http请求状态码的代码例子

采集内容常需要得到网页返回的验证码做进一步处理下面代码是用python写的用来获取网页http状态码的脚本 #!/usr/bin/python # -*- coding: utf-8 -*- #encoding=utf-8 #Filename:states_code.py import urllib2 url = 'http://www.jb51.net/' response = None try: response = urllib2.urlopen(url,timeout=5) excep
python通过urllib2获取带有中文参数url内容的方法

本文实例讲述了python通过urllib2获取带有中文参数url内容的方法.分享给大家供大家参考.具体如下: 对于中文的参数如果不进行编码的话,python的urllib2直接处理会报错,我们可以先将中文转换成utf-8编码,然后使用urllib2.quote方法对参数进行url编码后传递. content = u'你好 jb51.net' content = content.encode('utf-8') content = urllib2.quote(content) api_url =
python网页请求urllib2模块简单封装代码

对python网页请求模块urllib2进行简单的封装. 例子: 复制代码代码如下: #!/usr/bin/python#coding: utf-8import base64import urllibimport urllib2import time class SendRequest: ''' This class use to set and request the http, and get the info of response. e.g. set Authorization
python3连接kafka模块pykafka生产者简单封装代码

1.1安装模块 pip install pykafka 1.2基本使用 # -* coding:utf8 *- from pykafka import KafkaClient host = 'IP:9092, IP:9092, IP:9092' client = KafkaClient(hosts = host) # 生产者 topicdocu = client.topics['my-topic'] producer = topicdocu.get_producer() for i in ran
python在html中插入简单的代码并加上时间戳的方法

建议用pycharm,使用比较方便,并且可以直接编辑html文件 import time locatime = time.strftime("%Y-%m-%d" ) report = file('report.html') line = [] for i in report.readlines(): line.append(i) report.close() line.insert(7,'<p>\n <a href="report %s .html"
pymysql的简单封装代码实例

这篇文章主要介绍了pymysql的简单封装代码实例,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下 #coding=utf-8 #!/usr/bin/python import pymysql class MYSQL: """ 对pymysql的简单封装 """ def __init__(self,host,user,pwd,db): self.host = host self.user = us
Python 网页请求之requests库的使用详解

目录 1.requests库简介 2.requests库方法介绍 3.代码实例 1.requests库简介 requests 是 Python 中比较常用的网页请求库,主要用来发送 HTTP 请求,在使用爬虫或测试服务器响应数据时经常会用到,使用起来十分简洁. requests 为第三方库,需要我们通过pip命令安装: pip install requests 2.requests库方法介绍下表列出了requests库中的各种请求方法: 方法描述 delete(url, args) 发送 D
python处理图片之PIL模块简单使用方法

本文实例讲述了python处理图片之PIL模块简单使用方法.分享给大家供大家参考.具体实现方法如下: #!/usr/bin/env python #encoding: utf-8 import Image class myimg: def __init__(self, open_file, save_file): self.img = Image.open(open_file) self.save_file = save_file def Change_Size(self, percent=10
python读取ini配置的类封装代码实例

这篇文章主要介绍了python读取ini配置的类封装代码实例,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下此为基础封装,未考虑过多异常处理类 # coding:utf-8 import configparser import os class IniCfg(): def __init__(self): self.conf = configparser.ConfigParser() self.cfgpath = '' def checkSec
Vue使用axios发送请求并实现简单封装的示例详解

目录一.安装axios 二.简单使用 1.配置 2.发送请求三.封装使用 1.创建js封装类 2.配置 3.发送请求一.安装axios npm install axios --save 二.简单使用 1.配置 main.js中加入如下内容 // 引入axios --------------------------------------------------- import axios from 'axios' Vue.prototype.$axios = axios Vue.proto
深入解析Python中的urllib2模块

Python 标准库中有很多实用的工具类,但是在具体使用时,标准库文档上对使用细节描述的并不清楚,比如 urllib2 这个 HTTP 客户端库.这里总结了一些 urllib2 的使用细节. Proxy 的设置 Timeout 设置在 HTTP Request 中加入特定的 Header Redirect Cookie 使用 HTTP 的 PUT 和 DELETE 方法得到 HTTP 的返回码 Debug Log Proxy 的设置 urllib2 默认会使用环境变量 http_proxy
Python中使用urllib2模块编写爬虫的简单上手示例

提起python做网络爬虫就不得不说到强大的组件urllib2.在python中正是使用urllib2这个组件来抓取网页的.urllib2是Python的一个获取URLs(Uniform Resource Locators)的组件.它以urlopen函数的形式提供了一个非常简单的接口.通过下面的代码简单感受一下urllib2的功能: import urllib2 response = urllib2.urlopen('http://www.baidu.com/') html = response

python网页请求urllib2模块简单封装代码

相关推荐

随机推荐