新浪 weibo 采用了新的登录脚本; 目前支持的是RSA加密。
### 第一步:[¶](http://uliweb.clkg.org/tutorial/view_chapter/240#title_0-0-1)
登陆weibo我们首先需要从: url_prelogin = 'http://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=&rsakt=mod&client=ssologin.js(v1.4.5)&_=1364875106625'
取得 4个值,主要是对本地明文密码进行加密。
~~~
servertime = data['servertime']
nonce = data['nonce']
pubkey = data['pubkey']
rsakv = data['rsakv']
~~~
### 第二步:[¶](http://uliweb.clkg.org/tutorial/view_chapter/240#title_0-0-2)
加密用户名(su)
~~~
su = base64.b64encode(urllib.quote(username))
~~~
加密密码(sp):
~~~
rsaPublickey= int(pubkey,16)
key = rsa.PublicKey(rsaPublickey,65537)
message = str(servertime) +'\t' + str(nonce) + '\n' + str(password)
sp = binascii.b2a_hex(rsa.encrypt(message,key))
~~~
加密后的数据提交
~~~
postdata = {
'entry': 'weibo',
'gateway': '1',
'from': '',
'savestate': '7',
'userticket': '1',
'ssosimplelogin': '1',
'vsnf': '1',
'vsnval': '',
'su': su,
'service': 'miniblog',
'servertime': servertime,
'nonce': nonce,
'pwencode': 'rsa2',
'sp': sp,
'encoding': 'UTF-8',
'url': 'http://weibo.com/ajaxlogin.php?framelogin=1&callback=parent.sinaSSOController.feedBackUrlCallBack',
'returntype': 'META',
'rsakv' : rsakv,
}
~~~
### 第三步登陆:[¶](http://uliweb.clkg.org/tutorial/view_chapter/240#title_0-0-3)
~~~
url_login = 'http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.5)'
resp = session.post(url_login,data=postdata)
~~~
### 完整代码:[¶](http://uliweb.clkg.org/tutorial/view_chapter/240#title_0-0-4)
~~~
#coding:utf-8
import requests
import base64
import re
import urllib
import rsa
import json
import binascii
username = 'xxx'
password = 'xxx'
session = requests.Session()
url_prelogin = 'http://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=&rsakt=mod&client=ssologin.js(v1.4.5)&_=1364875106625'
url_login = 'http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.5)'
#get servertime,nonce, pubkey,rsakv
resp = session.get(url_prelogin)
json_data = re.search('\((.*)\)', resp.content).group(1)
data = json.loads(json_data)
servertime = data['servertime']
nonce = data['nonce']
pubkey = data['pubkey']
rsakv = data['rsakv']
# calc su
su = base64.b64encode(urllib.quote(username))
#calc sp
rsaPublickey= int(pubkey,16)
key = rsa.PublicKey(rsaPublickey,65537)
message = str(servertime) +'\t' + str(nonce) + '\n' + str(password)
sp = binascii.b2a_hex(rsa.encrypt(message,key))
postdata = {
'entry': 'weibo',
'gateway': '1',
'from': '',
'savestate': '7',
'userticket': '1',
'ssosimplelogin': '1',
'vsnf': '1',
'vsnval': '',
'su': su,
'service': 'miniblog',
'servertime': servertime,
'nonce': nonce,
'pwencode': 'rsa2',
'sp': sp,
'encoding': 'UTF-8',
'url': 'http://weibo.com/ajaxlogin.php?framelogin=1&callback=parent.sinaSSOController.feedBackUrlCallBack',
'returntype': 'META',
'rsakv' : rsakv,
}
resp = session.post(url_login,data=postdata)
login_url = re.findall("replace\('(.*)'\)",resp.content)
#print login_url
resp = session.get(login_url[0])
print resp.content
uid = re.findall('"uniqueid":"(\d+)",',resp.content)[0]
url = "http://weibo.com/u/"+uid
resp = session.get(url)
print resp.content
~~~
发送微博的完整代码
~~~
#coding:utf-8
import requests
import base64
import re
import urllib
import rsa
import json
import binascii
import time
username = 'xxx'
password = 'xxx'
session = requests.Session()
url_prelogin = 'http://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=&rsakt=mod&client=ssologin.js(v1.4.5)&_=1364875106625'
url_login = 'http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.5)'
#get servertime,nonce, pubkey,rsakv
resp = session.get(url_prelogin)
json_data = re.search('\((.*)\)', resp.content).group(1)
data = json.loads(json_data)
servertime = data['servertime']
nonce = data['nonce']
pubkey = data['pubkey']
rsakv = data['rsakv']
# calc su
su = base64.b64encode(urllib.quote(username))
#calc sp
rsaPublickey= int(pubkey,16)
key = rsa.PublicKey(rsaPublickey,65537)
message = str(servertime) +'\t' + str(nonce) + '\n' + str(password)
sp = binascii.b2a_hex(rsa.encrypt(message,key))
postdata = {
'entry': 'weibo',
'gateway': '1',
'from': '',
'savestate': '7',
'userticket': '1',
'ssosimplelogin': '1',
'vsnf': '1',
'vsnval': '',
'su': su,
'service': 'miniblog',
'servertime': servertime,
'nonce': nonce,
'pwencode': 'rsa2',
'sp': sp,
'encoding': 'UTF-8',
'url': 'http://weibo.com/ajaxlogin.php?framelogin=1&callback=parent.sinaSSOController.feedBackUrlCallBack',
'returntype': 'META',
'rsakv' : rsakv,
}
resp = session.post(url_login,data=postdata)
login_url = re.findall('replace\("(.*)"\)',resp.content)
#print login_url
resp = session.get(login_url[0])
#print resp.content
uid = re.findall('"uniqueid":"(\d+)",',resp.content)[0]
#print uid
#url = "http://weibo.com/u/"+uid
#resp = session.get(url)
#print resp.content
def decode_content(content):
result = re.findall('<script>STK && STK.pageletM && STK.pageletM.view\((.*?)\)<\/script>',content)
for i in result:
r = i.encode("utf-8").decode('unicode_escape').encode("utf-8")
print r.replace("\/","/")
#url_search = "http://s.weibo.com/weibo/%s?topnav=1&wvr=5&b=1" % "php"
#resp = session.get(url_search)
#decode_content( resp.content )
def add_new(content,resp):
add_url = "http://weibo.com/aj/mblog/add?_wv=5&__rnd=%s770"% int(time.time())
add_data = {
'text':content,
'rank':0,
'rankid':'',
'location':'home',
'module':'stissue',
"hottopicid":"",
'_surl':'',
'pic_id':'',
'_t':0,
}
headers={}
headers ['set-cookie']= resp.headers['set-cookie']
headers['Referer'] = 'http://weibo.com/u/'+uid+'?topnav=1&wvr=5'
resp = session.post(add_url,data=add_data,headers=headers)
print resp.status_code
add_new("hello",resp)
~~~
转发
~~~
def forward(mid,content):
forwardurl = "http://weibo.com/aj/mblog/forward?_wv=5&__rnd=%s"% int(time.time())
data = {'mid':mid, 'style-type':1, 'reason':content, 'rank':0, 'location':'mblog', '_t':0}
headers = {}
headers['set-cookie'] = resp.headers['set-cookie']
headers['Referer'] = 'http://weibo.com/u/'+uid+'?topnav=1&wv=5'
respon = session.post(forwardurl, data, headers=headers)
print respon.status_code
forward('3606151827013483', "转发")
~~~
关注
~~~
def followed(dstuid,oid):
followedurl = "http://weibo.com/aj/f/followed?_wv=5&__rnd=%s"% int(time.time())
data = {'uid':dstuid, 'rank':0, 'location':'mblog', '_t':0,'f':0,
'oid':oid,
'nogroup':'false',
'challenge_uids':'',
'check_challenge_value':'',
'location':'home',
'refer_sort':'interest',
'refer_flag':'friend_bridge',
'loc':1,
}
headers = {}
headers['set-cookie'] = resp.headers['set-cookie']
headers['Referer'] = 'http://weibo.com/u/'+oid+'?topnav=1&wv=5'
respon = session.post(followedurl, data, headers=headers)
print respon.status_code
followed('2898801847',uid)
~~~
~~~
result = re.findall('<script>STK && STK.pageletM && STK.pageletM.view\(({"pid":"pl_weibo_direct".*?)\)<\/script>',resp.content)
print eval(result[0])['html'].encode("utf-8").decode('unicode_escape').encode("utf-8")
~~~
# 发送微博2
~~~
import time
def add_new(content):
add_url = "http://weibo.com/aj/mblog/add?_wv=5&__rnd=%s"% int(time.time())
add_data = {
'text':content,
'rank':0,
'rankid':'',
'location':'home',
'module':'stissue',
"hottopicid":"",
'_surl':'',
'pic_id':'',
'_t':0,
}
headers={}
headers['Referer'] = 'http://weibo.com/u/'+uid+'?topnav=1&wvr=5'
resp = s.post(add_url,data=add_data,headers=headers)
print resp.status_code
add_new("@asmcos_智普教育 cookies?")
~~~
- Python爬虫入门
- (1):综述
- (2):爬虫基础了解
- (3):Urllib库的基本使用
- (4):Urllib库的高级用法
- (5):URLError异常处理
- (6):Cookie的使用
- (7):正则表达式
- (8):Beautiful Soup的用法
- Python爬虫进阶
- Python爬虫进阶一之爬虫框架概述
- Python爬虫进阶二之PySpider框架安装配置
- Python爬虫进阶三之Scrapy框架安装配置
- Python爬虫进阶四之PySpider的用法
- Python爬虫实战
- Python爬虫实战(1):爬取糗事百科段子
- Python爬虫实战(2):百度贴吧帖子
- Python爬虫实战(3):计算大学本学期绩点
- Python爬虫实战(4):模拟登录淘宝并获取所有订单
- Python爬虫实战(5):抓取淘宝MM照片
- Python爬虫实战(6):抓取爱问知识人问题并保存至数据库
- Python爬虫利器
- Python爬虫文章
- Python爬虫(一)--豆瓣电影抓站小结(成功抓取Top100电影)
- Python爬虫(二)--Coursera抓站小结
- Python爬虫(三)-Socket网络编程
- Python爬虫(四)--多线程
- Python爬虫(五)--多线程续(Queue)
- Python爬虫(六)--Scrapy框架学习
- Python爬虫(七)--Scrapy模拟登录
- Python笔记
- python 知乎爬虫
- Python 爬虫之——模拟登陆
- python的urllib2 模块解析
- 蜘蛛项目要用的数据库操作
- gzip 压缩格式的网站处理方法
- 通过浏览器的调试得出 headers转换成字典
- Python登录到weibo.com
- weibo v1.4.5 支持 RSA协议(模拟微博登录)
- 搭建Scrapy爬虫的开发环境
- 知乎精华回答的非专业大数据统计
- 基于PySpider的weibo.cn爬虫
- Python-实现批量抓取妹子图片
- Python库
- python数据库-mysql
- 图片处理库PIL
- Mac OS X安装 Scrapy、PIL、BeautifulSoup
- 正则表达式 re模块
- 邮件正则
- 正则匹配,但过滤某些字符串
- dict使用方法和快捷查找
- httplib2 库的使用