爱奇艺的弹幕文件形如:
https://cmts.iqiyi.com/bullet/59/00/1307555900_300_1.z?rn=0.7268306364207229&business=danmu&is_iqiyi=true&is_video_page=true&tvid=1307555900&albumid=214500601&categoryid=2&qypid=01010021010000000000
实际可简化为
https://cmts.iqiyi.com/bullet/59/00/1307555900_300_1.z
链接组成:
https://cmts.iqiyi.com/bullet/tvid倒数4位的前两位/tvid最后两位/tvid_300_x.z
x的计算方式为片子总时长除以300秒向上取整,即按每5分钟一个包。
转换方法:
二进制读取文件,转换为字节数组,用zlib库解包,以utf-8解码即可。
python实现代码:
import zlib
import requests
zread = open('1307555900_300_1.z', 'rb').read()
zarray = bytearray(zread)
xml=zlib.decompress(zarray, 15+32).decode('utf-8')
with open('qiyi.xml','w',encoding='utf-8') as f:
f.write(xml)
如果不是保存的文件,直接读取弹幕链接的response
import zlib
import requests
zget = requests.get(url).content#url即弹幕链接 省略
zarray = bytearray(zread)
xml=zlib.decompress(zarray, 15+32).decode('utf-8')
结果: