股票历史数据分析——看视频写代码

August 14, 2018 · 分享 · 51次阅读

先写下遇到的问题及解决

导入pandas_datareader时出错

import pandas_datareader

出现

...
ImportError: cannot import name 'is_list_like'

根据stackoverflow的提问,通过评论中通过

import pandas as pd
pd.core.common.is_list_like = pd.api.types.is_list_like

来解决报错

获取阿里巴巴股票数据出错

alibaba = pdr.get_data_yahoo('BABA')

出现

ImmediateDeprecationError: Yahoo Actions has been immediately
deprecated due to large breaks in the API without the introduction of
a stable replacement. Pull Requests to re-enable these data connectors
are welcome.

github上说打补丁什么的,还有改代码之类的,也有说有pandas_datareader分支有已经修复的包。
后来还是根据说安装pandas_datareader的dev包来解决
不过GitHub的写法是

pip install git+https://github.com/pydata/pandas-datareader.git

注意 这都是anaconda管理的,但不能直接通过conda直接安装dev包
不过我么有装git啊 那么操作如下

  • 先运行Anaconda Prompt就是Anaconda中打开的那个terminal,不过在win下开始菜单可以直接用Anaconda Prompt打开。
  • 安装dev包之前需要卸载已经安装的pandas-datareader包
pip uninstall pandas-datareader
  • 下载zip包 解压 切换到解压后的目录
  • 然后通过pip安装dev包
pip setup.py install

pdr.get_data_yahoo('APPL')获取不到苹果的股票数据 好像是接口问题 舍弃苹果的数据

调用函数sns.distplot()有个警告

C:UsersweimoAnaconda3libsite-packagesmatplotlibaxes_axes.py:6462: UserWarning: The 'normed' > kwarg is deprecated, and has been replaced by the 'density' kwarg.
warnings.warn("The 'normed' kwarg is deprecated, and has been "
GitHub上有评论说是版本变化引出的问题,但似乎又没有影响,搜索到一个中文博客是说直接修改源代码
seaborn/distributions.py
hist_kws.setdefault(“normed”, norm_hist)
改为
hist_kws.setdefault(“density”, norm_hist)

不过最终决定暂时不管,毕竟只是个warning。
推测应该是数据源的问题

总结:就像是在学matlab画图...但课程中提到Matplotlib for Python Developers这本书,目前出第二版了(2018)。第一版还是2009年,计划好好学学。顺便翻译一下?
这本书的彩图PDF官方链接:
https://www.packtpub.com/sites/default/files/downloads/MatplotlibforPythonDevelopersSecondEdition_ColorImages.pdf
GitHub上的章节代码:
https://github.com/PacktPublishing/Matplotlib-for-Python-Developers-Second-Edition/
书籍首页:
Matplotlib for Python Developers, 2nd Edition

记录正文

import numpy as np
import pandas as pd
pd.core.common.is_list_like = pd.api.types.is_list_like
from pandas import Series, DataFrame
import pandas_datareader as pdr

import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

from datetime import datetime
start = datetime(2014, 9, 20)
alibaba = pdr.get_data_yahoo('BABA', start=start)
amazon = pdr.get_data_yahoo('AMZN', start=start)
alibaba.head()
High Low Open Close Volume Adj Close
Date
2014-09-19 99.699997 89.949997 92.699997 93.889999 271879400 93.889999
2014-09-22 92.949997 89.500000 92.699997 89.889999 66657800 89.889999
2014-09-23 90.480003 86.620003 88.940002 87.169998 39009800 87.169998
2014-09-24 90.570000 87.220001 88.470001 90.570000 32088000 90.570000
2014-09-25 91.500000 88.500000 91.089996 88.919998 28598000 88.919998
amazon.head()
High Low Open Close Volume Adj Close
Date
2014-09-19 332.760010 325.570007 327.600006 331.320007 6886200 331.320007
2014-09-22 329.489990 321.059998 328.489990 324.500000 3109700 324.500000
2014-09-23 327.600006 321.250000 322.459991 323.630005 2352600 323.630005
2014-09-24 329.440002 319.559998 324.170013 328.209991 2642200 328.209991
2014-09-25 328.540009 321.399994 327.989990 321.929993 2928800 321.929993
#alibaba.shape
#alibaba.describe()
alibaba.to_csv("alibaba.csv")
amazon.to_csv("amazon.csv")
alibaba['Adj Close'].plot(legend=True)
<matplotlib.axes._subplots.AxesSubplot at 0x2eb3813a128>



output_5_1.png

for _ in alibaba:
    if _ == 'Volume':
        continue
    alibaba[_].plot(legend=True)

output_6_0.png

alibaba['high-low'] = alibaba['High'] - alibaba['Low']
alibaba.head()
High Low Open Close Volume Adj Close high-low
Date
2014-09-19 99.699997 89.949997 92.699997 93.889999 271879400 93.889999 9.750000
2014-09-22 92.949997 89.500000 92.699997 89.889999 66657800 89.889999 3.449997
2014-09-23 90.480003 86.620003 88.940002 87.169998 39009800 87.169998 3.860001
2014-09-24 90.570000 87.220001 88.470001 90.570000 32088000 90.570000 3.349998
2014-09-25 91.500000 88.500000 91.089996 88.919998 28598000 88.919998 3.000000
alibaba['high-low'].plot(figsize=(25,5))
<matplotlib.axes._subplots.AxesSubplot at 0x2eb388cd6d8>



output_9_1.png

alibaba['daily-return'] = alibaba['Adj Close'].pct_change()
alibaba['daily-return'].plot(figsize=(25,5),linestyle='--',marker='o')
<matplotlib.axes._subplots.AxesSubplot at 0x2eb38b609b0>



output_11_1.png

sns.distplot(alibaba['daily-return'].dropna(),bins=100,color='red')
C:\Users\weimo\Anaconda3\lib\site-packages\matplotlib\axes\_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.
  warnings.warn("The 'normed' kwarg is deprecated, and has been "





<matplotlib.axes._subplots.AxesSubplot at 0x2eb3a9bfac8>



output_12_2.png

start = datetime(2015, 1, 1)
company = ['GOOG', 'MSFT', 'AMZN', 'FB']
#company = 'APPL'
top_tech_df = pdr.get_data_yahoo(company,start=start)['Adj Close']
top_tech_df.head()
Symbols AMZN FB GOOG MSFT
Date
2014-12-31 310.350006 78.019997 523.521423 42.663837
2015-01-02 308.519989 78.449997 521.937744 42.948578
2015-01-05 302.190002 77.190002 511.057617 42.553627
2015-01-06 295.290009 76.150002 499.212799 41.929050
2015-01-07 298.420013 76.150002 498.357513 42.461777
top_tech_df.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x180d6e03f60>



output_15_1.png

top_tech_dr = top_tech_df.pct_change()
top_tech_df[['FB', 'MSFT']].plot()
<matplotlib.axes._subplots.AxesSubplot at 0x180d6de7550>



output_17_1.png

sns.jointplot('AMZN','GOOG',top_tech_dr,kind='scatter')
<seaborn.axisgrid.JointGrid at 0x180d6859f98>



output_18_1.png

sns.pairplot(top_tech_dr.dropna())
<seaborn.axisgrid.PairGrid at 0x180d6db60f0>



output_19_1.png

top_tech_dr['MSFT'].quantile(0.02)
-0.029942679770481886

output_5_1.png

标签:pandas,numpy,学习记录

最后编辑于:2018/08/14 12:43

添加新评论