@jorgesantos
@santosjorge
PLOTCON 2017
A productivity tool that binds the power of Plotly and Pandas.
Optimized for Financial Charting and fast manipulation of data
Interactive
Open Source
import cufflinks as cf
import pandas as pd
import eikon as ek
The app id can be obtained from the API Key Generator
ek.set_app_id('AD2833XXXXXXXX')
Using ek.get_timeseries
to retrieve 2 years of Microsoft Data
msft=ek.get_timeseries('MSFT.O',fields='CLOSE',start_date='2015-05-04',end_date='2017-05-04')
msft.head()
MSFT.O | CLOSE |
---|---|
Date | |
2015-05-04 | 48.24 |
2015-05-05 | 47.60 |
2015-05-06 | 46.28 |
2015-05-07 | 46.70 |
2015-05-08 | 47.75 |
Charting 2 years of Microsoft Data
msft.iplot()
msft.iplot(title='Microsoft',theme='ggplot',width=3)
Getting a best fit line
msft.iplot(kind='lines',color='blue',width=2,bestfit=True,title='Bestfit')
annotations={'Date':'First Event','Date':'Second Event'}
Adding a Quarterly Results Event to a Microsoft Chart
# All annotations have basic support for HTML tags
ann={'2016-01-29':'Microsoft Reports Quarterly Results Above Expectations' \
'<BR>Lowering Outlook, Price Target and Rating.'}
msft.iplot(kind='lines',annotations=ann,textangle=0,title='Microsoft - Estimate Revisions')
We would like to retrieve Net (Adjusted) Dividends and overlay this information over the price chart of Microsoft.
# We will retreive Net and Gross Dividend for From May 2015 to April 2017
fields=['TR.DivExDate','TR.DivAdjustedGross','TR.DivAdjustedNet']
divs,err=ek.get_data('MSFT.O',fields=fields,
parameters={'SDate':'05May15','EDate':'30Apr17','DateType':'AD'})
# Converting to srings to datetime
divs['Dividend Ex Date']=pd.to_datetime(divs['Dividend Ex Date'])
divs=divs.set_index('Dividend Ex Date')[['Adjusted Gross Dividend Amount','Adjusted Net Dividend Amount']]
divs.head()
Adjusted Gross Dividend Amount | Adjusted Net Dividend Amount | |
---|---|---|
Dividend Ex Date | ||
2015-08-18 | 0.31 | 0.31 |
2015-11-17 | 0.36 | 0.36 |
2016-02-16 | 0.36 | 0.36 |
2016-05-17 | 0.36 | 0.36 |
2016-08-16 | 0.36 | 0.36 |
We will generate the Text Labels (including HTML tags) from the own DataFrame columns
# Text Format
table_str="<b>Net: </b> {0} <b><br>Gross: </b> {1}"
divs['text']=divs.apply(lambda x:table_str.format(x['Adjusted Gross Dividend Amount'],
x['Adjusted Net Dividend Amount']),axis=1)
# Dividend Markers
msft_divs=msft.join(divs,how='right').figure(kind='scatter',y='CLOSE',text='text',
mode='markers',hoverinfo='x+text')
# Price Chart
msft_price=msft.figure(colors='blue',title='Microsoft Dividends')
We merge the figures and plot the results
cf.tools.merge_figures([msft_price,msft_divs]).iplot(legend=False)
We will retrieve data from 2015 for both Microsoft and SPX
performance=ek.get_timeseries(['MSFT.O','.SPX'],
fields='CLOSE',
start_date='2015-05-04',
end_date='2017-05-04')
performance.head()
CLOSE | MSFT.O | .SPX |
---|---|---|
Date | ||
2015-05-04 | 48.24 | 2114.49 |
2015-05-05 | 47.60 | 2089.46 |
2015-05-06 | 46.28 | 2080.15 |
2015-05-07 | 46.70 | 2088.00 |
2015-05-08 | 47.75 | 2116.10 |
Normalizing (Percent change from day 0) the values for both instruments and charting the performance.
performance.normalize().iplot(kind='spread',title='Microsoft Performance vs SPX',
width=2,showgrid=False)
def get_estimates(instrument,fields,periods):
fields=[fields] if not isinstance(fields,list) else fields
# The string required for the calculation
string='Percent_Chg(TR.{field}MeanEstimate(period=FY{period1}),' \
'TR.{field}{actOrEst}(period=FY{period0}))'
d=defaultdict(list)
for field in fields:
for period in range(periods):
_f=string.format(field=field,period0=period,
period1=period+1,
actOrEst='ActValue' if period==0 else 'MeanEstimate')
d[field].append(_f)
df,err=ek.get_data(instrument,d[field])
d[field]=[_[0] for _ in df.transpose().values[1:].tolist()]
# Renaming the columns
index=['FY{1}/FY{0}'.format(_,_+1) for _ in range(periods)]
# Returning a DataFrame
return pd.DataFrame(d,index=index)
amz_chg=get_estimates('AMZN.O',fields=['EPS','Revenue'],periods=4)
amz_chg
EPS | Revenue | |
---|---|---|
FY1/FY0 | 36.851837 | 22.197527 |
FY2/FY1 | 71.432683 | 20.910831 |
FY3/FY2 | 62.131399 | 18.671825 |
FY4/FY3 | 43.932644 | 16.330261 |
amz_chg.iplot(kind='bar',title='Amazon Estimates YoY Changes',colorscale='-brbg',
yTitle='Percent Change',theme='ggplot')
We first retrieve the peers of Amazon (including Amazon)
peers,err=ek.get_data(['PEERS(AMZN.O)','AMZN.O'],['TR.CommonName','TR.CompanyMarketCap'])
And get the top 10 by Market Cap
peers=peers.sort_values(by='Company Market Cap',ascending=False)[:10]
peers
Instrument | Company Common Name | Company Market Cap | |
---|---|---|---|
29 | AAPL.OQ | Apple Inc | 7.715562e+11 |
10 | GOOGL.OQ | Alphabet Inc | 6.486604e+11 |
50 | AMZN.O | Amazon.com Inc | 4.497893e+11 |
11 | FB.OQ | Facebook Inc | 4.398198e+11 |
1 | BABA.N | Alibaba Group Holding Ltd | 2.883858e+11 |
2 | WMT.N | Wal-Mart Stores Inc | 2.296707e+11 |
15 | PCLN.OQ | Priceline Group Inc | 9.230081e+10 |
17 | NFLX.OQ | Netflix Inc | 6.705998e+10 |
30 | PYPL.OQ | PayPal Holdings Inc | 5.888569e+10 |
19 | YHOO.OQ | Yahoo! Inc | 4.650770e+10 |
Getting the EPS estimates for each of them
peers_eps={}
for ins in peers['Instrument']:
peers_eps[ins]=get_estimates(ins,'EPS',4).to_dict()['EPS']
peers_df=pd.DataFrame(peers_eps)
columns=peers[['Instrument','Company Common Name']].set_index('Instrument').to_dict()['Company Common Name']
peers_df=peers_df.rename(columns=columns).transpose()
peers_df.head()
FY1/FY0 | FY2/FY1 | FY3/FY2 | FY4/FY3 | |
---|---|---|---|---|
Apple Inc | 7.497834 | 16.079019 | 6.666622 | 1.440143 |
Amazon.com Inc | 36.851837 | 71.432683 | 62.131399 | 43.932644 |
Alibaba Group Holding Ltd | 42.290507 | 23.503356 | 27.087412 | 25.123601 |
Facebook Inc | 28.854374 | 18.251770 | 26.584551 | 25.350571 |
Alphabet Inc | -1.189196 | 18.595541 | 18.491230 | 15.634594 |
peers_df.iplot(kind='bar',colorscale='-purples',title='Amazon Peers - EPS Est % Change YoY',
yTitle='Percentage')
First we define the Universe over which the screening will be performed
# From the Russell 1000 Growth Index as strarting universe
# With a Market Cap between 5B and 20B USD
# With Analyst recommendations greater than 80 and Price to Mid Term component greater than 30
# And EPS >=10 and Revenue >=5 Smart Estimates (All in USD)
universe='SCREEN(U(IN(indices(4387903/*Russell 1000 Growth*/))), ' \
'TR.CompanyMarketCap>=5000000000, TR.CompanyMarketCap<20000000000, ' \
'TR.ARM100Region>=80, TR.PriceMoMidTermComponent>=30, ' \
'TR.EPSSmartEstLastYrGrowth>=10, TR.RevenueSmartEstLastYrGrowth>=5, CURN=USD)'
fields=['TR.CommonName','TR.PricePctChg52W',
'TR.CompanyMarketCap','TR.Revenue(Period:FY0)',
'TR.ARM100Region','TR.TRBCEconomicSector']
df,err=ek.get_data(universe,fields=fields)
We get the top 50 companies by Market Cap
df=df.sort_values(by='Company Market Cap',ascending=False)[:50]
df.head()
Instrument | Company Common Name | 52-week Price PCT Change | Company Market Cap | Revenue | Analyst Revisions Model Region Rank | TRBC Economic Sector Name | |
---|---|---|---|---|---|---|---|
0 | ROK.N | Rockwell Automation Inc | 37.540050 | 1.987381e+10 | 5.879500e+09 | 88 | Industrials |
36 | CFG.N | Citizens Financial Group Inc | 65.304309 | 1.895192e+10 | NaN | 99 | Financials |
1 | SWKS.OQ | Skyworks Solutions Inc | 51.342883 | 1.871298e+10 | 3.289000e+09 | 96 | Technology |
4 | WDAY.N | Workday Inc | 23.301105 | 1.837550e+10 | 1.569407e+09 | 93 | Technology |
16 | COL.N | Rockwell Collins Inc | 15.661032 | 1.677746e+10 | 5.259000e+09 | 99 | Industrials |
And add the Text on hover information
text='<b>RIC: </b>{0}<br><b>Name: </b>{1}<br><b>Market Cap: </b>{2} <br><b>ARM: </b>{2}'
df['text']=df.apply(lambda x:text.format(
x['Instrument'],
x['Company Common Name'],
round(x['Company Market Cap']/100000000,2),
x['Analyst Revisions Model Region Rank']),axis=1)
df.iplot(kind='bubble',x='52-week Price PCT Change',y='Revenue',size='Company Market Cap',
text='text',categories='TRBC Economic Sector Name',xTitle='52-week Price PCT Change',
yTitle='Revenue',zerolinewidth=2,zerolinecolor='darkgrey',colorscale='set3')
We can add another axis (z) and see them in a 3D form
df.iplot(kind='bubble3d',x='52-week Price PCT Change',y='Revenue',z='Analyst Revisions Model Region Rank',
size='Company Market Cap',text='Company Common Name',categories='TRBC Economic Sector Name',
xTitle='52-week Price PCT Change',yTitle='Revenue',zTitle='Analyst Recs',zerolinewidth=3,
zerolinecolor='darkgrey',colorscale='set3',width=0)
We retrieve the analysts' recommendations for the FTSE
recs,err=ek.get_data('0#.FTSE',['TR.CompanyName','TR.NumOfBuy','TR.NumOfHold','TR.NumOfSell'])
And a bit of data wrangling
del recs['Instrument']
recs=recs.set_index('Company Name')
recs=recs.rename(columns=dict(zip(recs.columns.values,['Buy','Hold','Sell'])))
# We get all recommendatins as percent of the total
recs=recs.apply(lambda x:x/x.sum(),axis=1).sort_values(by='Buy')
recs.head()
Buy | Hold | Sell | |
---|---|---|---|
Company Name | |||
Intu Properties PLC | 0.062500 | 0.437500 | 0.500000 |
Pearson PLC | 0.062500 | 0.625000 | 0.312500 |
WM Morrison Supermarkets PLC | 0.066667 | 0.533333 | 0.400000 |
Next PLC | 0.090909 | 0.681818 | 0.227273 |
InterContinental Hotels Group PLC | 0.105263 | 0.631579 | 0.263158 |
And we chart the results for all companies
recs.iplot(kind='barh',barmode='stack',margin={'l':300},dimensions=(1000,600),
colors=['green','blue','pink'])
We need to add the sector information to each company
sectors,err=ek.get_data('0#.FTSE',['TR.TRBCEconomicSector','TR.CompanyName'])
sectors=sectors.set_index('Company Name')
del sectors['Instrument']
recs=recs.join(sectors)
Some group by magic..
# Grouping by Sector (mean of percentage values)
recs=recs.groupby('TRBC Economic Sector Name').mean().sort_values(by='Buy')
And plot the sector results
recs.iplot(kind='barh',barmode='stack',margin={'l':200},colors=['green','blue','pink'],
title='FTSE Analyst Recommendations by Sector')
Presidential Inauguration Dates
potus={'Barack Obama': '2009-01-20',
'Donald J. Trump': '2017-01-20',
'George Bush': '1989-01-20',
'William J. Clinton': '1993-01-20',
'George W. Bush': '2001-01-20'}
potus_df={}
for p in potus:
# Getting the SPX closing values for the first 100 days
df=ek.get_timeseries('.SPX','CLOSE',
start_date=potus[p],
end_date=cf.date_tools.getDatefromDate(potus[p],100,'%Y-%m-%d'))
potus_df[p]=df['CLOSE'].values
Some data wrangling..
min_len=min([len(potus_df[k]) for k in potus_df])
for k in potus_df:
potus_df[k]=potus_df[k][:min_len]
df=pd.DataFrame(potus_df)
df.pct_change(1).iplot(kind='histogram',colorscale='set1',width=0.5,
title='Potus First 100 Days<br>1d Returns Distribution')
df.pct_change(1).iplot(kind='histogram',colorscale='set1',width=0.5,
title='Potus First 100 Days<br>1d Returns Distribution',subplots=True,
shared_xaxes=True)
df.normalize().iplot(colorscale='set1',width=2,legend='bottom',title='First 100 Days POTUS Returns (SPX)')