Warren Buffett Value Investing like a Quant

In this tutorial we aim to use the key indicators that Warren Buffett uses to determine the strength of an underlying business, so that we can find excellent stocks that are worth more time investigating. However the ASX alone currently has 2061 listed stocks, how can we possibly reduce that number? With our Quant hat on, we can expedite this process using some key assumptions and our skills in python.
Let’s import the dependencies.

import datetime as dt 
import pandas as pd

import concurrent.futures as cf
from yahoofinancials import YahooFinancials

import re
import ast
import time
import requests
from bs4 import BeautifulSoup

We will get a list of stocks from the following website, you may wish to use this approach to create your own stock list of interest.

asx_200 = 'https://www.asx200list.com/'
all_ords = 'https://www.allordslist.com/'
small_ords = 'https://www.smallordslist.com/'

header = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36',

asx_list = [ASX200, ALLORDS, SMALLORDS]
for index, url in enumerate([asx_200, all_ords, small_ords]):
    res = requests.get(url, headers=header)
    soup = BeautifulSoup(res.text, 'html.parser')
    divs = soup.findAll('table', class_='tableizer-table sortable')[0].findAll('tbody')
    for i, val in enumerate(divs[0]):
        if len(val) > 1:
            text = re.sub(r"[<trd>]","", str(val))
            text = text.split('/')
print('ASX200', ASX200)

Let’s check for duplicates.

stockList = ASX200
stocks = [stock + '.AX' for stock in stockList]
stocks_set = set(stocks)

contains_duplicates = len(stocks_set) != len(stocks)
contains_duplicates = any(stocks.count(stock) > 1 for stock in stockList)
print(len(stocks_set), len(stocks), contains_duplicates)

Now we can rely on a python module called yahoofinancials to retrieve Yahoo Financial data by the bucket load. We can speed this function up by using Multithreading as we are Input/Output limited in calling this API not CPU limited. So create a function taht we can pass stock data into and ensure that you only use the number of threads that are available to you. Check how many core’s your computer has and then the number of threads per core.

balanceSheet = {}
incomeStatement = {}
cashStatement = {}

def retrieve_stock_data(stock):
        yahoo_financials = YahooFinancials(stock)
        balance_sheet_data = yahoo_financials.get_financial_stmts('annual', 'balance')
        income_statement_data = yahoo_financials.get_financial_stmts('annual', 'income')
        cash_statement_data = yahoo_financials.get_financial_stmts('annual', 'cash')

        balanceSheet[stock] = balance_sheet_data['balanceSheetHistory'][stock]
        incomeStatement[stock] = income_statement_data['incomeStatementHistory'][stock]
        cashStatement[stock] = cash_statement_data['cashflowStatementHistory'][stock]
        print('error with retrieving stock data')

Now time to exectute the function, however if you are running a python script (not running Jupyter labs) the data will not be cached so you will have to save the files locally as follows.

with open('balanceSheet_ASX200.txt', 'w') as output:
with open('incomeStatement_ASX200.txt', 'w') as output:
with open('cashStatement_ASX200.txt', 'w') as output:

Now run the multithreading function and wait for your data…

start = time.time()
executor = cf.ThreadPoolExecutor(16)
futures = [executor.submit(retrieve_stock_data, stock) for stock in stocks]
end = time.time()
print('  time taken {:.2f} s'.format(end-start))

How can you gen your hands on more data? Well the ASX website has a complete list of listed stocks available at https://www2.asx.com.au/markets/trade-our-cash-market/directory

Visit the website, download the csv file and import into python, you will then need to re-run the multithreading function with this stocklist.

asxComp = pd.read_csv('ASX_Listed_Companies_31-08-2021_06-51-02_AEST.csv')
stockList = asxComp.loc[:,'ASX code'].to_list()

To open your saved data files, you can use the following code to retrieve dictionaries from text files. Remember to use the ast.literal_eval function to ensure you are creating dictionaries.

with open('balanceSheet.txt', 'r') as input:
    balanceSheet = ast.literal_eval(input.read())

with open('incomeStatement.txt', 'r') as input:
    incomeStatement = ast.literal_eval(input.read())

Now we can use the data to calculate the return on equity (ROE) and the earnings per share (EPS) growth over a number of years for each stock.

  1. ROE = Net Income / Shareholder’s Equity
  2. EPS = Profit / Common Shares
  3. EPSG = % annualised increase in EPS
roe_dict, epsg_dict = {}, {}
count_missing, count_cond, count_eps_0 = 0, 0, 0
for (keyB, valB), (keyI, valI) in zip(balanceSheet.items(), incomeStatement.items()):
        if keyB == keyI:
            yearsI = [k for year in valI for k, v in year.items()]
            yearsB = [k for year in valB for k, v in year.items()]
            if yearsI == yearsB:
                count_cond += 1
                equity = [v['totalStockholderEquity'] for year in valB for k, v in year.items()]
                commonStock = [v['commonStock'] for year in valB for k, v in year.items()]

                profit = [v['grossProfit'] for year in valI for k, v in year.items()]
                revenue = [v['totalRevenue'] for year in valI for k, v in year.items()]
                netIncome = [v['netIncome'] for year in valI for k, v in year.items()]

                roe = [round(netin/equity*100,2) for netin, equity in zip(netIncome, equity)]
                roe_dict[keyB] = (round(sum(roe)/len(roe),2), roe)

                eps = [round(earn/stono,2) for earn, stono in zip(profit, commonStock)]
                    epsg = []
                    for ep in range(len(eps)):
                        if ep == 0:
                        elif ep == 1:
                        elif ep == 2:
                        elif ep == 3:
                            print('More than 4 years of FY data')
                    epsg_dict[keyB] = (round(sum(epsg)/len(epsg),2), epsg)
#                     print(keyB, 'eps contains 0')
                    count_eps_0 += 1
                    epsg_dict[keyB] = (0, eps)

#         print(keyB, 'data missing')
        count_missing += 1

print('Yearly data avail',count_cond, 'out of', len(balanceSheet))
print('Some key data missing', count_missing, 'out of', len(balanceSheet))
print('EPS Growth NaN', count_eps_0, 'out of', len(balanceSheet))

Finally we can apply our conditions to the two dictionaries and print the stocks that are present to both of them. These are the companies that are worth further investigation.

ROE_req = 10
EPSG_req = 10

print('-'*50, 'RETURN ON EQUITY','-'*50)
roe_crit = {k:v for (k,v) in roe_dict.items() if v[0] >= ROE_req and sum(n < 0 for n in v[1])==0}
# print(roe_crit)
print('-'*50, 'EARNINGS PER SHARE GROWTH','-'*50)
eps_crit = {k:v for (k,v) in epsg_dict.items() if v[0] >= EPSG_req and sum(n < 0 for n in v[1])==0}
# print(eps_crit)

print('-'*50, 'ROE & EPS Growth Critera','-'*50)
both = [key1 for key1 in roe_crit.keys() for key2 in eps_crit.keys() if key2==key1]

Thank you for following the tutorial and have fun using this code.