How to Make a Python Fully Automatic AI Based Trading System

Regadermawans
12 min readDec 19, 2020

A couple of weeks ago I was casually chatting with a friend, masks on, social distance, the usual stuff. He was telling me how he was trying to, and I quote, detox from the broker app he was using. I asked him about the meaning of the word detox in this particular context, worrying that he might go broke, but nah: he told me that he was constantly trading. “If a particular stock has been going up for more than one hour or so and I’m already over the 1% profit threshold then I sell”, he said, “among other personal rules I’ve been following”. Leaving aside the slight pseudoscientific aspect of those rules, I understood what he meant by detox: following them implied checking the phone an astronomically high number of times.

So I started wondering: would it be possible to automate the set of rules this guy has in mind? And actually — would it be possible to automate a saner set of rules, so I let the system do the trading for me? Since you’re reading this I assume you got caught by the title, so you’ve probably already guessed that the answer is yes. Let’s elaborate on that, but first of all: time is gold and I don’t want to clickbait anyone. This is what we’re going to do:

  1. Get some real-time, granular stocks price data: ideally, in one minute intervals. The richer the better — we’re going to use Yahoo! Finance for that, more details to follow.
  2. Instead of a personal set of rules, we are going to add some AI flavour to the system. Full disclosure: I’m by no means an expert in time series analysis, there are already lots of tutorials over there about how to train neural networks to trade and I don’t really want to overengineer in a toy system like this, so let’s keep it simple: a very basic ARIMA model will do for now.
  3. At this point we’ll have the data and the prediction coming from the algorithm, so we should be able to decide whether to sell, buy or hold; we need to connect with our broker to actually perform the action. We are going to use RobinHood and Alpaca.
  4. That’s pretty much it — the system is finished. The last thing we need is to deploy it somewhere, in our case AWS, and monitor the activity. I’ve chose to send a Telegram message to a group everytime an action is performed by my system.

And what are we going to need?

  • Python 3.6 with some libraries.
  • An AWS account with admin rights, for storage and deployment.
  • Node.js, just to set up the serverless framework for deployment.
  • A Telegram account, for monitoring.

Everything I’ve coded is available here. Okay! So, without further ado, let’s go for the first part: getting the data.

Getting the data

Getting the data is not easy. Some years ago there was an official Yahoo! Finance API, as well as alternatives like Google Finance — sadly, both have been discontinued for years now. But don’t worry, there’s still plenty of alternatives in the market. My personal requirements were:

  • Free of charge: for a production system I would definitely change this bullet point to cheap alternatives, but for a toy system, or proof of concept, whatever you want to call it, I want it free.
  • High limit rate: ideally no limit, but anything above 500-ish hits per minute is more than enough.
  • Real-time data: some APIs provide data with a slight delay, let’s say 15 minutes. I want the real deal — the closest I can get to the real-time price of the stock.
  • Ease to use: Again — this is just a POC. I want the easiest one.

With that list in mind, I went for yfinance — the unofficial alternative to the old Yahoo Finance API. Bear in mind that for a real system, and based on the awesome list provided by Patrick Collins, I would definitely choose the Alpha Vantage API — but let’s keep it simple for now.

The yfinance library was developed by Ran Aroussi to get access to the Yahoo! Finance data when the official API was shut down. Quoting from the GitHub repository,

Ever since Yahoo! finance decommissioned their historical data API, many programs that relied on it to stop working.

yfinance aimes to solve this problem by offering a reliable, threaded, and Pythonic way to download historical market data from Yahoo! finance.

Sweet, good enough for me. How does it work? First we need to install it:

$ pip install yfinance --user

And then we can access everything using the Ticker object:

import yfinance as yfgoogle = yf.Ticker(“GOOG”)

That method is quite fast, slightly above 0.005 seconds on average, and returns LOTS of info about the stock; for instance, google.info contains 123 fields, including the following:

52WeekChange: 0.3531152
SandP52WeekChange: 0.17859101
address1: 1600 Amphitheatre Parkway
algorithm: None
annualHoldingsTurnover: None
annualReportExpenseRatio: None
ask: 1815
askSize: 1100...twoHundredDayAverage: 1553.0764
volume: 1320946
volume24Hr: None
volumeAllCurrencies: None
website: http://www.abc.xyz
yield: None
ytdReturn: None
zip: 94043

There is more info available through several methods: dividends, splits, balance_sheet or earnings among others. Most of these methods return the data in a pandas DataFrame object, so we’ll need to play with it a bit to get whatever we want. For now I just need the information of the stock price through the time; the history method is the best one for that purpose. We can select both the period or the interval dates and the frequency of the data down to one minute — note that intraday information is only available if the period is minor than 60 days, and that only 7 days worth of 1m granularity data are allowed to be fetched per request. The transposed data of the last entry with a 1m interval is as follows:

df = google.history(period='1d', interval="1m")
print(df.head())

Dataframe of Google historical stock price — Image by Author

We can see how it’s indexed by the datetime and every entry has seven features: four fixed points of the stock price during that minute (open, high, low and close) plus the volume, dividends and stock splits. I’m going to use just the low, so let’s keep that data:

df = google.history(period='1d', interval="1m")
df = df[['Low']]
df.head()

Image by Author

Finally, since we’re going to use the data just for the last day, let’s reindex the dataframe to remove the date and timezone components and keep just the time one:

df['date'] = pd.to_datetime(df.index).time
df.set_index('date', inplace=True)
df.head()

Image by Author

Looking good! We already know how to fetch the latest info from yfinance — we’ll later feed our algorithm with this. But for that, we need an algorithm to feed: let’s go for the next part.

Adding the AI

I said it before but I’ll say this again: don’t try this at home. What I’m going to do here is fitting a VERY simple ARIMA model to forecast the next value of the stock price; think of it as a dummy model. If you want to use this for real trading, I’d recommend to look for better and stronger models, but be aware: if it were easy, everyone would do it.

First let’s split the dataframe into train and test, so we can use the test set to validate the results of the dummy model — I’m going to keep the last 10% of the data as the test set:

X = df.index.values
y = df['Low'].values# The split point is the 10% of the dataframe length
offset = int(0.10*len(df))X_train = X[:-offset]
y_train = y[:-offset]
X_test = X[-offset:]
y_test = y[-offset:]

If we plot it, we get:

plt.plot(range(0,len(y_train)),y_train, label='Train')
plt.plot(range(len(y_train),len(y)),y_test,label='Test')
plt.legend()
plt.show()

Image by Author

Now let’s fit the model with the training data and get the forecast. Note that the hyperparameters of the model are fixed whereas in the real world you should use cross-validation to get the optimal ones — check out this awesome tutorial about How To Grid Search ARIMA Hyperparameters With Python. I’m using a 5, 0, 1 configuration and getting the forecast for the moment immediately after the training data ends:

from statsmodels.tsa.arima.model import ARIMAmodel = ARIMA(y_train, order=(5,0,1)).fit()
forecast = model.forecast(steps=1)[0]

Let’s see how well performed our dummy model:

print(f'Real data for time 0: {y_train[len(y_train)-1]}')
print(f'Real data for time 1: {y_test[0]}')
print(f'Pred data for time 1: {forecast}')---Real data for time 0: 1776.3199462890625Real data for time 1: 1776.4000244140625
Pred data for time 1: 1776.392609828666

That’s not bad — we can work with it. With this info we can define a set of rules based on whatever we want to do, like holding if it’s going up or selling if it’s going down. I’m not going to elaborate on this part because I don’t want y’all to sue me saying you lost all your money, so please go ahead and define your own set of rules :) In the meantime, I’m going to explain the next part: connecting to the broker.

Connecting to the broker

As you probably have guessed, this part highly depends on the broker you’re using. I’m covering here two brokers, RobinHood and Alpaca; the reason is that both of them:

  • Have a public API (official or not) available.
  • Do not charge commissions for trading.

Depending on the type of your account you might have some limits: for instance, RobinHood allows just 3 trades over a 5 day period if your account balance is below 25000$; Alpaca allows far more requests but still has a limit of 200 requests per minute per API key.

RobinHood

There are several libraries that wrap the RobinHood API, but sadly, as far as I know no one of them is official. Sanko’s library was the biggest one, with 1.5k stars in GitHub, but it has been discontinued; LichAmnesia’s has continued Sanko’s path, but has just 99 stars so far. I’m going to use robin_stocks library, which has a little over 670 stars at the moment of writing this. Let’s install it:

$ pip install robin_stocks

Not all actions require login, but most of them do, so it’s useful to login before doing anything else. RobinHood requires MFA, so it’s necessary to set it up: go to your account, turn on the two factor authentication and select “other” when asked about the app you want to use. You will be presented with an alphanumeric code, which you will use in the code below:

import pyotp
import robin_stocks as robinhoodRH_USER_EMAIL = <<<YOUR EMAIL GOES HERE>>>
RH_PASSWORD = <<<YOUR PASSWORD GOES HERE>>>
RH_MFA_CODE = <<<THE ALPHANUMERIC CODE GOES HERE>>>timed_otp = pyotp.TOTP(RH_MFA_CODE).now()
login = rh.login(RH_USER_EMAIL, RH_PASSWORD, mfa_code=totp)

To buy or sell is pretty easy:

# Buying 5 shares of Google
rh.order_buy_market('GOOG', 5)# Selling 5 shares of Google
rh.order_sell_market('GOOG', 5)

Check the docs for advanced usage and examples.

Alpaca

For Alpaca we are going to use the alpaca-trade-api library, which has over 700 stars in GitHub. To install:

$ pip install alpaca-trade-api

After signing in your account you’ll get an API key ID and a secret key; both are needed for login:

import alpaca_trade_api as alpacaALPACA_KEY_ID = <<<YOUR KEY ID GOES HERE>>>
ALPACA_SECRET_KEY = <<<YOUR SECRET KEY GOES HERE>>># Change to https://api.alpaca.markets for live
BASE_URL = 'https://paper-api.alpaca.markets'api = alpaca.REST(
ALPACA_KEY_ID, ALPACA_SECRET_KEY, base_url=BASE_URL)

Submitting orders is slightly more complex than with RobinHood:

# Buying 5 shares of Google
api.submit_order(
symbol='GOOG',
qty='5',
side='buy',
type='market',
time_in_force='day'
)# Selling 5 shares of Google
api.submit_order(
symbol='GOOG',
qty='5',
side='sell',
type='market',
time_in_force='day'
)

That’s it! Note that leaving your credentials in plain text is a very, VERY bad thing to do — do not worry though, we’ll switch in the next step to environment variables, which is far safer. Now let’s deploy everything to the cloud and monitor it.

Deploy and monitoring

We are going to deploy everything in AWS Lambda. This wouldn’t be the best option for a production system, obviously, since Lambda does not have storage and we would want to store the trained model somewhere, for instance in S3. However, this will do for now — we’ll schedule the Lambda to run daily, training the model every time with the data from the current day. For monitoring purposes we’ll set up a Telegram bot that will send a message with the action to be taken and its outcome. Note that AWS Lambda is free up to a certain limit, but be aware of the quotas in case you want to send lots of messages.

The first thing on the to-do list is creating a bot. I followed the official instructions from Telegram:

  • Search for the user @BotFather in Telegram.
  • Use the command \newbot and choose a name and username for your bot.
  • Get the token and store it somewhere safe, you’re going to need it shortly.

Next step: deployment. There are several ways of deploying to Lambda. I’m going to use the serverless framework, so let’s install it and create a template:

$ npm install serverless --global
$ serverless create --template aws-python3 --path ai_trading_system

That will create a scheduled_tg_bot folder with three files: .gitignore, serverless.yml, and handler.py. The serverless file defines the deployment: what, when, and how it is going to be run. The handler file will contain the code to run:

import telegram
import sys
import osCHAT_ID = XXXXXXXX
TOKEN = os.environ['TELEGRAM_TOKEN']
# The global variables should follow the structure:
# VARIABLE = os.environ['VARIABLE']
# for instance:
# RH_USER_EMAIL = os.environ['RH_USER_EMAIL]def do_everything():
# The previous code to get the data, train the model
# and send the order to the broker goes here.
return 'The action performed'def send_message(event, context):
bot = telegram.Bot(token=TOKEN)
action_performed = do_everything() bot.sendMessage(chat_id=CHAT_ID, text=action_performed)

You need to change CHAT_ID to the ID of the group, the channel, or the conversation you want the bot to interact with. Here you can find how to get the ID from a channel and here is how to get the ID from a group., , , , , , , , , , , , , , . .

Now, we’re going to define how to run the code. Open serverless.yml and write:

org: your-organization-name
app: your-app-name
service: ai_trading_systemframeworkVersion: “>=1.2.0 <2.0.0”provider:
name: aws
runtime: python3.6
environment:
TELEGRAM_TOKEN: ${env:TELEGRAM_TOKEN}
# If using RobinHood
RH_USER_EMAIL: ${env:RH_USER_EMAIL}
RH_PASSWORD: ${env:RH_PASSWORD}
RH_MFA_CODE: ${env:RH_MFA_CODE}
# If using Alpaca
ALPACA_KEY_ID: ${env:ALPACA_KEY_ID}
ALPACA_SECRET_KEY: ${env:ALPACA_SECRET_KEY}functions:
cron:
handler: handler.send_message
events:
# Invoke Lambda function at 21:00 UTC every day
- schedule: cron(00 21 * * ? *)

This code tells AWS the kind of runtime we want and propagates the Telegram token from our own environment so we don’t have to deploy it. Afterwards, we’re defining the cron to run the function daily at 21:00 UTC time.

The only thing left is to get the AWS credentials and set them, along with the token and the rest of variables, as environment variables before deploying. Getting the credentials is fairly easy:

From your AWS console:

  • Go to My Security CredentialsUsersAdd user.
  • Choose a username and select Programmatic access.
  • Next page: select Attach existing policies directlyAdministratorAccess.
  • Copy the Access Key ID and the Secret Access Key and store them.

That’s it. Now, let’s export the AWS credentials and the Telegram token. Open a terminal and write:

$ export AWS_ACCESS_KEY_ID=[your key goes here]
$ export AWS_SECRET_ACCESS_KEY=[your key goes here]
$ export TELEGRAM_TOKEN=[your token goes here]# If using RobinHood
$ export RH_USER_EMAIL=[your mail goes here]
$ export RH_PASSWORD=[your password goes here]
$ export RH_MFA_CODE=[your mfa code goes here]

# If using Alpaca
$ export ALPACA_KEY_ID=[your key goes here]
$ export ALPACA_SECRET_KEY=[your key goes here]

Install the necessary packages locally and finally, deploy everything to AWS:

$ pip3 install -r requirements.txt -t . --system
$ serverless deploy

We’re done! The bot will trade for us every day at 21:00 UTC time and will message us with the action performed. Not bad for a proof of concept — now I can tell my friend he can stop frantically checking his phone to trade :)

Note that all the resources we’ve used through this tutorial have their own documentation: I encourage y’all to go deeper on whatever you think is interesting — remember that this is just a toy system! However, as a toy system, I believe it is a good starting point for a richer, more complex product. Happy coding!

You can check the code in GitHub.

References:

[1] P. Collins, Best Stock APIs and Industry Landscape in 2020 (2020), Medium

[2] R. Aroussi, Reliably download historical market data from Yahoo! Finance with Python (2019), Aroussy.com

[3] J. Brownlee, How to Grid Search ARIMA Model Hyperparameters with Python (2017), Machine Learning Mastery

[4] J. Brownlee, How to Make Out-of-Sample Forecasts with ARIMA in Python (2017), Machine Learning Mastery

[5] Serverless team, AWS Python Scheduled Cron Example, GitHub

--

--