My name is Christophe, I’m 40 and professor of physics at Paris University. I’m also doing research in fundamental
physics and hold a PhD in astrophysics. I have strong skills in machine learning too. While I must say I lost money on
manual trading I decided to investigate automated strategies based on machine learning during the French lockdown. This gave me the following model that enters positions in a pool of 32 possible cryptocurrencies, the list of which is given on the next page.
1. The model
The model answers an allocation decision problem on a daily basis. Leverage is 1, and we have 8 slots of 12.5% each to fill. The model first determines the weight allocated to the risk-free asset (USDT). Say it is 4*12.5 = 50%. Then it chooses in the pool of 32 assets which 4 are likely to go up until the next day. Then it would buy, say, 12.5% of BNB,
12.5% of SOL, 12.5% of Matic, and 12.5% of ETH. Of course the USDT allocation can vary from 0 to 100%.
The decisions of the model are based on machine learning training on the past dataset. In particular I have used only a small set of data from major cryptos such as BTC, ETH, BNB, … from 2017 to the end of 2020. The learning is based on genetic evolution algorithm, and selects best performing strategies given features computed from the past data, in particular very mainstream indicators such as moving averages of various lengths, momentas, and so on.
During my research I have come to realize that these indicators are poor predictors for the evolution of a given asset, but that their correlations are very relevant in order to rank the relative strength of one asset with respect to another. This is why I finally went to an allocation problem instead of looking for a long/short strategy on only one pair. The algorithm thus generates many indicators of the like and learns how to combine them in order to predict which asset is more likely to go up in the next 24 hours.
The following backtest runs from beginning of 2020 until 19th of October 2021, that is approx. 650 days. Note that past data have been used for training in the first part of this backtest, but also note that data from 2021 have never been seen by the learning algorithm. The only place we used the 2021 data were in selecting which 32 pairs to trade amongst the ~ 70 that Binance futures proposes.
Therefore the 2021 data is very much more like a true test than a backtest. By the way, an early version of this model actually tweets the daily allocation every day at 4pm (@bestcryptobot on Twitter) since the end of July. (The twitter model trades only 15 pairs). Fees are included in the calculation (0.1% for the spot). Futures fees are actually lower,
but we haven’t taken into account funding fees.
In going live on Diabolo, the pool of 32 pairs against USDT will be the following : BTC, ETH, XRP, ADA, BNB, ENJ, VET, ZIL, BTT, THETA, NEO, ICX, SOL, AVAX, EGLD, DOT, DOGE, WAVES, DENT, ONE, HOT, CHZ, ETC, LINK, XLM, FTM, MATIC, TRX, BCH, XMR, LTC, BAT. I may update this list from time to time.
The model is mostly Long, although it sometimes takes some shorts. Basically, this model wins when the market goes up and loses less than the market when it goes down. As a consequence, it may do nothing (100% USDT) for a quite significant amount of time (see the figures below). The backtest starts a couple of months before the Covid Crash. In the following, the portfolio value is in green, the BTC in orange, and the red is the average of the said 32 assets. Everything is scaled to 100 at the initial time. Horizontal axis is the number of days. These are the first 200 days :
This is now the 400 first days (basically all 2020) and the beginning of the bull run. Note that the model does almost nothing for almost 100 days around all summer 2020 :
And finally this is the full backtest including the amazing bull run of altcoins last spring. Because money is reinvested in these calculations, the model is nearly exponential during three months and achieves a tremendous performance. Again, these are data not used in the training process. (time = 400 corresponds to December 2020).
We note two things : first, the crash of May 2021 happens around time = 520. The market is down 50% or even 70% for some alts and the model is only down – 25%. Then comes the huge bounce of August around t = 600. Note that this bounce was actually tweeted by the live model, and made more than 100% gains in a month, that is, more than the green curve displayed here. However, the Twitter model also lost more in September. The main reason is that it trades less pairs, hence gets more gains but with more risks.
Finally about today. Around mid-October, the BTC right now is trying to break its ATH and is literally sucking the altcoins dry. Therefore the model is not performing well (at least compared to BTC), but if we see in the future money moving back again from BTC to alts, then we can expect similar results.
Finally we note that the model’s results depend quite a lot on the hour we choose to reallocate the portfolio. The following figure shows six results for six different hours for the 2021-only performance. This shows the typical volatility you may expect on the
results. Hours around noon and 9pm (Paris local time) seem to be favored.
Figure : Six portfolio performance for 2021. Depending on the chosen hour, the annualized volatility is around 90-100%, while maximal drawdown vary between -25% and -34%.