For anyone interested in stock market API's and financial data, I started a YouTube channel on this topic last year which is starting to gain some traction:
It covers popular services and libraries like Alpaca, Robinhood Private API, TD Ameritrade option prices, Yahoo Finance, TradingView alerts, backtesting, and a lot more.
Curious if you have used your foundational data infra for trading live? Were you able to beat a benchmark (e.g., beta-adjusted index returns?)
I'm asking because I've used semi-pro and pro infra (paid services) and it was hard to get consistent returns. It was easy to get returns, but rarely beyond, say, S&P 500 returns (at which point, I might as well just invest in the S&P 500 index.)
What you really want to know is "how problematic is the noise in this data"?
One way to answer it is to create your own FF sorts [1] and regress your Yahoo-derived factor returns against Ken French's (which come from CRSP/Compustat). If the intercepts are small/statistically insignificant and the R2's reasonable, then you're all set.
The big problem you will hit is survivorship bias in Yahoo!. My own research suggests that the quality is perfectly acceptable, provided you back-fill an unbiased universe (e.g. Russell 3K) from another source.
I'm actually surprised you didn't outperform the market with the survivorship bias. Back test must have covered a very small time period.
Yahoo! doesn't have data for delisted or bankrupt companies, a "survivorship bias."
You need a set of companies that lacks this bias, and the set of Russell 3000 index constituents is a good set to use.
Some of the older Russel 3000 companies will be missing from Yahoo, because they are delisted or bankrupt. You need to find a secondary source for them, or "back-fill" your securities master.
My sense is that the golden age of the home API trader is about to emerge and I'd love to hear how things are going for other tinkerers. From my side, I decided to dive back into math, which I've ended up enjoying so much that I haven't surfaced yet to put any of it to work.
Been trading for roughly 3 years now. And yup, I'm exactly in the same boat, just recently dove back into filling all the holes i have in my knowledge (portions of math/stats)
Khan is so great for that. Sal's contribution to society is massive. And he's great to listen to. Even at 1.5X speed. :-)
Math is funny that way. Miss out on some fundamental aspect and it'll ruin you. And Khan explains some really basic concepts in ways that has given me new perspective or insight.
Random question but is this Tel from the east coast? (wont name the city to respect anonymity). Think we worked together before :)
Depends how new you are to it all, but assuming you just know basic trading stuff (what a bid/offer is, what options greeks are, etc..). Everything I do is stat arb type of trades. My path won't be the same as yours, but here's how I started.
What I did...
- read literally every link and topic under https://www.investopedia.com (the education tab -> investing trading section on the right)
- quantopian.com - look under the learning tab. I went through all the material - literally all of it. There's a series of Lectures that explains a ton
I read a _ton_ and played around with lots and lots of data to try and get my systems working. Still no where near where I'd like to be, but been profitable since the beginning (minus a giant loss on some gambles I took - I no longer trade discretionary because of this). The links I pasted above are what I'd consider the "easy" part of all this, anyone can read and learn. Learning the more in-depth stuff around specific topics is harder, but I have a bunch of books (willing to share some recommendations if you need).
Once you progress, you'll find specific areas you're interested in (equity options, commodity futures, spread trades, fixed income/bonds, etc.. ).
edit: think I should follow up with a comment/question I ask myself alot. Is all the work/time worth it? In my case, I'd say yes, because I've always had an interest in finance and literally would be reading + working with markets on the side for fun anyway. If I think of all the effort poured into this vs say working for some large tech company... I'd probably say just go work the high paying tech job. Obviously you can make money in the markets, but the tradeoff for keeping your sanity is a very real question you have to ask yourself.
Very cool. It sounds like you're basically treating this as a profitable hobby. How much alpha are you achieving, and how many hours a week would you say you put into it? Are you trading equities only or options and other derivatives as well?
it's definitely more than a side hobby. I probably put about 20 hours/week while also working 40 hours my normal work. I've also had stints where I worked on trading full time (6 months+ stretch). That's why I sent my follow up comment on "is it worth it". If your serious about trading, you need to understand the level of commitment. Also add the stress levels from when you inevitably hit bumps in the road (my large loss was a huge damage to my psyche at the time). It's not for everyone.
To answer your question, I started in equities but moved to commodity and index futures/options.
The important math for understanding the foundations of finance and business isn't advanced math. It is accounting - primarily addition, subtraction and simple algebra or calculus 101 for things like interest calculations or discounted value modeling. This basic math also underpins the "fundamentals" of business used in objective, common-sense investing - the kind Warren Buffett is fond of.
Also, "stocks" and math are a classic "the map is not the territory" situation. Math describes stocks and business performance very well but does not define it. A machine learning algorithm trained on historical price data in concert with differential equations of 1,000 variables will historically do no better at investing than buying and holding index funds.
So the thing to focus on isn't going from math -> stock trading. It is to learn accounting and business and basic stock market concepts. Eventually a math background will help, but it's support not the foundation.
I've been up to this recently. None of the maths I've used is very advanced yet. I've mostly been trying to use simple statistical tools like regression analysis as well as possible. I know a few people who have found employed in finance after having done graduate work in stochastic processes, but I'm not sure what the payoff would be for some of the more advanced stuff as a non-institutional trader.
Do you think this is the beginning of the golden age for home API traders due to market conditions or the tooling that enables home traders to operate getting better and better?
Not the OP, but I think it's the latter (tooling getting better and easier) combined with lower (or close-to-zero) trade execution fees and more widely available information.
What are some good fundamental data providers for small-time amateur investors (non-professionals)? I am looking for 1. US + international stocks 2. deep history of all historical accounting records (15+ years) 3. api, ideally with python examples
They're the only provider if you're not willing to spend several thousand per year. Beware that a lot of their info is scraped from other sources and does not have accuracy guarantees.
You can also scrape the fundamentals directly from the 10-K statements each company publishes, but it's very difficult to get a clean, consistent dataset out of it.
Ahh, thw beauty and versatility of Python again. And the momentum doesn’t stop. Never regret jumping off of Perl, then Ruby ten years ago to fully commit to Python. I can see so many use-cases for it (with the rich ecosystem), on an almost daily basis.
Any programming language with a http client and a sqlite driver won't have problems with this example. Although, this is so simple that even sqlite might be overkill for analyzing the data.
Python quickly falls over with large datasets. Its great as a "glue" language or for POCs for large data processing or handling jobs - but at some point you need to graduate to faster things...
What’s «large»? I’ve set up multi-TB analyses without issue, using dask. I don’t know of a more productive language for such analyses, either. What is even better?
Between things like pyspark or using a rdbms you can scale pretty far. Even with other languages like java this is the case. Once your dataset goes beyond a single machine you need some kind of data platform.
Language speed becomes mostly irrelevant compared to the strategy, framework, toolset, and dataset design for parallelization.
If you have a use case where a program in C runs in a reasonable time on your laptop but one in Python doesn't, that solution is only going to take you so far before you'll want to graduate from your laptop and take advantage of the Python ecosystem again in real big data contexts.
Well that just isn't true. The only reason Python is remotely even usable in this context is because it's merely a wrapper around C the vast majority of the time. Language definitely matters.
For me I know about what this data stands for and what it represents, I just don't know what to do _after_ acquiring this sort of data. Any pointers on that? I live in a really poor country where I can't do algotrading anyways but the eventual goal is to venture into that field by moving abroad, any pointers on what I can build right now with free APIs or from data like this?
I bought historical option data in the past directly from CBOE [0]. Depending on what you need the prices are very reasonable. E.g. for a single symbol ~$70-$80 for a year depending on the interval you need. There is a big discount if you just buy all the symbols.
I cannot recommend the stock prices though. Somehow there were just a lot of mistakes and you can get them from other sources cheaper.
I didn't see much scraping on that site. It just looked like regular calls. Maybe the API is not as clean or consistent as it could be. But anyway I did a quick search for Alpaca. This the one you meant: https://alpaca.markets/?
https://www.youtube.com/parttimelarry
It covers popular services and libraries like Alpaca, Robinhood Private API, TD Ameritrade option prices, Yahoo Finance, TradingView alerts, backtesting, and a lot more.