Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Wallstreetlocal – View investments from America's biggest companies (github.com/bruhbruhroblox)
299 points by anonyonoor 6 months ago | hide | past | favorite | 54 comments
Hello Hacker News! My name is Anonyo, and I am a seventeen-year-old from Southeast Michigan. This is wallstreetlocal, my passion project for the last year (and a half). I've posted this before, but I've finally open-sourced this entire project, so I thought I'd post it again.

Heres the short pitch.

The Securities and Exchange Commission (SEC) keeps record of every company in the United States. Companies whose holdings surpass $100 million though, are required to file a special type of form: the 13F form. This form, filed quarterly, discloses the filer's holdings, providing transparency into their investment activities and allowing the public and other market participants to monitor them.

The problem though, is that these holdings are often cumbersome to access, and valuable analysis is often hidden behind a paywall. Through wallstreetlocal, the SEC's 13F filers become more accessible and open.

By exploring the website (and the code), you can see the resources I used, check out some notable money managers I listed, and download any data that suits you. All for free. (Note, the mobile site likely needs work.)

I made this project to better democratize SEC filings, and also to get some experience on my hands. I love computers, and one day hope to get involved with startups. In the comments, I'd appreciate any and all advice, as well as feedback on how to improve the site.




Dang, I saw the name and hoped it was a map-based app that showed you ownership of things around you (further intriguing me because I don't believe such data exists at a local scale anyway).

Great project though! Opening up these sort of semi-encumbered datasets is what keeps humans well informed

I'm from MI as well and always wondered about deeper datasets for watching money and influence change hands


There are a lot of GIS / mapping software that shows property ownership and their boundaries, useful for hunting. onX, for example, and I think you can purchase 'ownership layers' on stuff like Avenza.

I agree with what you're saying though, it would be cool to take something like those property owner layers and find the ultimate legal entity for stuff like LLCs owning land.

E, only semi-related: In the 'urban design' part of the internet I've seen really cool mapping that puts bar charts on top of a city's grid, with the bar charts being how much tax revenue the city generates. It's really stark to see skyscraper-sized bars in downtown cores and mostly flat all around where cities have zoned residential separate from commercial, or even where suburbs tax less than the core city.


> There are a lot of GIS / mapping software that shows property ownership and their boundaries, useful for hunting. onX, for example, and I think you can purchase 'ownership layers' on stuff like Avenza.

Gaia GPS [1] is the one I use. It's got a lot of layers for free including land ownership which properly shows all of my neighbors plots. I use it often when collecting rock specimens and mushrooms to make sure I'm on public land that allows it or to figure out who to seek permission from.

[1] https://www.gaiagps.com/


Thanks for the pointer, I just downloaded the app. What is the name of the layer showing property boundaries for free? I can see the “Private Land (US)” but it’s prompting me to sign up for premium when I click on it.


Oooff that's my mistake. I might be grandfathered in on the premium layers and didn't realize.


Ah ok yeah I thought that was too good to be true. Currently using onX for this but am paying $30/year for the private land layer


Most of the information these services draw from is publicly available, and the stuff that isn’t is of questionable accuracy.


The app landglide does what you're saying pretty well.


Very interesting project, I like the overview. I also really like that you took the finance industry as a theme for your project.

>> Every company in the United States

Sorry for being so fussy but I highly recommended changing the word 'company' / not using it in the future, as the title is quite misleading. No private company in the US has to register with the SEC or has to file with the SEC. 'Investment advisors', who also go by other aliases like 'asset manager', have to file a 13F filing only if they a) are registered with the SEC due to fund marketing purposes and b) if they have, as you already mentioned, over $100 million dollars under management (not 'in holdings'). This is also why large family offices (.e.g. Bayshore Global Management of Sergey Brin) won't show up in any SEC records as they meet the second but not the first criteria - same goes pretty much for any non asset-management company (e.g. McDonald's) as they do not raise money for fund vehicles. However, you have take this into account on your website, and further below you wrote "money manager" which is correct in finance jargon.

I hope this gives you a better understanding, keep up the great work.


I think that’s not entirely correct either, for example Nvidia had to file a 13F for its holdings after their value of ARM shares exceeded 100M. And the requirement I believe is not for investment advisors (which is its own thing, see Register Investment Advisors from the SEC), but broadly extends to institutional investment managers which is a lot looser of a definition that can certainly include private companies like broker-dealers, in addition to hedge funds, RIC, etc as well as public corporation like Nvidia depending on the types of activities they are involved in.


Good point, I did not mention that. Non-registered investment advisors are also included but this relates more to banks, bank holdings and broker/dealers (e.g. investment banks) who trade on 'investment discretion'[1]. Thus it means entities that are not registered as an investment advisor but still invest on behalf of outside clients and not their own book. This does clearly not apply to Nvidia. It is highly unlikely and not logical that they would manage money for third parties in the name of the public corporate entity, just for a pure side hustle. As that would not contribute much financially (in terms of fees that are earned from managing money on behalf of others) nor to their core business.

The reason for this 13F filing, as you already guessed to some extent, is that Nvidia is a publicly traded company. As such it is subject to a wide range of SEC filings including those from section 13[2].

Nvidia seems to be a rare case. Acquiring public equity, as a company - especially as a public one, just for the purpose of managing concurrent assets, is very unusual but not out of the question - just away from the textbook. Given the fact that its ARM, whose acquisition failed before Nvidia filed the 13F, it could also serve some other purpose, e.g. showing that interest is still present.

[1] https://www.sec.gov/divisions/investment/13ffaq#:~:text=Bank....

[2] https://www.legalandcompliance.com/securities-law/sec-report...


I geniunely did not know that there was a real difference between companies and asset managers. Sorry for the confusion.

Thanks for informing me, and for the kind words, I'll make sure to avoid using the word company from now on.


[flagged]


You crossed into personal attack with this comment. That's not ok, and it looks like you've done it in other places as well (https://news.ycombinator.com/item?id=39245439). Can you please avoid that? It's against the site guidelines and destroys what HN is supposed to be for.

If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.

Btw, saying "I don't mean to be rude." doesn't change this—if that is your intent, then you need to edit your comments so they express your intent unambiguously. More on that here if you want further explanation: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que....


Guess I just missed it. The SEC's search database provides info for companies and money mangers alike, so I wrongly assumed they were the same thing. Sorry.


They’re being rude, I don’t think you have anything to be sorry for. Very cool project, well done!


Impressive work. Just a small comment - It seems to not able to track the prices after bonus or a stock split. If you can adjust the paid price for stock to account for that. Like for google i see the price: $1413.61


same with AMZN (Berkshire Hataway holdings for example), 1kUSD -> 100 USD because of stock split (eventually shows -90% performance)


Nice job! Working on things is really the only way to get better and this is a great project to sink your teeth into.

My advice is: Keep working, keep learning. If you love computers and want to work for a startup you absolutely have what it takes to make that happen. And if there are no startups near you which are right for you, you can found your own.


Lots of competition here as services like WhaleWisdom are quite powerful at the basics around 13F.

Some ideas:

* Cluster the 13Fs into buckets. Performance buckets and volatility buckets and aggressiveness buckets. * Build model portfolios that blend holdings from the best performers. * Do some basic regime analysis and find which 13F filers perform best during the various regimes.


I have learned something today, thank you for that! The pitch is clear and the fact that you put so much work in open sourcing it is really impressive.


Please consider contributing data from this to https://www.data-liberation-project.org/

https://www.data-liberation-project.org/datasets/

Awesome work!!


Awesome project, I have submitted a request.


Strange, suddenly Vanguard Group doesn't show up anymore under top filers: https://www.wallstreetlocal.com/recommended/top But they should be the same amount of assets like Black Rock: https://en.wikipedia.org/wiki/List_of_asset_management_firms

Why did they vanish from the list? Maybe a data bug?


Congrats on the project!

Here’s an idea to monetize it: implement collaborative filtering (example: funds in rows, assets in columns).

Once you do it, you will be able to cluster similar funds. Let’s say X funds have the asset A, but Y funds do not have it, and X and Y belong to the same cluster. Thus, if you recommend asset A to the Y funds, there’s a high probability they’d add it to their portfolio (if property pitched). This is roughly how Netflix recommends movies, Spotify recommends songs, etc.

A lot of players in the industry would pay high $$$ for a recommender system like that. And you already made 80%: it’s only missing the final machine learning part (which is the fun part :))


This is a great idea, especially since I am already learning machine learning with Python right now. I will try this, thanks.


Feedback is: by making this an npm thing, you reduce the pool of people that will use this.

This should really be a lib that takes a folder of 13F forms and outputs a csv, or something like this. That's it. No need for webapps or whatever.


There's actually a couple of libraries that already do this, and this project can probably function like a libary if edited here and there.

The problem I found though is that you can't really just use the raw filings. The data is much more useful when the stocks are queried with third party APIs and organized along with things like recent price data. This project alone uses three APIs, and while you could include that in a library and force the developer to get three different API keys, it just works better as a service.

If a "library" is all you want though, the API is available with documentation. There's also 13.info, an open-source project that predates this one. Although it is still a service, it is more like a library and could probably be used like one.

https://13f.info/


Hi! This looks great. I am playing with SEC data at Embarc.com and found that ingesting the filings is no fun. Well, if it is xml, it is fine. But 8k filings are all HTML. Thankfully they have standard section names.

Anyhow, are you downloading the sec filings locally and processing them? It can be a lot of files! The EDGAR database has a lot of files in there. I download stuff daily, add to sqlite, and then process into various other things. I had to do some app side compression as the sqlite file gets big!


For the search database, I did have to manually download each and every company, all 856,000 of them. To cut down on size though, I included all data points except for filings. This is because they're large, but also because they're updated often. By excluding them the database stays more up-to-date.

Other than the search database which needs to be available on demand, the rest of the filers are queried and analyzed on demand. This is required because some filers get really, really big. Blackrock Inc alone is about a 30MB file.


I get the list of filings, and for now only download the ones I want, which are related to S-1 or 8k (there are lot that are related, including withdrawals of a filing, updates to a filing, etc., and some companies use an S-4 etc).

Right now I scan the list of daily filings, and for every cik I mark them as dirty. Another pass looks at ciks marked dirty, and downloads the xml or json of the filings. I then scan for anything new, create filing rows and mark them as dirty so to speak.

Annual reports can be big, and for my purposes I don't need them so I skip them. The HTML of these things is garbage! But a lot of financial data is in the company facts which is nice and clean.


This is a fantastic project. Thank you for releasing it!

Do you have a plan to expand with new features and are you looking for contribution?

Will the database be updated automatically?


I plan to add a lot of features, and all suggestions are appreciated.

As for contribution, I am definitely looking for it. I have not been in the open-source field for long so I don't exactly know how to get it, but I would highly appreciate it if anyone could help.

Contribution would be especially helpful since I was still learning a lot of the technologies I used while I built this project, and the code is prone to newbie mistakes.


I am working on a somewhat similar project, for searching items 1 and 1a in 10-k annual reports, that I am hoping to release in the near future. I would be interested to hear what lessons you end up learning about scaling up to handle the interest you got from HN.


Definitely limit the more disk heavy features, or spend more time (and money) on infastructure. I was running the whole site on an 8vCPU 24GB RAM VM, and it almost immediately crashed due to the high disk reads.

This is likely due to the fact that the database is huge, and providing that data on demand is very resource intensive- especially when there are forty different people sending many requests a second.

If you don't want to compromise on data though, look into spending a little bit more time/money on infrastructure. I wish I had deployed the project on Kubernetes, instead of what I ended up doing.


Isn't there a website where you can view the trades by day of executives at large companies and see what moves they're making?

I used to have it bookmarked


This is a cool concept, great job on the project so far!


Whalewisdom and gurufocus have similar offerings. Not sure if they are 100% free.


Both offer premium memberships, and can often be really restricting when viewing/downloading data.

The main feature I added because of WhaleWisdom was the data download. For any filer you can download all data in CSV or JSON, as on WhaleWisdom that's unavailable. I also plan to add a bunch of features those sites have eventually.


Yeah, I'll throw my SEC Filings site https://Last10K.com into the mix too that has a free feature on how a manager's portfolio changed by tagging which stocks are new / sold / increased / decreased from the previous to current quarter. Here's an example from the OP's "Popular" and "Top" filers:

Tiger Global: https://last10k.com/sec-filings/1167483

Ruane, Cunniff Goldfarb: https://last10k.com/sec-filings/1720792


I think your site may be getting hugged to death right now. I have a feeling you'll learn a lot more about scalability after being on the front page of HN :-)


Wow, you really nailed it on the head.

I have been tearing my hair out for the last hour trying to get my Always Free Oracle instance running again.


Great work, keep going!


So you're 17. It's a pretty unique niche for a young person, so I'd be interested to know where your curiosity in this comes from, and even more so, why you decided to share the output of your curiosity? What got you interested? Where do you see it going? Are other people around you at your stage in life also interested in this stuff?

imo you're on a great path, stick with it!


My cousin told me about 13F filers about a year and a half ago, and I was shocked to find out that all SEC data is public. After I saw that a lot of sites restrict data for SEC filings, I set out to find the SEC filings directly and make them more accessible.

This project was a way to use my web development experience for a good cause, and to learn a lot along the way. I hope to improve the project thouroughly though user suggestion, then maybe hand it off to other people. Like I said in my post, my main goal is to one day get into start-ups, so I can create good in the world through them. This is hopefully the first step in that journey.

Thanks for your kind words.


  504: GATEWAY_TIMEOUT
  Code: FUNCTION_INVOCATION_TIMEOUT
If the data changes irregularly, you’re probably better off making it a static site and having a script update it periodically, also to avoid excessive cloud charges (since you seem to be hosting this on Vercel).


Vercel hosts the front-end, but the back-end is whats down right now. I thought I could rely on my Always Free Oracle cloud instance but I guess not.

The problem with making a static site is that there are over 800,000 SEC filers, so it would be impossible to query all of them and store it.

I hadn't expected so much traffic, so I really have no clue how to handle this without excessive cloud charges. The best I've done so far is to look into free hosting for open-source projeccts and add a donation link to the homepage.


You can query a SQLite DB entirely client side over HTTP: https://github.com/proofrock/ws4sqlite

The pre-processed data would have to be served from somewhere though. I'm not sure if GitHub could be used to host.

If not, Scaleway Stardust includes a bit of disk, 75GB free S3-compatible storage, and most importantly: free 100mbps outbound data transfer for < $4/month.

There are probably other cheap shared web hosts that claim unlimited data transfer but not sure they'll deliver.


> The Securities and Exchange Commission (SEC) keeps record of every company in the United States.

No. SEC keeps tabs on publicly traded companies, large institutional investors and the like. Hence the 'securities' and 'exchange'. Most companies in the US are private and not publicly traded. And most companies are not asset managers.

> and valuable analysis is often hidden behind a paywall.

Sure, but that's because analysis is valuable.

> I made this project to better democratize SEC filings

The SEC already makes the information publicly available via their edgar system.

> In the comments, I'd appreciate any and all advice, as well as feedback on how to improve the site.

I'm not sure who your target audience is. Finance companies already have internal teams pulling this kind data for them or they buy it from finance data providers. As for the general public, what need do they have for top 'filers'? As for me, I'd rather dump the data into a database or excel and query it rather than looking at a static page.

At the very least, don't make the site static. Try to add a spreadsheet. Or at the very least a sorting option. Also, vanguard, blackrock, etc aren't top filers. They are the largest or top asset managers. Maybe tie news stories to each asset manager? And more data? But as I said, not sure what the value here is for the end user. Why would I or anyone else ever use your site? Or better yet, what do you use the site for?


Private companies don't have to file 13Fs, sure, but it's not true that the SEC doesn't 'keep tabs on' them, they still have to be non-fraudulent in their private offerings, etc.


[flagged]


To find value ahead of other market participants.

https://www.cnbc.com/2024/02/15/nvidia-holdings-disclosure-p...


As if that would be real...


Gains are gains, regardless. You don’t have to believe for the number go up.


There's no gain to be made like that. The fact that you mention Nvidia is a selection bias: you won't mention the 1000s of other companies for which that didn't work, or even you would have lose money.

I could tell you that stocks starting with a vowel have positive returns in Christmas of odd years. Even though "gains are gains", it's just a selection bias as well and isn't tradable.

As a non professional investor, you are better off considering that any information you have access to is already priced in the market. If the price is lower than what you think, it's probably not a "good deal", but rather that you didn't understand the risk that other most sophisticated players priced.


Other retail traders buy on 13F, so number go up?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: