Hacker News new | past | comments | ask | show | jobs | submit | more zzleeper's comments login

Maybe because in the places where most news are written (DC, NY, SF) crime has spiked drastically?

EG:

"D.C. Police have responded to 227 homicides so far in 2023. That's up 34% from the same time last year. It also tops 2021's total for the year, which was the highest since 2003. That means the homicide rate in the District is the highest it's been in 20 years, and climbing, with two months left in the year."

https://www.wusa9.com/article/news/crime/dc-homicide-rate-20...

https://www.axios.com/local/washington-dc/2023/10/03/dc-crim...


Homicide is a tiny fraction of all crime, and the numbers of homicides are low enough in large US cities that small changes can lead to fairly large percentages.



I was wondering about this too. Maybe the geographical distribution of crime is changing, even if overall there is less of it per capita.


Looks great! It would be very interesting to understand a bit they why/how of some of the steps, such as the reranking and how you arrived at your chunking algo.


Thank you :). I updated the README to have some more explanation of the steps.

The chunking algorithm chunks by logical section (intro, abstract, authors, etc.) and also utilizes recursive subdivision chunking (chunk at 512 characters, then 256, then 128...). It is quite naive still but it works OK for now. An improvement would perhaps involve more advanced techniques like knowledge graph precomputation.

Reranking works by instead of embedding each text chunk as a vector and performing cosine similarity nearest neighbor search, you use a Cross-Encoder model that compares two texts and outputs a similarity score. Specifically, I chose Cohere's Reranker that specializes in comparing Query and Answer chunk pairs.


If you want to get something done quickly, try llama index.

If you want to learn/hack, pick an easy vectordb, get an OpenAI API account, and do a quick attempt

Then you can switch to a local LLM and embedder, and it helps a bit in learning what the pain points are


I was in that spot a few weeks ago. My requirements were not huge but a) I was on Windows and b), didn't want to waste too much time setting it up.

Tried a few DBs that didn't work well (e.g. I think it was ChromaDB that didn't support Python 3.12) and ended up picking LanceDB.

Very simple onboarding (just built on top of parquet) but there are a few rough edges.

Curious how it compares with qdrant for non-crazy problems


I'm unsure if there is any comparison of LanceDB and Qdrant available out there, but there shouldn't be any issues with Python 3.12 and qdrant-client compatibility. Windows is also not a problem, as the typical local setup is usually based on Docker. Are there any specific features you are interested in?


I wish there would be a way to just buy sp500 minus a list of firms, so I could invest in a diversified way without going into Uber and friends (or any other firms that make profits on paper without the cash flow to back it)


> firms that make profits on paper without the cash flow to back it

Uber is cash-flow positive [1].

Due to stock-based compensation, many profitable tech firms hit this metric before GAAP. Put another way, if you owned the entire business, you could sustainably extract those profits.

[1] https://s23.q4cdn.com/407969754/files/doc_earnings/2023/q3/e...


Fractional shares would work, if you don't want the expense and unreliability of shorting.

Simplest thing would be to find an actively-managed mutual fund that focuses on what you want. (Warning: What you want is probably to lose money compared to the rest of the market.)


Be careful. The indexers might hear this and reassure you that the S&P is only optimal with Uber included at exactly the date it did and the fund will fail otherwise.


So buy SPY and short Uber?


Can you do this easily? Without paying $$ in transaction costs due to shorting (plus margin)


Most trading APIs are fairly simple, TD for example is very easy to use and very accessible. IBKR is a bit weirder in both protocol and access, but, works very well.

You can pretty trivially DIY that.


https://pebble.finance/

(Disclosure: The CEO is an acquaintance, but I have no financial interest.)


You can do this with most direct indexing solutions.


There's at least one ETF that tracks the top 50.


I wanted to point this out, albeit slighty less cynically. NickB invested a lot of capital into becoming the expert on WFH (see e.g. https://wfhresearch.com/project-team/ ).

Whenever I see this in other economists, more likely than not their findings tend to go towards where their capital is invested.


> Whenever I see...

LORDY, yes. The only solid "scientific" theory which Economists seem to have developed is the millenia-old one of predicting which "truths" each emperor / prince / nobleman / etc. will want to be told.


> the Fed can correctly model how ...

I wish :(


Also around DC area, the leasing office folks told me that the price changed dynamically and could only be overruled by the HQ. They said that many companies use that software in order to obtain their "optimal" price for the area, which is of course observationally equivalent to colluding with extra steps.

BTW, more recently I saw the same thing with large contractor companies. One quoted me an insane price for something and I asked why, and they replied that it's the price that the software (ServiceTitan) tells them to charge, and even showed me their screen.

Again, they said ServiceTitan decides the price depending on "the average price of other companies in the area" which is code word for you-know-what.


Am I missing the obvious here, a smart and nimble company can corner this market by lowering their price by X. They would have to give up X per unit but they make it up by (volume * lower_price). Isn't that how the market is supposed to work? Like if I want all the business and there's not an obvious differentiator I will compete on price?


That's not how it works because there is no volume. You can't provide more supply if there's no supply to provide. Each building has a fixed number of units and the management for those buildings are trying to maximize the value they get out of each unit. Most cities have a mostly inelastic number of units and population grows every year, so you can pretty much charge whatever you want and people will pay it since there's no alternatives.

Why doesn't someone just build a new building? Mostly they will be blocked by NIMBYism. NIMBYism is a near universal philsophy, to the point that renters who would benefit from this are typically NIMBYs too. Sometimes they're the loudest NIMBYs, It will be pretty much impossible to build anything new until the state takes control of building away from localities and forces them to allow building.


This assumes vacancy is at 0%, for cities like SF I can see this, but do all cities have near 0% vacancy rate? Genuine question, I don't know. I would offer at lower price and have all my units filled.


Not sure about smart, but the word "nimble" can never be used to described anything about real estate.


Typically the price fixing part is one small feature of a large software suite, and deviations require written justification submitted to the provider (with the implied threat of losing access to the management software).


That's nuts. I had to do 100-hour weeks a few times in my career, and felt that any time above that had zero or even negative productivity.

Is it really necessary to do so many slide decks in order for IB to be as productive?


That was basically my worst experience ever over the course of 7 years and I have never personally met anyone who's worked more ours in one week (lol?) but yeah, any "meaty" meeting is likely to generate a few hundred slides if you include all of the revisions, so it gets out of hand fast. I probably had 5-10 100-hour weeks in any given year, but 70-80 hour weeks would be pretty normal. (I probably also had 4-5 40-hour weeks in any given year, but you usually everyone keeps those on the DL...)

Shit compounds when there's a deadweight on the team, or a managing director that doesn't know what they want / has poor planning, or everyone's busy on other projects or the situation is actually complex. In that particular case, it was a combination of a deadweight analyst on a very lean team (I was the only associate, no VP, one very senior director trying to make MD and 2 other super senior MDs... for a $2B M&A deal), poor planning and an actually complex situation (crossborder, post-bankruptcy former public company with public debt still outstanding, multiple tax jurisdictions, CFIUS issues, you name it) leading up to a board meeting on Friday (to which I took a flight to deliver the books, which added to the hours...)


You made me search for 2022-23 studies on HCQ, to see if anything changes.

Behold, still fucking useless: https://www.drugs.com/medical-answers/hydroxychloroquine-eff... (one of many meta studies). Where do you get that disinfo??


https://www.medrxiv.org/content/10.1101/2023.04.03.23287649v...

It appears to improve mortality, though not necessarily symptom duration.

    HCQ-AZ treatment was associated with a significantly lower mortality rate than no HCQ-AZ after adjustment for sex, age, period and patient care setting (adjusted OR (aOR) 95% confidence interval (CI) 0.55, 0.45-0.68). The effect was greater among outpatients (71% death protection rate) than among inpatients (45%).


3rd factor problem. People with parasites are highly likely to die of covid so treat the parasite decrease mortality


Different drug (you're referring to Invermectin).

Still, I'm not sure there's any reason to trust this latest Didier Raoult paper any more than the meta analysis carried out of numerous papers and studies into AZ/HCQ carried out by institutions that hadn't staked their reputation on it being a cure in March 2020


This was a French study, on French people.

So far as I know, France doesn't have a serious human parasite infection problem.


Literally searching for HCQ-AZ on duckduckgo gives you the results quick, it's 2nd and 3rd on the list.


Why do you call it disinformation instead of him just being wrong?

This obsession with disinfo, misinfo, malinfo portends exceedingly troubling things for our ability to talk with each other on the internet.


The person he was responding to wasn't simply making a [possibly] incorrect statement about medical treatments, he was insisting that the media generally favouring the emerging consensus of medical professions on treatments was "state lies after state lies" and "all propaganda"

I'm not sure it's fair (or even remotely in good faith) to argue that it's people responding to arguments expressed in that manner that are lowering the standard of internet discourse...


It's certainly fair and probably said in good faith. This is why people say "two wrongs don't make a right". Just because one person is lowering a bar doesn't mean a second can't lower it further.


I'd love to hear how a bar for discourse set at "[everything you read on unrelated wedge issue I've inserted into the conversation] is lies and propaganda" is lowered by "this is disinformation. Here's a study"...

People say "two wrongs don't make a right" when they're not singling out the words of only the second party as the reason why people can't talk to each other on the internet.


> People say "two wrongs don't make a right" when they're not singling out the words of only the second party as the reason why people can't talk to each other on the internet.

People say "two wrongs don't make a right" when they're saying that one bad behavior in response to another bad behavior is still a bad behavior.

> "this is disinformation. Here's a study"...

This is both more reasonable than and different from what was written.

> Where do you get that disinfo??

Couldn't that part simply have been left out? More to the point, is a person definitely wrong for thinking that it ought to have been left out?

> I'd love to hear how a bar [that is ostensibly resting on the ground] is lowered

Dig a hole. It can always get worse.


> More to the point, is a person definitely wrong for thinking that it ought to have been left out?

That is both more reasonable and different from what was written :-p

Yes, I think it is absolutely unfair for someone to observe an exchange between somebody ranting about how mainstream scientific consensus is all lies and propaganda and somebody in turn dismissing their argument for why everything was all lies and propaganda as disinfo, and place the blame for the internet discourse in general being suboptimal squarely on the terms used by the second person.

YMMV.


Well put! Reminds me of the habit that people have of calling each other Russian bots or AI. Then why are you arguing with a bot or an AI?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: