Hacker News new | past | comments | ask | show | jobs | submit login

Good initiative. Now people need to go through this and do the reviews :-)

Next step would be to do reproducible builds (if it's not already the case).




Not a Rust dev so maybe a dumb question, but is this more involved than just running diffs? If so, what needs to be done?


The key thing is interpretation of the diff. Is there a difference since they ran some code generator ins the crate contains generated code, not present in the repo or did they add a backdoor?


Most of the diffs are probably innocuous. I suspect the most common diff would be the version line of Cargo.toml, both from CI that automatically updates that line, and people who forgot to update it before making a tag in git.


As someone with a crate that's in the 50MM plus range, this happens all the time. I really should automate this via a GH action.


Interested to see the crate, and maybe I can help?


https://crates.io/crates/ctor

Patches welcome!

Email in profile as well.


First pass gpt?


Heavily down voted, which is fair because I didn't really explain what I meant, which was: Would using LLM's to parse the generated diffs, as a first pass, be useful/efficient for spotting and interpreting discrepancies?


when your goal is to improve security, the unreliability that comes with LLMs is not the answer.


I don't think this is a relevant take. Your goal is to implement a system to automatically scan countless packages and run a heuristic to determine if a package is suspicious or not. You're complaining about false positives/false negatives while ignoring that packages that not checking packages at all is not an improvement.


Personally I think using LLMs to scan is a good idea, but an honesty negative is a potential false sense of security. I think using LLMs here are useful for finding unintentional security flaws. I don't think it's a great tool to find intentional security flaws a la the xz situation. People might be less inclined to dig into the code directly if it was stamped with a green check mark by a GPT.


Using machine learning, including LLMs, to detect and mitigate malicious code is of interest to a whole lot of smarter people than me, really suggests your flippant rejection of their potential is premature.

https://arxiv.org/abs/2405.17238 https://arxiv.org/abs/2404.02056 https://arxiv.org/abs/2404.19715 https://www.sciencedirect.com/science/article/pii/S266638992...


Not necessarily, it reduces false positives. It just doesn’t do anything for false negatives (arguably makes the problem worse).

If you just want to see if there is incidence of valid differences, this seems fine. But I wouldn’t use it as a guarantee.


Neither does grabbing yet another online third party's untested data?


It could work for classifying honest/innocent differences.

However, LLMs are incredibly naive, so they could be easily fooled by a malicious actor (probably as easy as adding a comment that this is definitely NOT a backdoor).


LLMs are broadly naive, but when fine tuned on a small domain of expertise/knowledge, this problem is less impactful.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: