I haven't looked at what is directly impacted yet, but this is an oversimplification. Assuming it is all Garmin services down, I've personally used my inReach for the following situations that have no real map involvement:
- Coordination of supply drops during multi-day backcountry ski tours.
- Weather updates in the Rocky Mountains where weather can change in an instant.
- Contacting a personal dispatch POC over messaging during an avalanche injury that required SAR without having to hit the SOS button.
Additionally, my wife relies on her inReach mini every day as a safety net as a biologist in remote areas of Colorado. Most people I encounter in the outdoors space rarely use their Garmin for navigation. Instead they use an app akin to onX, Gaia, CalTopo, etc.
Well, nobody needs anything 100% of the time besides oxygen. But if you pay a lot of money for a watch because it has certain features backed by services, I would definitely hope that I can actually use them when I want to.
and nobody can guarantee 100% uptime of online software services. Not to give a free pass or excuse downtime, but you shouldn't expect 100% uptime if its not in an SLA you are paying for, which would be quite expensive
There are no roles of that kind. If you want to work in IT, you need to be a recruiter or a manager.
While this might seem controversial, it's very accurate.
how about all alumni email addresses just become forward and reverse proxy relays. let me just add a forwarding address, and provide outbound mail.
permanent vanity addresses tied to accomplishments, clubs etc should be a straightforward business or product for google, microsoft, cloudflare etc to offer.
or patch dns to allow the sale of email addresses, and process all forwards (with the forwarding address stored privately) before processing other mx records.
There's actual SWE jobs where humans sift through this kind of noise. Someone told me they worked such a job recently.
It's a good tool to add pressure and raise expectations.
Maybe this is the future..
They only know the 22% number because unit tests to check for a fix are included in the benchmark. In other words, in a real world situation, the human would still need to double check. The patches this tool generates do not include appropriate tests or explanations and would never pass code review by a qualified human.
SWE-bench Lite is a subset of extremely simple issues from a cherry-picked subset (SWE-bench) of a handful of large (presumably well-run) Python-only projects.
Here are some rules they used to trim down the SWE-bench Lite problems:
* We remove instances with images, external hyperlinks, references to specific commit shas and references to other pull requests or issues.
* We remove instances that have fewer than 40 words in the problem statement.
* We remove instances that edit more than 1 file.
* We remove instances where the gold patch has more than 3 edit hunks (see patch).
You can't demonstrate whether a dataset is representative or not by "an example or two". You need to look at all the data.
And all of this is fine. It's just a benchmark suit and doesn't need to be fully representative. The dataset itself doesn't even claim to be that as far as I can find. All I'm saying that the title wasn't really accurate.
at the time of writing this their repo is 12h old.
the training time isn't stated in the paper.
i'm thinking maybe one of these robots can replicate this and tell us how it went.
reply