I got mine down to 160 bytes with some pixel tweaking and converting it to a 16-color indexed PNG. It's not a lot of work or very difficult (I'm an idiot at graphics editing), but you do need to spend the (small amount of) effort. I embed it as a data URI and it's just four lines of (col-80 wrapped) base64 text, which seems reasonable to me.
Haven't managed to get my headshot down to less than 10k without looking horrible no matter how much I tweaked the JPEG or WebP settings, and thought that was just a tad too big to embed. Maybe I need to find a different picture that compresses better.
I got that 280k Discord favicon down to just 24K simply by opening it in GIMP and saving it again. I got it down to 12K by indexing it to 255 colours rather than using RGB (I can't tell the difference even at full size). You can probably make it even smaller if you tried, but that's diminishing returns. Still, I bet with 5 more minutes you can get it to ~5k or so.
It's very easy; you just need to care. Does it matter? Well, when I used Slack I regularly spent a minute waiting for them to push their >10M updates, so I'd say that 250k here and 250k there etc. adds up and matters, giving real actual improvements to your customers.
The Event Horizon Telescope having a huge favicon I can understand; probably just some astronomer who uploaded it in WordPress or something. Arguably a fault of the software for not dealing with that more sensibly, but these sort of oversights happen. A tech company making custom software for a living is quite frankly just embarrassing to the entire industry. It's a big fat "fuck you" to anyone from less developed areas with less-than-ideal internet connections.
It's not, at least for me. If you checked in devtools, that's gzip over the wire size. Hover over the size and it'll show you the actual resource size, still 285k for me.
This is bad math, not researched heavily but in 2020 discord had 300 million users. 285kb goes a long way with wasted energy and bits flowing through the pipes. I agree generally with what your saying though gzipped sizes are what's being sent some CPU usage somewhere to unzip. less bytes == less waste?
Includes but doesn't always use. PNG also includes filters which can dramatically decrease sizes, especially when combined with compression.
That's why tools like OptiPng basically brute force all the combination of options. Depending on the image content different combinations of filters and compression will get the best file size.
I wouldn't be surprised if that was for a specific reason, like somehow showing up better somewhere for some reason, or something like that. Or maybe not; who knows...
256x256 PNG reduced to 256 colors with pixel transparency gets it to 2.68K. I manually dropped the color depth to indexed and saved it out in PhotoShop and I used FileOptimizer to shrink it. It includes 12 different image shrinkers and runs them all.
Note that unlike some of the other tools mentioned here, pngquant does lossy compression. Might still be the right tool in many cases, but it means you should check the output while e.g. optipng is a no-brainer to add to whatever your publishing pipeline is.
The difference between the Apple “precomposed” and standard icons had to do with the gloss effect on icons on pre iOS 7 home screens.
When adding a website/webapp to these earlier home screens, the OS would apply a gloss effect over the icon in order to match the aesthetic of the standard apps.
The precomposed icon was a way for the developer to stop the OS from applying this effect, such as if their logo already had a different gloss effect already applied (i.e “precomposed”) or other design where adding the glossy shine wouldn’t look right.
The standard icon allowed the OS to apply the gloss effect - which was a timesaver as Apple did tweak the gloss contour over the years: hence using a standard icon ensured that the website/webapp always matched the user’s OS version.
Also, we turned up 2,000 domains that redirect to a very shady site called happyfamilymedstore[dot]com. Stuff like avanafill[dot]com, pfzviagra[dot]com, prednisoloneotc[dot]com. These domains made it into the Tranco 100k somehow.
Lately, happyfamilymedstore has mysteriously always been in the top ~ten Google Images results for super niche bicycle parts searches I do. They seem to have ripped an insane amount if images that gets reposted on their domain.
What most of them do is they will use Wordpress exploits to get into random wordpress website ran by people who know nothing about managing a website and are running on a $3/mo shared hosting account.
After they get into these random wordpress sites, then then embed links back to their sketchy site in obscure places on the wordpress site that they hacked, so that owners of the site don't notice, but search bots do. They usually leave the wordpress site alone, but will create a user account to get back into it again later if Wordpress patches an exploit. All of this exploit and link adding is automated, so it is just done by crawlers and bots.
This is done tens of thousands or even millions of times over. All of these sketchy backlinks eventually add up, even if they are low quality, and provide higher ranking for the site they all point to.
Think of websites like mommy blogs, diet diaries, family sites, personal blogs, and random service companies (plumbers, pest control, restaurants, etc) that had their nephew throw up a wordpress site instead of hiring a professional.
I don't mean to pick on wordpress, but it really is the most common culprit of these attacks. Because so many Wordpress sites exist that are operated by people who aren't informed about basic security. Plus, wordpress is open source, so exploits get discovered by looking at source code and attackers will sell those exploits instead of reporting them. So Wordpress is in an infinite cycle of chasing exploits and patching them.
You can have a separate system, even a locally running desktop app do that. You can still have a database, complex HTML templating, and image resizing! You just do it offline as a preprocessing step instead of online dynamically for each page view.
Unfortunately, this approach never took off, even though it scales trivially to enormous sites and traffic levels.
I recently tried to optimise a CMS system where it was streaming photos from the database to the web tier, which then resized it and even optimised it on the fly. Even with caching, the overheads were just obscene. Over a 100 cores could barely push out 200 Mbps of content. Meanwhile a single-core VM can easily do 1 Gbps of static content!
Here's some rough scheme I came up with (I never implemented it, though):
1. Use github pages to serve content.
2. Use github login to authenticate using just JS.
3. Use JS to implement rich text editor and other edit features.
4. When you're done with editing, your browser creates a commit and pushes it using GitHub API.
5. GitHub rebuilds your website and few seconds later your website reflects the changes. JavaScript with localStorage can reflect the changes instantly to improve editor experience.
6. Comments could be implemented with fork/push request. Of course that implies that your users are registered on GitHub, so may not be appropriate for every blog. Or just use external commenting system.
So, essentially a site generated with Jekyll, hosted on GitHub Pages with Utterances [0] for comments and updated with GitHub Actions.
I don’t know if https://github.dev version of Visual Studio Code supports extensions/plugins, but if so, then there is also a rich text editor for markdown ready.
All that’s left would be an instant refresh for editing.
There are plenty of places that you can go to on this planet with little to no law enforcement. Don't be surprised if you end up dead there. Handling global crime is very difficult.
I recently saw and reported one to a local business.
If you typed in the domain and visited directly, it wouldn't redirect to the scam site. But if you clicked on a link from a google search, then it would redirect.
Probably makes it harder to find for small website owners if they're not clicking their own google searches.
It happens through search engine optimization, SEO, and a mix of planting reviews and other tactics. Think of it like this - what would you do to get people talking about your site? You'd somehow put links, conversations, reviews, quotes, etc. in front of them.
I worked on Opera Link, the first built-in synchronization between different installations of the Opera browser, both desktop, Opera Mini and Opera Mobile (+ a web view).
Favicons got included in the data from day one, and it was awesome to get the look and feel of your bookmark bar/UI with the correct icons right away.
Back then we stored the booksmarks in a home grown XML data store (built on top of mysql, acting more or less as a key-value store). This worked quite nice, and it allowed us to easily scale the system.
One night the databases and backends handling the client requests suddenly started eating a lot more memory, and the database started using much more storage than normal.
As one of only two backend devops working on Opera Link, I had to debug this, and find out what was going on. After a while I isolated the problem to a handful of users. But how could a few users affect the system so much?
As a part of the XML data store, we decided naively to store the favicons in the XML, as a base64 encoded string. While not pretty, a 16x16 PNG is not that much data, and even with thousand of bookmarks, the total overhead on compression and parsing was neglishable. What we did not foresee was what I uncovered that night; A semi-popular porn site had changed something on their server. They had started serving the images while also pointing browser to the same images as the favicon! Each image being multiple megabytes, sent from the client, parsed on the backend, decoded, verified, encoded back to base64, added to the XML DOM, serialized, compressed and pushed back to the database...
Before going to bed that night, I had implemented a backlist of domains we would not accept favicons for, cleaned up the "affected" user data, and washed my eyes with soap.
The truth is that most services will have a set of devops with access to personal information. And some times, we need to look at private data to solve issues like this. My first instinct back then was that some smart hacker had created a FUSE support for Link or something similar.
Opera Link did not encrypt bookmarks and speeddials etc, but had datatypes encrypted with master password, even while syncing. We where two people with the access and knowledge to access individual user information, and we took it very serious.
> In fact, I recommend that browsers ignore these hints because they are wrong much of the time.
I don't agree. That's the kind of coddling that encourages incompetence. Instead of compensating for others' mistakes, just let their stuff break.
I wonder if Safai on iOS ignores the hints. When I tested, I was surprised to see that pressing the share icon, which holds the option for `Add to Home Screen`, would cause a download of all of the icons listed with `link rel="icon"`.
A problem with this is that when a website breaks in one browser, but works in another, I imagine most people's reaction would be to blame the browser. This leads to a kind of race-to-the-bottom for browser compatibility. See for example the history of User-Agent strings.
The opposite is the case. Overall, being too lenient in what code accepts and applying heuristics will lead to way worse problems down the line. For example, you want your compiler to fail hard instead of saying: "Oh, this isn't a pointer, but I'm sure you meant well, I'm just going to treat it as a pointer!"
In this particular case, it seems to me that the hints serve no purpose and should be abolished, and in the meantime fully ignored, altogether. All necessary metadata is contained in the image file, and browsers should also be (relatively) strict in what image files with what metadata they accept, for security reasons alone.
And if they also went so far as limiting file size, the perpetrators that clog up bandwidth by putting up multi-MB favicons would catch on much earlier (or at all), too.
So what actually is the point of those hints, if browsers have to fallback anyway?
The hints are not a hint in how to render the icon - browsers don't need hints for that. the hints are an instruction to browsers on which icon to download in the case where multiple icons are specified.
if you are safari and you don't know how to display SVG favicons, then you don't need to waste bytes downloading a favicon only to fail to display it. the HTML does not limit a site to only one favicon.
Why is that not done through the MIME type and using HEAD? The server is apparently much better able to figure out the MIME type through magic numbers and file extensions of the actual file, than the author (human or not) of the HTML, as we see.
The same headers also inform the browser that they can skip downloading a favicon that they consider too big, for example.
Ugh, HEAD is not being universally supported, at least for static content? Okay, I accept that this has value then.
As for the MIME type, for image types I'd say it's more than stable enough. Certainly much, much more stable than the 6.7% error rate mentioned in the article here, I'd be surprised if it was even 1%. If you double click on an image on your desktop for example, you can in almost all cases expect that it will be opened correctly. It ceases being a heuristic entirely if you tell the webserver that *.png is image/png, and only put PNGs with names ending in ".png".
Guess those are the reasons why I got out of web development in 10 years ago, everything's held together by scaffolding and needlessly wasteful and inefficient there.
You might be overthinking this. I agree with the philosophy that stricter is better, but in this case what do you expect broken hints to do?
They’re not used for rendering, they’re used for figuring out what to fetch. A HEAD request would be far less efficient than knowing ahead of time what to fetch: 1 request versus 2N+1 requests.
What you suggest sounds all fine but the entire web is user input for a browser, so no matter what, you need to define how to fail. If you can fail gracefully, you might as well do so, because a failure might not even be triggered by bad code/configuration on your side but simply by flaky network issues.
Yeah, I get how those hints make sense, now that you (and others in the thread) have told me how things are, and I did overlook that HEAD is still an extra request, while the attributes are (effectively) for free.
I do wish that content negotiation (e.g. Accept headers) worked properly. In the end though, those hints implement a subset of content negotiation in a reasonable way, given the state of affairs.
Just don't ignore filename extension. favicon.svg is SVG and that's about it. If you don't support SVG, don't download it. If you want to store png in favicon.svg, don't do that.
YouTube and Twitter both have wrong parameters. Presumably this means all major browsers ignore them or someone would have noticed their favicons not displaying right?
The point for the hints is probably that the browser doesn't need to fetch the 2000×2000 favicon if it only needs something in 16×16 to render in the tab bar.
That may be your viewpoint but browsers have historically always taken the other viewpoint. Take HTML parsing for example. You can miss closing tags and a ton of other stuff, and it'll all work on a best-effort basis.
The browsers job is to do the best it can, that's what users want. No one would use a browser that breaks at the smallest tiniest error in the source code.
It's a vector graphic; its resolution is whatever you render it at. "S" as in, "Scalable".
Sure, there is some nuance in that you wouldn't want some fine detail to get lost at the displayed size, but presumably you know you're making a favicon when you do so.
Or, you're the NFL & you're going to supply a 4 megapixel image IDK.
> Sure, there is some nuance in that you wouldn't want some fine detail to get lost at the displayed size, but presumably you know you're making a favicon when you do so.
On the other hand, SVG is really not designed for the fine pixel control you want to make the icon look good at smaller sizes as it does not have the equivalent of font hinting.
... and wrote an interesting technical article about it, that even someone like me, who doesn't do web development, enjoys reading. Definitely why I come to HN (no sarcasm, it is).
Designed for personal use as a PWA specifically on my iPhone. I migrated from android where i had a TI-89 emulator app. No such thing exists for iOS. Usability by others was never a requirement :)
My website, gameboyessentials.com, would not exist without this esoteric CSS property. I wanted to show Game Boy images in their exact resolution (160 by 144). With image-rendering: pixelated; I have crisp pictures on my site whose sizes are counted in bytes.
I wonder if there might be a way to map all these using t-SNE to discrete grid locations? Maybe even an autoencoder. I'd love to see what features it could pick out.
I don't see their data set though. hmmm.
maybe I'll just have to crawl it on my own if I want to do it.
You can use t-SNE (or even better: UMAP or one of its variation) to create a 2D points cloud, and then use something like RasterFairy [1] to map 2D positions to the cells a grid. It usually works well.
Favicons are slightly useful. You can serve your page at http://www.example.com with a favicon from https://example.com that has a HTTP Strict-Transport-Security header with includeSubDomains, and then future page loads in that browser will be https (across your whole domain). (This assumes you want your domain to be https)
I know of a company whose favicon was a hires true color PNG that weighed in at more than 2 MB. The web site was the dominion of marketing. Suggestions to improve the situation were detrimental to one's career path. sigh
Not really relevant, but using Go to fetch the data, and then Ruby to process the data is the best. I used this exact set up for a project and it was amazing. Really the sweet spot of use cases for both languages.
Go's got an awesome feature set built in to the language for building small networked services. I implemented a client to a cryptocurrency network to extract information about its status and clients. I can't really express why it's so good, it just feels right.
Same for Ruby, the syntax is perfectly suited for transforming, digging through and acting upon data. I didn't even add a Gemfile, only used standard library functions, transforming the data the Go program mined into usable information serialized in JSON which was subsequently used as a static database for a webpage.
The non-PNG Apple touch icons might be CgBI files? It's an undocumented proprietary Apple extension to PNG which most PNG tools won't accept, but which Xcode uses for iOS apps.
That article was a fun read! There was one sentence that bothered me though.
> I recommend that browsers ignore these hints because they are wrong much of the time. We calculated a 6.7% error rate for these attributes, where the size is not found in the image or the type does not match the file format.
I think of much in this context to mean at least more than 50% of the time. So I had to look up the definition of the word. One definition from Merriam is "more than is expected or acceptable : more than enough." So I guess the usage is acceptable!
I always enjoy finding I have a slightly wrong definition in my mind for a word. Many arguments, or much arguments, fail to move forward due to the differing, unidentified, underlying assumptions relying on words with slightly different definitions, both people having a slightly different question they are arguing in their mind.
This reminds me of the time I reported to CIRA (Canadian domain registry) that their favicon was ~2mb /w bad caching rules and was causing issues in ... many situations.
Didn't they miss all the pre-sized icons in their scan as well? For a while Apple encouraged multiple resolution sizes for favicons for... reasons.
I know they additionally missed the directory specific favicons which have always had iffy support (i.e. /index.html => /favicon.ico and /munks-page/index.html => /munks-page/favicon.ico)
One weird behavior with favicons that I noticed is that Firefox will download both the 16x16 icon that matches the size its displayed at (on 1x pixel ratio screen) as well as the largest icon and then will display whichever finished last. This behavior makes no sense to me.
That "I am feeling lucky" button does not seem random at all, it brought me in order to: Microsoft Windows, Blogger, The Financial Times, Github, Adobe ...
As every other location I randomly scroll to has no recognizable image on it ... that seems preselected :-)
I have always wanted to do this _exact_ analysis - so awesome! Every time I am building some kind of semi-intelligent parser to fetch an arbitrary visual icon for a URL I think to myself there has gotta be a better way do do this.
>The favicon visualization brought memories of the million dollar homepage. I suppose it was precursor of NFTs.
It was not; NFTs are digital certificates saying that you own certain digital content on the other hand The Million Dollar Homepage was basically selling ad space on the website.
You can argue you could buy part of the website(digital space) and therefore you own the part of the website but in reality you were renting it as an ad space meant to promote your website(link).
Purpose and vision of The Million Dollar Homepage and NFTs are completely different but I can see similarities between
quasi owning digital space(part of website) and owning digital content or digital certificate(digital token).
Huh, there's a row of identical icons of 3 blue circles (search for cashadvancewow[dot]com) and all the domains using them are loan-related. Interesting way to do forensics on clone sites (although trying a few of them, they're not showing any icons right now, and the URL /favicon.ico 404's)
And I checked a few of the sites, I just got lorem-ipsum style landing pages. I wonder what's the point, or are the scammers using the domains mostly for emails?
There are multiple runs of "just a bit too abstract" icons that point into the abyssal cesspools of the Internet. Most of them seem to be about loans, so I'm going to avoid announcing that too loudly if I ever need a loan, since clearly, there are some scumbags out there.
What is the Tranco dataset that this is based on? I mean come on -- anything that claims to be based on 'Alexa' (or any of these others: Cisco Umbrella/openDNS? Majestic? Quantcast?) is sooo suspect. None of these sources are that good and especially Alexa which harks back to a time 20 years ago of browser toolbars and extensions which the large majority do not use anymore.
Just saying yes maybe it's easy to come up with a top 1000 list of sites on the net, but other than that no one really knows unless you're like Google/Bing/Apple/Cloudflare that have redirection urls/DNS control, tracking clicks etc
> We did a hacky image analysis with ImageMagick to survey favicon colors. Here is the dominant color breakdown across our favicons. This isn’t very accurate, unfortunately. I suspect that many multicolored favicons are getting lumped in with purple.
Writing or reviewing a sentence like this should make you reconsider. Either do the right analysis or remove this from your article. But when you say your analysis is probably wrong and the results look weird, then why publish as is?
Haven't managed to get my headshot down to less than 10k without looking horrible no matter how much I tweaked the JPEG or WebP settings, and thought that was just a tad too big to embed. Maybe I need to find a different picture that compresses better.
I got that 280k Discord favicon down to just 24K simply by opening it in GIMP and saving it again. I got it down to 12K by indexing it to 255 colours rather than using RGB (I can't tell the difference even at full size). You can probably make it even smaller if you tried, but that's diminishing returns. Still, I bet with 5 more minutes you can get it to ~5k or so.
It's very easy; you just need to care. Does it matter? Well, when I used Slack I regularly spent a minute waiting for them to push their >10M updates, so I'd say that 250k here and 250k there etc. adds up and matters, giving real actual improvements to your customers.
The Event Horizon Telescope having a huge favicon I can understand; probably just some astronomer who uploaded it in WordPress or something. Arguably a fault of the software for not dealing with that more sensibly, but these sort of oversights happen. A tech company making custom software for a living is quite frankly just embarrassing to the entire industry. It's a big fat "fuck you" to anyone from less developed areas with less-than-ideal internet connections.