Hacker News new | past | comments | ask | show | jobs | submit login
Lossy CSS compression for fun and loss (or profit) (blog.danieljanus.pl)
254 points by todsacerdoti 10 months ago | hide | past | favorite | 43 comments



I think it'd be a lot more interesting if you could feed it information about how often each rule is used.

For example, add some code to your live site that periodically samples the DOM from a random subset of real user - for every element that's actually visible on the page, see what style rules apply to it. Count those up and get a big histogram.

Then let that feed into the compression. The least-used style rules on obscure pages should be the most compressed. Find a way to have those "reuse" other style rules that are close enough. Your most important content would be the least degraded.


Author here – thanks! This is an interesting avenue of exploration that I hadn’t thought about. Exactly the kind of feedback I hoped for.

I’ll try and give it a shot.


Using playwright you can get coverage for your used CSS on each page. It’s straightforward to build into E2E tests. Chrome will also tell you which rules were used on any given page.

That in turn could give guidance for how you could pattern match rules even further


That's really not as simple as you make it sound. Chrome and Playwright record the rules that were used, but if you don't trigger every media query, a lot of useful CSS will be flagged as useless. Also your playwright test needs to hover every element which has a :hover rule and print css will be discarded. You also have to keep in mind that Chrome isn't the only browser, some browser specific css will be ignored. Another example is if you have rules like :nth-child() on a list of results and the html you test it with doesn't have that element right now but an api call might return more result later. These css rule won't get counted.

Also there are some false positives, for example if you set a variable twice and only read it later, then both variable declaration are marked as used while the first one was never read and is therefore useless.

There's to my knowledge no tool that automates this process. I wish there was one.

The only one I found is doei [1] but it's not finished and it just tries on a couple of hardcoded media queries. It's far from a simple problem but I'm sure someone can do it.

[1] https://github.com/JamieMason/doei


Tailwind does this through analysis of the source. Not sure why you'd need to do dynamic analysis. You want to spin up a Chrome browser on each build to occasionally shave off a few bytes from a gzip stream?


My genuine thought was using it to establish some approximation of how you could with further accuracy compress / merge styles tighter for greater re-use


I’ve had this same thought for creating better chunking. There needs to be some sort of measuring component.


[flagged]


Are you a bot that just rewrites the comments you're replying to?


Probably someone using OpenAI with a prompt that says to respond as a Hacker News user


Sorry, I replied to the wrong comment


Still failing the Turing test.


Right from the top I thought "Why would I risk this breaking my site at random?"... until the last paragraph. There the author states that (outside of the exploration for exploration's sake benefit) there may be potential gain here in comparing the output from this tool to a CSS monstrosity that has not been well managed. I think that could be quite valuable.


This needs upvoting, could well be that this is a life saver for your typical WP install based on 3 competing CSS frameworks and about half a meg of bunged together styles that was once generated by some forgotten Node.js script that has now vacated the premises. Bookmarked.


Not a free (as in speech) license by the way https://oql.avris.it/license?c=Daniel%20Janus


Some other ideas in this vein:

1. Approximate colours by reducing colours to their three letter hex-codes

2. Detect repetative rules and use native mixins.

3. Ugglify the CSS names, although you'd need to edit the HTML accordingly.


Also possibly finding similar colors that could be combined in rules instead of repeated separately.

`.a{color: #FFFFFF;} .b{color:FFFFFE;}` ==> `.a,.b{color:#FFF;}`


and the next pass class .b can go since we already have .a

then, if that is really all it does it can be called .white

next round trip it can look if multiple .b share a parent without other font colors.

.white then becomes #wrapper or body.


Same goes for embedded fonts


Sounds like begriff's (Postgrest author) CSS Ratiocinator

https://github.com/begriffs/css-ratiocinator


This I can get behind. It takes out the redundant or poorly structured rules that accumulate over time when writing CSS. It doesn't modify the style.


If you could also incorporate precedence rules you could get some more reduction. e.g. for the h1,h2 example, you'd have a selector for `h1,h2` (which is essentially the full h2 rules) and another for `h1` that overrides the font-size. Then when needing to "reduce" rules you could select for "smaller" rules and the only loss would be that h1 and h2 have the same size.

To do that I think you'd need to do the factorization on the CSS properties alone, and then apply the values in a predetermined order. But would be cool/fun to test out!


My snarky frontend response is "devs will do anything to avoid fixing their CSS," but truly the task is too herculean to bother with on any most timescales. (You think "Untangle and reduce CSS bundle by X% with 0% improvement to any real KPI" is a ticket that's ever going to get prioritized?)

As others have said, this might be an interesting way to start zapping bloated CSS assets on aging codebases.


> Why not? The sheer joy of exploration is reason enough.

Amen


> CSS codebases have the tendency to grow organically and eventually start collapsing under their own weight,...

This was true if you don't co-locate CSS and code that uses it. Remove the code that imports the modular CSS and you remove the CSS. Interesting project nonetheless (even written in Clojure).


Not wrong, but the main “collapsing” problem I’ve observed isn’t dormant CSS code that isn’t used, but code which, whether legitimately used or not, is not scoped to the component it supposedly belongs to (and may sit alongside in the source tree), interacting with or cascading to other elements unexpectedly. It seems like something you’d just solve by convention, wrap as SCSS around whole files, and just demand of your front end developers. But still I’ve never seen it done thoroughly. Too many broad rules slip in there. I guess it’s why styled components are popular, as inelegant a solution as it is :(


While the concept (of lossy compression for other things than bitmaps and audio signals), is interesting, CSS is a pretty bad use case although outputs something that still works, would not for most programming code.

E.g ML training datasets. Huggingface is filled with interesting stuff often in the gigabytes in size. I'm sure skimming out what follows patterns of redundant or less useful to ML learning algorithms could offer a nice catalogue of datasets quick to download and cleanup. Image sets for ML learning: loss of irrelevant channels and whatnot. Weather forcasting or historical dumps etc


I'd like to see this coded into a browser extension, just to have the experience of seeing whether I notice it affecting sites at various lossiness levels.


Wait, I cannot believe no one said this would not work with very simple cases like:

    a { ... }
    article a { ... }
    .card a { ... }


It's not mentioned in the article whether structure is preserved, because if it's eliminating the "wrong" styles (affecting structure) then everything else it's doing right is irrelevant.


> The program only works on style rules (which make up the majority of a typical CSS). It leaves the non-style rules unchanged.


I had expected this to tweak colors or margins to make them compress better when passed through brotli or gzip. What was done instead is cool too.


wow this sounds incredibly useful


Come on this is so stereotypically developer. It's a fun premise and all they had to do was implement it on a test page and screenshot the before/after. Something, ANYTHING to demonstrate visually the end result. Sigh.


Did you see all of the links below the test "here’s how the page looks with various settings"?

It's admittedly a pretty trivial page to test it on, but it works.


Seems interesting, but for a CSS post, I would really expect some screenshots showing the difference


There are links to live examples of different compression, and the source code. That's as good as (perhaps even better than) screenshots in this case.


The author was just shy of showing the results of the example he provided, but then didn't.

I think this could be a useful tool in the IDE. Imagine vscode highlighting a line and suggesting another place to put it instead. Neat.


There are literally five live examples, along with ground truth, linked in the article, formatted as a bullet list. What more could you want?


The "more" that I wanted was for the conclusion to the small worked example to be shown. That is, what do the computed factor matrices B and C look like as CSS rules, for this tiny example? Does the A' that they multiply to equal the original A matrix?


This is exactly what I meant.

He started building a small straight to the point example and didn't show the result.


see 'If you just want to see some results, here is a sample with my homepage serving as a patient etherized upon a table'

https://danieljanus.pl/index30.html vs https://danieljanus.pl/


I don't understand what you're expecting but not seeing.


The "more" that I wanted was for the conclusion to the small worked example to be shown. That is, what do the computed factor matrices B and C look like as CSS rules, for this tiny example? Does the A' that they multiply to equal the original A matrix?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: