I wouldn't recommend doing what the author suggests. crass doesn't squeak out extra bytes when you reprocess because it already does this for you.
If you combine multiple minifiers, the bugs in one minifier can end up being amplified by the others. For instance, one minifier might not consider !important when moving around CSS. Another minifier might then take that output and perform an optimization that's no longer just unsafe, but now incorrect. It might even delete rulesets that it believes are not used, ruining your stylesheet.
There are suites that test the correctness of minifieres, and many don't do great. CSS still "works" when the syntax is invalid, so invalid stylesheets result in undefined results when you minify. Between bugs and undefined behavior, I wouldn't recommend mixing and matching just to save one or two TCP packets, especially with gzip/brotli on the wire.
> I don't recommend using minifiers which contain bugs
I recommend not using minifiers. Or compilers, for that matter. Or really any software. ;)
> or do unsafe transformations (a form of bug IMO)
I think it depends. In crass, unsafe transformations are only performed when you pass a special flag to the API or CLI. The constraints and effects are documented with the premise "If you're not doing X, Y, and Z, we'll squeeze out a few more bytes". X, Y, and Z are things like relying on browser bugs or only using deprecated features (e.g., using -webkit-border-radius without border-radius). Folks using CSS preprocessors (SCSS, stylus, LESS, etc.) oftentimes don't have CSS that could cause issues, and this option is safe enough for them (and provides a hefty benefit).
The problem with CSS is that you can guarantee many things as safe, but because of the flexibility of the language, it's impossible to minify "perfectly". Having knobs and levers for the developer to say "I'm not supporting IE9 or below, yes you're looking at ALL of my CSS, no I don't care about browsers older than three years..." gives much better results--especially on large projects--by trimming down the number of variables that could prevent different sorts of "optimizations".
So you don't recommend using a minifier? In all seriousness, I don't think anyone is recommending using one with known bugs but there will likely always be some latent bugs that aren't known.
Reprocessing one minifier's output with itself is probably fine, since you're not mixing and matching approaches to minification. It's likely that a minifier is consistent with how it minifies internally, and they simply don't do multiple passes.
Your time would be far better spent, though, making a pull request to properly implement multiple passes on your minifier of choice so that you can safely get that benefit. Or you could say "meh, I'm not losing sleep over the extra hundred bytes" and work on something more productive.
I'm an academic. I get very unwell if I'm not pompous at least half of the time. I also had two days to put this together, from conception to submission here. I could have been more innovative, but given the time limit...
The best part was that the whole description is technically correct.
You should apply this to png optimization. A combination of: optipng, pngout, pngcrush, DeflOpt, advpng.
It's a bit more complicated since you have to also vary the command line arguments you pass to the program. (Different color types can sometimes help, as can different filters.)
The funny thing about the software I wrote is that you can have it do that too. You just need to define what a minifier is. I do permit command line parameters.
But honestly, just use pngzopfli as post-processing and pngquant as pre-processing.
That's a pretty old tech. So old that many existing software package do that for you e.g. ImageOptim for OSX will run all of Zopfli, PNGOUT, OptiPNG, AdvPNG and PNGCrush using a nice drag-and-drop interface.
Or it's because HN's commenting system is a trash fire and whoever coded it decided they couldn't be arsed to implement escapes, quotes, lists or fucking tables but just had to strip out arbitrary symbols and obviously U+2495 NUMBER FOURTEEN FULL STOP (⒕) is important but god forbid anybody use U+262C "ADI SHAKTI" (<should be here but HN strips it out, see https://en.wikipedia.org/wiki/Khanda_(Sikh_symbol) >) or even better U+2299 "CIRCLED DOT OPERATOR" (⊙) makes the cut but U+2609 "SUN" (<also a dot inside a circle just a different one>) shall be banned from using or discussing.
I knew HN stripped symbols pretty arbitrarily, I just checked a few non-letter Unicode blocks (Enclosed Alphanumerics[0], Mathematical Operators[1], Misc Symbols[2]) and checked which one HN strips.
Font, sadly. There have been proposals for e.g. a font variant to select a monochrome v color version but I don't think it stuck. So you get whatever your font does.
There is a CSS draft, I think, but the Unicode 'patch' for this is already real and it works 100% reliably in Safari and with varying and limited success in other browsers, which hopefully will climb aboard the clue train eventually. Just append code point FE0F (Emoji) or FE0E (Symbol) to the ambiguous character.
The story is deeper than just 'font' being the deciding factor, though.
The key thing to know is that a lot of emoji, especially many of the 'first' ones, were actually just existing (black & white) symbols that were mapped to new, color pictograms on emoji-supporting devices. This might sound like a nice backwards-compatible path, but it was, is, and will always be a disaster!
The Unicode Corsortium stupidly, naïvely, regrettably dictated that such characters should render as emoji "in the mobile context" but as symbols elsewhere. A half a moment's thought shows the folly of this incredibly dumb proposal. Say I text you an emoji. Great. Mobile context. And then you copy my text and paste it into an email your phone to your mother. Should the email look dramatically different depending on whether she checks her email on her phone or on a desktop? Madness. On what planet does it make sense for a writing system incorporate the viewing device into its decoding algorithm?
The ambiguity is resolved differently, even character-to-character in the same browser on the same device, as my tiny three-character test suite reveals.
Upsettingly to me, this means that a lot of documents written before emoji existed, or even written today on a non-emoji device such as a Windows 7 box, HAVE BEEN RETROACTIVELY AND PERMANENTLY BROKEN by the Unicode Consortium. Because, let's not kid ourselves: The meaning and tone of a document absolutely changes when one symbol is replaced with a conceptually similar but decidedly different cartoony pictogram.
I worry that this is the chaos of folks not understanding compiler theory. This is the result: every minifier is a compiler, but none of these minifiers boast properties like idempotence and none of them are sufficiently correct that they achieve their goal on the first try. I don't expect to necessarily see scholarly papers, but I would not ever expect a minifier to improve on itself when run multiple times in a row!
It's not that rare for optimizers to be able to find more optimizations in a subsequent pass. For example, it's often true (for at least small gains) in assembly peephole optimizers. Once you've applied one round of peephole optimizations, which are basically just simple pattern-matching replacing instruction sequences with better ones, the resulting, improved instruction sequence may now have new sequences that the peephole optimizer could improve again if run a second time. You could try running this iteratively until some kind of fixed point if compilation time were no issue... although, it might also have cycles! That is often the nature of these kinds of pattern-matching, heuristic optimization passes. That's especially likely if you mixed-and-matched from different compilers, like running optimization passes from gcc, clang, and icc interleaved with each other. Though that example would be difficult to do in practice, because they aren't source-to-source optimizers, and have totally different IRs, so can't easily be chained in the way these CSS-to-CSS optimizers can be chained.
This is one of the problems with IT: it has too little science behind it. A lot of things are built based on common sense, tradition, and chutzpah. Much like building was in medieval times. No wonder things often crash and take a hard-to-predict time to build.
Some things can be over-engineered but under-scienced, so to say, at the same time.
A lot of great computer science already exists, and is applicable, and some of it is taught at universities. But it's not required for a CRUD app or for a flappy bird clone. And when you grow up enough to want more complex things, a lot is already forgotten.
I have, I think, a different view: there is too much CS and not enough engineering in IT. The numerous CS grads have a bag of hammers as their toolset and so every engineering problem is a DS&A "nail." At least, here in the valley it seems that way. Outside of it there is at least in my experience much more of a nod toward software development as an engineering endeavor rather than as an exercise in pure CS.
Yes, the theoretic / scientific part does not percolate enough into the engineering practice. Thus the practice is starving without a good conceptual view, while CS may be lacking in engineering department and thus practicality.
Guess it really depends. Maybe the frontend landscape can be a bit more chaotic since JS started on really shaky grounds and so many new things come out all the time, but I'd say in general programming is still quite rigorous, especially if you're a big organization and can afford all the time and resources to apply all the best software engineering practices with a whole team. Even if not, modern developers generally do quite well. Actually, hasn't the problem people been complaining about centered on the presumed disconnection between CS academia and actual programming, about how a lot of graduates just simply can't code a simple task? I'd say a lot of theoretical CS is just not that easily applicable to actual engineering work. There's a reason why people divide science and engineering. The latter is so much more about ingenuity and practicality in a lot of senses. They're not the same thing and it would be really harsh to classify the whole field as some "medieval architecture" stuff. Not to mention there are still plenty of great works in medieval architecture!
Yep. The first minifiers were just some regexes to do a search and replace. Undoubtably many nowadays are full tokenisers and build an AST - but I do suspect that there's still plenty of room for improvement.
If you really want to get fancy you could use Selenium to get screenshots and compare them to check that the remynified CSS produces the same layout as the original CSS.
To add to that point, if you manage to get screenshots you can do an image comparison using perceptual hashing [0]. The idea is to compute hashes of images and compare those, rather than scanning through each pixel.
Note that perceptual hashes should not obey the avalanche effect, so a minor difference in the two images should yield a small change in the hashes (desirable in this case). Then accept results that are within some small % error margin.
An example of a library that does this is jimp [1].
Transforming .a-really-long-class-name-or-id into something shorter, like .x would save a lot more bytes.
Another thing is splicing unecessary properties, which can even save more bytes, but given web apps becoming becoming so dynamic right now, that would be hard or really awkward for developer to work with too.
This isn't something a minifier could do without also modifying HTML and JS, and since a lot of JS automatically generates CSS names (something like $(`.list-${i}`)), this would be nearly impossible.
The programmer could shorten it when writing, but is the extra half a milisecond worth having to work with x, y and z as classes rather than more explicit classes?
> This isn't something a minifier could do without also modifying HTML and JS, and since a lot of JS automatically generates CSS names (something like $(`.list-${i}`)), this would be nearly impossible.
the other option is what CSS modules does. your scripts essentially "require" classnames from your stylesheets so as classnames in your stylesheet are minified so are their js references.
While I am not aware of any open source framework doing this, as long as the programmer holds itself to a few basic rules, this should be possible.
If all html/js/css is generated by a template language, you keep the css classes easily parsable and you avoid using dynamic classnames in js (data properties are better for that anyway),
you can just replace all classes with random ids.
That's only a secondary reason, the main reason for this is to avoid having to deal with CSS' "everything is global" flaw.
CSS Modules [1] and other tools allow you to require CSS like you would with any other dependency, and replace named classes by unique hashes at build time. This prevents your CSS from leaking to elements that do not explicitly require it.
I really can't tell if this is satire. But either everyone is in on the joke and out to get naive people like me (in that case, well played) or Hacker News does not seem to think so.
I don't think there is one single use case, on any scale, anywhere in the world, where saving up to 17% on a minimized css file before gzip would matter in the slightest, let alone while adding 20mins to your build cycle :)
This discussion reminds me of this exploration into compressing JSON data. Unless you're considering the final gzipped size you're doing it wrong:
"In short, if you try to eliminate redundancy by applying some kind of clever transformation or restructuring your data object before piping it to gzip, you will probably just do a crappy job of reimplementing gzip at the wrong layer of abstraction, and get worse results overall." -- https://github.com/wincent/relay/blob/hack/contrib/lightweig...
I'm more a fan of writing compact, clean, logical css in the first place.
A couple months ago I worked on a project and re-wrote a 129k minified css file someone created as a clean un-minified 12k css file that had 100% of the original functionality plus some additional UI improvements.
You can only get these improvements if you understand what you are writing and stop using sass to write bloated files.
It sounds like you don't understand what a minifier is.
A minifier takes your css code and removes the noise from it. It does so by removing whitespace, comments, duplicate rules and redundant properties.
What it sounds like you're really complaining about is using sass (which has a compressed output), because it removes a layer of abstraction and makes it easy to shoot yourself in the foot. This is true of almost anything and everything.
`rm bootstrap.css` look, a 100% reduction! And it led to a better website.
`gzip goodfile.css` and there's an improvement several times more effective than even the best minifier. And it keeps your source code legible in the browser and doesn't require a slow/buggy asset pipeline to test changes.
Yes, yes, I know minify+gzip can save like an additional 1% over what gzip alone does. To me, that's just not worth the cost to the developer.
You're sacrificing personal efficiency for bandwidth efficiency. You get to decide which is the best option for you and your users.
I generally share the opinion that most people using bootstrap probably don't need to. I'd recommend questioning it at the start of every project instead of assuming it's what a project starts with.
In my anecdotal experience using bootstrap correctly with enduser efficiency in mind is the exception, not the rule.
>Reflects poorly on those people then, not Bootstrap.
Yes.
>Of course, who wouldn't plan their project ahead of time?
Again, in my experience: MANY people using bootstrap
> The ones that don't need breadcrumbs?
> // @import "bootstrap/breadcrumbs";
> ...aaand we're done.
My comment was pointing out an inherent flaw to the popularity of bootstrap (which is no fault of their own): it's often used improperly and has lead to MASSIVE amounts of resources being wasted globally.
My comment was made with the hope that someone who hadn't previously thought about WHY they are using bootstrap to think about it a little more at the start of their next project.
You decided to respond by essentially fulfilling the stereotype of "condescending IT guy."
I'm happy that you know how to use bootstrap as intended, and I apologize if my comment upset you... but I don't see the need to be a dick about it.
I'm sorry you took offense, I'm not sure what you found condescending.
If someone uses a hammer instead of a screwdriver we don't blame the hammer.
>in my experience: MANY people using bootstrap
Sure but you said "YOUr'e" sacrificing efficiency. If you want to move the goalposts to "using Bootstrap incorrectly is inefficient" it will be a different discussion.
I agree, we should use tools correctly, but the comment I responded to (and the crux of the discussion) revolves around the suggestion that removing Bootstrap "led to a better website".
With modern front-end framework you only include the parts you need. No one in their right mind would include all of it. You only include the code for the modal if you need it.
One of the main points of css optimization is not transfer size, but the time it takes the css engine to parse through a css file, and figure out where to apply those changes to the DOM.
I think that while the described "reminification" method may not make sense in practice (as other's have said running minification process multiple times will amplify minifier bugs), Optimizing CSS, removing unused rules, grouping similar rules, optimizing selectors etc... does make sense, regardless of reduction amount in comparison to compression.
Great work! There are several directions where you can make your work more substantial. For once it would be interesting to see what's the marginal benefit of iterative minification over iterations (can be represented in a graph). Does Remynification approach an asymptotic size value, will further minification actually increase the size instead of decreasing?
Also, it would be interesting to take screenshot using selenium as suggested by sbierwagen and see if Remynification actually preserves initial layout.
Lastly, it would be interesting to see a theory proposed as why such method work, and what can future minifier learn from this.
I would be extremely curious to find out what modifications were made by the minifiers in round 2 and onwards that it couldn't have caught the first time...?
JS or HTML I suppose I could see, because they're more complicated, but CSS by itself is very simple, so I'm wondering what actually got left out.
Do you have any diffs or info on what it was removing or doing?
No, but you can do this yourself quite easily. I am way too tired and sleep deprived to do more on this for at least a couple of days. Even then I have other priorities.
It'll be smaller because gzip can only losslessly compress what is there but minifying first can lossfully remove characters like comments or with conversions like "0px" to "0".
I found that rearranging by desired output seemed to work better, especially once you consider that compression works best on repeated longer strings. With minification as it stands you're forced to have the longer strings once, with the short strings (the CSS properties) repeated a lot.
This is exactly backwards from what you want. You want the short strings to appear infrequently, and the longer strings a lot.
CSS resets sheets are a bad example for this kind of thing, as they're strongly sorted by desired output property, but for general CSS for something with a lot of components, for example, or a CSS sheet with page template specific styling, it seems like it should minify and gzip a lot better.
Plus, you can group your CSS by relevant section, i.e. keep your colours separate from your alignment, from your fonts, etc.
Except the problem is that doing it this way requires a bit of rearranging of the rules, which may cause some trouble in a fairly small number of cases, so that's why it's out as an automated way of minifying things.
Does your "reminified" CSS actually compress better under GZIP than any of the independent minifiers? The variance you were seeing looks to be irrelevant in comparison.
I really should be doing something else, but this is too interesting.
It doesn't yet, but it could. Keep it mind, at the worst case scenario I get to pick which of the four individual minifiers I run. I will always at least be able to match the best one.
There is absolutely nothing that prevents me from running zopfli over the final outputs and comparing the sizes. I can use the compressed size to drive search. A matter of a change in metric used to consider what is an improvement.
I'll add this to my todo list on this project. Looks like I may be doing more on this than I originally intended...
And while at that, load it PhantomJS and measure parse and render time. Then try to figure out what weight to give bandwidth VS cpu performance, realize other things can hog the cpu, get an icecream and give up.
"It took 25 extra minutes to save this 261 bytes. Worth it? You decide."
I think I've decided it's obviously not worth it, even if we built a time machine and sent the css back to 1955 when they'd appreciate 261 bytes difference. (Granted they had no use for CSS in 1955).
I totally read the title as "Remnification" and "Remnyfication". I didn't see the right spelling until I got to section 5! "OH! RE-minification"... ;)
Very nice, this is a fun project and a nice write up. I would definitely worry about lossy minification on production code, I've bumped into many minifier bugs that broke my valid CSS.
Also, pretty sure you could get Bootstrap.css down to a couple k-bytes and really truly pwn the file size leaderboard if you could dead-strip all the rules not actually referenced in your HTML & JS.
This is a good point. On any sizable site, especially one that has evolved over time, you can't really remove CSS, it just gradually accretes more and more. But most of the rules aren't even used on most pages. Being able to first remove anything not used, then remove anything that gets overridden in the cascade, could reduce CSS down to a tiny fraction of its full size (never mind comments and spacing). But would that give better results than loading a big bloated and wasteful CSS file once and caching it for the entire site (or many sites in a case like CDN'd Bootstrap)? Either way is one form of bloat and waste or another.
The problem with such a tree optimization is that you will only find a locally best solution.
If compressor1 generates the smallest result in step 1 all other compressors will only try to minimize this result. But maybe the compressor did something which the other conpressors are not optimized for. So you'll find a lically best solution for a starting point with conpressor1.
But maybe it would have been better to start with compressor3 because it's result is smaller after step 2 than starting with compressor1.
This just makes no sense. If somehow the minimizer wasn't able to further process it, maybe it's just because it was already quite thoroughly processed and you will loss some information if you continue? That's totally unsound approach. What does that extra fee bytes of saving do to loading time anyways? Probably very little, yet you risk very real degradation and broken behaviors.
Not to mention the author even admitted to not being very proficient in CSS and doesn't even want to learn JS because he "dislikes it too much"... The whole description of the process is basically dragging something on out of very little substance.
In general good laughing material though, just as he acknowledged at the end of the article, I guess.
> I handle crashing minifiers, as well as ones that loop forever.
It could actually be useful to know the exact css before running the minifier who crashed/got stuck. One could check with a css validator if this is correct css in the first place (if not, one of the previous minifier screwed up) , and if so, inform the minifier's maintainers of their tool crashing with this particular valid input.
> It could actually be useful to know the exact css before running the minifier who crashed/got stuck.
I was actually thinking this could be used to speed up computation a bit too. If you're going to add a set of minifiers to the end of the chain, caching the intermediate results (really just the last one) would let you avoid reprocessing from scratch each time you need to do cssnano | cssnano | csso | *, I don't know if it does this now but the description didn't talk about it either. That'd let you look at the chain and find where a mistake propagated from.
I wonder what the impacts of (re|)minification is on actual network performance.
For example: consider a minified file x. X is likely getting gzipped when served. Is the reminified version smaller than the original? Same size? Bigger?
You might expect the obvious answer, but did anyone do the actual measurements?
I'm curious though, how do these minifications, or CSS minifications in general, affect runtime performance? Reducing the file size is great, and a goal in itself (though brotli is more important where gzip is supported), but another important goal is rendering the page fast. Some CSS selectors are faster than others, is this something minifiers take into account? Or is this not significant at all?
Changing selectors would be an unsafe transform (how could the minifier know that the new selector matches the DOM?) and it's not related at all. Even if it was, selector speed isn't really relevant, the changes are way too miniscule.
Yes, of course, but to me this phrase means: "I conducted the experiment and you can reproduce it, but it can run forever and break your stuff". That hardly sounds like an enticing introduction.
It will break stuff. It will run forever. The software that I committed works in theory, but I don't think you should actually deploy it in its current form.
If you aren't interested in some playful carnage on a slow news day, no worries. Different strokes, eh? :)
What the phrase actually means is "I conducted the experiment and you can reproduce it, but you shouldn't use it in production".
Have you considered that you're not the target audience? Clearly people find this interesting. You might too, if you actually read the article. Also, if you're browsing Hacker News expecting to only see 100% bug-free tools suitable for immediate deployment, I have some bad news for you...
Of course people find this interesting, the post has 140 points. I just posted my negative reaction, that's it. This is what people do here in HN all the time :-)
Looking at modern websites: very few use the Accept-Encoding: gzip of the HTTP 1/1
This could even «minify the served CSS» a lot :)
(ok client side it will always be as big, but the server side is often paying its traffic and the client side may too so it reduces traffic anyway and means great savings).
For public static websites, the savings induced by gzip compression totally justify to not use http2 when traffic matters.
Now rewrite all your selectors in optimum precedence and specificity for file size.
After that, remember that it will all go through gzip, so let's see how reordering properties affect compression.
Filesize is not necessarily better performance. On todays hardware i believe (but i have not tested this..) that opening a zipped 1MB file often is slower than just opening it directly, basically thats just shoving it in RAM.
On the contrary, reading over the network or from disk (no matter how fast and modern) is still comparatively slow and is nearly always the bottleneck.
Most non-embedded CPUs are more than fast enough to decompress every bit they get from the network or disk without decompression becoming the new bottleneck. There are other factors like latency, seeking, the minimum block size, and it always depends on the application, but generally speaking reading and writing compressed data should always be considered.
That being said, for CSS the benchmark should rather be the size after gzip or other commonly used HTTP-Stream compressions, which are almost universally used anyway.
Compression and HTTPS don't play nice together though. So in cases where every byte matters, and you can't get rid of HTTPS (an admittedly rather niche use case). This minification still has some uses.
If you are talking about the BREACH vulerability, CSS won't reflect user input. The attacker is not able to observe thousands of slight variations of the same file, which is how this attack works (simplified). Transferring static files compressed over HTTPS is completely safe to my knowledge.
So maybe if you inline your CSS into the page this comes into play.
Also I never said minification is useless, only that you need to compare plain+gzip vs minifier1+gzip vs minifier2+gzip to see the _effective_ size reduction.
Now I know why most of my colleagues in software development do not have a degree - academy really, really closes your mind in some strange bubble and shields you from real world...
I wouldn't recommend doing what the author suggests. crass doesn't squeak out extra bytes when you reprocess because it already does this for you.
If you combine multiple minifiers, the bugs in one minifier can end up being amplified by the others. For instance, one minifier might not consider !important when moving around CSS. Another minifier might then take that output and perform an optimization that's no longer just unsafe, but now incorrect. It might even delete rulesets that it believes are not used, ruining your stylesheet.
There are suites that test the correctness of minifieres, and many don't do great. CSS still "works" when the syntax is invalid, so invalid stylesheets result in undefined results when you minify. Between bugs and undefined behavior, I wouldn't recommend mixing and matching just to save one or two TCP packets, especially with gzip/brotli on the wire.