This appears to be bitcode.
It probably means they just starting making use of more metadata or something that is now included in the bitcode.
Bitcode also now deliberately trades off size vs speed and includes indexes used for LTO, etc.
They could be including those.
You should almost always expect bitcode to get beat by "llvm-dis|xz", because the goal of bitcode is not to be the most compact possible format, but instead, a compact format the compiler can use effectively :)
Now, if actual on-device binary sizes increased, my guesses would be:
1. it now includes bitcode, or another architecture, in the binary (which would be interesting)
2. Something has gone horribly horribly wrong :P Really, speaking as a guy whose org maintains the toolchains for platforms like this, there's a 0% chance we wouldn't notice a 2x-3x size increase.
The iOS Gmail app has increased in size to 140MB at the last update. I'm on a 16GB iPhone and I've noticed app sizes slowly increasing over the last two and a half years that I've had my phone. I have to keep an eye on app size to make sure I don't run out of space on my phone. I've already boycotted the Facebook app, but apps like Gmail are too important to uninstall. If it comes to it I'll be getting rid of a lot of other apps before Gmail (e.g. the Google app) when I need to reclaim space. I don't understand how app sizes keep creeping up; it's moronic. By contrast, the Reddit app is a slim 13MB and I'd say it does a lot more than the Gmail app.
The apple store binary format is really horrible when it comes to keeping the size down. Especially if you need to use libraries and whatnot. What it comes down to is this:
1 - The binary format itself isn't very efficient. Compressing it usually yields at least a 50% reduction in final size, often more
2 - Even though .ipa is a zipped format, there's no actual compression on the binary itself because it's encrypted before compression. I have no idea why they do this, I can't perceive any security benefit(Any jailbroken phone can be tooled to output the unencrypted binary. And you can bet anyone trying to reverse engineer your app will be using a jailbroken phone for their work)
Basically, if you're a large company with lots of apps and shared dependencies(Google, Facebook, etc...) doing the "right" thing and writing shared library code across projects ends up bloating your binary size immensely. And since binary size(rather than resources like images) dominate the final app size, you're in trouble if you want to keep app size down.
Source: I work on the iOS app for a music streaming service, despite our best efforts, our app downloadable size is ~35MB(~45MB installed size). Our android counterpart, almost effortlessly has the same app functionality with a 14MB downloadable package(they did some cleaning a while ago and got it down to 10MB at some point). Thing is, if we do proper compression(no binary encryption, single architecture) we end up with a 19MB .ipa file
PS: The compression issue doesn't affect installed size, but I believe the issue here is simply that (1) the binary format they're using is just too damn inefficient by default
Features mostly. I mean, you seem to absolutely need the Gmail app. While I, too, have a Gmail account, I like the simplicity of the native mail app in iOS. I don't need any additional features. I get mail, I read mail, I can write mail.
it's gotta be something like mostly media (uncompressed bitmaps), metadata, (debug symbols maybe?), libraries / platform / compatibility layers (95% of which is never used for any particular install).
I have seen what it looks like when you actually try to cram "features mostly" into a few tens of megabytes and the GMail app is not that.
That's exactly what the Gmail app does too. What other features does it offer that the Mail app doesn't? If none, then why does it need to be so huge?
I picked on Gmail because it's the most recent culprit that I have noticed creeping (sometimes jumping) up in app size but it isn't the only one guilty of this. Like I said in my original comment, the Reddit app is 13MB and does a bit more than the Gmail app. I can read/write posts and comments and view images; similar features to Gmail, but I can also watch gifs and videos. Can someone explain why there is an order of magnitude of discrepancy between the two apps sizes?
They're probably using several bloated SDK's within their app for core Google services, view elements etc etc. If they're using dynamic linking, they have to include these in the binary even if they only use a small percentage of the libraries.
I've never used the ios mail app, but in general gmail has a weird imap implementation that behaves wonky with a lot of mail clients so with gmail using the webmail or a gmail specific app is usually preferable.
I made the mistake of getting a 16GB iPhone 4s last time. I'll never make that mistake again (well, it's impossible now I guess). I was forever having to delete stuff to make room for updates (and IOS gets all sideways when you run out of space halfway through an app update) and managing the memory. It's just not worth the constant headache.
Erm, that's not true. As soon as GMail app updates on Android, it replaces the builtin one with full features. The shared "Play components" are things like push handling, which is builtin on iOS as well.
Play services and OS do not carry GMail shared components.
(Pardon the joke, but while the "m" obviously means "M" from context, I really wish people -- especially computer engineers -- wouldn't say bits when they mean something eight times as large.)
I think you have a good point. The whole reason for standards (like SI units) is so that people don't need to guess or interpret. I don't get why this was downvoted.
A bit isn't a divisible unit, so nobody is going to be confused as to whether "mb" stands for "millibits". As for the capitalization of "b", if we're going to be pedantic then we should say "Mo" for "megaoctets", since "byte" is, as far as IEEE standards are concerned, of ambiguous length. But I think we can trust people enough to not spend too long puzzling over whether "mb" means megabytes or megabits, just as we can trust them to assume that byte implies eight bits in this context.
What ? Sure it is. When you measure the information content (or, the entropy) of a message, you very frequently get non-integer numbers of bits per (character/unit/message/whatever).
Written english, for instance, has 1.46 bits of information per character.
Entropy (in bits) is defined as - \sum_x (p(x) log_2 p(x))
There is no reason this has to be an integer, since probabilities are not restricted to being reciprocals of powers of 2.
Consider also that you can simply use a different logarithm base to get a different unit (e.g. use the natural logarithm to obtain the entropy in nats). It would be bizarre if the arbitrary choice of 2 as the base gave a unit that was indivisible.
I think this whole confusion comes down to the difference between a bit as a "unit of information in the sense of information theory" [divisible] and a bit as a "single physical one or zero" [not divisible]. The relationship between the two is that the entropy of a random variable is a lower bound on the average number of bits required to represent it.
only when you consider bits to be the final, indivisible, fundamental unit of information.
which they aren't.
if you have a data storage thingy that can store any of three values, a ternary digit, it is exactly equivalent to log2(3) = ln(3) / ln(2) ~= 1.585 bits.
kind of like US pop-science articles like to say stuff like "a volume 1.5 olympic-size swimming pools" (because a megagallon is just weird), even though obviously, can never have half of such a pool or it would empty.
(ok after some consideration, you could have the bottom half)
you are obviously right, but i think that in the specific case described above -- computer code -- we have binary digits as final and indivisible units.
> But I think we can trust people enough to not spend too long puzzling over whether "mb" means megabytes or megabits
Please don't assume this. I have the great pleasure working with network Engineers, who have apparently globally decided that bits are a perfectly reasonable measurement of throughput and react very differently to speed in Mb/s and MB/s. I'm not trying to be pedantic or say that this is how it should be, I'm just saying that people really do use both units and it is horribly confusing and anything you can do to not be ambiguous is appreciated.
Well, no. If we're going to be pedantic, we should say MiB (mebibytes), because file sizes on disk are expressed in multiples of powers of 2, but the SI prefixes are multiples of 10.
So a 1 megabyte file (as reported by the file system) is actually 1048576 bytes, which technically - sorry, I mean pedantically - speaking, is 1 mebibyte.
To make matters worse, disk manufacturers use the decimal prefixes, so our nice 1 terabyte drive is 931 mebibytes, but is reported by the file system as 931 megabytes (not MiB).
Finally, memory manufacturers use the binary prefix, so 1 megabyte of RAM is actually 1 mebibyte (1048576 bytes).
A bit of a mess, no?
All the above is, IMHO, a consequence of imprecision. If we get used to being loose with our terminology, we risk carrying that attitude over into our work product, with sometimes regrettable results.
So I'll continue to strive to be pedantic (translation: precise).
I didn't know OS/X used the decimal prefixes, but that just means it's less true, not untrue. There are still many more systems out there that use the binary prefix. I imagine most *nix, and not sure about Windows. And RAM is still power of 2.
I don't think it's terribly user hostile to express sizes as powers of two when you work with these kinds of numbers for a living, especially when it's near the bare metal (Erlang binary data type FTW!)
But I do think it's user hostile to have two different units depending on what you're looking at. If it were all decimal or all binary, it would be much easier.
Agreed. Fun fact: Wolfram alpha understands Mebibytes, which is useful if you want to quickly convert between networking specs (say megabits) and "real" computer units.
Then again, maybe doing more simple math by actually using one's brain wouldn't hurt either. :)
Edit: And yes, I'm aware that you'll never get the converted speed of what is written on the network device's box. But sometimes it's nice to have an upper limit you can compare to at least.
Many, many wire protocols use 5 bits of bandwidth to send 4 bits of information, for various reasons. So dividing by 10 gives you a better estimate.
Of course when gigabit became a thing, your practical throughput was more like 75 MBps for a very long time, and being off by 25% in capacity planning is a pretty big error (one I've seen numerous engineers make, and a few make both, which means you're off by 40%)
After using many Unix tools that have this convention, I'm ok with 10 M referring to 1010241024 bytes (10 MiB), contrasted with 10 MB meaning 10,000,000 bytes.
This is gatekeeping, since the message is coded to be obvious to those "in the know" (of course mb means mebibyte!) but is a barrier to those who are trying to learn more (mb probably means megabits per second? it's a unit for measuring download speed? why is the 's' left off?)
This is putting the burden of collaboration in the wrong place: it shouldn't be a question of, can we expect a reasonable engineer in the industry to understand this unambiguously (with some deductions); but rather, can I hold down the shift key when typing the abbreviation for megabytes.
Obviously this depend on the actual audience, don't bother following this in team chat where speed is more important than clarity.
Because we solve practical problems, and do not nit pick of what is technically correct. We are not drones and easily understand that in this context it's megabytes.
Just look at all that "technically it's mebibytes bla bla bla" in replies. No one cares. Write some code. Or better - go outside.
space launches have crashed because of confusion over standard units.
in that case it was confusion between metric and certain fantasy engineering units, but an error of 1000/1024 will cause troubles just as badly.
so with that attitude maybe don't write that code, and better stay inside or a rocket might fall on your head.
but for serious, that correction probably has taught more than 10 people the difference between uppercase B = bytes, lowercase b = bits, uppercase M = mega = 1 000 000, lowercase m = milli, MiB = mebibyte = 1024 x 1024 bytes = 1 048 576 bytes, or at least made them aware of the important fact there is a difference. while your pedantry about nitpicking has taught nobody anything except to always be alert cause there's people like you that like to offload mental ballast and use wrong units because they insist their errors can be inferred and corrected from context... which is an important lesson also, but as a warning, not to defend the behaviour.
You should have included the whole window in the screenshot to prevent confusion: http://imgur.com/a/e2wb3
Specifically to those who don't use iTunes Connect, this page is titled "Estimated App Store file sizes for Build", and the (?) callout says "This is the amount of disk space the app will take up on the customer's device."
Then that's pretty confusing when the (?) button next to Install Size says "This is the amount of disk space the app will take up on the customer's device."
I'm not sure the same can be assumed for Apple based on several things such as internal secrecy, differing organisational hierarchies, their willingness to focus on things that aren't tangible metrics like end user file download size.
Between the org and culture differences I'm just not so sure Apple can be safely assumed as similar as you expect.
None of the things you mention change the fact that if you increase the IOS system image size by 3x, it will likely no longer fit in the default firmware partition :)
It impacts only the bitcode and not the final binary AFAIK. The issue is with debug-info not being stripped out completely as expected. It comes from a change from LLVM upstream that was made (in 3.9) to keep source locations for providing better optimization diagnostics. It interacts badly with the debug info stripping and apparently wasn't caught. It is used by the optimizer to tell you e.g. that a specific loop can't be vectorized.
See http://llvm.org/devmtg/2016-11/Slides/Nemet-Compiler-assiste... for more details about how it's used.
Note that the size increase is in the _bitcode_ portion of the binary. This slice is stripped from the binary before it makes it to the user's device. This means the size increase is merely an inconvenience during the development process, and has no impact on the size of apps as users see them.
And the 2.2x non-bitcode increase seems consistent with adomanico's report of a 2~3x increase of "App Store File Sizes" as reported by iTunes Connect: https://news.ycombinator.com/item?id=13992107
Would be interesting to see numbers for the compiled, stripped and thinned binary the end user will download from the App Store (they are in the details view for the app build on iTunes Connect). My guess is that at least most of the change should go away there.
With bitcode, app thinning and what not in the mix the Xcode build artifact is so much different from what's actually being downloaded it's hard to tell if this radar has real world implications for anyone but the developer uploading the build to apple. Still interesting, and potentially annoying though.
The Apple bug report was filed by my colleague, JP, after we noticed that the built size of Realm's frameworks dramatically increased after updating to Xcode 8.3. We pay attention to the built size of our frameworks as we distribute precompiled versions of them (https://github.com/realm/realm-cocoa/releases), and a signifiant size increase inconveniences our users. We've not tested the impact of the size on an app installed via the App Store, but since the increase is limited to the bitcode portion of the binary we have no reason to think it will be affected.
I'm not well versed in the details of iOS development. Why is this inconvenient during development? Is the bitcode portion copied to your device and is a slow process?
Some people like to check in precompiled versions of their dependencies. Others like to distribute precompiled versions of their frameworks or libraries. Larger binaries aren't great for either of those cases.
Larger files to store and upload to iTunes Connect mostly. I don't think bitcode is ever transferred to development devices.
I guess there is also a risk that the compilation process generating 3x larger files could also be slower as more work are being done - but that's just speculation.
I think the size iTune Connect reports includes bitcode, and isn't representative of the size of the app once it makes it to a user's device. The information in the original bug report you linked to clearly shows the size increase is limited to the bitcode portion of the binary.
The size I'm referring to is shown in iTunes Connect, which is the administrative side of the App Store that the developer uses to manage their app releases. As far as I'm aware the size shown in the App Store's user-facing UI does reflect the size that will be downloaded by the device. I think it's possible to see this size in iTunes Connect, but since I don't myself have any apps in the App Store I can't easily verify this.
Is this also true to user downloads from a website? The majority of your clients get our apps via Enterprise certificates not through the Apple store process.
When you create an archive, Xcode includes the full version of your app but allows you to export variants from the archive. I think most enterprise users disable bitcode for simplicity but it's possible to slice the variants and deliver the correct one to your users based on their device if you want to.
So, the app explodes in size, and since almost no app provides a "clear cache/temp" feature, apps grow til you are crashing routinely. While iOS may clear some space when it feels like it, I have a monthly routine of deleting and reinstalling a slew of apps which take up gigs of space on the device after usage, even though they are just showing data stored on a server. I know, I shouldn't have to worry about this, that iOS will eventually clean it up... but when apps are crashing b/c they can't get space, I wind up having to manually step in.
So, a) for devs, if you think your app caches, provide a way to clear it (look at Opera's Coast browser, who puts it in the Settings app), and b) for users, if you think you are out of space, look at apps and compare app size to total space, and you'll find some hogs.
> "Note that the system may delete the Caches/ directory to free up disk space, so your app must be able to re-create or download these files as needed."
> tmp: "however, the system may purge this directory when your app is not running."
Despite this claim. It doesn't feel like it happens. Maybe, apps aren't writing the Cache files to the right place and that's why they aren't being cleaned up.
But system controlled garbage collection is great until it's not. When I need space, I need it now, not when system decides I do. We've seen it with Java, and now with iOS. Yet another thing that iOS does on my behalf that I may wish to do myself.
I made the mistake of buying a 16 GB iPad mini 2. While pretty much all I use on it are a screenful of streaming apps, it's chronically low on memory. I have iCloud Photos enabled and it set to optimize storage, but it regularly gets in a situation where there's not enough free storage to upload new photos to iCloud Photos, since iCloud Photos has filled up the 4 GB free storage...
I've been using Google Photos for this. They offer unlimited storage (though they compress photos), and the sync process is hands off (just have to open the app).
This is due to bitcode, which won't actually affect the binary size seen by end users (i.e. app download size): https://twitter.com/jckarter/status/846796503775567872
"That at least shouldn't affect your users' download size, then."
Is this only true for store downloads? Because the majority of our apps is delivered simply by a website download through the device via Enterprise certificates.
Isn't 10.3 the first version that is introducing the new APFS file system? If so, couldn't that have something to do with it? Does each app need to compile for both supported file systems now? I am not a LLVM expert but someone with more expertise on this subject might be able to say. I just found it odd that no one else here had mentioned it. It is the big update for 10.3
Did anyone actually look at the content which is responsible for the increase in size? I hope it does not include the source itself, comments, and who knows what.
Apple charges $1200 to upgrade the latest touch bar rMBP from 512GB to 2TB of flash.
Let's not forget that they are a hardware vendor.
I don't think it's some grand conspiracy theory, but the interests of the vendor and of the user are not precisely aligned when it comes to efficient usage of storage. (The lack of stripping applications of their alternate language content on install/download also comes to mind.)
This theory is pretty easily debunked when you consider the primary goal of bitcode is to strip assets from apps that aren't relevant to a user's device, hence taking up less space.
(For example, removing @2x images for a plus device that uses @3x images, or vice-versa).
App thinning is orthogonal to ENABLE_BITCODE. The App Store could have been thinning app binaries years ago (literally just `lipo -thin`). Apple likely wanted ENABLE_BITCODE for better app store verification and perhaps architecture re-targeting in the future. I very much doubt the latter because Apple seems to have no compunctions about forcing developers to use newer SDKs, rebuilding apps for WatchOS or tvOS, or aggressively deprecating 32-bit only apps in the App Store.
I said explicitly in my comment that I don't think this is some
grand conspiracy. I don't think Xcode makes big files to use up disk space. I think Apple just has little incentive across the entire ecosystem to use less storage or use storage more efficiently.
Afaik bitcode is so that Apple can rebuild binaries for different target architectures (e.g. new models of phone, watch, et c) without source developer interaction. It should come in major handy in the OSX app store when ARM64 Macbooks ship in a year or two. A full app store of working apps on hardware launch day will make the bitcode requirement worth it when it's the smoothest architecture transition they've ever done (out of 680x0->ppc and ppc->x86).
Bitcode does not allow cross-architectural builds. This is a common misconception. IR (& bitcode) includes architecture and platform-specific ABI.
What it does allow is for better optimizations as the LLVM backend optimizer improves.
I would imagine that with enough engineering effort, a cross-architecture "porting" of IR would be feasible. I doubt Apple will bother to do that when they can just force developers to rebuild and republish lest they get left out of the App Store.
Outside of a Java-style high-level VM, cross-platform cross-architecture in the C world requires compilation. You cannot use what is not there. When compiling a C language, macros are used to determine architecture and platform, and extra code is simply not compiled. The most simplest of examples is endianness handling, which would be totally broken if Intel-compiled code is automagically made to run on arm.
iOS and macOS are both little-endian, but there are many other differences between platforms (like pointer alignment, SIMD size, ObjC ABI) and bitcode makes no effort at all to accommodate them.
Bitcode can used to recompile for minor ARM updates, compiler bugs, new optimizations etc without having to get developers to submit new binaries.
> Outside of a Java-style high-level VM, cross-platform cross-architecture in the C world requires compilation.
Kind of.
On IBM i, C compiles to TIMI bytecode just like everything else. For producing actual native code directly from the compiler you need the Metal C compiler, or the POSIX compatibility environment (PASE).
The TenDRA C and C++ compilers also used bytecode (TenDRA Distribution Format).
No, Apple is not intentionally wasting your storage. That's frankly a very offensive accusation to make.
As for alternate language content, honestly that stuff doesn't take up very much space, and stripping it would break the code signature. The only way Apple could strip that is if they perform the equivalent of App Thinning based on a list of languages you specify when downloading the binary from the App Store (and even that wouldn't apply to downloading non-MAS apps).
if so okay, but still i do confuse with the limitation of 100mb.Sometimes facebook apps can update more then 100 mb and some apps cannot update over 100 mb.
I didn't downvote you, but I can help you understand why you were downvoted.
On HN, you'll usually get an answer if you ask a question.
"Is x going to be a problem for y?"
Making an incorrect statement however will get you downvoted.
"x will be a problem for y."
The other issue is your writing quality. You will get better results if you put more effort into your writing.
You don't have to have perfect English. Many people on HN have English as a second language.
Things like a capital letters at the start of a sentence, and using single full stops (periods) at the end of sentences, show effort. You will find people will be more tolerant of grammar issues if you at least put effort into the basics.
Bitcode also now deliberately trades off size vs speed and includes indexes used for LTO, etc. They could be including those.
You should almost always expect bitcode to get beat by "llvm-dis|xz", because the goal of bitcode is not to be the most compact possible format, but instead, a compact format the compiler can use effectively :)
Now, if actual on-device binary sizes increased, my guesses would be:
1. it now includes bitcode, or another architecture, in the binary (which would be interesting)
2. Something has gone horribly horribly wrong :P Really, speaking as a guy whose org maintains the toolchains for platforms like this, there's a 0% chance we wouldn't notice a 2x-3x size increase.