Hacker News new | past | comments | ask | show | jobs | submit login
How Uber Deals with Large iOS App Size (uber.com)
257 points by rakingleaves on Feb 26, 2021 | hide | past | favorite | 202 comments



An old (but fantastic) comment from the previous discussion about Uber's app size that addresses why the Uber app is so big: https://news.ycombinator.com/item?id=25376346


The summary of that comment is "we have to include a ton of stuff that will never be relevant to most users like payments APIs that only work in India."

This is why some global apps have different apps for different countries. It's a trade off. Would you rather have a single fat Uber app, or have to download Uber India when you arrive there?


I love that it’s one app. I remember landing in Delhi from San Francisco and the Uber app worked perfectly and I was so astonished. Effectively nothing else translates seamlessly like that. Combined with the incredible quality control on the vehicles side (every car I got in India had a seatbelt, where close to 0 taxis you’d get on the street would), I knew it was a very high functioning company to execute like this.


Exactly same experience, from San Fran to Singapore and in between. Never really realised how good we have it, such a good point


That’s no longer possible now because Uber withdrew from South-East Asia. You can use Grab and Gojek in Singapore, but they don’t operate in the USA.


You should have come pre covid then. Ubers ecosystem in India has kinda collapsed, driver quality and availability has gone way down as more and more realize how bad a deal it is. Almost every driver I've spoken to here confirms that they are in terrible debt and have not made any money at all driving uber.


While I had a positive experience with the Uber app software-wise (app functioned seamlessly landing in Mumbai from U.S.A.), Uber as a service was not really comparable to what it is like in the U.S. and developed W. European countries. I got the sense that it was more of a strictly ride hailing / request service, not much accountability required from the driver. Often had > 3 drivers cancel after 10-15 minutes waiting before actually getting a ride. That being said, it was still a hugely valuable service to have in India as a foreigner.


Uber India isn’t like Uber New York but it’s a million times more reliable and predictable than almost any other commercial service you can get in India.


Really? Seems like it’d be more reliable and a lot faster to just hop in one of the hundred rickshaws on every street. Besides the fact that they’ll overcharge you if you’re a tourist.


Have you seen those things in an accident? I can easily afford the Uber with seatbelts and I am not going to trust my life to a three wheeler.


Reliable as IT offshoring commercial services?

You said any.


Yeah my experience was that it could be a bit of a pain with cancellations but there were enough cars on the road that I would always eventually get a ride.


Same here when I went from Bangalore to Houston. Amazing seamless experience.


> every car I got in India had a seatbelt

That sounds so weird to me... What do you mean cars don't have seat belts?!


> That sounds so weird to me... What do you mean cars don't have seat belts?!

I can confirm the OP's experience albeit in a different country (Mexico). In some regions, I haven't been in a single street taxi with working seat belts. Indeed, I've been in some private cars without them as well, or with more physical space for passengers than there were seat belts available.

Uber vehicles, though, have always provided those safety features. Some drivers work as both street taxi and Uber drivers (as they frequently cross into or live in regions without Uber but drive people to places that have it), so that quality assurance can trickle down in some cases. It honestly goes beyond seat belts though; a Uber car is more likely to have AC, electric windows, etc. than your average taxi, even in ridiculously hot parts of the country.


75% of Ubers in Egypt didn't have seat belts. It's just tucked away and not culturally the norm


People have even chuckled when I reached to put a seatbelt on in India. :)


Not sure what to make of this. I’m Indian and the last time I was in India, there was strict execution of the seatbelt law and offenders were made to wait and fined heavily. Most drivers were wearing seatbelts.

Maybe because it was in Chennai, a southern state and not in Mumbai or Delhi.


Sounds plausible. I was in Delhi a couple of years ago, using mostly Ola (Uber didn't seem to like my non-Indian card, strangely enough) and most of them did not have seat belts at all.


Used Uber in Delhi and the driver told me to relax as there was no need to wear the seatbelt because it's only mandatory for the driver.


Personally, I would download whatever the dominating hailing app is in the country. Most of the time its cheaper than Uber like Grab in parts of Asia


If you're landing in Asia from the US you're probably not going to care about the price (there are exceptions), and Uber has been reliable for me travelling, if that costs a few % more than the local version I'd be happy to pay for convenience.


> Uber has been reliable for me travelling

Unless India? [1]

[1] https://www.theguardian.com/technology/2017/jun/08/uber-exec...


This is such a non-feature for 99% of users.

The fact that Uber devotes engineering resources to serving the tiny 1% slice of users who care about having an app that works seamlessly across dozens of countries, at the expense of the much larger number of users with limited space on their outdated devices, is really emblematic of the Valley's out-of-whack priorities.


It's much more complicated than just countries. If you read the linked comment then you'll learn that it's includes many regional customizations, even down to specific cities and airports.

An app for travel should absolutely prioritize UX and ease of use as you travel, however far away your destination is.


Interesting assumption/perspective - The only people in my circle of friends and co-workers who use Uber, are travelers. And nobody I know has a clue of their phone storage or has hit it due to apps (as opposed to videos/photos).

For some, Taxi works well enough and is (perceived as) licensed/trusted/reliable so they don't have a need to use Uber. Others are bit of Luddites, or mistrustful. But I guess friction of Taxi / benefits of Uber just aren't high enough :-/

(Personally, I've only ever used Uber on specific travels; for 99% of my transactions, Taxi has been easier/more reliable. Don't get me wrong, I think Taxi licensing/medalion model is outdated, the drivers have worked incentives, and cars aren't as maintained as well as they could be. But I still normally don't find a benefit in Ubering).

Finally, FWIW, even traveling within country, I've noticed significantly different screens/features/options in, say, Ottawa or Toronto airports and vicinity. So I think overall a lot more people benefit from this monolithic model than may be immediately apparent.


I'm in the same boat. To a first approximation, I never use Uber in my local metro area. To a second approximation, I don't use it much traveling either. I'll often grab a cab if it's convenient rather than waiting for an Uber/Lyft to show up. Or if there's a good public transit option to my hotel I'll use that. In spite of (normally) 100+ days of travel per year I maybe use Uber a dozen times a year. (I do use a private car service to take me to the airport, but again Uber wouldn't be very good for that purpose.)


> This is such a non-feature for 99% of users.

I would imagine that travellers are an outsized percentage of high spenders, even if they are a small portion of users.


Totally this. It isn't uncommon for a whales (usually 1 to 2 percent of an apps users) to account for 30 to 50 percent of their profit. So it is very worth thinking of those customers first.

Not all customers are equal. And if you build your app or site without knowing that you may well chase the wrong features.


I would agree with this. At home I walk, ride my bike, get on transit, rent a carshare etc. When I'm travelling, I'm walking or calling an Uber/Lyft. I'll take a train or bus only if it's super easy to navigate.


Same here. In my local area I understand the transit. I have a transit card. But when I am in a foreign city that speaks a different language these are not realistic options unless I spend effort on figuring that out and deal with local municipal transit authorities, which is not the vacation experience I am looking for.


I spent a lot of 2018/2019 travelling around North America. It was wonderful to never have to consider a taxi.


As a frequent traveller, I would be much more inclined to either pick the airports that are going to be relevant to me ever or pick the destination airport when I am at the source airport which Uber can easily detect. I am visiting ~10-15 airports most of the time. I would be happy to spend some time to set this up on my phone so i not need to waste a huge amount of space and bandwidth every time that I update Uber. For Uber it would be a win too, reducing the size of the app significantly. Maybe it is only me.


A willingness to put in some time up front to tailor a technology experience to yourself is one thing that separates the average HN user from the population at large.

For a significant majority of users, if it doesn't work well out of the gate, it's broken.


Even as an avid Hacker News user, I wouldn't tailor an app download to save bandwidth/space. Both of those are abundant.


Don’t app updates only push deltas anyway? It’s only the initial install that’s large.


It's not even just per country. If you fly to certain airports in certain states they have different rules that affect what the Uber app can do. Do they maintain a separate app for Washington and for New York? It gets pretty messy pretty quickly. Not only do you have to maintain these different edge cases, but you also need to maintain separate applications and all of the problems associated with that like keeping libraries and API's in sync between them.


I'm probably missing something obvious as I'm tired, but why does the Uber app need stuff like specific airport rules stored in it rather than pulled from their servers when needed?

Map apps let you look at any city in the world without needing all of that data inside the core app, and if you have enough data to use the Uber app wouldn't you almost certainly also be fine to have it download in the background the required info (coordinates of where pickup is or isn't allowed, specific instructions message to display, etc.) the same as it receives information about local pricing, location of available cars nearby and so on?


The iOS app store prohibits downloading code at runtime, excluding some very specific circumstances.

As long as the airport rules can be stored as data, not code, then iOS would be fine with pulling them.. but at that point you have a data blob saying effectively "in country X, you can use payment Y", and you still need code that knows "payment Y means enable this section of the app and use this library"... and you can't download that library at runtime on iOS.

So my guess is that the code implementing regional rules have to be bundled with the app due to apple's restrictions, and the associated data that could be dynamically downloaded is to small to not just bundle it too.


Connectivity at airports is not the best - if you’re on an expensive data plan those extra MBs cost a pretty penny which might put off the customer.

You need to think about the larger picture.


Wouldn't it be significantly less than a MB for all the info about "how does Uber work at this airport" info to come in json or whatever? It could surely survive without downloading special airport-specific media like logos, do the airport rules require complicated heavy code on the phone?

Thinking of my experience with Uber at airports, it's generally stuff like "pickups from this terminal are only available at the west exit" (show on map - so needs coordinates and message text), or "There is a £10 airport pickup fee for all private hire vehicles here which will be added to your bill", or whatever along those lines.

But perhaps your point still stands even if it's KBs rather than MBs.


It’s also the airport layout information, including doors. From most US airports, you specify your pickup point by door (Arrival E6, for example), and I think some of that information lives in the app, along with the snap points that are rendered on the map (or even just the logic). There’s usually also a specific UX when requesting the ride where it asks you to select your airline to guess which terminal and egress to set as the pick up point.


Why would airport layout data not be handled the same way as city layouts? Maybe effectively automatically caching airports to make up for potential bandwidth issues?

And for the specific UX, sure but isn't that just a single feature for airports in general, and the app can pull the up to date airlines/terminals data for the specific airport when needed?


True, the rules data is small, which is why there's no point in not bundling it in the first place. Better to have an instant and smooth experience rather than try to save a few MBs by introducing a blocking network call before you can even interact with the app.


Give the streaming of location data for cars in the area, I don't think they care much about those extra MBs. If they did, you'd get a "N cars within X km of you", not a live-ish map.

A quick summary like "extra fee for airport pickup: X, pickup location restricted to: Y, etc." should be pretty comparable.


A live map of cars near you takes an incredibly small amount of data. Once you have the map data that'd be easily doable over a dialup connection.


Yes, it's pretty small, same as description of airport rules would be.


I don't have any numbers, but I would guess that a good part of Uber's user base consists of people that travel often. Not having to download a new app or update every time you go to a different country is a huge advantage.


There’s me, a frequent traveller who uses UberLUX 4+ times a day, spending more than 5000GBP/mo.

Then there are 100 guys with outdated devices who use UberX once or twice a month to get back home from the pub, probably splitting the ride with their friends.

I bring in more money than the latter group of people.


I’m a bit surprised you have the time for UberLUX. When I find myself with an expense account for Uber I always go for the lowest ETA, which is rarely Uber Premium. You must be hanging out in richer places than me.


In central London it’s never much further away than the cheaper options, the cars are actually nice (S-class 90% of the time) unlike in the US.


Why are you assuming that splitting it into multiple apps would require less engineering resources than keeping just one?


That 1% of users represents a lot more than 1% of profit, and engineering an application that is that robust across scenarios has benefits to the stationary users transiting through multiple scenarios as well (surge pricing to normal, credit card to Apple Pay, car ride to scooter ride, etc).


Au contraire, people with outdated phones are less likely to be able to afford to Ubers. Much of Uber's profit comes from people booking Ubers from airports, I've used Uber all over the world and it's absolutely amazing.


All sorts of people use UberX and Uber Pool, actually.

Uber Pool is no longer available due to covid; but hopefully it will come back as vaccination rates go up.


Personally? A single fat app. Less fumbling around when I land somewhere to get their localized app, which will presumably only be available from that country’s App Store that’s inaccessible before I land there.

If I have to deal with the airport’s wifi... I don’t want to depend on downloading a 100mb binary over it.

Maybe they could have a local and global version available, but that’s already making it more complicated.


> If I have to deal with the airport’s wifi... I don’t want to depend on downloading a 100mb binary over it.

That pretty much doomed all these "local Uber competitors". Nobody would start looking for a local competitor, set up an account, get 2FA, register their payment method and then have the privilege of getting a "starting up" screen telling them how and where they can get a ride. Instead they just open Uber and get where they want.


>That pretty much doomed all these "local Uber competitors"

Did it? How much of Uber's usage in a random big city is from travelers based elsewhere vs locals? And it seems like others have succeeded in some narrower regions (Grab, Didi, etc.).


A whole lot of Ubers big spenders. On a individual basis, but I know many consultants/sales people traveling all over the world that spend hundreds or thousands on ubers a month and want the ease of use. Those are Ubers profitable customers buying lux/black cars etc. in addition to many wealth families that use Uber all over the world in vacation. They should be solving for those customers over a customer worried about an extra 50 or 100mb even on their phone


> doomed

Uber lost in pretty much every market where they had local competition. Are there any counterexamples?


Not that it affects your overall point, but countries’ app stores are gated based on region you’ve configured your phone to, not where the phone’s physical location is.


The downside is trying to download Uber for the first time when arriving at an airport. However, I think Uber is so ubiquitous now that it's the likely to already be installed.


The Google Play Store solved this problem with dynamic feature modules. Either at install time or later on, you can let users download only certain parts of the app. All with a single app store entry and app bundle.

We use this for devices which don't have NFC. If the device doesn't support it, then there is no reason to download the module for identification via passport NFC scans.


Interesting. I have had bad cell connectivity in unfamiliar airports, though, where I could barely keep a connection to e.g. the Lyft servers. What if I can’t download the module when I need it? I don’t think requiring the user to download it in advance of their flight is viable, either. If we lived in a world of universal, homogeneous, inexpensive connectivity, I’d be satisfied with the solution you mention. I guess if they had location/policy micro-modules small enough to fit in a single MTU, then anybody who could connect to Uber at all could be served.

It still boggles my mind that there could be ~100M meaningful instructions in a program.


Which sucks. I arrive in a new country, and need to find internet access???


Mate.. how are you going to get your uber anyway without internet?


If you want to use Uber afterwards... yes, you already established you'll rely on internet access.


How are you ordering an Uber without internet access?


Consider that many travelers are using global roaming data rates on their sim card when arriving at their destination. They should be able to just open the app and it 'just works'. Your proposal isn't something a good product manager would even consider as an option for more than a few seconds


It's not exactly a case of downloading an India version when you arrive there. In order to do so first you have to register an India App Store account. And in order to do that, you might need an India cell phone number and credit card.


Not necessarily. They could have all the Uber apps in all the app stores. It would be a huge pain for their engineers and their users which is probably why they don’t do it.


No, it's also substantially different between certain cities and regions in the CONUS. Otherwise that might be viable.


Easy. One app.

When I traveled to India (pre-COVID), it was great that I could just open the app & get where I needed -- even if I wanted to travel via tuk tuk.

It was more annoying when I got to Ireland, had to figure out the app to use was MyTaxi and get it all working (including payments), particularly as a foreigner.


That's a false trade off. You can load partial content. That's the big advance of html. UI as markup and you can stream the small required portions of a large app.


> This is why some global apps have different apps for different countries

Do you have some examples? Genuinely curious. To my knowledge, most of the major FAANG apps are single-binary.


Dedicated apps for fast food chains are one example. A quick search gave me "Burger King India", "PizzaHut Egypt" and "KFC UAE".

Why? I have no idea.


Totally different requirements. Some countries' fast food apps are for mobile ordering and delivery, some are a giant collection of coupons, and some are only for nutritional information. And some are just for promotional activities (only used for various promotional calendars.) After seeing all the different APKs I tried out a few for different countries out of curiosity and at least for Burger King they were entirely different applications with completely different use cases. To cram them all into one global app would be an enormous mess.


I think fast food chains are generally not actually owned by the same parent company. A company in India is licensing the Burger King branding and presumably some of the recipes from Burger King USA, rather than being a subsidiary thereof.


This is the real reason. For instance, McDonald's in India is itself run by two companies, one for North and West India, and another for South and East India.


Probably payment APIs, just like Uber. Or they are developed by different local app shops.


Starbucks has a separate China app. I suspect a part of this is due to app size because there are regions specific SDKs. Another is likely security. Regulations in some countries require data to be shared with the government and they don't want SDKs that collect this data to be included in more privacy focused regions.


eBay is one that comes to mind right away (unless they changed it recently).

Google Pay is another one. They have a dedicate app in Singapore.

It seems like a lot of them went to single apps when they realized they could download data packs within the app. Stuff like Rick Steves guided tour apps used to be separate per city, but now it's a single app where you download the data for a certain city.

But I think you're right that all the major FAANG apps are single-binary.


>Do you have some examples? Genuinely curious. To my knowledge, most of the major FAANG apps are single-binary.

Worked on iOS app size at a FAANG for a couple of years -- this is untrue. At the very least there are different binaries for watch vs iPhone architectures.


As a user? The first option.

As an engineer? The first option too.


It strikes me that this would be the perfect use case for loadable modules. The Uber app could download the payment module you need on the first use and leave the dozens of other APIs off your device. This could also significantly cut down on the number of updates (300MB downloads each time!) that the app needs, since NA or European users won't have to re-download the app because some Indian payment API changed.

Unfortunately the way Apple and Google set up their walled gardens makes this impossible. I guess Apple would prefer if the Uber app dropped all of that and just made everybody use ApplePay instead.


>>> The Uber app could download the payment module you need on the first use

When I'm opening Uber at 1am in the cold to get a ride home, this is not the time to download a payment SDK update.


Or when you just landed on the other side of the planet and don't have a good internet access (or it costs $$/Mb): the app is still expected to work, because you need your ride right now


Uber doesn't work without decent internet access though. Maybe you can make he argument for $$/Mb, but there's no point in uber creating the app so that you can get to the pay screen without internet when you need the internet to use uber.


Decent can still be pretty bad and still work. And of course it's worth it because a phone can find/lose it's connection all the time for a bunch of reasons even if the network is high quality and high bandwidth.


The whole point is the seamless transition. Would you want to sit there twiddling your thumbs in a potentially unfamiliar area, while the region's version of the app loads? Or would you rather just have it work?


This keeps getting repeated in this thread & I keep not understanding it. If I downloaded Uber & set it up with the payment method that I need, why would that payment method suddenly change at 1am in India?

The way this should work is when you set up payment option X, it downloads the relevant payment info & then you're set from then on without any other modules unless you add a new payment option. Likely pre-bundle the generally "global" options (Apple Pay/Android Pay since those are platform-native & credit cards since those are likely small implementations).

The real reason is that you will always have drop-off because the download phase is split in two (on the other hand you'll have increase in installs because the app size is smaller). That would need App Store integration with the loadable modules so that you could say "Install these payment features of the app". This may not be a win because again, it requires the user to do more work. Simple for everyday users will often win the day even if inefficient vs more optimal options that achieve that optimality by pushing complication to the user.


It's like 5MB and you have to be connected to the Internet to use Uber in the first place.


This is anecdotal but...

I recently was traveling. I landed at a new destination and checked the internet speed. Mobile via my tablet just outside the airport was theoretically as 50KB/second via Google speed check. However actually downloading a file from US servers was 5-15KB/second because of the latency (3000ms+) being so high that packets were constantly being dropped.

That's at best, 75 seconds waiting for a download. At worst it's 16 minutes.

On the other hand, I was able to get Uber on my phone and though it was painfully slow, it found a driver in under 30 seconds.


Google encourages this with feature modules and app bundles. Apple doesn't allow it because they've always been against downloading executable code that doesn't go through app review. Same reason they don't like game streaming or downloading react native at runtime.

I'd like to see them open up this possibility in a controlled way one day. Something like a review process for feature modules that could be updated in a similar process to full apps.


Google is also serving a different market (average of lower-end phones on lower-end networks); however, I worry that every day Google's stance on apps and app review starts to look more and more like Apple's.

WRT to Apple enabling this: I imagine developers could get into a bit of a versioning hell-scape there if they were decouple updates of different modules in their app (do you know if your app is working with FeatureA v1, v2, v3, or not at all? How about FeatureB?) If Apple were to do this it would look something like app extensions do today (separate binary stored within the IPA - that's possibly thinned out and rehydrated on device); probably with very little control over what's loaded (similar to how they did rollouts: this percent on this date and nothing else)


Interesting idea, I assume Apple would solve the versioning problem the same way Google is currently handling it. Apple is doing something like this already but there isn't too much detail other than the mention of update packages here: https://developer.apple.com/documentation/xcode/reducing_you...


Actually google has recently added that feature with android app bundles. Apple has something similar with on demand resources, but specifically prohibits binary code in them. Hopefully with android having the feature, it would induce apple to allow binary dynamic libraries in on demand resources too.


Another person who think the internet is available everywhere.


Do you understand why we are assuming internet availability when ordering an Uber.


Sure, but not why you are assuming it is fast or cheap.


How are all you people without internet access ordering Ubers?


Also, this was an entertaining retrospective recently posted:

https://mobile.twitter.com/stantwinb/status/1336890442768547...

I can’t seem to find the old link, but I’m pretty sure it was on hacker news and somebody posted a nice collation of the posts.


After reading that I feel like coining a law:

Any discussion about software distribution will inevitably result in an argument about dynamic linking vs. static linking.


> And then you have the binary size bloat with Swift that OP takes about.

Hopefully, swift ABI stability will reduce that. The new bytecode stuff will help to reduce bloat, as well, but he notes that a lot of SDKs are used. In my experience, SDKs and dependencies often won’t work, compiled with bytecode. Hopefully, that’s changing.

That code repetition thing also happens when a lot of dependencies are used; which often reinvent the wheel. That just comes with the territory. It can be addressed by using highly granular dependencies, but that sort of flies in the face of why we use dependencies. One advantage that Uber has, is they are an 800-lb gorilla. They could contract for specific configurations of dependencies.

I’m not a big Uber user (but I’ve used it a few times). I think it’s a fairly well-done app, as a user.


Yeah that’s a great read, super interesting detail. You wonder what would have happened if Apple didn’t up the limit.

This post is next level though, deeep optimization. All of it is just increasing the ceiling though, there is some limit on number of features Uber can offer in one app.. and eventually that limit will be reached, doesn’t sound like they are willing to accept being over the limit either. Wonder what space saving techniques are left in the box?


And it applies to not only Uber, but tons of other apps. Unfortunately incentives are not aligned to make rarely used features load on-demand. I wonder what the average user percent of application binary actually executed is.


Why should a user care about an extra MB’s on their phone for a frequently? If we were talking about 10 year ago storage prices/capacity or much larger apps sure this would be a discussion but it shouldn’t be nowadays. I get it programmers like to focus on efficiency, but I promise you 99% of users don’t care about the extra MBs with the price of storage and value of Uber nowadays.


I’m the founder of a YC company in the current batch focused on solving this exact problem! https://www.emergetools.com

We parse Obj-C and Swift runtime metadata to determine size contributions of individual types and functions in your app. We use this analysis to post PR comments with granular size diffs to help devs write smaller, better code.

I tried it out on the Uber app and immediately noticed a disproportionate impact from their code-gen dependency injection framework, Needle. The codegen is responsible for over 30k classes in the app binary, and contributes over 10mb! In general codegen is a common problem with Swift binary sizes, and the fewer reference types generated the better, it even helps with startup time!

We’ve written a blog post with case studies about how 7 of the most popular iOS apps could reduce their size: https://medium.com/swlh/how-7-ios-apps-could-save-you-500mb-...


"The Lyft app has hundreds of duplicate files, the largest consumer of space is a single asset catalog copied 73 times in separate bundles. Another asset catalog that is virtually identical except for the timestamp at which it was created is copied 67 times. Each of these contain nothing but 482 colors (colors can be stored in asset catalogs to simplify management of dark mode). With each one taking ~250kb these quickly eat up 35mb."

I read this as: Lyft installed Dark mode for 35mb. I can only imagine what my JavaScript modules are doing behind the scenes.


My biggest complaint with iOS development is how confusing Xcode's build system is. Extracting code out to shared frameworks is a confusing process and I can understand how so many of the top apps get it wrong. Also, it's clearly not a priority for Apple because they don't provide easy inspection tools. Best case for them is the user buys a new phone with more storage.


Yep this is exactly the problem I'm trying to solve! A lot of large app companies have switched from Xcode to third party build systems like Buck or Bazel. This can make things faster but even more confusing. I've found analyzing the actual build products to be the best solution to make sure nothing unintended is happening.


Interesting! Are you interested in going back into the build process to try and thin the app directly? Or just in helping developers identify the sources of app size?


Currently identifying sources and offering suggestions for how to make improvements. For many apps this can reveal a lot of opportunity!


I agree that it's not necessarily the most straightforwards, but I would think that in the hundreds of engineers they have at least some have figured out how it works to the point to which they can do this…


I’m curious - does React Native/Expo do any better job at this, with tree-shaking and package building?


Expo produces ridiculously huge bundle sizes. I Would steer clear of it.


Discussed here:

Launch HN: Emerge (YC W21) – Monitor and reduce iOS app size - https://news.ycombinator.com/item?id=26014180 - Feb 2021 (44 comments)


My first thought reading the headline was “wasn’t there recently a show HN about this exact problem?”

Glad to see it near the top - saved me a search.


Is the name a nod to Gentoo's emerge? (https://wiki.gentoo.org/wiki/Portage#emerge)

Just curious.


I'd like to see Apple expose more control over their app thinning technologies.

Currently they only deliver the binary for the device's CPU, and only the assets for the device's asset class. There's then some tech targeted at game devs for on-demand assets for things like game levels that you don't need all of on device at one time.

I suspect the limitations of this are around the binary not being subject to this, but maybe it could be. I can see a couple of options, one is some way of extending the asset classes to code features, so that the App Store doesn't have to download iPad screens for iPhones, etc. Perhaps this could be extended with either App Store account region or locale so that, Uber in this example could not include the Venmo SDK outside of the US where no one has heard of Venmo.

Or perhaps Apple could extend the on-demand assets to allow for some sort of plugin system, perhaps backed by Swift Packages, such that apps can on-demand decide they need the Venmo SDK because they're in the US, and download just that. I don't think we want a generalised package manager here, I don't envision that SDK coming from Venmo directly, but allowing an app author to upload all their separate packages if they want to.

With feature heavy, international apps such as Uber I'd expect this to dramatically improve things. I'm not sure whether this benefit would translate to that much demand across the whole App Store though as I think this matters more to a very few big apps. Apple is at that optimisation point in the iOS lifecycle though so perhaps it's worth it to them.


Well since on demand resources were made for games (lua is specifically called out as 'ok'), I could imagine many games also making their initial size far smaller if they used binary code to make new levels or regions. Games are very large chunk of the app store and it's revenue.


Games don't really use those technologies because we want to use the same tech on Android and Apple so we typically roll our own or pay a third party BaaS. The big players probably almost always roll there own.


Sure, but until apple makes an official way to provide binary dynamic libraries on demand without breaking the app store rules, which would probably be delivered through something like binary on demand resources, the big game players cant do it either. Whatever official version apple makes will probably update in the background better too, since they have full OS control, unlike ad hoc apps.


Can someone help me understand this? They blame the source of the large bundle size on:

> The choice of Swift as our primary programming language, our fast-paced development environment and feature additions, layered software and its dependencies, and statically linked platform libraries result in large app binaries

but can somebody familiar with iOS development explain what makes app bundles so big? Actual CPU instructions or config can't contribute this significantly. The entire Bible is about 4.5mb. If you're writing an app by yourself you almost certainly didn't write that much text in the source code. A sibling comment links to https://news.ycombinator.com/item?id=25376346 which says that they have a lot of screens but even something like "PayTM (15+ screens)" is still just textual source code and config that I don't follow how it gets beyond kilobytes. The App Store places them at 309mb, so ~68 bibles.

I understand when games are large because they typically ship with images and videos included in the binary for game assets. But for a normal application where does the size come from?

Is it dependencies? (And how did _they_ get so big?) That weird intro video they have on the loading screen? Are they shipping bitmaps of the cities they have markets in?


Uber's article focuses on binary size, but the App Store 309mb number is app bundle size. 120mb of this is not coming from the binary. I have a breakdown of this here: https://news.ycombinator.com/item?id=25380198

App size can be measured in many ways like download size, install size, binary size, thinned size. I wrote about the most important ones here: https://docs.emergetools.com/docs/what-is-app-size


I'd say this is the most comprehensive breakdown: https://news.ycombinator.com/item?id=25376346


Let's say there are a hundred screens in the app and the app is 300mb. Does it really take 3mb of source code, about 3/4 of a bible, to render one screen?

(I do understand that source code isn't what ships in the binary, but for the sake argument let's say they're 1:1 in size.)


Considering that Uber's binary size (330MB) is comparable to similar apps such as Google Maps (224MB), Lyft (435MB), and Didi (332MB), it might just be par for the course for iOS apps.

Yelp, for example, is what you might call a "straightforward CRUD app" (to Yelp engineers, I know it's probably legit complicated and hard), and that is 292MB on the App Store.

It's probably to do with how the framework handles lifecycle management and combining static assets like text and image with business logic that lives in Controllers.


This is par for the course for large companies with many engineers working on writing code without spending enough time on keeping app sizes low.


It must be mostly the dependencies and assets they're pulling in for each screen, and not simply the source code. They could be using a different SDK for each type of payment they take, which is a lot. If the app has 250 features, and each feature includes 4MB of assets (images, icons, sounds, etc.), that's already a gig right there. I also suspect that there's a lot of reinventing the wheel going on, since there's 40+ feature teams all working on the app at the same time.


"Reinventing wheels" are represented by the machine-code outlining :)

These are code. Swift is a safe language with more runtime checks than other "zero-abstraction" languages. It also support "value" semantics and can deploy monomorphization for generics (although no guarantee). All these means you can have functions with slightly different view models duplicated many times throughout the binary.

Not to mention the language itself need to generate a lot of retain / releases for refcounting purpose (the blog post also pointed this out).

All in all, Swift as a language is not particularly optimized for small binary sizes, and there are a lot of trade-offs made to improve the usability rather than binary size. That has been said, there can be more opportunities exploited (and right now not) to reduce the binary size from compiler side.


Redoing the same code feature does not lead to exactly the same machine code


The argument you’ll have by saying that for its sake will be pretty useless, because source code and machine code are nowhere near 1:1 in size.


The same equivalent code in objective-c is significantly smaller than the same code with swift. Swift has a lot of implicit specializing templates which really bloats code size, like it would in C++. If you compare binary sizes of apps from the pre-swift era, you'll notice many are far smaller. Like I remember tweetbot being 4.5mb in the pre ios 6 days.


I'm not very familiar with iOS dev, but I'd suspect a lot of dependencies, yes.

Also, the Uber app has a LOT more features than you would expect at a glance, due to extensive customization of the experience (i.e. feature flagging) along many vectors, and so it wouldn't surprise me at all if this ends up adding to a lot of code.

Edit: Linked post from sibling commenter bhupy outlines this in detail.


> The app has a couple of millions of lines of code

I wonder if Uber is planning to do anything about that? The technique described in the article (whole program instructions outlining optimisation) is a band aid style solution, merely delaying the inevitable: the code produced by numerous teams independent of each other will inevitably cross first the download size limit threshold, and later maintainability threshold.


I also find it unbelievable to have so much code for a single app, this is approaching the level of magnitude of OS code bases: if you think that the entire Linux kernel is around 28M lines so roughly 15x the Uber app.

The binary size is also from the same ballpark as the entire Windows 98 needed for installation.

I'm glad Uber is doing something about this, but in my opinion Apple should tackle this across their entire ecosystem at the toolchain level, devices with less than 64GB of storage can quickly run out of space with just a handful of applications installed.

Unfortunately it's in Apple's interest that people buy devices with more storage, so I don't expect them to invest much effort in this.


> I also find it unbelievable to have so much code for a single app

This point comes up in a lot of discussions about non-trivial software. My theory is that it's of the same nature as underestimating development complexity when planning your own work as an engineer. Project after project, everyone (me included) keeps forgetting that they _will_ spend 80% of their time (and code) dealing with small issues and edge cases in their product.

Who hasn't thought at some point that they can write a Twitter clone in a weekend, or hasn't been fascinated by the amount of simple bugs in someone's else product, thinking that they are obviously just bad engineers.


For both technical and UX reasons it's in Apple's interest to make app updates lightweight, and those likely far outweigh any driving force they might have to encourage people to move to higher storage sizes.

(1) if it takes a long time for people to update their apps, that's a crap experience that people are having on Apple devices, which goes directly against the grain of Apple's whole value proposition ("use our stuff and your life will be great!")

(2) For technical reasons, it's in Apple's interest to reduce app image sizes; less strain on infrastructure, easier to scale, etc. (300MB * 1.2M (# of app store reviews) = 360 terabytes transiting their networks whenever Uber pushes an update. All that has to be load balanced, CDN'd, etc.)


I imagine there are a few attempts.

One worth calling out (and recently written about) is server-driven UI: https://artem-tyurin.medium.com/screenflow-an-unfinished-att...

The more the can make the app a "thin-client" (effectively just taking configuration from the server on how to display components w/in a Screen), the more product code they can pull from the app.


> later maintainability threshold

I mean I imagine no one person or team completely understands the entire app. Different people/teams are responsible for different portions of the app. Each team only needs to understand their modules and the few other modules they interact with.


They are already waaaay beyond the maintainability threshold.


Complementary: This thread on Uber's transition to Swift that almost broke them https://twitter.com/StanTwinB/status/1336890442768547845

Includes, among other things: forcing Apple to increase cellular download limits, 45 seconds for letters to appear in XCode, 12 seconds to call main, rewriting the linker and so on.


I'm sure swift's shifting sand castle language changes didn't help.


Good read. Good to know that Uber engineering culture was as much of a dumpster fire (despite the brilliant moves) as their product.


Hmm, looks like this comment didn't go over well. I think they had brilliant engineers (far beyond what I can even imagine) that pulled-off incredible feats to avoid disaster. But the disaster need not have loomed. Rewrite-the-world development smacks of what I call "cowboy engineering". Why didn't they migrate the existing code-base one layer at a time?


Brilliant is a big word for a company that still hasn't earned a profit


They took a lot of investor's money.. So in a sense it was a success.


This is certainly an great read, and working on it must have very interesting. That being said, in my experience things like these are invariably technical band-aids over social problems. Whenever I see things like this, often paired with statements like “there’s so many screens and feature flags”, usually the problem is not there but actually in many other processes: for example, the design team adds assets in a way that is not enforced by the usual tooling that checks binary size, or the build process adds duplicate files into the bundle that nobody notices. Sometimes the underlying issue is hard to fix, like if it’s code size explosion due to a custom templating engine, but they really should get addressed at some point. Changes like these don’t actually solve the underlying issues, which can be a benefit for a while, but eventually they become so complex that it is hard to maintain them and they start impacting productivity in harder to measure ways by doing things like increasing build times and reducing the quality of debugging information.


> usually the problem is

Usually, sure. But sometimes there is a lot to do, and if I may, Uber is not your usual app. At the point where you're being very choosy about the access modifiers on your classes, you probably thought about icon assets already.

Someone elsewhere in thread linked to a partial list of concerns the app needs to cover, many of which are location-specific. You might say "well split the app by geography", but that just trades one set of problems for another, and that new set of problems could well be worse for the business overall. Paying a team of people to do this junk may be a whole lot cheaper than suffering a reduction in customer engagement when they fly to a new country and don't have the right app anymore.


> At the point where you're being very choosy about the access modifiers on your classes, you probably thought about icon assets already.

You'd think, but many of the most popular apps accidentally ship these all the time. I think another comment mentioned that much of the code size seems to be coming from a code generation framework.


> While power law and fractal patterns have revealed themselves in several physical, biological, and man-made phenomena, to our knowledge we are the first to identify their presence in machine-code sequences in computer executable code. Presumably, machine code is a human expression of instructions to a computer and it is well established that all human languages show a power-law in the frequency of the words.

Made me chuckle. Maybe the authors should look at getting an ACM subscription.

[https://dl.acm.org/doi/pdf/10.1145/1391984.1391986]


It is an over claim to attribute the findings in the blog here to the "Power Laws in Software" paper published in 2008 at ACM Transactions on Software Engineering and Methodology. The ACM paper is casting a wide net on many places where distributions show power law in software and is focused on software modules/libraries/classes and their dependencies. There is a mention of CPU ISA using instruction frequency in CISC architectures but no in-depth treatment of the subject. This blog focuses on the machine instructions and analyzes just not instruction frequency but a whole sequence of instructions and their frequencies plus their lengths in an exhaustive manner.


Machine-code outlining sounds kind of like the opposite of function inlining. Right down to the name! I am amazed I've never heard of this optimization technique being used in compilers before -- it sounds like it could improve performance in many cases by making code smaller (or hurt performance for the same reason that inlining can help performance)...


It has... a few million... lines of code?

What?

Linux has 30 million of C!

I'm speechless. I cannot fathom how & why.


I've observed that lines of code are measured differently based on whether the writer is trying to convince the audience that the subject matter is big and complicated and the reader should respect the magnitude of dealing with this particular piece of software OR whether the author wants you to appreciate the brevity/simplicity/approachability of the software in question. The first decision made in this decision tree is whether you just use wc , or whether you filter out empty lines. Next goes the comments. Next goes syntactically less significant lines (just a closing brace that could* go on the previous line). Wash rinse repeat.

It's a variant of the "I didn't have time to write you a short ____ so I wrote a long one instead" adage.

I would guess (but only guess) that this article erred on the side overstating size.


How? 1000 engineers x 1000 LoC/each = 1M LoC

Why? If you have 100+ engineers at any given time, shipping features over a period of a few years, you'll hit 1M in no time.

It sounds like a lot, but it really isn't when you consider the amount of people working on it.

Now whether or not you can build the same thing with less LoC, probably. But it's not like it was built from the ground up with every piece of functionality planned out from day 1, so there will be inefficiencies.

Comparing it to Linux is pointless. Platforms should be relatively stable, products are ever changing and the shelf-life of the code is sometimes measured in weeks/months.


The app is the size of a Debian installer with dozens of executables made over the course of 50 years.


Cannot fathom why LOC is a metric? Me neither. Lots of stuff has millions of lines of code in various languages with wide ranging feature-sets and functionality. LOC has near zero meaning across the language/project boundary.


...as a demoscener who has released multiple 64k/4k intros I hereby formally say: LOL.


It's maybe a naive question, but from my point of view I don't understand why it's even a problem. Why is the Uber app so big ?


This article is bogus. They spend a whole bunch of time benchmarking build sizes, nothing about why they need so much code in an iOS app to begin with.

Apple has dropped limits for large app downloads on cellular. They now put up a dialog to tell the user how big the download is and if they wish to defer to a WiFi download.

I checked the size of the Uber app and it's about 300MB. Uber Driver is 232MB and Uber Eats is 228MB.


I would love to work on something like this. Optimizing assembler instructions for size, speed.. just writing some. How do you get a job in this? Is embedded land like this?


Why not do machine outlining in LTO/ThinLTO? `opt` doesn't really scale with huge module in terms of memory consumption and multi-threading, that's the reason ThinLTO was invented in the first place.

I think adding machine outlining into LLVM Pass pipeline is still doable with LLVM plugin (with new PassManager)...worst case just come up with a custom LLVM/Clang


So Uber has about 5000 engineers. If all of them write/click/draw 10,000 keystrokes per day, and it's all new code, and Uber has been around 10 years, thats 182 gigabytes of 'human input'.

Compiling that down to 200 MB isnt too shabby!


On a smaller note, adding lossless compression to the image assets in the Uber app can save more than 14% of 2.4MB

If your app has larger images, don't waste user bandwidth and optimize your assets!


I was quite surprised by the increase in build times:

> Overall, 5 rounds of outlining builds in 66 minutes — a 45-minutes addition to the baseline.


There's apparently a bananas crunch/backstory to this, where they committed to Swift before realizing they would hit its limits, and had to come up with a bunch of this optimization madness on the fly. I guess this is the cleaned up version and the more final, stable optimizations for the company blog:

https://twitter.com/StanTwinB/status/1336890442768547845


Wouldn't UPX give similar results in deduplicating binary code?


No. UPX won't work on iOS for various reasons, but it is also generally speaking a poor choice to use on macOS. There are several issues here:

* It usually does not reduce the size of the file in transit, as most files are compressed for distribution, and even if they are not most http servers will use transparent gzip compression

* It does not actually reduce the size of files at rest since APFS (and HFS+ before it) support transparent decompression. The layer this is handled at is sufficiently low level most people do not even realize it is happening (stat(2) returns the uncompressed size, you need to look at extended attributes to see the real on disk size). Admittedly this does not handle binaries that are drag installed on macOS. You can find out more details here: https://github.com/RJVB/afsctool

* UPX slows down app launch because you know have to decompress the entire executable before you launch it, which means you need need to read the entire compressed executable from disk before you launch it

* UPX greatly increases the memory overhead of running an application. Because you decompressed it in memory all the executable pages are dirty memory that need to be kept in memory or written to swap. That means you immediately loaded everything into memory instead of just the pages you needed. Normally a binaries pages are brought in from disk as necessary, and because of that they are unmodified clean memory. The built in compression support compresses smaller blocks of the binary and thus can still bring them in individually (technically this reduces the compression, but the trade off of being able to keep page demand loading working is more than worth it).

* UPX makes the system perform worse under memory pressure. The fact that it generates decompresses the pages in userspace means that from the kernels perspective they are dirty. If the kernel needs to evict them due to memory pressure it needs to swap them out. Uncompressed files (or those compressed with the builtin filesystem compression) are clean memory, which means that under memory pressure the kernel can just through them out and then reload them from the file later, no need to write out the pages.

In the past there were legitimate reasons for tools like UPX, and there may still be on other operating systems, but it simply does not make sense on Darwin platforms.


Is part of the strategy using private iOS APIs, tracking everyone outside of ToS, lying about it, getting caught, and then being too large for Apple to actually punish them?


This seems like a case of sloppy product management saved (or rather having its consequences delayed) by person-years of ingenious engineering.

Build times in tens of minutes seem terrible.


Oh I would kill for a sub half hour build time at $dayjob.


Wouldn't native apps be much smaller? I wish more companies did this. Uber on Android handles quite poorly.


Theare essentially compressing their app... ?


wow, all that code and it's still absolutely terrible to use. Imagine


328.9 MB I'm sure it could easily be 1/10 of that while keeping all features important to the user.


Important to a user in 1 region/country.

https://news.ycombinator.com/item?id=25376346

Uber is a global app, so the other 9/10s of the code is for all the features and functionality you'll never see outside your region since there is currently no way to split up binaries by region.


328 mb more than a complete OS. No doubt they can EASILY trim 30-60%. Asset optimization, stripping SDK libraries and you're done.

It's like graffiti.. the app is so big already, that the devs don't give a damn about optimizing.. why bother if they are just A/B-feature tests? 30mb for some unoptimized screens for example


If you click the link OP posted and read the previous thread you will see they can "easily" do ass. They already spent millions in engineering time to cut the app size as the company existence literally depended on it when Apple's bundle size limit was 150 MB.

If your app is serving hundreds of cities with specific per-city customizations and all code and assets are in a single binary, life gets tough.


If you open up the IPA, you'll see that basically they've recreated some anroid-xml compatible rendering engine.

The localization files (50MB) -> All the strings files are double the size (unpacked), because of useless comments. There's 25MB already.

In the assets catalog -> half an MB for an upscaled (!) visa card. Other images where jpgs of heif are a better choice. probably in total 10-20 mb.

Strip all ICC/Gamma from the PNGS -> another 10mb.

pngcrush the images -> about 40%

And then of course the binary itself which is probably full of unused information.


I would like Apple to be doing some of these against all the apps.

Few app developers have the time/bandwidth to do these things and it would be a very inefficient use of resources to have everyone do it over and over again.


I hope they have enough resources “They already spent millions in engineering time to cut the app size”

The image optimizations, precalculation should be done be apple. But the dev could use a lossy format for certain files. That’s up to the dev, not Apple.

Maybe it’s because I used to write some j2me games. Or some games when Apple only allowed 15mb I think. I had to optimize the hell out of my assets. Still I think I’d Lucy’s that certain apps are almost half a gb in size


Nice check. I guess they relaxed a bit since Apple raised the app size limits :) (sorry for the snark in the previous post)


Well it’s easy to assume that big companies do the work you’d expect. But I think the bigger the team, the less they care about all aspects.

Especially with A/B tests, because they are just temporarily


Curiously, they've also had that issue and wrote a dedicated tool to clean up unreachable code after old A/B tests:

https://www.infoq.com/news/2020/04/uber-piranha-unreachable-...

I also know it first-hand as just last week I've been doing a mega-cleanup of years-old A/B test flags in our own code.


tbh I'm actually surprised that the executables / code takes up so much space at all. I'm sure the executable has some embedded resources in as well, but 130mb for just the executable is quite a lot imo.


It is still amazing to me how Uber cannot narrow down the use cases enough. To me it was a done product in 2014, no need to additional features. I think the software industry as a whole does not have the concept of 1.0. We are trying to ship one more thing all the time.


I'd take a sledge hammer to it. The app doesn't have to be an app at all. It could simply be a stream with an os interactive overlay that intercepts touches. Like a thin client for phones.


A stream may also make it harder for the app to work in areas with poor connection. Which, given Uber's use case, is probably a likely scenario and one that could lose a lot of customers to competitors.


Pull it on first run and cache it. If you're in place with a poor internet connection, you aren't going to be able to download the app from app store any way. I am going to assume 80% of the code packaged in the app in the app is never used by a majority of their customer base. Like all the business features where a company give their employees allowances.


From some simple experiences with recording my desktop to a mp4 file I've found the delta compression to be extremely efficient when there is only a little motion. Perhaps still a deal killer, true.


That's called a web browser and a web app.

And then there wouldn't be anything to hog-up 1/3 of a GiB on every customer's phone, and it would always be up-to-date. Just don't ever lose internet access.


How do you expect to hail a taxi when you don't have internet access in the first place?

Does the app use SMS when the internet connection is lost?


Web apps can be cached and run entirely offline.

https://en.m.wikipedia.org/wiki/Progressive_web_application#...


Especially getting lost in a cell dead zone.

Its not exactly a web app but it could be made that way with WebRTC.


That could trigger the App Store review block. But this is Uber, so I'm guessing there's special treatment :)


Interesting. I've not looked at the long rule list in a while.


Uber is just a scam to launder saudi blood money through softbank. There is zero chance that human driven cars will go away. And self-driving cars cannot drive on roads with human drivers. Uber is in the later stages of the scam now and have "sold" (actually, they gave 400 million USD to the company they "sold" the division too) their self driving setup. They've admitted the only business model that would make them profitable is impossible. It's over. They're just trying to take the money and run now.

That their "app" is large is irrelevant to the scam.


According to their public investor reports, they've been EBIDTA profitable on Rides for years. They're also profitable in more rationalized Eats markets (markets where there aren't other VC funded companies burning hundreds of millions of dollars on subsidies). What do they need self driving for?


Not exactly: https://finance.yahoo.com/quote/UBER/financials?p=UBER

It is weird that something as exploitative as Uber can't even stay in business in the long term. Eventually profit does matter and the weak stocks will be culled in the next panic. Uber will be one of them.


To still be profitable after they stop breaking local laws and regulation catches up with them. https://horanaviation.com/publications-uber




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: