Ok. So most of these problems are related to Runner and Docker Containers for builds:
1. You can make Runner to not create cache containers with Advanced Settings and `disable_cache` option,
2. As for Git Repo, you can force Runner to retry cloning,
3. 500 with variables is new to me, let us know if you manage to get a stack trace from this error.
More generally, tuning the runner's polling interval to minimize latency is also tricky, especially with multiple independent runners, and the runner doesn't handle 429s well in my experience (getting stuck in a tight retry loop without backing off sensibly, thus continuing to exceed its request limit).
What's frustrating is that is a regression-- I used to be able to queue up many jobs and never got the job failures (which, again, give no error message whatsoever).
Thanks for writing that comment and making us aware of that case. I'm very sorry that it did that long to get reviewed and merged. The number of issues and merge requests that pass our hands is just crazy and sometimes it happens that some of them are oversight and not properly scheduled. Having people to be persistent, like you, makes them finally to be merged and be part of the product. Thank you for doing that! From our side, we will try our best to make this process to be more efficient.
One point may not have been clear-- this isn't just a matter of a single bugfix being available to the userbase sooner rather than later. Rather, it is that multiple iterations through a development cycle were missed. From my rough recollection:
1. There was a problem setting up an OSX guest to work with virtualbox runner on OSX with passwords. We worked around this using pubkey-based auth between guest and host.
2. There was also a timeout issue mentioned on the tracker for which Sam found the relevant code and suggested (but did not create) a fix.
3. Sam seemed to think a snapshot-reuse approach was too complex and likely to hang or fail. I didn't agree and was trying to explore ways in which they could run reliably as it cut CI time in half for my use case.
4. For uploading artifacts, the OSX guest's gitlab-runner binary had the wrong name requiring the user to manually create a symlink with the correct name. (I think this is fixed in 9.0, but not sure since I still have the symlink there and it's still working.)
At the time my CI builds (and probably Sam's) weren't working at all, so we both ate the cost of learning how the CI build system worked in order to debug it. If there had been feedback saying that his MR was the right (or wrong) approach I would have been happy to apply my knowledge gained at the time to the related problems mentioned above in order to debug and eventually fix them.
But now that knowledge is gone. The cost for me to regain it outweighs the benefits of small improvements to what is atm a working if imperfect system.
1. Since the review apps is tied with a branch, as soon as branch is deleted the review app is stopped. This actually happening in above video. At the end you see the stopped review :)
2. You can use your own services. Sid did propose to use Helm package. If you look into Helm package sources you will see how they do it and following the same you get the same result on vanilla Docker.
The biggest plus of using integrated registry is that you have integrated authentication and authorization of GitLab that follow your groups and members assigned to your GitLab projects, making it really easy to have private container repositories stored on registry.
Second the built-in registry is really easy to configure and maintain. You have to specify the address and provide a certificate to start using it. You also use the same backups mechanism as you use for your GitLab installation.
In the future we may make it possible to use external storage for container images, like S3. This is something that is already supported by docker/distribution.
I am super stoked to see this and will be using the crap out of it and pointing others to it (the registry is kind of a pain to get going)! It looks like this is just the v2 registry (from Distribution) integrated into Gitlab, so I'm wondering what's stopping me from backing this registry with S3? Is it just not supported by the Gitlab config yaml? I back my private registry with S3 and it's just a couple of config options to enable it. Or am I misunderstanding some fundamental concept here? Thanks for the awesome work!
Glad to hear you're super stoked! I think you're on the money regarding the s3 backup. I think it is making the configuration accessible. I expect you can work around that by doing it yourself.
It really comes down to cost and convenience. We don't charge anything additional to run the container registry (including unlimited private projects for personal or business use), and it's already installed with GitLab.
Having said that, we do love deep integration, so we'll continue to improve it going forward. If you have any ideas for improvements, please do create an issue!
We are thinking big. The integrated Container Registry is part of our plan. We will soon have an integrated deployments with easy to access environments. We also plan to introduce manual actions on pipelines allowing you to execute arbitary things, ex. promote to production, but also merge it back to the master.
Kamil, GitLab CI/CD Lead