[{"data":1,"prerenderedAt":817},["ShallowReactive",2],{"/en-us/blog/how-we-built-gitlab-geo":3,"navigation-en-us":37,"banner-en-us":447,"footer-en-us":457,"blog-post-authors-en-us-Gabriel Mazetto":699,"blog-related-posts-en-us-how-we-built-gitlab-geo":713,"blog-promotions-en-us":754,"next-steps-en-us":807},{"id":4,"title":5,"authorSlugs":6,"body":8,"categorySlug":9,"config":10,"content":14,"description":8,"extension":25,"isFeatured":12,"meta":26,"navigation":27,"path":28,"publishedDate":20,"seo":29,"stem":33,"tagSlugs":34,"__hash__":36},"blogPosts/en-us/blog/how-we-built-gitlab-geo.yml","How We Built Gitlab Geo",[7],"gabriel-mazetto",null,"engineering",{"slug":11,"featured":12,"template":13},"how-we-built-gitlab-geo",false,"BlogPost",{"title":15,"description":16,"authors":17,"heroImage":19,"date":20,"body":21,"category":9,"tags":22},"How we built GitLab Geo","Take a deep dive into the many architectural decisions we made while building GitLab Geo.",[18],"Gabriel Mazetto","https://res.cloudinary.com/about-gitlab-com/image/upload/v1749678985/Blog/Hero%20Images/how-we-built-geo-cover.jpg","2018-09-14","\n[Geo](https://docs.gitlab.com/ee/administration/geo/index.html), our solution for read-only mirrors of your GitLab instance, started with our co-founder [Dmitriy Zaporozhets](/company/team/#dzaporozhets)’ crazy idea of making not only the repositories, but the entire GitLab instance accessible from multiple geographical locations.\n\nAt that time (Q4 of 2015) there were only a few competitors trying to provide an *automatic mirroring* solution for repositories and/or issue trackers, and they were mostly built around an additional independent instance and a bunch of webhooks to replicate events. Also, in those cases, no other data was shared outside this asynchronous replication channel, and you had to set up the webhook per project and take care of the users yourself. Long story short: this was not practical for any instance with more than a couple of projects.\n\nWe also had a previous experience early that year [using DRBD to migrate 9 TB of data](/blog/moving-all-your-data/) from our dedicated co-location hosting to the AWS cloud,\nwhich didn't provide the scale, performance, or the UX we had in mind for the future.\n\nHere's the history of how we built Geo:\n\n## Phase 1: MVP\n\nGeo's first mission was to provide people who were located in satellite offices, or in distant locations, with fast access to the tools they need to get work done. The plan was not only to make it faster for Git clones to occur in remote offices but also to provide a fully functional read-only version of GitLab: all project issues, Git repositories, Wikis, etc. automatically synchronized from the primary with as little delay as possible.\n\nTo get there we made a few architectural decisions:\n\n#### 1. Use native database replication\n\nThis would allow us to replicate any user-visible information, user content, user and permissions, projects, any project relation to groups/namespaces, etc. Basically, any data ever written to the database in the primary node made readily available to the others, without any extra communication overhead in the webhooks.\n\nIt is also the most [Boring Solution](https://handbook.gitlab.com/handbook/values/#efficiency), as it uses proven technologies developed for databases in the past two decades. To simplify the endeavor we decided to support only PostgreSQL.\n\n#### 2. Use API calls to notify any secondary node of changes that should happen on disk\n\nThis is the second synchronization mechanism. If a new project is created or a repository updated, this notification lets any other node know they have this pending action, and should replicate the new data on disk.\n\n#### 3. Use Git itself to replicate the repositories\n\nWe investigated many alternatives to replicate our repositories, from using basic UNIX tools (like `rsync` or equivalent) to specific distributed file-systems features. We were aiming for a simple solution, as ideally we had to support the lowest common denominator, which is a Linux machine running the default filesystem (ext3 or 4). That limitation ruled out any distributed file-system based implementation.\n\nWe considered `rsync` and its variants as well, which could potentially work for our use case, but that would add significant CPU for each synchronization operation, and we expect it to increase as the repositories get bigger and bigger.\n\nBy using `rsync` we would need to grant more on-disk permissions than we were comfortable doing, and restricting its reach could be an engineering challenge in itself.\n\nThe same can be said for `scp` and its variants. In the end, we decided to use Git itself and benefit from its internal protocol. This was a no-brainer and very easy decision to make. We understood the protocol enough and we already had the required safeguards in place. All we needed was a slightly different authentication mechanism for the node-to-node synchronization.\n\n#### 4. Always push code to the primary, pull code from anywhere\n\nWhen we started Geo, there was no bundled Git support for having a multi-repository \"transactional\" replication, or information on how to implement one.\n\nWe figured out quickly that to implement something on that line it would require either a *global lock* or to implement a variant of [RAFT](https://raft.github.io/)/[PAXOS](https://en.wikipedia.org/wiki/Paxos_(computer_science)) on top of Git internal protocol.\n\nBoth solutions have their downsides and tradeoffs, and adding to that the time and effort to build it correctly, led us to opt for the simplest implementation: always push to the primary, notify secondaries that repository data changed, and have the secondaries fetch the changes. This is also in line with our motto of [Boring Solutions](https://handbook.gitlab.com/handbook/values/#efficiency).\n\nThe initial repository synchronization is no different than doing a `git clone \u003Cremote> --mirror`. The same idea goes for the repository updates, they behave very similarly to a `git fetch \u003Cremote> --prune`. The difference is that we need to replicate additional, internal metadata as well, that is not normally exposed to a regular user.\n\n![GitLab Geo - MVP Synchronization Architecture Diagram](https://about.gitlab.com/images/blogimages/how-we-built-geo/geo-architecture-mvp.png){: .medium.center}\n\n#### 5. Don’t replicate Redis data between nodes\n\nWe initially thought we could replicate Redis as well as the main database in order to share cached data, session information, etc. This would allow us to implement a Single Sign-On solution very easily, and by reusing the cache we would speed up the initial page load.\n\nAt that time Redis only supported **Leader** to **Follower** replication mode and even though it is usually super fast when used in a local network, the fact remains that replicating data across disparate geographical locations can add significant latency.\n\nThis additional latency would impact on the initial objective of simplifying the Single Sign-On implementation. If you simply log in on the primary node and get redirected to the secondary, chances are that the session information would still not be available on the secondary node due to the replication latency.\n\nThat would eventually fix itself by redirecting back and forth, but if the latency is significant enough, your browser will terminate the connection based on the redirect loop prevention feature. Another downside of this approach is that it creates a hard dependency on the primary node being online, or otherwise the secondary node would be inaccessible and/or completely broken.\n\nIn addition to all these issues, we needed an additional Redis instance that supports writing data to it, in order to persist Jobs to our Jobs system on the secondary node.\n\nSo it made sense, in the end, to give up on the idea of replicating Redis, and we started looking for a solution to the authentication problem.\n\n#### 6. Authenticate on the primary node only\n\nBecause we can’t write on the main database of secondary nodes, any auditing logs, brute force protection mechanism, password recovery tokens, etc. can’t have their data and state persisted inside secondary nodes. The only viable solution then is to authenticate on the primary and redirect the user to the secondary.\n\nThis decision also helped with the integration of any company-specific authentication systems. If a company uses internal authentication based on LDAP, CAS or SAML for instance, then they wouldn't have to replicate that system to the other location or configure firewall rule exceptions to accept traffic over the internet.\n\n#### 7. Implement Single Sign-On and Single Sign-Off using OAuth\n\nWith the previous Redis limitations in mind, we looked into alternatives to implement the authentication. We had to choose between either CAS or an OAuth-based one. As we already had OAuth Provider support inside GitLab, we decided to go with that.\n\nBasically, for any Geo node configured in the database we also have a corresponding OAuth application inside GitLab, and whenever a new user tries to log into a Geo node, they get redirected to the primary node and need to \"allow\" the \"Geo application\" to have access to their account credentials at the first login.\n\nThe shortcoming here is that if you are not logged in already and the primary goes down, you can't log in again until the primary node connectivity issue is fixed.\n\n#### 8. Build a read-only layer on the application side to prevent accidents\n\nWe needed this safeguard in place in case any required subsystem was misconfigured. With the read-only layer, we can prevent the instance from diverging from the primary in a non-recoverable way. It's also this layer that prevents anyone from pushing a repository change to the secondary node directly.\n\n#### 9. Don’t replicate any user attachments yet, just redirect to the primary\n\nInstead of trying to replicate user attachments at this stage, we decided to just rewrite the URLs pointing the resource to the primary node instead. This allowed us to iterate faster and still provide a decent experience to the end users.\n\nThey would still enjoy faster access to the repository data and have the web UI rendering the content from a closer location, with the exception of the issue/merge request attachments, avatars etc, which were still being fetched from the primary. But as they are also highly cachable the impact is minimal.\n\nThis was the initial foundation that allowed us to validate Geo as a viable solution. Later on, we took care of replicating the missing data as well.\n\n### Bonus trivia\n\nThe term **Geo** came only after a while, it was previously named as **GitLab RE** (*Read-Only Edition*), followed by **GitLab RO** (*Read Only*) before getting its final name: **GitLab Geo**.\n\n## Phase 2: First-generation synchronization mechanism\n\nWith the MVP implementation done, we needed to pave the way for a stable release. The first part we decided to improve was the notification mechanism for pending changes. During the MVP, we built a custom API endpoint and a buffered queue. That queue was also optimized to store only unique, idempotent events. If a project received three push events in the last few seconds, we only needed to store and process one event notification.\n\nWe decided that instead of building our own custom notification \"protocol\" and implementing some early optimizations, we should leverage existing GitLab internal capabilities: our own webhooks system.\n\n![GitLab Geo - First Generation Synchronization Architecture Diagram](https://about.gitlab.com/images/blogimages/how-we-built-geo/geo-architecture-first-gen.png){: .medium.center}\n\nBy taking that route, we would be forced to \"[drink our own champagne](https://en.wikipedia.org/wiki/Eating_your_own_dog_food#Alternative_terms)\" and as a result, improve our existing functionality. That decision actually resulted in improvements to our system-wide webhooks in a few ways. We added new system-wide webhook events, expanded the granularity of the information available, and fixed some performance issues.\n\nWe've also improved the security of our webhooks implementation by adding ways of verifying that the notification came from a trusted source. Previously the only way to do that relied on whitelisting the originating IP address as a way to establish trust.\n\nThis security limitation was not present in the MVP version, as we reused the admin personal token as the authorization mechanism for the API, which is also not ideal, but better than previous webhook implementation.\n\nI consider this to be the first generation of the synchronization mechanism that was used in the wild. It had a few characteristics: it reacted almost like real-time for small updates, webhook was fast enough and parallelizable to be used on the scale we wanted to support.\n\nAs the very first version of Geo was only concerned with getting repositories available and in-sync, from one location to the other, that's where we focused all of our efforts. At that time, setting up a new Geo node required an almost identical clone of the primary to be available in advance. That included not only replicating the database but also *rsyncing* the repositories from one node to the other. For improved consistency, we required initially a *stop the world* phase in order to not lose changes made during the time between when the backup started and when the secondary node got completely set up.\n\nWhile this was still closer to a barebones solution, it already provided value for remote teams working together in a shared repository or simply in any project that needed to synchronize code between different locations. We had a few customers trying it out and evaluating the potential, but it was still not ready for production use as we were still missing a lot of functionality.\n\nThe *stop the world* phase previously mentioned got phased out later with the help of improved setup documentation. Much later, a good chunk of the initial cloning step got simplified by leveraging some improvements in the next-generation synchronization and by introducing a backfilling mechanism.\n\n### First-generation synchronization pitfalls\n\nWhile our first-generation solution worked fine for the highly active repositories, the use of webhooks as a notification mechanism had some really obvious drawbacks.\n\nIf, for any reason, the notification failed to be delivered, it had to be rescheduled and retried. Also because we were using our internal Jobs system to dispatch the webhooks, having a node go dark for a few hours meant our Jobs system would be busy retrying operations over an unreachable destination for at least that same amount of time.\n\nDepending on the volume of data and how long it has been accumulating changes, that could even fill up the Redis instance disk storage. If that ever happened we would have to resync the whole instance again and start from scratch.\n\nWe've improved the retrying mechanism with custom Geo logic to alleviate the problem, but it was clear to us that this was not going to be a viable solution for a Generally Available (*stable*) release.\n\nAlso because of backoff algorithm in the retrying logic, in conjunction with the asynchronicity aspect of the system, it could lead to important changes taking a lot of time to replicate, especially in less active projects. The busiest ones were less affected, as any update to the repository would get it to the current state rather than to the state when the update notification was issued. And because the project is receiving many updates during the day, it's expected to generate also many notification events.\n\nAny implementation misstep between sending the webhook or receiving and processing it on the other side could mean we would lose that information forever. This was again not a major issue with highly active projects, as it would eventually receive a new, valid update notification which would sync it to the current state, but the outliers could miss it until someone notices or another update arrives much later.\n\nWe also wanted to make Geo a viable Disaster Recovery solution in the long term, so missing updates without a way to recover from it was not an option.\n\n## Phase 3: Second-generation synchronization mechanism\n\nWe started looking for alternative ways of notifying the secondary nodes and also considered switching to other standalone queue systems instead. We were also worried about the lack of control over the order in which the operations would happen in a parallel and asynchronous replication system and on the effect it had on the data on disk.\n\nA few examples of situations that can happen because of the parallelism and the async nature of it:\n\n1. A project removal event can be processed before a project update for the same repository\n1. Renaming, creating a project with the new name and sending new content to it, if processed in an incorrect order, can lead to temporary data loss\n\nThere was also the case when the notification arrived before the database had replicated the required data. As an example, when the node receives the notification for new project creation, but the database doesn't have it yet.\n\nThat required the secondary node to keep a \"mailbox\" until the received events are ready to be processed. As they were basically Jobs, that meant keep retrying until the job succeeded.\n\nConsidering all the complexity we had brought to the application layer, we investigated a few standalone queue systems to which we could offload the burden, but decided ultimately to build an event queue mechanism in PostgreSQL instead, as it had three important advantages:\n\n#### 1. No extra dependencies\nWe were already replicating the database, so there is no need to install, configure and maintain another process, worry about backing up yet another component, integrate it in our Omnibus package, and provide support for our users.\n\n#### 2. No more delayed processing\nIf the event arrives on the other side, the data associated with it will already be there as well. We can also guarantee consistency with transactions and repeat less information than with the webhooks implementation.\n\n#### 3. Easy to retry/restore from backup or in a disaster situation\nWith a standalone queue system, to have a consistent backup solution you either need some sort of \u003Cabbr title=\"Write-Ahead Logging\">WAL\u003C/abbr> files that could help rebuild a consistent state between the systems or do backups in a \"stop the world\" way, otherwise, you may lose data.\n\n### Our implementation\n\nWe took inspiration from how other log-based replication systems work (like the database) and implemented it with a central table as the main source of truth and a few others to hold bookkeeping for specific event types. Any relevant information we used to ship with the webhook notification is now part of this implementation, with extras to support the missing replicable events.\n\nOn the secondary node, these new tables are read by a specific daemon (we call it the Geo Log Cursor), and as the name suggests, it holds a persistent pointer of the last processed event. This allows us to also report the state of replication and monitor if our replication is broken. We also made it highly available, so you can boot up one as **Active** and keep a few extras as **Standby**. If the Active daemon stops responding for a specified amount of time a new election starts and one of the Standbys takes place as the new Active.\n\nThe second part of the new system requires a persistent layer on the secondary node to keep any synchronization state and metadata. This was done by using another PostgreSQL instance.\n\nWe couldn’t reuse the same main instance, as we were replicating with *Streaming replication* mode. With *Streaming replication*, the whole instance is replicated, and you can’t perform any change in it. The alternative to being able to replicate and write in the same instance is to use *Logical replication* mode, but at that time, there was no official *Logical Replication* support available in the PostgreSQL versions we supported (PgLogical was also not a viable alternative back then).\n\nWith the new persistence layer (we call it the *Geo Tracking Database*), we had the foundations built to be able to actively compare the \"desired vs actual\" state, and find missing data on any secondary instance. We built a more robust backfilling mechanism based on that as well.\n\nQuerying between the two database instances (the replicated Secondary, and the Tracking Database), were made much faster and scalable by enabling Postgres FDW ([Foreign Data Wrapper](https://www.postgresql.org/docs/9.6/static/postgres-fdw.html)). That allowed us to query data using a few **LEFT JOIN** operations among the two instances, instead of pooling with multiple queries from the application layer against the two databases in isolation.\n\n![GitLab Geo - Second Generation Synchronization Architecture Diagram](https://about.gitlab.com/images/blogimages/how-we-built-geo/geo-architecture-second-gen.png){: .medium.center}\n\n### Other improvements\n\nAnother important shortcoming fixed was how we replicated the SSH Keys. This was technical debt we needed to pay since the first implementation. Historically, GitLab built the SSH authorization mechanism as with many other Git implementations, by writing each user-provided SSH Key to the `AuthorizedKeys` file on the server and pointing each one to our [gitlab-shell](https://gitlab.com/gitlab-org/gitlab-shell) application.\n\nThis implementation allowed us to authenticate the authorized users, and because we control how the Shell application is invoked, we can pass a specific key ID to it, that can be used later to identify the user on our database and authorize/deny operations to specific repositories.\n\nThe problem with this approach, in general, is that the bigger the user base is, the slower the initial request will be, as OpenSSH will have to perform a scan to the whole file (**O(N)** complexity). With Geo, that's not just about speed but any delay in updating this file either to add a new key or to revoke an existing one is very undesirable.\n\nWhen we decided to fix that we did for both Geo and GitLab Core by using an interesting feature present in newer versions of OpenSSH (6.9 and above), that allows overriding the `AuthorizedKeys` step, switching from reading the keys from a file to invoke a specified CLI instead (*O(1)* complexity). You can read more about it [in the documentation here](https://docs.gitlab.com/ee/administration/operations/fast_ssh_key_lookup.html#doc-nav).\n\nWe fixed another shortcoming around the repository synchronization, switching from Git over SSH protocol, to Git over HTTPS. The initial motivation was to simplify the setup steps, but that decision also allowed us to shape the synchronization differently when it was originated from a Geo node, vs a regular request. Internally we store additional metadata in the repository and also commits that may no longer exist in your regular branches, but were part of a previous merge request, or had user comments associated with them.\n\nBy also switching to full HTTP(S), it made it simpler to run our development instances locally with [GDK](https://gitlab.com/gitlab-org/gitlab-development-kit), which helped to improve our own internal development process as well.\n\n## Phase 4: Third-generation synchronization and the path to a Disaster Recovery solution\n\nWhile still working in Phase 3, we discovered another major limitation around how we stored files on disk. GitLab, for historical reasons, stored repositories and file attachments in a similar disk structure as the base URL routes. For group and project `gitlab-org/gitlab-ce` there would be a path on disk that would include `gitlab-org/gitlab-ce` as part of it. The same is true for file attachments.\n\nKeeping both the database and disk in sync, even not considering Geo replication, means that at any time a project is renamed, several things have to be renamed on disk as well.\n\nThis is not only slow and error prone: what should we do if something fails to rename in the middle of the \"transaction?\" This is also problematic when replication comes into place as we are susceptible again to processing it in the correct order or risk a temporarily inconsistent state.\n\nWe tried to find a solution to problems around the order of execution of the events and we came up with three ideas:\n\n1. **Find or build a queue system that is guaranteed to process things in the same order they were scheduled**\n2. **Detect and recover from any replication failure or data corruption**\n3. **Make every replication operation idempotent, removing the queue-ordering requirement completely**\n\nThe first one was easily ruled out, as even if we switched to a queue system with that type of guarantee, it would be either slow due to the lack of parallelism in order to guarantee the order requirement, or will be extremely complex and hard to use as it would require extra care to have the same guarantees while also working in parallel.\n\nWe found no system that satisfied our needs, and even if we considered a standalone queue solution, we would lose the Postgres advantage from the previous generation, of having both the main database and the queue system always in sync.\n\nRuling out the first one, we considered the second idea of detecting and recovering from failures and data corruption as we concluded we needed it for *Disaster Recovery* anyway. Any robust *Disaster Recovery* solution needs to guarantee that the data it is holding is the exact one it's supposed to have. If, for any reason, that data gets corrupt or someone removes it from disk, it needs a way of detecting it and restores it to the desired state.\n\nTo achieve that, we built a robust verification mechanism that generates a checksum of the state of the repository and is stored in a separate table in the primary node. That table gets replicated to secondary nodes, where another checksum is also calculated (and stored in the Tracking Database). If both checksums match, we know the data is consistent. The checksum is recalculated automatically when an update event is processed, but can also be triggered manually.\n\n![Screen Capture - Repository Verification Status](https://about.gitlab.com/images/blogimages/how-we-built-geo/verification-status-primary.png){: .medium.center}\n\nWe used that mechanism to validate all repositories in `gitlab.com` when successfully [migrating from Azure to GCP](/blog/gcp-move-update/), last month.\n\nThe verification mechanism is not enough and while it gives us the guarantees we need, we can do better, which is why we also decided to implement the third idea as well, and make every replication operation idempotent in order to remove any situation where processing the incorrect order of events would put data in a temporarily inconsistent state.\n\nWe are calling that solution the [Hashed Storage](https://docs.gitlab.com/ee/administration/repository_storage_types.html). This is a complete rewrite of how GitLab stores files on disk. Instead of reusing the same paths as present in the URLs, we use the internal IDs to create a hash instead and derive the disk path from that hash. With the Hashed Storage, renaming a project or moving it to a new group requires only the database operations to be persisted, as the location on disk never changes.\n\n![Hashed Storage and Legacy Storage example](https://about.gitlab.com/images/blogimages/how-we-built-geo/hashed-storage-disk-path-example.png){: .medium.center}\n\nBy making the paths on disk immutable and non-conflicting, any `create`, `move` or `remove` operations can happen in any order, and they will never put the system in an inconsistent state. Also replicating a project rename or moving a project from one group/owner to another will require only the database change to be propagated to take full effect on a secondary node.\n\n## What to expect from Geo in the near future\n\nImplementing Geo has been an important effort at GitLab that involved many different areas. It is a crucial infrastructure feature that allowed us to migrate from one cloud provider to another. We also believe it's an important component to support the needs of many organizations today, from providing peace of mind regarding data safety in the events of a Disaster Recovery, to easing the burdens of distributed teams across the globe.\n\nWe've been using the feature ourselves and this allowed us to stress-test the biggest and most challenging GitLab installation, GitLab.com, making sure it will work just as fine for any other customer.\n\nOver the upcoming months we will be focusing on the following items:\n\n* Release a push proxy for Geo secondary nodes: [Pull and push from the same remote transparently](https://gitlab.com/groups/gitlab-org/-/epics/124)\n* Release [Hashed Storage as *Generally Available*](https://gitlab.com/groups/gitlab-org/-/epics/75)\n* Improve configuration: We want to reduce the steps and make it [simpler via automating most steps](https://gitlab.com/groups/gitlab-org/-/epics/367)\n* Improve the verification step: [Improve the signals we use for the checksum](https://gitlab.com/gitlab-org/gitlab-ee/issues/5196)\n* [Improve the Geo UX and UI](https://gitlab.com/groups/gitlab-org/-/epics/369)\n* Keep improving performance and reliability\n* Support replication of [GitLab Pages](https://gitlab.com/gitlab-org/gitlab-ee/issues/4611) and the internal [Docker Registry](https://gitlab.com/gitlab-org/gitlab-ee/issues/2870)\n\nCover photo by [NASA](https://unsplash.com/photos/Q1p7bh3SHj8) on [Unsplash](https://unsplash.com/)\n",[23,24],"features","inside GitLab","yml",{},true,"/en-us/blog/how-we-built-gitlab-geo",{"title":15,"description":16,"ogTitle":15,"ogDescription":16,"noIndex":12,"ogImage":19,"ogUrl":30,"ogSiteName":31,"ogType":32,"canonicalUrls":30},"https://about.gitlab.com/blog/how-we-built-gitlab-geo","https://about.gitlab.com","article","en-us/blog/how-we-built-gitlab-geo",[23,35],"inside-gitlab","EEvIpyqGZ4Rg_WzmLYb86AtNC7W_4DRSZhl0q7veZX0",{"data":38},{"logo":39,"freeTrial":44,"sales":49,"login":54,"items":59,"search":367,"minimal":398,"duo":417,"switchNav":426,"pricingDeployment":437},{"config":40},{"href":41,"dataGaName":42,"dataGaLocation":43},"/","gitlab logo","header",{"text":45,"config":46},"Get free trial",{"href":47,"dataGaName":48,"dataGaLocation":43},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com&glm_content=default-saas-trial/","free trial",{"text":50,"config":51},"Talk to sales",{"href":52,"dataGaName":53,"dataGaLocation":43},"/sales/","sales",{"text":55,"config":56},"Sign in",{"href":57,"dataGaName":58,"dataGaLocation":43},"https://gitlab.com/users/sign_in/","sign in",[60,87,182,187,288,348],{"text":61,"config":62,"cards":64},"Platform",{"dataNavLevelOne":63},"platform",[65,71,79],{"title":61,"description":66,"link":67},"The intelligent orchestration platform for DevSecOps",{"text":68,"config":69},"Explore our Platform",{"href":70,"dataGaName":63,"dataGaLocation":43},"/platform/",{"title":72,"description":73,"link":74},"GitLab Duo Agent Platform","Agentic AI for the entire software lifecycle",{"text":75,"config":76},"Meet GitLab Duo",{"href":77,"dataGaName":78,"dataGaLocation":43},"/gitlab-duo-agent-platform/","gitlab duo agent platform",{"title":80,"description":81,"link":82},"Why GitLab","See the top reasons enterprises choose GitLab",{"text":83,"config":84},"Learn more",{"href":85,"dataGaName":86,"dataGaLocation":43},"/why-gitlab/","why gitlab",{"text":88,"left":27,"config":89,"link":91,"lists":95,"footer":164},"Product",{"dataNavLevelOne":90},"solutions",{"text":92,"config":93},"View all Solutions",{"href":94,"dataGaName":90,"dataGaLocation":43},"/solutions/",[96,120,143],{"title":97,"description":98,"link":99,"items":104},"Automation","CI/CD and automation to accelerate deployment",{"config":100},{"icon":101,"href":102,"dataGaName":103,"dataGaLocation":43},"AutomatedCodeAlt","/solutions/delivery-automation/","automated software delivery",[105,109,112,116],{"text":106,"config":107},"CI/CD",{"href":108,"dataGaLocation":43,"dataGaName":106},"/solutions/continuous-integration/",{"text":72,"config":110},{"href":77,"dataGaLocation":43,"dataGaName":111},"gitlab duo agent platform - product menu",{"text":113,"config":114},"Source Code Management",{"href":115,"dataGaLocation":43,"dataGaName":113},"/solutions/source-code-management/",{"text":117,"config":118},"Automated Software Delivery",{"href":102,"dataGaLocation":43,"dataGaName":119},"Automated software delivery",{"title":121,"description":122,"link":123,"items":128},"Security","Deliver code faster without compromising security",{"config":124},{"href":125,"dataGaName":126,"dataGaLocation":43,"icon":127},"/solutions/application-security-testing/","security and compliance","ShieldCheckLight",[129,133,138],{"text":130,"config":131},"Application Security Testing",{"href":125,"dataGaName":132,"dataGaLocation":43},"Application security testing",{"text":134,"config":135},"Software Supply Chain Security",{"href":136,"dataGaLocation":43,"dataGaName":137},"/solutions/supply-chain/","Software supply chain security",{"text":139,"config":140},"Software Compliance",{"href":141,"dataGaName":142,"dataGaLocation":43},"/solutions/software-compliance/","software compliance",{"title":144,"link":145,"items":150},"Measurement",{"config":146},{"icon":147,"href":148,"dataGaName":149,"dataGaLocation":43},"DigitalTransformation","/solutions/visibility-measurement/","visibility and measurement",[151,155,159],{"text":152,"config":153},"Visibility & Measurement",{"href":148,"dataGaLocation":43,"dataGaName":154},"Visibility and Measurement",{"text":156,"config":157},"Value Stream Management",{"href":158,"dataGaLocation":43,"dataGaName":156},"/solutions/value-stream-management/",{"text":160,"config":161},"Analytics & Insights",{"href":162,"dataGaLocation":43,"dataGaName":163},"/solutions/analytics-and-insights/","Analytics and insights",{"title":165,"items":166},"GitLab for",[167,172,177],{"text":168,"config":169},"Enterprise",{"href":170,"dataGaLocation":43,"dataGaName":171},"/enterprise/","enterprise",{"text":173,"config":174},"Small Business",{"href":175,"dataGaLocation":43,"dataGaName":176},"/small-business/","small business",{"text":178,"config":179},"Public Sector",{"href":180,"dataGaLocation":43,"dataGaName":181},"/solutions/public-sector/","public sector",{"text":183,"config":184},"Pricing",{"href":185,"dataGaName":186,"dataGaLocation":43,"dataNavLevelOne":186},"/pricing/","pricing",{"text":188,"config":189,"link":191,"lists":195,"feature":275},"Resources",{"dataNavLevelOne":190},"resources",{"text":192,"config":193},"View all resources",{"href":194,"dataGaName":190,"dataGaLocation":43},"/resources/",[196,229,247],{"title":197,"items":198},"Getting started",[199,204,209,214,219,224],{"text":200,"config":201},"Install",{"href":202,"dataGaName":203,"dataGaLocation":43},"/install/","install",{"text":205,"config":206},"Quick start guides",{"href":207,"dataGaName":208,"dataGaLocation":43},"/get-started/","quick setup checklists",{"text":210,"config":211},"Learn",{"href":212,"dataGaLocation":43,"dataGaName":213},"https://university.gitlab.com/","learn",{"text":215,"config":216},"Product documentation",{"href":217,"dataGaName":218,"dataGaLocation":43},"https://docs.gitlab.com/","product documentation",{"text":220,"config":221},"Best practice videos",{"href":222,"dataGaName":223,"dataGaLocation":43},"/getting-started-videos/","best practice videos",{"text":225,"config":226},"Integrations",{"href":227,"dataGaName":228,"dataGaLocation":43},"/integrations/","integrations",{"title":230,"items":231},"Discover",[232,237,242],{"text":233,"config":234},"Customer success stories",{"href":235,"dataGaName":236,"dataGaLocation":43},"/customers/","customer success stories",{"text":238,"config":239},"Blog",{"href":240,"dataGaName":241,"dataGaLocation":43},"/blog/","blog",{"text":243,"config":244},"Remote",{"href":245,"dataGaName":246,"dataGaLocation":43},"https://handbook.gitlab.com/handbook/company/culture/all-remote/","remote",{"title":248,"items":249},"Connect",[250,255,260,265,270],{"text":251,"config":252},"GitLab Services",{"href":253,"dataGaName":254,"dataGaLocation":43},"/services/","services",{"text":256,"config":257},"Community",{"href":258,"dataGaName":259,"dataGaLocation":43},"/community/","community",{"text":261,"config":262},"Forum",{"href":263,"dataGaName":264,"dataGaLocation":43},"https://forum.gitlab.com/","forum",{"text":266,"config":267},"Events",{"href":268,"dataGaName":269,"dataGaLocation":43},"/events/","events",{"text":271,"config":272},"Partners",{"href":273,"dataGaName":274,"dataGaLocation":43},"/partners/","partners",{"backgroundColor":276,"textColor":277,"text":278,"image":279,"link":283},"#2f2a6b","#fff","Insights for the future of software development",{"altText":280,"config":281},"the source promo card",{"src":282},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758208064/dzl0dbift9xdizyelkk4.svg",{"text":284,"config":285},"Read the latest",{"href":286,"dataGaName":287,"dataGaLocation":43},"/the-source/","the source",{"text":289,"config":290,"lists":292},"Company",{"dataNavLevelOne":291},"company",[293],{"items":294},[295,300,306,308,313,318,323,328,333,338,343],{"text":296,"config":297},"About",{"href":298,"dataGaName":299,"dataGaLocation":43},"/company/","about",{"text":301,"config":302,"footerGa":305},"Jobs",{"href":303,"dataGaName":304,"dataGaLocation":43},"/jobs/","jobs",{"dataGaName":304},{"text":266,"config":307},{"href":268,"dataGaName":269,"dataGaLocation":43},{"text":309,"config":310},"Leadership",{"href":311,"dataGaName":312,"dataGaLocation":43},"/company/team/e-group/","leadership",{"text":314,"config":315},"Team",{"href":316,"dataGaName":317,"dataGaLocation":43},"/company/team/","team",{"text":319,"config":320},"Handbook",{"href":321,"dataGaName":322,"dataGaLocation":43},"https://handbook.gitlab.com/","handbook",{"text":324,"config":325},"Investor relations",{"href":326,"dataGaName":327,"dataGaLocation":43},"https://ir.gitlab.com/","investor relations",{"text":329,"config":330},"Trust Center",{"href":331,"dataGaName":332,"dataGaLocation":43},"/security/","trust center",{"text":334,"config":335},"AI Transparency Center",{"href":336,"dataGaName":337,"dataGaLocation":43},"/ai-transparency-center/","ai transparency center",{"text":339,"config":340},"Newsletter",{"href":341,"dataGaName":342,"dataGaLocation":43},"/company/contact/#contact-forms","newsletter",{"text":344,"config":345},"Press",{"href":346,"dataGaName":347,"dataGaLocation":43},"/press/","press",{"text":349,"config":350,"lists":351},"Contact us",{"dataNavLevelOne":291},[352],{"items":353},[354,357,362],{"text":50,"config":355},{"href":52,"dataGaName":356,"dataGaLocation":43},"talk to sales",{"text":358,"config":359},"Support portal",{"href":360,"dataGaName":361,"dataGaLocation":43},"https://support.gitlab.com","support portal",{"text":363,"config":364},"Customer portal",{"href":365,"dataGaName":366,"dataGaLocation":43},"https://customers.gitlab.com/customers/sign_in/","customer portal",{"close":368,"login":369,"suggestions":376},"Close",{"text":370,"link":371},"To search repositories and projects, login to",{"text":372,"config":373},"gitlab.com",{"href":57,"dataGaName":374,"dataGaLocation":375},"search login","search",{"text":377,"default":378},"Suggestions",[379,381,385,387,391,395],{"text":72,"config":380},{"href":77,"dataGaName":72,"dataGaLocation":375},{"text":382,"config":383},"Code Suggestions (AI)",{"href":384,"dataGaName":382,"dataGaLocation":375},"/solutions/code-suggestions/",{"text":106,"config":386},{"href":108,"dataGaName":106,"dataGaLocation":375},{"text":388,"config":389},"GitLab on AWS",{"href":390,"dataGaName":388,"dataGaLocation":375},"/partners/technology-partners/aws/",{"text":392,"config":393},"GitLab on Google Cloud",{"href":394,"dataGaName":392,"dataGaLocation":375},"/partners/technology-partners/google-cloud-platform/",{"text":396,"config":397},"Why GitLab?",{"href":85,"dataGaName":396,"dataGaLocation":375},{"freeTrial":399,"mobileIcon":404,"desktopIcon":409,"secondaryButton":412},{"text":400,"config":401},"Start free trial",{"href":402,"dataGaName":48,"dataGaLocation":403},"https://gitlab.com/-/trials/new/","nav",{"altText":405,"config":406},"Gitlab Icon",{"src":407,"dataGaName":408,"dataGaLocation":403},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203874/jypbw1jx72aexsoohd7x.svg","gitlab icon",{"altText":405,"config":410},{"src":411,"dataGaName":408,"dataGaLocation":403},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203875/gs4c8p8opsgvflgkswz9.svg",{"text":413,"config":414},"Get Started",{"href":415,"dataGaName":416,"dataGaLocation":403},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com/get-started/","get started",{"freeTrial":418,"mobileIcon":422,"desktopIcon":424},{"text":419,"config":420},"Learn more about GitLab Duo",{"href":77,"dataGaName":421,"dataGaLocation":403},"gitlab duo",{"altText":405,"config":423},{"src":407,"dataGaName":408,"dataGaLocation":403},{"altText":405,"config":425},{"src":411,"dataGaName":408,"dataGaLocation":403},{"button":427,"mobileIcon":432,"desktopIcon":434},{"text":428,"config":429},"/switch",{"href":430,"dataGaName":431,"dataGaLocation":403},"#contact","switch",{"altText":405,"config":433},{"src":407,"dataGaName":408,"dataGaLocation":403},{"altText":405,"config":435},{"src":436,"dataGaName":408,"dataGaLocation":403},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1773335277/ohhpiuoxoldryzrnhfrh.png",{"freeTrial":438,"mobileIcon":443,"desktopIcon":445},{"text":439,"config":440},"Back to pricing",{"href":185,"dataGaName":441,"dataGaLocation":403,"icon":442},"back to pricing","GoBack",{"altText":405,"config":444},{"src":407,"dataGaName":408,"dataGaLocation":403},{"altText":405,"config":446},{"src":411,"dataGaName":408,"dataGaLocation":403},{"title":448,"button":449,"config":454},"See how agentic AI transforms software delivery",{"text":450,"config":451},"Watch GitLab Transcend now",{"href":452,"dataGaName":453,"dataGaLocation":43},"/events/transcend/virtual/","transcend event",{"layout":455,"icon":456,"disabled":27},"release","AiStar",{"data":458},{"text":459,"source":460,"edit":466,"contribute":471,"config":476,"items":481,"minimal":688},"Git is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license",{"text":461,"config":462},"View page source",{"href":463,"dataGaName":464,"dataGaLocation":465},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/","page source","footer",{"text":467,"config":468},"Edit this page",{"href":469,"dataGaName":470,"dataGaLocation":465},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/content/","web ide",{"text":472,"config":473},"Please contribute",{"href":474,"dataGaName":475,"dataGaLocation":465},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/CONTRIBUTING.md/","please contribute",{"twitter":477,"facebook":478,"youtube":479,"linkedin":480},"https://twitter.com/gitlab","https://www.facebook.com/gitlab","https://www.youtube.com/channel/UCnMGQ8QHMAnVIsI3xJrihhg","https://www.linkedin.com/company/gitlab-com",[482,529,583,627,654],{"title":183,"links":483,"subMenu":498},[484,488,493],{"text":485,"config":486},"View plans",{"href":185,"dataGaName":487,"dataGaLocation":465},"view plans",{"text":489,"config":490},"Why Premium?",{"href":491,"dataGaName":492,"dataGaLocation":465},"/pricing/premium/","why premium",{"text":494,"config":495},"Why Ultimate?",{"href":496,"dataGaName":497,"dataGaLocation":465},"/pricing/ultimate/","why ultimate",[499],{"title":500,"links":501},"Contact Us",[502,505,507,509,514,519,524],{"text":503,"config":504},"Contact sales",{"href":52,"dataGaName":53,"dataGaLocation":465},{"text":358,"config":506},{"href":360,"dataGaName":361,"dataGaLocation":465},{"text":363,"config":508},{"href":365,"dataGaName":366,"dataGaLocation":465},{"text":510,"config":511},"Status",{"href":512,"dataGaName":513,"dataGaLocation":465},"https://status.gitlab.com/","status",{"text":515,"config":516},"Terms of use",{"href":517,"dataGaName":518,"dataGaLocation":465},"/terms/","terms of use",{"text":520,"config":521},"Privacy statement",{"href":522,"dataGaName":523,"dataGaLocation":465},"/privacy/","privacy statement",{"text":525,"config":526},"Cookie preferences",{"dataGaName":527,"dataGaLocation":465,"id":528,"isOneTrustButton":27},"cookie preferences","ot-sdk-btn",{"title":88,"links":530,"subMenu":539},[531,535],{"text":532,"config":533},"DevSecOps platform",{"href":70,"dataGaName":534,"dataGaLocation":465},"devsecops platform",{"text":536,"config":537},"AI-Assisted Development",{"href":77,"dataGaName":538,"dataGaLocation":465},"ai-assisted development",[540],{"title":541,"links":542},"Topics",[543,548,553,558,563,568,573,578],{"text":544,"config":545},"CICD",{"href":546,"dataGaName":547,"dataGaLocation":465},"/topics/ci-cd/","cicd",{"text":549,"config":550},"GitOps",{"href":551,"dataGaName":552,"dataGaLocation":465},"/topics/gitops/","gitops",{"text":554,"config":555},"DevOps",{"href":556,"dataGaName":557,"dataGaLocation":465},"/topics/devops/","devops",{"text":559,"config":560},"Version Control",{"href":561,"dataGaName":562,"dataGaLocation":465},"/topics/version-control/","version control",{"text":564,"config":565},"DevSecOps",{"href":566,"dataGaName":567,"dataGaLocation":465},"/topics/devsecops/","devsecops",{"text":569,"config":570},"Cloud Native",{"href":571,"dataGaName":572,"dataGaLocation":465},"/topics/cloud-native/","cloud native",{"text":574,"config":575},"AI for Coding",{"href":576,"dataGaName":577,"dataGaLocation":465},"/topics/devops/ai-for-coding/","ai for coding",{"text":579,"config":580},"Agentic AI",{"href":581,"dataGaName":582,"dataGaLocation":465},"/topics/agentic-ai/","agentic ai",{"title":584,"links":585},"Solutions",[586,588,590,595,599,602,606,609,611,614,617,622],{"text":130,"config":587},{"href":125,"dataGaName":130,"dataGaLocation":465},{"text":119,"config":589},{"href":102,"dataGaName":103,"dataGaLocation":465},{"text":591,"config":592},"Agile development",{"href":593,"dataGaName":594,"dataGaLocation":465},"/solutions/agile-delivery/","agile delivery",{"text":596,"config":597},"SCM",{"href":115,"dataGaName":598,"dataGaLocation":465},"source code management",{"text":544,"config":600},{"href":108,"dataGaName":601,"dataGaLocation":465},"continuous integration & delivery",{"text":603,"config":604},"Value stream management",{"href":158,"dataGaName":605,"dataGaLocation":465},"value stream management",{"text":549,"config":607},{"href":608,"dataGaName":552,"dataGaLocation":465},"/solutions/gitops/",{"text":168,"config":610},{"href":170,"dataGaName":171,"dataGaLocation":465},{"text":612,"config":613},"Small business",{"href":175,"dataGaName":176,"dataGaLocation":465},{"text":615,"config":616},"Public sector",{"href":180,"dataGaName":181,"dataGaLocation":465},{"text":618,"config":619},"Education",{"href":620,"dataGaName":621,"dataGaLocation":465},"/solutions/education/","education",{"text":623,"config":624},"Financial services",{"href":625,"dataGaName":626,"dataGaLocation":465},"/solutions/finance/","financial services",{"title":188,"links":628},[629,631,633,635,638,640,642,644,646,648,650,652],{"text":200,"config":630},{"href":202,"dataGaName":203,"dataGaLocation":465},{"text":205,"config":632},{"href":207,"dataGaName":208,"dataGaLocation":465},{"text":210,"config":634},{"href":212,"dataGaName":213,"dataGaLocation":465},{"text":215,"config":636},{"href":217,"dataGaName":637,"dataGaLocation":465},"docs",{"text":238,"config":639},{"href":240,"dataGaName":241,"dataGaLocation":465},{"text":233,"config":641},{"href":235,"dataGaName":236,"dataGaLocation":465},{"text":243,"config":643},{"href":245,"dataGaName":246,"dataGaLocation":465},{"text":251,"config":645},{"href":253,"dataGaName":254,"dataGaLocation":465},{"text":256,"config":647},{"href":258,"dataGaName":259,"dataGaLocation":465},{"text":261,"config":649},{"href":263,"dataGaName":264,"dataGaLocation":465},{"text":266,"config":651},{"href":268,"dataGaName":269,"dataGaLocation":465},{"text":271,"config":653},{"href":273,"dataGaName":274,"dataGaLocation":465},{"title":289,"links":655},[656,658,660,662,664,666,668,672,677,679,681,683],{"text":296,"config":657},{"href":298,"dataGaName":291,"dataGaLocation":465},{"text":301,"config":659},{"href":303,"dataGaName":304,"dataGaLocation":465},{"text":309,"config":661},{"href":311,"dataGaName":312,"dataGaLocation":465},{"text":314,"config":663},{"href":316,"dataGaName":317,"dataGaLocation":465},{"text":319,"config":665},{"href":321,"dataGaName":322,"dataGaLocation":465},{"text":324,"config":667},{"href":326,"dataGaName":327,"dataGaLocation":465},{"text":669,"config":670},"Sustainability",{"href":671,"dataGaName":669,"dataGaLocation":465},"/sustainability/",{"text":673,"config":674},"Diversity, inclusion and belonging (DIB)",{"href":675,"dataGaName":676,"dataGaLocation":465},"/diversity-inclusion-belonging/","Diversity, inclusion and belonging",{"text":329,"config":678},{"href":331,"dataGaName":332,"dataGaLocation":465},{"text":339,"config":680},{"href":341,"dataGaName":342,"dataGaLocation":465},{"text":344,"config":682},{"href":346,"dataGaName":347,"dataGaLocation":465},{"text":684,"config":685},"Modern Slavery Transparency Statement",{"href":686,"dataGaName":687,"dataGaLocation":465},"https://handbook.gitlab.com/handbook/legal/modern-slavery-act-transparency-statement/","modern slavery transparency statement",{"items":689},[690,693,696],{"text":691,"config":692},"Terms",{"href":517,"dataGaName":518,"dataGaLocation":465},{"text":694,"config":695},"Cookies",{"dataGaName":527,"dataGaLocation":465,"id":528,"isOneTrustButton":27},{"text":697,"config":698},"Privacy",{"href":522,"dataGaName":523,"dataGaLocation":465},[700],{"id":701,"title":18,"body":8,"config":702,"content":704,"description":8,"extension":25,"meta":708,"navigation":27,"path":709,"seo":710,"stem":711,"__hash__":712},"blogAuthors/en-us/blog/authors/gabriel-mazetto.yml",{"template":703},"BlogAuthor",{"name":18,"config":705},{"headshot":706,"ctfId":707},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1749678982/Blog/Author%20Headshots/brodock-headshot.jpg","brodock",{},"/en-us/blog/authors/gabriel-mazetto",{},"en-us/blog/authors/gabriel-mazetto","elgZFaYSswD9MZUjesPB6vCwvmu4bay0z__KaEaQWI8",[714,728,741],{"content":715,"config":726},{"body":716,"title":717,"description":718,"authors":719,"heroImage":721,"date":722,"category":9,"tags":723},"Most CI/CD tools can run a build and ship a deployment. Where they diverge is what happens when your delivery needs get real: a monorepo with a dozen services, microservices spread across multiple repositories, deployments to dozens of environments, or a platform team trying to enforce standards without becoming a bottleneck.\n  \nGitLab's pipeline execution model was designed for that complexity. Parent-child pipelines, DAG execution, dynamic pipeline generation, multi-project triggers, merge request pipelines with merged results, and CI/CD Components each solve a distinct class of problems. Because they compose, understanding the full model unlocks something more than a faster pipeline. In this article, you'll learn about the five patterns where that model stands out, each mapped to a real engineering scenario with the configuration to match.\n  \nThe configs below are illustrative. The scripts use echo commands to keep the signal-to-noise ratio low. Swap them out for your actual build, test, and deploy steps and they are ready to use.\n\n\n## 1. Monorepos: Parent-child pipelines + DAG execution\n\n\nThe problem: Your monorepo has a frontend, a backend, and a docs site. Every commit triggers a full rebuild of everything, even when only a README changed.\n\n\nGitLab solves this with two complementary features: [parent-child pipelines](https://docs.gitlab.com/ci/pipelines/downstream_pipelines/#parent-child-pipelines) (which let a top-level pipeline spawn isolated sub-pipelines) and [DAG execution via `needs`](https://docs.gitlab.com/ci/yaml/#needs) (which breaks rigid stage-by-stage ordering and lets jobs start the moment their dependencies finish).\n\n\nA parent pipeline detects what changed and triggers only the relevant child pipelines:\n\n```yaml\n# .gitlab-ci.yml\nstages:\n  - trigger\n\ntrigger-services:\n  stage: trigger\n  trigger:\n    include:\n      - local: '.gitlab/ci/api-service.yml'\n      - local: '.gitlab/ci/web-service.yml'\n      - local: '.gitlab/ci/worker-service.yml'\n    strategy: depend\n```\n\n\nEach child pipeline is a fully independent pipeline with its own stages, jobs, and artifacts. The parent waits for all of them via [strategy: depend](https://docs.gitlab.com/ci/pipelines/downstream_pipelines/#wait-for-downstream-pipeline-to-complete) so you get a single green/red signal at the top level, with full drill-down into each service's pipeline. This organizational separation is the bigger win for large teams: each service owns its pipeline config, changes in one cannot break another, and the complexity stays manageable as the repo grows.\n\n\nOne thing worth knowing: when you pass [multiple files to a single `trigger: include:`](https://docs.gitlab.com/ci/pipelines/downstream_pipelines/#combine-multiple-child-pipeline-configuration-files), GitLab merges them into a single child pipeline configuration. This means jobs defined across those files share the same pipeline context and can reference each other with `needs:`, which is what makes the DAG optimization possible. If you split them into separate trigger jobs instead, each would be its own isolated pipeline and cross-file `needs:` references would not work.\n\n\nCombine this with `needs:` inside each child pipeline and you get DAG execution. Your integration tests can start the moment the build finishes, without waiting for other jobs in the same stage.\n\n```yaml\n# .gitlab/ci/api-service.yml\nstages:\n  - build\n  - test\n\nbuild-api:\n  stage: build\n  script:\n    - echo \"Building API service\"\n\ntest-api:\n  stage: test\n  needs: [build-api]\n  script:\n    - echo \"Running API tests\"\n```\n\n\nWhy it matters: Teams with large monorepos typically report significant reductions in pipeline runtime after switching to DAG execution, since jobs no longer wait on unrelated work in the same stage. Parent-child pipelines add the organizational layer that keeps the configuration maintainable as the repo and team grow.\n\n![Local downstream pipelines](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738759/Blog/Imported/hackathon-fake-blog-post-s/image3_vwj3rz.png \"Local downstream pipelines\")\n\n## 2. Microservices: Cross-repo, multi-project pipelines\n\n\nThe problem: Your frontend lives in one repo, your backend in another. When the frontend team ships a change, they have no visibility into whether it broke the backend integration and vice versa.\n\n\nGitLab's [multi-project pipelines](https://docs.gitlab.com/ci/pipelines/downstream_pipelines/#multi-project-pipelines) let one project trigger a pipeline in a completely separate project and wait for the result. The triggering project gets a linked downstream pipeline right in its own pipeline view.\n\n\nThe frontend pipeline builds an API contract artifact and publishes it, then triggers the backend pipeline. The backend fetches that artifact directly using the [Jobs API](https://docs.gitlab.com/ee/api/jobs.html#download-a-single-artifact-file-from-specific-tag-or-branch) and validates it before allowing anything to proceed. If a breaking change is detected, the backend pipeline fails and the frontend pipeline fails with it.\n\n```yaml\n# frontend repo: .gitlab-ci.yml\nstages:\n  - build\n  - test\n  - trigger-backend\n\nbuild-frontend:\n  stage: build\n  script:\n    - echo \"Building frontend and generating API contract...\"\n    - mkdir -p dist\n    - |\n      echo '{\n        \"api_version\": \"v2\",\n        \"breaking_changes\": false\n      }' > dist/api-contract.json\n    - cat dist/api-contract.json\n  artifacts:\n    paths:\n      - dist/api-contract.json\n    expire_in: 1 hour\n\ntest-frontend:\n  stage: test\n  script:\n    - echo \"All frontend tests passed!\"\n\ntrigger-backend-pipeline:\n  stage: trigger-backend\n  trigger:\n    project: my-org/backend-service\n    branch: main\n    strategy: depend\n  rules:\n    - if: $CI_COMMIT_BRANCH == \"main\"\n```\n\n```yaml\n# backend repo: .gitlab-ci.yml\nstages:\n  - build\n  - test\n\nbuild-backend:\n  stage: build\n  script:\n    - echo \"All backend tests passed!\"\n\nintegration-test:\n  stage: test\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"pipeline\"\n  script:\n    - echo \"Fetching API contract from frontend...\"\n    - |\n      curl --silent --fail \\\n        --header \"JOB-TOKEN: $CI_JOB_TOKEN\" \\\n        --output api-contract.json \\\n        \"${CI_API_V4_URL}/projects/${FRONTEND_PROJECT_ID}/jobs/artifacts/main/raw/dist/api-contract.json?job=build-frontend\"\n    - cat api-contract.json\n    - |\n      if grep -q '\"breaking_changes\": true' api-contract.json; then\n        echo \"FAIL: Breaking API changes detected - backend integration blocked!\"\n        exit 1\n      fi\n      echo \"PASS: API contract is compatible!\"\n```\n\n\nA few things worth noting in this config. The `integration-test` job uses `$CI_PIPELINE_SOURCE == \"pipeline\"` to ensure it only runs when triggered by an upstream pipeline, not on a standalone push to the backend repo. The frontend project ID is referenced via `$FRONTEND_PROJECT_ID`, which should be set as a [CI/CD variable](https://docs.gitlab.com/ci/variables/) in the backend project settings to avoid hardcoding it.\n\n\nWhy it matters: Cross-service breakage that previously surfaced in production gets caught in the pipeline instead. The dependency between services stops being invisible and becomes something teams can see, track, and act on.\n\n\n![Cross-project pipelines](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738762/Blog/Imported/hackathon-fake-blog-post-s/image4_h6mfsb.png \"Cross-project pipelines\")\n\n\n## 3. Multi-tenant / matrix deployments: Dynamic child pipelines\n\n\nThe problem: You deploy the same application to 15 customer environments, or three cloud regions, or dev/staging/prod. Updating a deploy stage across all of them one by one is the kind of work that leads to configuration drift. Writing a separate pipeline for each environment is unmaintainable from day one.\n\n\nGitLab's [dynamic child pipelines](https://docs.gitlab.com/ci/pipelines/downstream_pipelines/#dynamic-child-pipelines) let you generate a pipeline at runtime. A job runs a script that produces a YAML file, and that YAML becomes the pipeline for the next stage. The pipeline structure itself becomes data.\n\n\n```yaml\n# .gitlab-ci.yml\nstages:\n  - generate\n  - trigger-environments\n\ngenerate-config:\n  stage: generate\n  script:\n    - |\n      # ENVIRONMENTS can be passed as a CI variable or read from a config file.\n      # Default to dev, staging, prod if not set.\n      ENVIRONMENTS=${ENVIRONMENTS:-\"dev staging prod\"}\n      for ENV in $ENVIRONMENTS; do\n        cat > ${ENV}-pipeline.yml \u003C\u003C EOF\n      stages:\n        - deploy\n        - verify\n      deploy-${ENV}:\n        stage: deploy\n        script:\n          - echo \"Deploying to ${ENV} environment\"\n      verify-${ENV}:\n        stage: verify\n        script:\n          - echo \"Running smoke tests on ${ENV}\"\n      EOF\n      done\n  artifacts:\n    paths:\n      - \"*.yml\"\n    exclude:\n      - \".gitlab-ci.yml\"\n\n.trigger-template:\n  stage: trigger-environments\n  trigger:\n    strategy: depend\n\ntrigger-dev:\n  extends: .trigger-template\n  trigger:\n    include:\n      - artifact: dev-pipeline.yml\n        job: generate-config\n\ntrigger-staging:\n  extends: .trigger-template\n  needs: [trigger-dev]\n  trigger:\n    include:\n      - artifact: staging-pipeline.yml\n        job: generate-config\n\ntrigger-prod:\n  extends: .trigger-template\n  needs: [trigger-staging]\n  trigger:\n    include:\n      - artifact: prod-pipeline.yml\n        job: generate-config\n  when: manual\n```\n\n\nThe generation script loops over an `ENVIRONMENTS` variable rather than hardcoding each environment separately. Pass in a different list via a CI variable or read it from a config file and the pipeline adapts without touching the YAML. The trigger jobs use [extends:](https://docs.gitlab.com/ci/yaml/#extends) to inherit shared configuration from `.trigger-template`, so `strategy: depend` is defined once rather than repeated on every trigger job. Add a new environment by updating the variable, not by duplicating pipeline config. Add [when: manual](https://docs.gitlab.com/ci/yaml/#when) to the production trigger and you get a promotion gate baked right into the pipeline graph.\n\n\nWhy it matters: SaaS companies and platform teams use this pattern to manage dozens of environments without duplicating pipeline logic. The pipeline structure itself stays lean as the deployment matrix grows.\n\n\n![Dynamic pipeline](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738765/Blog/Imported/hackathon-fake-blog-post-s/image7_wr0kx2.png \"Dynamic pipeline\")\n\n\n## 4. MR-first delivery: Merge request pipelines, merged results, and workflow routing\n\n\nThe problem: Your pipeline runs on every push to every branch. Expensive tests run on feature branches that will never merge. Meanwhile, you have no guarantee that what you tested is actually what will land on `main` after a merge.\n\n\nGitLab has three interlocking features that solve this together:\n\n\n*   [Merge request pipelines](https://docs.gitlab.com/ci/pipelines/merge_request_pipelines/) run only when a merge request exists, not on every branch push. This alone eliminates a significant amount of wasted compute.\n\n*   [Merged results pipelines](https://docs.gitlab.com/ci/pipelines/merged_results_pipelines/) go further. GitLab creates a temporary merge commit (your branch plus the current target branch) and runs the pipeline against that. You are testing what will actually exist after the merge, not just your branch in isolation.\n\n*   [Workflow rules](https://docs.gitlab.com/ci/yaml/workflow/) let you define exactly which pipeline type runs under which conditions and suppress everything else. The `$CI_OPEN_MERGE_REQUESTS` guard below prevents duplicate pipelines firing for both a branch and its open MR simultaneously.\n\n\nWith those three working together, here is what a tiered pipeline looks like:\n\n```yaml\n# .gitlab-ci.yml\nworkflow:\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"merge_request_event\"\n    - if: $CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS\n      when: never\n    - if: $CI_COMMIT_BRANCH\n    - if: $CI_PIPELINE_SOURCE == \"schedule\"\n\nstages:\n  - fast-checks\n  - expensive-tests\n  - deploy\n\nlint-code:\n  stage: fast-checks\n  script:\n    - echo \"Running linter\"\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"push\"\n    - if: $CI_PIPELINE_SOURCE == \"merge_request_event\"\n    - if: $CI_COMMIT_BRANCH == \"main\"\n\nunit-tests:\n  stage: fast-checks\n  script:\n    - echo \"Running unit tests\"\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"push\"\n    - if: $CI_PIPELINE_SOURCE == \"merge_request_event\"\n    - if: $CI_COMMIT_BRANCH == \"main\"\n\nintegration-tests:\n  stage: expensive-tests\n  script:\n    - echo \"Running integration tests (15 min)\"\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"merge_request_event\"\n    - if: $CI_COMMIT_BRANCH == \"main\"\n\ne2e-tests:\n  stage: expensive-tests\n  script:\n    - echo \"Running E2E tests (30 min)\"\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"merge_request_event\"\n    - if: $CI_COMMIT_BRANCH == \"main\"\n\nnightly-comprehensive-scan:\n  stage: expensive-tests\n  script:\n    - echo \"Running full nightly suite (2 hours)\"\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"schedule\"\n\ndeploy-production:\n  stage: deploy\n  script:\n    - echo \"Deploying to production\"\n  rules:\n    - if: $CI_COMMIT_BRANCH == \"main\"\n      when: manual\n```\n\nWith this setup, the pipeline behaves differently depending on context. A push to a feature branch with no open MR runs lint and unit tests only. Once an MR is opened, the workflow rules switch from a branch pipeline to an MR pipeline, and the full integration and E2E suite runs against the merged result. Merging to `main` queues a manual production deployment. A nightly schedule runs the comprehensive scan once, not on every commit.\n\n\nWhy it matters: Teams routinely cut CI costs significantly with this pattern, not by running fewer tests, but by running the right tests at the right time. Merged results pipelines catch the class of bugs that only appear after a merge, before they ever reach `main`.\n\n\n![Conditional pipelines (within a branch with no MR)](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738768/Blog/Imported/hackathon-fake-blog-post-s/image6_dnfcny.png \"Conditional pipelines (within a branch with no MR)\")\n\n\n\n![Conditional pipelines (within an MR)](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738772/Blog/Imported/hackathon-fake-blog-post-s/image1_wyiafu.png \"Conditional pipelines (within an MR)\")\n\n\n\n![Conditional pipelines (on the main branch)](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738774/Blog/Imported/hackathon-fake-blog-post-s/image5_r6lkfd.png \"Conditional pipelines (on the main branch)\")\n\n## 5. Governed pipelines: CI/CD Components\n\n\nThe problem: Your platform team has defined the right way to build, test, and deploy. But every team has their own `.gitlab-ci.yml` with subtle variations. Security scanning gets skipped. Deployment standards drift. Audits are painful.\n\n\nGitLab [CI/CD Components](https://docs.gitlab.com/ci/components/) let platform teams publish versioned, reusable pipeline building blocks. Application teams consume them with a single `include:` line and optional inputs — no copy-paste, no drift. Components are discoverable through the [CI/CD Catalog](https://docs.gitlab.com/ci/components/#cicd-catalog), which means teams can find and adopt approved building blocks without needing to go through the platform team directly.\n\n\nHere is a component definition from a shared library:\n\n```yaml\n# templates/deploy.yml\nspec:\n  inputs:\n    stage:\n      default: deploy\n    environment:\n      default: production\n---\ndeploy-job:\n  stage: $[[ inputs.stage ]]\n  script:\n    - echo \"Deploying $APP_NAME to $[[ inputs.environment ]]\"\n    - echo \"Deploy URL: $DEPLOY_URL\"\n  environment:\n    name: $[[ inputs.environment ]]\n```\nAnd here is how an application team consumes it:\n\n```yaml\n# Application repo: .gitlab-ci.yml\nvariables:\n  APP_NAME: \"my-awesome-app\"\n  DEPLOY_URL: \"https://api.example.com\"\n\ninclude:\n  - component: gitlab.com/my-org/component-library/build@v1.0.6\n  - component: gitlab.com/my-org/component-library/test@v1.0.6\n  - component: gitlab.com/my-org/component-library/deploy@v1.0.6\n    inputs:\n      environment: staging\n\nstages:\n  - build\n  - test\n  - deploy\n```\n\nThree lines of `include:` replace hundreds of lines of duplicated YAML. The platform team can push a security fix to `v1.0.7` and teams opt in on their own schedule — or the platform team can pin everyone to a minimum version. Either way, one change propagates everywhere instead of needing to be applied repo by repo.\n\n\nPair this with [resource groups](https://docs.gitlab.com/ci/resource_groups/) to prevent concurrent deployments to the same environment, and [protected environments](https://docs.gitlab.com/ci/environments/protected_environments/) to enforce approval gates - and you have a governed delivery platform where compliance is the default, not the exception.\n\n\nWhy it matters: This is the pattern that makes GitLab CI/CD scale across hundreds of teams. Platform engineering teams enforce compliance without becoming a bottleneck. Application teams get a fast path to a working pipeline without reinventing the wheel.\n\n\n![Component pipeline (imported jobs)](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738776/Blog/Imported/hackathon-fake-blog-post-s/image2_pizuxd.png \"Component pipeline (imported jobs)\")\n\n## Putting it all together\n\nNone of these features exist in isolation. The reason GitLab's pipeline model is worth understanding deeply is that these primitives compose:\n\n*   A monorepo uses parent-child pipelines, and each child uses DAG execution\n\n*   A microservices platform uses multi-project pipelines, and each project uses MR pipelines with merged results\n\n*   A governed platform uses CI/CD components to standardize the patterns above across every team\n\n\nMost teams discover one of these features when they hit a specific pain point. The ones who invest in understanding the full model end up with a delivery system that actually reflects how their engineering organization works, not a pipeline that fights it.\n\n## Other patterns worth exploring\n\n\nThe five patterns above cover the most common structural pain points, but GitLab's pipeline model goes further. A few others worth looking into as your needs grow:\n\n\n*   [Review apps with dynamic environments](https://docs.gitlab.com/ci/environments/) let you spin up a live preview for every feature branch and tear it down automatically when the MR closes. Useful for teams doing frontend work or API changes that need stakeholder sign-off before merging.\n\n*   [Caching and artifact strategies](https://docs.gitlab.com/ci/caching/) are often the fastest way to cut pipeline runtime after the structural work is done. Structuring `cache:` keys around dependency lockfiles and being deliberate about what gets passed between jobs with [artifacts:](https://docs.gitlab.com/ci/yaml/#artifacts) can make a significant difference without changing your pipeline shape at all.\n\n*   [Scheduled and API-triggered pipelines](https://docs.gitlab.com/ci/pipelines/schedules/) are worth knowing about because not everything should run on a code push. Nightly security scans, compliance reports, and release automation are better modeled as scheduled or [API-triggered](https://docs.gitlab.com/ci/triggers/) pipelines with `$CI_PIPELINE_SOURCE` routing the right jobs for each context.\n\n## How to get started\n\nModern software delivery is complex. Teams are managing monorepos with dozens of services, coordinating across multiple repositories, deploying to many environments at once, and trying to keep standards consistent as organizations grow. GitLab's pipeline model was built with all of that in mind.\n\nWhat makes it worth investing time in is how well the pieces fit together. Parent-child pipelines bring structure to large codebases. Multi-project pipelines make cross-team dependencies visible and testable. Dynamic pipelines turn environment management into something that scales gracefully. MR-first delivery with merged results ensures confidence at every step of the review process. And CI/CD Components give platform teams a way to share best practices across an entire organization without becoming a bottleneck.\n\nEach of these features is powerful on its own, and even more so when combined. GitLab gives you the building blocks to design a delivery system that fits how your team actually works, and grows with you as your needs evolve.\n\n> [Start a free trial of GitLab Ultimate](https://about.gitlab.com/free-trial/) to use pipeline logic today.\n\n## Read more\n\n*   [Variable and artifact sharing in GitLab parent-child pipelines](https://about.gitlab.com/blog/variable-and-artifact-sharing-in-gitlab-parent-child-pipelines/)\n*   [CI/CD inputs: Secure and preferred method to pass parameters to a pipeline](https://about.gitlab.com/blog/ci-cd-inputs-secure-and-preferred-method-to-pass-parameters-to-a-pipeline/)\n*   [Tutorial: How to set up your first GitLab CI/CD component](https://about.gitlab.com/blog/tutorial-how-to-set-up-your-first-gitlab-ci-cd-component/)\n*   [How to include file references in your CI/CD components](https://about.gitlab.com/blog/how-to-include-file-references-in-your-ci-cd-components/)\n*   [FAQ: GitLab CI/CD Catalog](https://about.gitlab.com/blog/faq-gitlab-ci-cd-catalog/)\n*   [Building a GitLab CI/CD pipeline for a monorepo the easy way](https://about.gitlab.com/blog/building-a-gitlab-ci-cd-pipeline-for-a-monorepo-the-easy-way/)\n*   [A CI/CD component builder's journey](https://about.gitlab.com/blog/a-ci-component-builders-journey/)\n*   [CI/CD Catalog goes GA: No more building pipelines from scratch](https://about.gitlab.com/blog/ci-cd-catalog-goes-ga-no-more-building-pipelines-from-scratch/)","5 ways GitLab pipeline logic solves real engineering problems","Learn how to scale CI/CD with composable patterns for monorepos, microservices, environments, and governance.",[720],"Omid Khan","https://res.cloudinary.com/about-gitlab-com/image/upload/v1772721753/frfsm1qfscwrmsyzj1qn.png","2026-04-09",[106,724,725,23],"DevOps platform","tutorial",{"featured":27,"template":13,"slug":727},"5-ways-gitlab-pipeline-logic-solves-real-engineering-problems",{"content":729,"config":739},{"title":730,"description":731,"authors":732,"heroImage":734,"date":735,"body":736,"category":9,"tags":737},"How to use GitLab Container Virtual Registry with Docker Hardened Images","Learn how to simplify container image management with this step-by-step guide.",[733],"Tim Rizzi","https://res.cloudinary.com/about-gitlab-com/image/upload/v1772111172/mwhgbjawn62kymfwrhle.png","2026-03-12","If you're a platform engineer, you've probably had this conversation:\n  \n*\"Security says we need to use hardened base images.\"*\n\n*\"Great, where do I configure credentials for yet another registry?\"*\n\n*\"Also, how do we make sure everyone actually uses them?\"*\n\nOr this one:\n\n*\"Why are our builds so slow?\"*\n\n*\"We're pulling the same 500MB image from Docker Hub in every single job.\"*\n\n*\"Can't we just cache these somewhere?\"*\n\nI've been working on [Container Virtual Registry](https://docs.gitlab.com/user/packages/virtual_registry/container/) at GitLab specifically to solve these problems. It's a pull-through cache that sits in front of your upstream registries — Docker Hub, dhi.io (Docker Hardened Images), MCR, and Quay — and gives your teams a single endpoint to pull from. Images get cached on the first pull. Subsequent pulls come from the cache. Your developers don't need to know or care which upstream a particular image came from.\n\nThis article shows you how to set up Container Virtual Registry, specifically with Docker Hardened Images in mind, since that's a combination that makes a lot of sense for teams concerned about security and not making their developers' lives harder.\n\n## What problem are we actually solving?\n\nThe Platform teams I usually talk to manage container images across three to five registries:\n\n* **Docker Hub** for most base images\n* **dhi.io** for Docker Hardened Images (security-conscious workloads)\n* **MCR** for .NET and Azure tooling\n* **Quay.io** for Red Hat ecosystem stuff\n* **Internal registries** for proprietary images\n\nEach one has its own:\n\n* Authentication mechanism\n* Network latency characteristics\n* Way of organizing image paths\n\nYour CI/CD configs end up littered with registry-specific logic. Credential management becomes a project unto itself. And every pipeline job pulls the same base images over the network, even though they haven't changed in weeks.\n\nContainer Virtual Registry consolidates this. One registry URL. One authentication flow (GitLab's). Cached images are served from GitLab's infrastructure rather than traversing the internet each time.\n\n## How it works\n\nThe model is straightforward:\n\n```text\nYour pipeline pulls:\n  gitlab.com/virtual_registries/container/1000016/python:3.13\n\nVirtual registry checks:\n  1. Do I have this cached? → Return it\n  2. No? → Fetch from upstream, cache it, return it\n\n```\n\nYou configure upstreams in priority order. When a pull request comes in, the virtual registry checks each upstream until it finds the image. The result gets cached for a configurable period (default 24 hours).\n\n```text\n┌─────────────────────────────────────────────────────────┐\n│                    CI/CD Pipeline                       │\n│                          │                              │\n│                          ▼                              │\n│   gitlab.com/virtual_registries/container/\u003Cid>/image   │\n└─────────────────────────────────────────────────────────┘\n                           │\n                           ▼\n┌─────────────────────────────────────────────────────────┐\n│            Container Virtual Registry                   │\n│                                                         │\n│  Upstream 1: Docker Hub ────────────────┐               │\n│  Upstream 2: dhi.io (Hardened) ────────┐│               │\n│  Upstream 3: MCR ─────────────────────┐││               │\n│  Upstream 4: Quay.io ────────────────┐│││               │\n│                                      ││││               │\n│                    ┌─────────────────┴┴┴┴──┐            │\n│                    │        Cache          │            │\n│                    │  (manifests + layers) │            │\n│                    └───────────────────────┘            │\n└─────────────────────────────────────────────────────────┘\n```\n\n## Why this matters for Docker Hardened Images\n\n[Docker Hardened Images](https://docs.docker.com/dhi/) are great because of the minimal attack surface, near-zero CVEs, proper software bills of materials (SBOMs), and SLSA provenance. If you're evaluating base images for security-sensitive workloads, they should be on your list.\n\nBut adopting them creates the same operational friction as any new registry:\n\n* **Credential distribution**: You need to get Docker credentials to every system that pulls images from dhi.io.\n* **CI/CD changes**: Every pipeline needs to be updated to authenticate with dhi.io.\n* **Developer friction**: People need to remember to use the hardened variants.\n* **Visibility gap**: It's difficult to tell if teams are actually using hardened images vs. regular ones.\n\nVirtual registry addresses each of these:\n\n**Single credential**: Teams authenticate to GitLab. The virtual registry handles upstream authentication. You configure Docker credentials once, at the registry level, and they apply to all pulls.\n\n**No CI/CD changes per-team**: Point pipelines at your virtual registry. Done. The upstream configuration is centralized.\n\n**Gradual adoption**: Since images get cached with their full path, you can see in the cache what's being pulled. If someone's pulling `library/python:3.11` instead of the hardened variant, you'll know.\n\n**Audit trail**: The cache shows you exactly which images are in active use. Useful for compliance, useful for understanding what your fleet actually depends on.\n\n## Setting it up\n\nHere's a real setup using the Python client from this demo project.\n\n### Create the virtual registry\n\n```python\nfrom virtual_registry_client import VirtualRegistryClient\n\nclient = VirtualRegistryClient()\n\nregistry = client.create_virtual_registry(\n    group_id=\"785414\",  # Your top-level group ID\n    name=\"platform-images\",\n    description=\"Cached container images for platform teams\"\n)\n\nprint(f\"Registry ID: {registry['id']}\")\n# You'll need this ID for the pull URL\n```\n\n### Add Docker Hub as an upstream\n\nFor official images like Alpine, Python, etc.:\n\n```python\ndocker_upstream = client.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://registry-1.docker.io\",\n    name=\"Docker Hub\",\n    cache_validity_hours=24\n)\n```\n\n### Add Docker Hardened Images (dhi.io)\n\nDocker Hardened Images are hosted on `dhi.io`, a separate registry that requires authentication:\n\n```python\ndhi_upstream = client.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://dhi.io\",\n    name=\"Docker Hardened Images\",\n    username=\"your-docker-username\",\n    password=\"your-docker-access-token\",\n    cache_validity_hours=24\n)\n```\n\n### Add other upstreams\n\n```python\n# MCR for .NET teams\nclient.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://mcr.microsoft.com\",\n    name=\"Microsoft Container Registry\",\n    cache_validity_hours=48\n)\n\n# Quay for Red Hat stuff\nclient.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://quay.io\",\n    name=\"Quay.io\",\n    cache_validity_hours=24\n)\n```\n\n### Update your CI/CD\n\nHere's a `.gitlab-ci.yml` that pulls through the virtual registry:\n\n```yaml\nvariables:\n  VIRTUAL_REGISTRY_ID: \u003Cyour_virtual_registry_ID>\n\n  \nbuild:\n  image: docker:24\n  services:\n    - docker:24-dind\n  before_script:\n    # Authenticate to GitLab (which handles upstream auth for you)\n    - echo \"${CI_JOB_TOKEN}\" | docker login -u gitlab-ci-token --password-stdin gitlab.com\n  script:\n    # All of these go through your single virtual registry\n    \n    # Official Docker Hub images (use library/ prefix)\n    - docker pull gitlab.com/virtual_registries/container/${VIRTUAL_REGISTRY_ID}/library/alpine:latest\n    \n    # Docker Hardened Images from dhi.io (no prefix needed)\n    - docker pull gitlab.com/virtual_registries/container/${VIRTUAL_REGISTRY_ID}/python:3.13\n    \n    # .NET from MCR\n    - docker pull gitlab.com/virtual_registries/container/${VIRTUAL_REGISTRY_ID}/dotnet/sdk:8.0\n```\n\n### Image path formats\n\nDifferent registries use different path conventions:\n\n| Registry | Pull URL Example |\n|----------|------------------|\n| Docker Hub (official) | `.../library/python:3.11-slim` |\n| Docker Hardened Images (dhi.io) | `.../python:3.13` |\n| MCR | `.../dotnet/sdk:8.0` |\n| Quay.io | `.../prometheus/prometheus:latest` |\n\n### Verify it's working\n\nAfter some pulls, check your cache:\n\n```python\nupstreams = client.list_registry_upstreams(registry['id'])\nfor upstream in upstreams:\n    entries = client.list_cache_entries(upstream['id'])\n    print(f\"{upstream['name']}: {len(entries)} cached entries\")\n\n```\n\n## What the numbers look like\n\nI ran tests pulling images through the virtual registry:\n\n| Metric | Without Cache | With Warm Cache |\n|--------|---------------|-----------------|\n| Pull time (Alpine) | 10.3s | 4.2s |\n| Pull time (Python 3.13 DHI) | 11.6s | ~4s |\n| Network roundtrips to upstream | Every pull | Cache misses only |\n\n\n\n\nThe first pull is the same speed (it has to fetch from upstream). Every pull after that, for the cache validity period, comes straight from GitLab's storage. No network hop to Docker Hub, dhi.io, MCR, or wherever the image lives.\n\nFor a team running hundreds of pipeline jobs per day, that's hours of cumulative build time saved.\n\n## Practical considerations\nHere are some considerations to keep in mind:\n\n### Cache validity\n\n24 hours is the default. For security-sensitive images where you want patches quickly, consider 12 hours or less:\n\n```python\nclient.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://dhi.io\",\n    name=\"Docker Hardened Images\",\n    username=\"your-username\",\n    password=\"your-token\",\n    cache_validity_hours=12\n)\n```\n\nFor stable, infrequently-updated images (like specific version tags), longer validity is fine.\n\n### Upstream priority\n\nUpstreams are checked in order. If you have images with the same name on different registries, the first matching upstream wins.\n\n### Limits\n\n* Maximum of 20 virtual registries per group\n* Maximum of 20 upstreams per virtual registry\n\n## Configuration via UI\n\nYou can also configure virtual registries and upstreams directly from the GitLab UI—no API calls required. Navigate to your group's **Settings > Packages and registries > Virtual Registry** to:\n\n* Create and manage virtual registries\n* Add, edit, and reorder upstream registries\n* View and manage the cache\n* Monitor which images are being pulled\n\n## What's next\n\nWe're actively developing:\n\n* **Allow/deny lists**: Use regex to control which images can be pulled from specific upstreams.\n\nThis is beta software. It works, people are using it in production, but we're still iterating based on feedback.\n\n## Share your feedback\n\nIf you're a platform engineer dealing with container registry sprawl, I'd like to understand your setup:\n\n* How many upstream registries are you managing?\n* What's your biggest pain point with the current state?\n* Would something like this help, and if not, what's missing?\n\nPlease share your experiences in the [Container Virtual Registry feedback issue](https://gitlab.com/gitlab-org/gitlab/-/work_items/589630).\n## Related resources\n- [New GitLab metrics and registry features help reduce CI/CD bottlenecks](https://about.gitlab.com/blog/new-gitlab-metrics-and-registry-features-help-reduce-ci-cd-bottlenecks/#container-virtual-registry)\n- [Container Virtual Registry documentation](https://docs.gitlab.com/user/packages/virtual_registry/container/)\n- [Container Virtual Registry API](https://docs.gitlab.com/api/container_virtual_registries/)",[725,738,23],"product",{"featured":12,"template":13,"slug":740},"using-gitlab-container-virtual-registry-with-docker-hardened-images",{"content":742,"config":752},{"title":743,"description":744,"authors":745,"heroImage":747,"date":748,"category":9,"tags":749,"body":751},"How IIT Bombay students are coding the future with GitLab","At GitLab, we often talk about how software accelerates innovation. But sometimes, you have to step away from the Zoom calls and stand in a crowded university hall to remember why we do this.",[746],"Nick Veenhof","https://res.cloudinary.com/about-gitlab-com/image/upload/v1750099013/Blog/Hero%20Images/Blog/Hero%20Images/blog-image-template-1800x945%20%2814%29_6VTUA8mUhOZNDaRVNPeKwl_1750099012960.png","2026-01-08",[259,621,750],"open source","The GitLab team recently had the privilege of judging the **iHack Hackathon** at **IIT Bombay's E-Summit**. The energy was electric, the coffee was flowing, and the talent was undeniable. But what struck us most wasn't just the code — it was the sheer determination of students to solve real-world problems, often overcoming significant logistical and financial hurdles to simply be in the room.\n\n\nThrough our [GitLab for Education program](https://about.gitlab.com/solutions/education/), we aim to empower the next generation of developers with tools and opportunity. Here is a look at what the students built, and how they used GitLab to bridge the gap between idea and reality.\n\n## The challenge: Build faster, build securely\n\nThe premise for the GitLab track of the hackathon was simple: Don't just show us a product; show us how you built it. We wanted to see how students utilized GitLab's platform — from Issue Boards to CI/CD pipelines — to accelerate the development lifecycle.\n\nThe results were inspiring.\n\n## The winners\n\n### 1st place: Team Decode — Democratizing Scientific Research\n\n**Project:** FIRE (Fast Integrated Research Environment)\n\nTeam Decode took home the top prize with a solution that warms a developer's heart: a local-first, blazing-fast data processing tool built with [Rust](https://about.gitlab.com/blog/secure-rust-development-with-gitlab/) and Tauri. They identified a massive pain point for data science students: existing tools are fragmented, slow, and expensive.\n\nTheir solution, FIRE, allows researchers to visualize complex formats (like NetCDF) instantly. What impressed the judges most was their \"hacker\" ethos. They didn't just build a tool; they built it to be open and accessible.\n\n**How they used GitLab:** Since the team lived far apart, asynchronous communication was key. They utilized **GitLab Issue Boards** and **Milestones** to track progress and integrated their repo with Telegram to get real-time push notifications. As one team member noted, \"Coordinating all these technologies was really difficult, and what helped us was GitLab... the Issue Board really helped us track who was doing what.\"\n\n![Team Decode](https://res.cloudinary.com/about-gitlab-com/image/upload/v1767380253/epqazj1jc5c7zkgqun9h.jpg)\n\n### 2nd place: Team BichdeHueDost — Reuniting to Solve Payments\n\n**Project:** SemiPay (RFID Cashless Payment for Schools)\n\nThe team name, BichdeHueDost, translates to \"Friends who have been set apart.\" It's a fitting name for a group of friends who went to different colleges but reunited to build this project. They tackled a unique problem: handling cash in schools for young children. Their solution used RFID cards backed by a blockchain ledger to ensure secure, cashless transactions for students.\n\n**How they used GitLab:** They utilized [GitLab CI/CD](https://about.gitlab.com/topics/ci-cd/) to automate the build process for their Flutter application (APK), ensuring that every commit resulted in a testable artifact. This allowed them to iterate quickly despite the \"flaky\" nature of cross-platform mobile development.\n\n![Team BichdeHueDost](https://res.cloudinary.com/about-gitlab-com/image/upload/v1767380253/pkukrjgx2miukb6nrj5g.jpg)\n\n### 3rd place: Team ZenYukti — Agentic Repository Intelligence\n\n**Project:** RepoInsight AI (AI-powered, GitLab-native intelligence platform)\n\nTeam ZenYukti impressed us with a solution that tackles a universal developer pain point: understanding unfamiliar codebases. What stood out to the judges was the tool's practical approach to onboarding and code comprehension: RepoInsight-AI automatically generates documentation, visualizes repository structure, and even helps identify bugs, all while maintaining context about the entire codebase.\n\n**How they used GitLab:** The team built a comprehensive CI/CD pipeline that showcased GitLab's security and DevOps capabilities. They integrated [GitLab's Security Templates](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/ci/templates/Security) (SAST, Dependency Scanning, and Secret Detection), and utilized [GitLab Container Registry](https://docs.gitlab.com/user/packages/container_registry/) to manage their Docker images for backend and frontend components. They created an AI auto-review bot that runs on merge requests, demonstrating an \"agentic workflow\" where AI assists in the development process itself.\n\n![Team ZenYukti](https://res.cloudinary.com/about-gitlab-com/image/upload/v1767380253/ymlzqoruv5al1secatba.jpg)\n\n## Beyond the code: A lesson in inclusion\n\nWhile the code was impressive, the most powerful moment of the event happened away from the keyboard.\n\nDuring the feedback session, we learned about the journey Team ZenYukti took to get to Mumbai. They traveled over 24 hours, covering nearly 1,800 kilometers. Because flights were too expensive and trains were booked, they traveled in the \"General Coach,\" a non-reserved, severely overcrowded carriage.\n\nAs one student described it:\n\n*\"You cannot even imagine something like this... there are no seats... people sit on the top of the train. This is what we have endured.\"*\n\nThis hit home. [Diversity, Inclusion, and Belonging](https://handbook.gitlab.com/handbook/company/culture/inclusion/) are core values at GitLab. We realized that for these students, the barrier to entry wasn't intellect or skill, it was access.\n\nIn that moment, we decided to break that barrier. We committed to reimbursing the travel expenses for the participants who struggled to get there. It's a small step, but it underlines a massive truth: **talent is distributed equally, but opportunity is not.**\n\n![hackathon class together](https://res.cloudinary.com/about-gitlab-com/image/upload/v1767380252/o5aqmboquz8ehusxvgom.jpg)\n\n### The future is bright (and automated)\n\nWe also saw incredible potential in teams like Prometheus, who attempted to build an autonomous patch remediation tool (DevGuardian), and Team Arrakis, who built a voice-first job portal for blue-collar workers using [GitLab Duo](https://about.gitlab.com/gitlab-duo-agent-platform/) to troubleshoot their pipelines.\n\nTo all the students who participated: You are the future. Through [GitLab for Education](https://about.gitlab.com/solutions/education/), we are committed to providing you with the top-tier tools (like GitLab Ultimate) you need to learn, collaborate, and change the world — whether you are coding from a dorm room, a lab, or a train carriage. **Keep shipping.**\n\n> :bulb: Learn more about the [GitLab for Education program](https://about.gitlab.com/solutions/education/).\n",{"slug":753,"featured":12,"template":13},"how-iit-bombay-students-code-future-with-gitlab",{"promotions":755},[756,770,781,793],{"id":757,"categories":758,"header":760,"text":761,"button":762,"image":767},"ai-modernization",[759],"ai-ml","Is AI achieving its promise at scale?","Quiz will take 5 minutes or less",{"text":763,"config":764},"Get your AI maturity score",{"href":765,"dataGaName":766,"dataGaLocation":241},"/assessments/ai-modernization-assessment/","modernization assessment",{"config":768},{"src":769},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138786/qix0m7kwnd8x2fh1zq49.png",{"id":771,"categories":772,"header":773,"text":761,"button":774,"image":778},"devops-modernization",[738,567],"Are you just managing tools or shipping innovation?",{"text":775,"config":776},"Get your DevOps maturity score",{"href":777,"dataGaName":766,"dataGaLocation":241},"/assessments/devops-modernization-assessment/",{"config":779},{"src":780},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138785/eg818fmakweyuznttgid.png",{"id":782,"categories":783,"header":785,"text":761,"button":786,"image":790},"security-modernization",[784],"security","Are you trading speed for security?",{"text":787,"config":788},"Get your security maturity score",{"href":789,"dataGaName":766,"dataGaLocation":241},"/assessments/security-modernization-assessment/",{"config":791},{"src":792},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138786/p4pbqd9nnjejg5ds6mdk.png",{"id":794,"paths":795,"header":798,"text":799,"button":800,"image":805},"github-azure-migration",[796,797],"migration-from-azure-devops-to-gitlab","integrating-azure-devops-scm-and-gitlab","Is your team ready for GitHub's Azure move?","GitHub is already rebuilding around Azure. Find out what it means for you.",{"text":801,"config":802},"See how GitLab compares to GitHub",{"href":803,"dataGaName":804,"dataGaLocation":241},"/compare/gitlab-vs-github/github-azure-migration/","github azure migration",{"config":806},{"src":780},{"header":808,"blurb":809,"button":810,"secondaryButton":815},"Start building faster today","See what your team can do with the intelligent orchestration platform for DevSecOps.\n",{"text":811,"config":812},"Get your free trial",{"href":813,"dataGaName":48,"dataGaLocation":814},"https://gitlab.com/-/trial_registrations/new?glm_content=default-saas-trial&glm_source=about.gitlab.com/","feature",{"text":503,"config":816},{"href":52,"dataGaName":53,"dataGaLocation":814},1776454389816]