
Teams that are afraid of their CI/CD pipelines treat them like infrastructure; something you carefully architect, document, and maintain. That's wrong. Pipelines are feedback loops. The only way to make them reliable is to break them constantly and learn from it.
When you treat a pipeline like infrastructure, you get
Pipeline reliability is not the same as deployment reliability or application reliability. Your pipeline is supposed to fail. That's its job; catch problems before they hit production. A pipeline that never fails either isn't being used or isn't actually testing anything.
You can't design your way to a good pipeline. You can only iterate your way there.
The tooling barely matters. Pick something standard (GitLab, GitHub Actions, even Jenkins if you're stuck with it) and just start. Laugh out anyone who says they have a better custom solution.
What matters is your willingness to fuck with it. To let it break. To learn from the breaks. To make it disposable enough that you're not afraid to change it.
The more disposable it is, the more fearless you can be with it.
Start gluing basics together. Add what seems useful. Remove what isn't. Repeat until it stops sucking.
The only way to make it reliable is to break it a lot.