This project is responsible for building and shipping artifacts from the NuMarquez repository (our fork of the Marquez project).
Front-end application to serve data lineage metadata and discovery capacibilties to data platform users. It allows users to visually navigate in lineages of jobs and datasets, gathering knowledge from the metadata about these resources and how they're related to each other (dependencies).
Back-end application OpenLineage-compatible, responsible for storing lineage events and producing data entities such as jobs, datasets, and runs based on these events. Also responsible for serving data to the Web UI, and empowering other data lineage tools at Nubank.
Regular Marquez API, capable of performing reads and writes ops, connected to the WRITER Aurora Postgresql cluster endpoint.
Applied connected to the READER Aurora Postgresql cluster endpoint, being restricted to GET (read) operations.
For the staging environment there's one workflow per type of action and per application. The available actions are related to:
build: Build Docker imagesdeploy: Deploy software through Helm into Kubernetesdestroy: Remove the Helm charts from Kubernetes
All workflows for the staging environment starts with "pr-<action_name>", and they have their own message to be used in PR comments for triggering them. For example: if I want to build a new Docker image for Marquez Web UI, I must comment in a PR:
cicd/pr-build-marquez-api-image:< NuMarquez's branch I want to use, including "main">
Please check the workflows to see how to trigger them.
For the production environment we're using a upstream X downstream strategy for building, deploying, removing deployments related to NuMarquez applications.
It works like the following:
graph TD
A0(Comment in a PR containing 'cicd/release:<definitionFile>') -->A1(release-marquez)
A1(release-marquez) -->B1(downstream-marquez-api-write)
A1(release-marquez) -->B2(downstream-marquez-api-read-only)
A1(release-marquez) -->B3(downstream-marquez-web-ui)
B1(downstream-marquez-api-write) -->C1(build-docker-image)
C1(build-docker-image) -->C2(deploy-into-k8s)
B1(downstream-marquez-api-write) -->C3(only-deploy-into-k8s)
C2(deploy-into-k8s) -->C4(Marquez API Write deployed)
C3(only-deploy-into-k8s) -->C4(Marquez API Write deployed)
B2(downstream-marquez-api-read-only) -->C5(build-docker-image)
C5(build-docker-image) -->C6(deploy-into-k8s)
B2(downstream-marquez-api-read-only) -->C7(only-deploy-into-k8s)
C6(deploy-into-k8s) -->C8(Marquez API Read-only deployed)
C7(only-deploy-into-k8s) -->C8(Marquez API Read-only deployed)
B3(downstream-marquez-web-ui) -->C9(build-docker-image)
C9(build-docker-image) -->C10(deploy-into-k8s)
B3(downstream-marquez-web-ui) -->C11(only-deploy-into-k8s)
C10(deploy-into-k8s) -->C12(Marquez Web UI deployed)
C11(only-deploy-into-k8s) -->C12(Marquez Web UI deployed)
And for removing:
graph TD
A0(Comment in a PR containing 'cicd/revert:<chartName>') -->A1(revert-marquez)
A1(revert-marquez) -->B1(downstream-marquez-api-write)
A1(revert-marquez) -->B2(downstream-marquez-api-read-only)
A1(revert-marquez) -->B3(downstream-marquez-web-ui)
B1(downstream-marquez-api-write) -->C1(remove-helm-cart)
B2(downstream-marquez-api-read-only) -->C2(remove-helm-cart)
B3(downstream-marquez-web-ui) -->C3(remove-helm-cart)
C1(remove-helm-cart) -->D1(Marquez API Write removed)
C2(remove-helm-cart) -->D2(Marquez API Read-only removed)
C3(remove-helm-cart) -->D3(Marquez Web UI removed)
Open a PR and comment:
cicd/release:<name of the release file without '.json'>
The release-marquez.yaml workflow will be triggered, it will validate the content and send it to an event-driven layer for triggering the "downstream" files expecting to match the build & deploy strategy within the release definition file.
Please see more about the release definition files in here.
Open a PR and comment:
cicd/revert:<name of the Helm chart>
The revert-marquez.yaml workflow will be triggered, it will build an event about the deletion of the Helm chart and send it to an event-driven layer for triggering the "downstream" files.
The current name of the charts are:
marquez-api-writemarquez-api-read-onlymarquez-web-ui