Welcome to KTO Group, where innovation drives excitement in iGaming.
Founded in 2018 by Andreas Bardun, we're transforming online gaming with a focus on transparency and player satisfaction.
AtKTO.com, we blend the thrill of sports betting with online casino entertainment, tailored to local markets and powered by our proprietary platform for a seamless, personalized experience.
KTO is a rising leader in LATAM, proudly ranked among Brazil's top 10 iGaming brands.
Join us as we set new standards in trust, innovation, and the future of iGaming.
SUMMARY OF THE POSITION
We are seeking a highly skilledSite Reliability Engineer (SRE)to join our technology team at KTO.
The successful candidate will be responsible for designing, implementing, and maintaining scalable and reliable infrastructure while ensuring seamless deployments and system stability.
You will work closely with development teams, applying SRE principles to optimize system performance, observability, and automation.
MAIN RESPONSIBILITIES
Design, develop, and maintain automation solutions usingTerraformfor centralizedInfrastructure as Code (IaC)management.
Implement and manageCI/CD pipelineswithGitHub ActionsandArgoCDto support continuous and secure application deployments.
Enhance system stability and reliability by establishing advanced observability practices, leveragingElastic Cloud, Grafana, and Prometheusfor monitoringAPM, logs, and metricswith event correlation.
Proactively identify and resolve performance and availability issues in distributed systems to ensure minimal downtime and high reliability.
Manage and optimize containerized environments withKubernetes, ensuring scalability and high availability of applications.
Collaborate with development teams to align operations withSRE best practices, includingSLIs, SLOs, and Error Budgets .
Advocate for and implement strategies forblue/green deploymentsandContinuous Deploymentto minimize risk during releases.
EXPERIENCE & QUALIFICATIONS REQUIRED
Proven experience withTerraformfor managingInfrastructure as Code .
Strong expertise incontainer orchestrationwithKubernetesandHelmfor centralized management.
Proficiency inobservability tools, particularlyElastic Cloud, Grafana, and Prometheus, for monitoring and troubleshooting.
Expertise inCI/CD pipelineswithGitHub Actions and ArgoCD .
Solid experience withLinux systemsandshell scripting .
Hands-on experience withAWS or similar cloud platforms .
Knowledge of programming languages such asPython, Go, or Java .
Experience withSQL and NoSQL databases .
Background in deploying and managinghighly available, scalable production environments .
NICE TO HAVE
Experience withadvanced deployment strategiessuch ascanary releasesorfeature flags .
Knowledge ofdistributed tracing and correlation techniques .
Exposure toDevOps practicesand tools applied to reliability engineering.
Certifications inCloud Computing, Kubernetes, or DevOps-related areas .
At KTO, diversity isn't just a buzzword – it's our strength.
We're all about creating an inclusive environment where everyone feels valued and empowered.
Together, we're not just working on projects – we're making a real impact in our communities.
Join us in celebrating diversity and driving meaningful change!
KTO is licensed for Brazilian sports betting and online gaming under Portaria 2.093/2024, ensuring a secure and regulated environment for our operations.#J-18808-Ljbffr