Site Realiability Engineer [REMOTE | +/- 2hrs CEST]
Own the infra of a cloud platform rewriting efficiency from first principles. Direct impact, zero bureaucracy, world-class systems team.
We usually respond within a day
The cloud is broken: It's wasteful, slow, awfully expensive, and burdened with legacy tech that wasn't built for today's workloads. At Unikraft we're building a generational, truly millisecond-native, extremely scalable cloud platform that provides exponentially higher efficiency. Are you bored with your current job? Want to push the boundaries of what's possible in the cloud to the absolute limit?
Our team consists of some of the best systems, performance, and security geeks out there, and is backed by top investors with category leaders as our customers. We believe a focused team of exceptional people, moving fast with conviction, can rebuild the cloud from first principles and make extreme efficiency (e.g., millions of users on a few servers) available to everyone.
What you'll do and why it's career defining
This is a rare opportunity to work at the very foundation of a generational cloud platform -- one that's rewriting the rules of infrastructure performance. As an SRE at Unikraft, you won't just keep the lights on. You'll be building the reliability and deployment machinery that underpins a product developers love and trust with their most demanding workloads.
You'll work closely with world-class systems engineers and have direct ownership over production environments, deployment pipelines, and observability infrastructure. If you care deeply about reliability, love automation, and want your work to have a measurable impact on a fast-growing platform, this is your role.
What You'll Own
Deployments & Reliability
Maintain and operate customer on-prem and cloud deployments of our platform, ensuring reliability and rapid troubleshooting of technical issues.
Plan, package, and roll out software updates both internally and to customers, including testing and validation.
Collaborate with engineering to ensure quality deployments and maintain a high standard of product reliability.
Deploy, manage, and troubleshoot Kubernetes clusters for reliable, scalable infrastructure.
Observability & Automation
Set up and manage monitoring systems to proactively detect and resolve issues in production environments.
Write scripts and automation for deployment, infrastructure management, and CI/CD workflows.
Build tooling and automation to streamline deployment and platform integration.
Contribute to continuous integration pipelines that catch regressions across components and system integrations.
Documentation
Create and maintain clear documentation for systems, processes, and tools to support team effectiveness.
What We're Looking For
At least 2 years of experience working in high-pressure production environments.
Proven experience in Linux system administration, software packaging, and delivery.
Solid understanding of Linux networking fundamentals, including firewalls, DNS, proxies, and best practices.
Experience managing and troubleshooting Kubernetes clusters in production.
Good understanding of the CNCF/cloud-native landscape and associated tools.
Familiarity with observability tools such as Prometheus and Grafana.
Basic scripting skills (e.g., Bash, Python).
Familiarity with cloud platforms (e.g., AWS, GCP, Azure).
Interest in automation tools like Ansible, Terraform, or similar.
Exposure to CI/CD pipelines (e.g., GitHub Actions, Jenkins, GitLab CI).
Familiarity with microservice architectures, serverless, and DevOps best practices.
[BONUS] Familiarity with virtualization solutions like QEMU/KVM -- micro-VMMs like Cloud-Hypervisor or Firecracker are a big plus.
Why you will love this team
Elite founders, real access: Work directly with globally recognized deep tech founders who've spent careers at the frontier of systems and cloud research. You'll learn more here in a year than most people do in five.
World-class product: A category-defining technology that sparks genuine excitement with developers.
Zero bureaucracy: Founder-led, product-obsessed, and deeply technical.
Fully Remote, Fully Flexible: Work from your favorite place, at your most productive times.
Retreats, Game Nights and More: Fun-focused team retreats and other events to recharge and build great relationships.
The Standard Stuff: Competitive salary, 6 weeks of vacation, development opportunities.
- Department
- Platform Team
- Locations
- Berlin
- Remote status
- Fully Remote