Position Title: Site Reliability Engineer

Date: Jul 14, 2026

Requisition ID: 25576

Work Location:

Sintra, Sintra, PT, 2639-002

MAKE HISTORY WITH US!    

 

At PMI, we’ve chosen to do something incredible. We’re totally transforming our business, and building our future on smoke-free products with the power to improve the lives of a billion smokers worldwide. 

With huge change, comes huge opportunity. So, wherever you join us, you’ll enjoy the freedom to dream up and deliver better, brighter solutions and the space to move your career forward in endlessly different directions.

Role purpose

The SRE Engineer role focuses on improving the reliability, operability, and observability of production services through hands‑on engineering and operational work. The role combines day‑to‑day operational responsibilities with continuous improvement activities across monitoring, alerting, incident support, logging, and automation.

The SRE Engineer works on live systems and is expected to investigate production issues, troubleshoot complex problems, and implement improvements that make services more reliable and easier to operate. This includes configuring and maintaining dashboards, alerts, log views, and automation using established SRE tooling and infrastructure‑as‑code practices. The role is execution‑oriented and applies defined standards and frameworks rather than setting organizational reliability strategy.

The role also includes supporting the adoption and usage of SLIs and SLOs by implementing agreed definitions, ensuring correct data sources, and helping teams use reliability signals in daily operations. The focus is on consistent implementation and operational use, not ownership of the reliability framework itself.

An AI‑oriented mindset is expected. This means understanding the concepts and potential applications of AI‑assisted capabilities within SRE tools (for example, anomaly detection, noise reduction, correlation, and automation support), and being able to work with AI‑enabled features where they are available. The role does not require building AI models, but does require the ability to understand how AI‑driven features influence observability, alerting, and operational workflows, and to use these features responsibly within existing tools.

In addition, the SRE Engineer is expected to interact with external vendors related to SRE tooling and platforms. This includes acting as a technical point of contact for operational topics such as troubleshooting, integrations, upgrades, and feature usage. Vendor interaction is expected to grow progressively over time, starting with guided collaboration and moving toward more autonomous technical ownership.

Overall, the role is intended for an engineer who can operate independently on complex tasks, apply SRE practices consistently, understand modern observability and automation tooling (including AI‑assisted capabilities), and contribute to improving reliability through practical, measurable changes.

Key responsibilities

Monitoring & Observability
- Implement and improve monitoring capabilities to ensure real-time visibility and proactive issue detection.
- Design, build, and maintain dashboards, alerts, and supporting telemetry.
On-call & Escalation enablement
- Improve alert routing, escalation policies, and integrations between monitoring and alerting tools.
- Support platform teams in adopting alerting best practices and reducing alert noise.
Incident support and problem management
- Contribute to the resolution of complex incidents through structured troubleshooting and analysis.
- Support root cause analysis, documentation, and corrective actions to prevent recurrence.
Log aggregation & analysis
- Improve log ingestion, analysis, and visualization using ELK.
- Build reusable dashboards and alerts based on log patterns and operational signals.
SLO/SLI implementation
- Support the definition and implementation of SLIs and SLOs.
- Use reliability data and error budgets to guide operational and engineering improvements.
Infrastructure as Code & automation
- Develop and maintain Terraform/Terraform Enterprise assets, including reusable modules.
- Automate onboarding, configuration, and operational workflows to reduce manual effort.
Vendor interaction and management
- Act as a technical point of contact for SRE-related vendors (e.g. observability, alerting, CI/CD).
- Support tool onboarding, upgrades, integrations, and issue resolution with vendors.
- Participate in vendor reviews, follow-ups, and roadmap discussions together with senior engineers or leadership.
- Ensure vendor-provided solutions align with SRE standards, tooling strategy, and operational needs.
Documentation & knowledge sharing
- Maintain technical documentation, runbooks, and operational guidelines.
- Share knowledge within the SRE team and contribute to repeatable, scalable practices.

Must-have capabilities

SRE / Reliability practices

Intermediate understanding of SRE principles and practices.
Ability to handle more complex tasks and contribute to continuous improvement of processes.
Intermediate troubleshooting and problem‑solving skills in production environments.

Technical skills

New Relic: monitoring and alerting setup, including custom dashboards.
ELK: log management, analysis, visualization, and alerting.
Opsgenie: alert management, routing, escalation policies, and integrations.
Terraform / Terraform Enterprise (advanced): IaC tasks, module creation, and lifecycle management.
Bitbucket / GitHub (advanced): branching strategies, pull requests, and code reviews.
Python: scripting and automation, including API integrations.
JavaScript: scripting for automation and tool integrations.
Jenkins: CI/CD pipelines, complex workflows, and integrations.
AWS: understanding of core cloud services and reliability fundamentals.
Vendor coordination
- Ability to work with external vendors on technical topics, including issue triage, implementation support, and follow‑ups.
- Comfortable representing the SRE perspective in discussions with vendors.

Should-have capabilities

Ability to mentor junior engineers and provide technical guidance.
Strong communication and collaboration skills, including working across internal teams and with external vendors.

Nice-to-have capabilities

Understanding of Node.js.
Familiarity with container technologies (Docker, Kubernetes).
Familiarity with Ansible.

Please note that only on-line applications will be taken into consideration.

Only selected candidates will be contacted.

Note just for Poland applicants:

In this position you will earn no less than PLN 17,621 gross per month