available for opportunities · Dublin, Ireland
Site Reliability Engineer · Incident Response Manager
7+ years keeping high-availability fintech and enterprise systems reliable, observable, and resilient. Specialist in monitoring strategy, incident management, and operational automation.
work history
featured work
Built a Python utility that automatically captures and formats incident timelines during major outages, eliminating manual logging during high-pressure P1 situations and significantly reducing documentation effort post-incident.
Designed and implemented a full-stack monitoring strategy across production payment services using Dynatrace, Splunk, and New Relic. Reduced alert noise by consolidating thresholds and introducing synthetic monitoring for critical user journeys.
Developed a structured readiness framework to assess and improve operational stability before major releases — covering monitoring coverage, runbook completeness, on-call preparedness, and rollback plans across engineering teams.
Designed a suite of Bash scripts to automate repetitive operational tasks for national customs infrastructure at Version 1, saving 30+ engineering hours per month and reducing risk from manual processes in high-volume environments.
Contributed to AWS cloud migration projects at Version 1, supporting workload transitions from on-premise VMware and Hyper-V environments to AWS cloud infrastructure while maintaining operational continuity for public sector clients.
Built a Python tool that generates structured P1 incident report templates pre-populated with service context, stakeholder lists, and escalation paths — cutting initial triage time and ensuring consistent communication during critical outages.
capabilities
credentials
academic background
open to new roles
Looking for a Senior SRE or Incident Response in Dublin or remote. Let's connect.