Experian logo

Mid Cloud Observability Engineer

Experian

BrazilOTHERPosted 1 day(s) ago$0-$0 / yr

$0-$0 / yr

Salary

brazil

Region

ASAP

Start Date

About Experian

Experian is a global data and technology company that drives opportunities for people and businesses around the world. We operate in diverse markets such as financial services, healthcare, automotive, agribusiness, insurance, and more. Experian invests in people and advanced new technologies to unlock the power of data. We have an incredible team of 25,200 employees in 32 countries. Our uniqueness is valuing yours. Experian's people-centric, inclusive, and purpose-driven culture is recognized by numerous awards — including World’s Best Workplaces™ 2025 (Fortune's Top 25 global) and Great Place To Work™ in 26 countries, among others. Check out Experian Life on social media or explore our careers website to understand why. Experian is also proud to be an equal opportunity employer and an affirmative action employer. If you have a disability or need that requires accommodation, please let us know as soon as possible.

About this Role.

A cloud observability engineer’s day is about making complex systems understandable, improving signal quality, and enabling faster, smarter debugging across teams.

  • Check system heathy, review alerts/incidents.

  • Triage alerts

  • Investigate issues

  • Improve observability instrumentation

  • Build and improve dashboards

  • Alert optimization

  • Work with development teams and other engineering partners.

  • Continuous improvements.

  • Release support.

Observability & Monitoring

  • Design and implement observability frameworks using metrics, logs, and distributed tracing

  • Develop dashboards, alerts, and visualizations to monitor system health

  • Standardize observability practices across engineering teams (logging, telemetry, tracing)

  • Implement and manage native monitoring tools.

  • Amazon CloudWatch (metrics, logs, alarms, dashboards)

  • AWS X-Ray (distributed tracing)

  • AWS Distro for OpenTelemetry (ADOT)

Detection & Response

  • Build alerting systems (avoid alert fatigue)

  • Participate in on-call rotations

  • Build intelligent alerting using

Improve System Reliability/ proactively reduce risk

  • Identify reliability risks to help harden systems against failure

  • Reduction in alert noise / false positives

  • Increased observability coverage (% of services instrumented)

  • Improved SLO compliance

Instrumentation & Telemetry Engineering

  • Embed observability into applications

  • Add tracing/metrics into code

  • Standardize logging formats

  • Ensure all services are observable end-to-end

  • Microservices (EKS, ECS)

  • Serverless (Lambda, API Gateway)

  • Data services (RDS, DynamoDB, S3)

Collaboration Across Teams

  • Development

  • Platform/infra teams

  • Security & operations

Requirements

  • SRE, DevOps, or Cloud Engineering

  • Cloud platforms AWS

  • Experience in AWS services including

  • CloudWatch, X-Ray, Lambda

  • ECS/EKS, API Gateway

  • RDS, DynamoDB, S3

  • Hands-on with experience with an observability tool

  • Dynatrace

  • Splunk

  • Datadog

  • OpenTelemetry

  • Prometheus, Grafana

  • AWS Distro for OpenTelemetry

  • Strong understanding of:

  • Containers & orchestration (Docker, Kubernetes)

  • CI/CD pipelines

  • Infrastructure as Code (Terraform, CloudFormation)

  • Monitoring/observability tools

Nice to have

  • Experience building observability platforms at scale in AWS

  • Familiarity with multi-account AWS environments

  • Experience with cost optimization for observability (logging/metrics ingestion)

  • Experience in high-scale distributed system

Skills Required

Ready to Apply?

Apply Now

Similar jobs

No similar jobs found.