Data Engineer

Data Engineer

Data Engineer

Pavago

Brazil

4 horas atrás

Nenhuma candidatura

Sobre

  • Job Title: Data Engineer
  • Position Type: Full-Time, Remote
  • Working Hours: U.S. client business hours (with flexibility for pipeline monitoring and data refresh cycles)

About the Role

  • Our client is seeking a Data Engineer to design, build, and maintain reliable data pipelines and infrastructure that deliver clean, accessible, and actionable data. This role requires strong software engineering fundamentals, experience with modern data stacks, and an eye for quality and scalability. The Data Engineer ensures data flows seamlessly from source systems to warehouses and BI tools, powering decision-making across the business.

Responsibilities

Pipeline Development

  • Build and maintain ETL/ELT pipelines using Python, SQL, or Scala.
  • Orchestrate workflows with Airflow, Prefect, Dagster, or Luigi.
  • Ingest structured and unstructured data from APIs, SaaS platforms, relational databases, and streaming sources.

Data Warehousing

  • Manage data warehouses (Snowflake, BigQuery, Redshift).
  • Design schemas (star/snowflake) optimized for analytics.
  • Implement partitioning, clustering, and query performance tuning.

Data Quality & Governance

  • Implement validation checks, anomaly detection, and logging for data integrity.
  • Enforce naming conventions, lineage tracking, and documentation (dbt, Great Expectations).
  • Maintain compliance with GDPR, HIPAA, or industry-specific regulations.

Streaming & Real-Time Data

  • Develop and monitor streaming pipelines with Kafka, Kinesis, or Pub/Sub.
  • Ensure low-latency ingestion for time-sensitive use cases.

Collaboration

  • Partner with analysts and data scientists to provide curated, reliable datasets.
  • Support BI teams in building dashboards (Tableau, Looker, Power BI).
  • Document data models and pipelines for knowledge transfer.

Infrastructure & DevOps

  • Containerize data services with Docker and orchestrate in Kubernetes.
  • Automate deployments via CI/CD pipelines (GitHub Actions, Jenkins, GitLab CI).
  • Manage cloud infrastructure using Terraform or CloudFormation.

What Makes You a Perfect Fit

  • Passion for clean, reliable, and scalable data.
  • Strong problem-solving skills with debugging mindset.
  • Balance of software engineering rigor and data intuition.
  • Collaborative communicator who thrives in cross-functional environments.

Required Experience & Skills (Minimum)

  • 3+ years in data engineering or back-end development.
  • Strong Python and SQL skills.
  • Experience with at least one major data warehouse (Snowflake, Redshift, BigQuery).
  • Familiarity with pipeline orchestration tools (Airflow, Prefect).

Ideal Experience & Skills

  • Experience with dbt for transformations and data modeling.
  • Streaming data experience (Kafka, Kinesis, Pub/Sub).
  • Cloud-native data platforms (AWS Glue, GCP Dataflow, Azure Data Factory).
  • Background in regulated industries (healthcare, finance) with strict compliance.
  • What Does a Typical Day Look Like?

A Data Engineer’s day revolves around keeping pipelines running, improving reliability, and enabling teams with high-quality data. You will

  • Check pipeline health in Airflow/Prefect and resolve any failed jobs.
  • Ingest new data sources, writing connectors for APIs or SaaS platforms.
  • Optimize SQL queries and warehouse performance to reduce costs and latency.
  • Collaborate with analysts/data scientists to deliver clean datasets for dashboards and models.
  • Implement validation checks to prevent downstream reporting issues.
  • Document and monitor pipelines so they’re reproducible, scalable, and audit-ready.
  • In essence: you ensure the business has accurate, timely, and trustworthy data powering every decision.

Key Metrics for Success (KPIs)

  • Pipeline uptime ≥ 99%.
  • Data freshness within agreed SLAs (hourly, daily, weekly).
  • Zero critical data quality errors reaching BI/analytics.
  • Cost-optimized queries and warehouse performance.
  • Positive feedback from data consumers (analysts, scientists, leadership).

Interview Process

  • Initial Phone Screen
  • Video Interview with Pavago Recruiter
  • Technical Task (e.g., build a small ETL pipeline or optimize a SQL query)
  • Client Interview with Engineering/Data Team
  • Offer & Background Verification