Data Engineer

Pavago

Brazil

•

4 horas atrás

•

Nenhuma candidatura

Job Title: Data Engineer
Position Type: Full-Time, Remote
Working Hours: U.S. client business hours (with flexibility for pipeline monitoring and data refresh cycles)

Our client is seeking a Data Engineer to design, build, and maintain reliable data pipelines and infrastructure that deliver clean, accessible, and actionable data. This role requires strong software engineering fundamentals, experience with modern data stacks, and an eye for quality and scalability. The Data Engineer ensures data flows seamlessly from source systems to warehouses and BI tools, powering decision-making across the business.

Build and maintain ETL/ELT pipelines using Python, SQL, or Scala.
Orchestrate workflows with Airflow, Prefect, Dagster, or Luigi.
Ingest structured and unstructured data from APIs, SaaS platforms, relational databases, and streaming sources.

Implement validation checks, anomaly detection, and logging for data integrity.
Enforce naming conventions, lineage tracking, and documentation (dbt, Great Expectations).
Maintain compliance with GDPR, HIPAA, or industry-specific regulations.

Partner with analysts and data scientists to provide curated, reliable datasets.
Support BI teams in building dashboards (Tableau, Looker, Power BI).
Document data models and pipelines for knowledge transfer.

3+ years in data engineering or back-end development.
Strong Python and SQL skills.
Experience with at least one major data warehouse (Snowflake, Redshift, BigQuery).
Familiarity with pipeline orchestration tools (Airflow, Prefect).

Experience with dbt for transformations and data modeling.
Streaming data experience (Kafka, Kinesis, Pub/Sub).
Cloud-native data platforms (AWS Glue, GCP Dataflow, Azure Data Factory).
Background in regulated industries (healthcare, finance) with strict compliance.
What Does a Typical Day Look Like?

Check pipeline health in Airflow/Prefect and resolve any failed jobs.
Ingest new data sources, writing connectors for APIs or SaaS platforms.
Optimize SQL queries and warehouse performance to reduce costs and latency.
Collaborate with analysts/data scientists to deliver clean datasets for dashboards and models.
Implement validation checks to prevent downstream reporting issues.
Document and monitor pipelines so they’re reproducible, scalable, and audit-ready.
In essence: you ensure the business has accurate, timely, and trustworthy data powering every decision.