
Data Engineer (DBT + Spark + Argo) (Remote – Latam)
Jobgether
Brazil
•3 horas atrás
•Nenhuma candidatura
Sobre
- This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Data Engineer (DBT + Spark + Argo) in Latin America.
- We are seeking a highly skilled Data Engineer to join a remote-first, collaborative team driving the modernization of large-scale data platforms in the healthcare sector. In this role, you will work on transforming legacy SQL pipelines into modular, scalable, and testable DBT architectures, leveraging Spark for high-performance processing and Argo for workflow orchestration. You will implement modern lakehouse solutions, optimize storage and querying strategies, and enable real-time analytics with ElasticSearch. This position offers the chance to contribute to a cutting-edge, cloud-native data environment, working closely with cross-functional teams to deliver reliable, impactful data solutions.
Accountabilities
- Translate legacy T-SQL logic into modular, scalable DBT models powered by Spark SQL.
- Build reusable, high-performance data transformation pipelines.
- Develop testing frameworks to ensure data accuracy and integrity within DBT workflows.
- Design and orchestrate automated workflows using Argo Workflows and CI/CD pipelines with Argo CD.
- Manage reference datasets and mock data (e.g., ICD-10, CPT), maintaining version control and governance.
- Implement efficient storage and query strategies using Apache Hudi, Parquet, and Iceberg.
- Integrate ElasticSearch for analytics through APIs and pipelines supporting indexing and querying.
- Collaborate with DevOps teams to optimize cloud storage, enforce security, and ensure compliance.
- Participate in Agile squads, contributing to planning, estimation, and sprint reviews.
- Strong experience with DBT for data modeling, testing, and deployment.
- Hands-on proficiency in Spark SQL, including performance tuning.
- Solid programming skills in Python for automation and data manipulation.
- Familiarity with Jinja templating to build reusable DBT components.
- Practical experience with data lake formats: Apache Hudi, Parquet, Iceberg.
- Expertise in Argo Workflows and CI/CD integration with Argo CD.
- Deep understanding of AWS S3 storage, performance tuning, and cost optimization.
- Experience with ElasticSearch for indexing and querying structured/unstructured data.
- Knowledge of healthcare data standards (e.g., ICD-10, CPT).
- Ability to work cross-functionally in Agile environments.
- Nice to have: Experience with Docker, Kubernetes, cloud-native data tools (AWS Glue, Databricks, EMR), CI/CD automation, data compliance standards (HIPAA, SOC2), or contributions to open-source DBT/Spark projects.
- Contractor agreement with payment in USD.
- 100% remote work within LATAM.
- Observance of local public holidays.
- Access to English classes and professional learning platforms.
- Referral program and other growth opportunities.
- Exposure to cutting-edge data engineering projects in a cloud-native environment.
- Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching.
- When you apply, your profile goes through our AI-powered screening process designed to identify top talent efficiently and fairly.
- 🔍 Our AI thoroughly analyzes your CV and LinkedIn profile, evaluating your skills, experience, and achievements.
- 📊 It compares your profile against the job’s core requirements and past success factors to calculate a match score.
- 🎯 The top 3 candidates with the highest match are automatically shortlisted.
- 🧠 When necessary, our human team may perform additional review to ensure no strong candidate is overlooked.
- The process is transparent, skills-based, and unbiased, focusing solely on your fit for the role. Once the shortlist is completed, it is shared with the hiring company, who then determines next steps such as interviews or additional assessments.
- Thank you for your interest!
- #LI-CL1