Data Engineer

Posted 9ds ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Data Engineer responsible for building enterprise-grade data pipelines for a global client. Join a premier AI services provider to drive innovation with data science and technology.

Responsibilities:

  • Design and build production-grade data pipelines in Databricks using Spark/PySpark and SQL.
  • Develop and maintain an Analytics ID stitching pipeline using deterministic and probabilistic matching techniques across multiple customer data sources.
  • Build and manage modular data marts (Identity, Behavior, Demographics) with independent refresh cadences.
  • Implement and maintain a scalable feature store supporting downstream analytics and data science use cases.
  • Own the end-to-end data lifecycle: ingestion, transformation, validation, deployment, monitoring, and optimization.
  • Develop data quality frameworks including schema drift detection, anomaly monitoring, match-rate validation, and automated deduplication audits.
  • Implement CI/CD processes for multi-environment promotion (dev/staging/prod) in Databricks environments.
  • Coordinate orchestration workflows and manage dependencies using Databricks Workflows or similar tools.
  • Collaborate closely with Data Architects and Client stakeholders to translate business rules into scalable technical solutions.
  • Produce comprehensive technical documentation including data contracts, lineage maps, architecture diagrams, and operational runbooks.

Requirements:

  • 4+ years of experience in Data Engineering building production-grade data pipelines at scale.
  • Strong hands-on experience with Databricks and Apache Spark (PySpark preferred).
  • Advanced SQL skills (complex joins, CTEs, window functions, performance tuning).
  • Experience developing identity resolution or entity matching pipelines (deterministic and/or probabilistic).
  • Experience designing and implementing data marts or dimensional models (Kimball or similar).
  • Familiarity with data quality frameworks (schema drift detection, validation, anomaly monitoring).
  • Experience implementing CI/CD for data pipelines and managing multi-environment deployments.
  • Strong communication skills and ability to present technical concepts to non-technical stakeholders.
  • Experience using Jira for ticket tracking and Confluence for documentation.
  • Nice to Have: Experience with third-party data providers (Epsilon, LiveRamp, Neustar).
  • Experience with feature stores (Databricks Feature Store, Feast, or similar).
  • Knowledge of Databricks Unity Catalog.
  • Experience managing large-scale customer data (transactions, loyalty, retail/QSR data).
  • Experience with Delta Lake / Lakehouse architecture.
  • Familiarity with orchestration tools such as Airflow.
  • Experience working in consulting or embedded enterprise client environments.
  • Advanced English level (written and spoken) required for client-facing collaboration and technical presentations.

Benefits:

  • 📚Learning Opportunities: Certifications in AWS (we are AWS Partners), Databricks, and Snowflake.
  • Access to AI learning paths to stay up to date with the latest technologies.
  • Study plans, courses, and additional certifications tailored to your role.
  • Access to Udemy Business, offering thousands of courses to boost your technical and soft skills.
  • English lessons to support your professional communication.
  • 👨🏽‍💻Travel opportunities to attend industry conferences and meet clients.
  • 👩‍🏫 Mentoring and Development: Career development plans and mentorship programs to help shape your path.
  • 🎁 Celebrations & Support: Special day rewards to celebrate birthdays, work anniversaries, and other personal milestones.
  • Company-provided equipment.
  • ⚖️ Flexible working options to help you strike the right balance.
  • Other benefits may vary according to your location in LATAM.