Data Engineer

Posted 34ds ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Senior Data Engineer at Pythian collaborating globally to design impactful cloud data solutions. Leading technical projects and optimizing data architectures with a focus on advanced analytics.

Responsibilities:

  • Design and develop end-to-end cloud-based solutions with a strong emphasis on data applications and infrastructure.
  • Lead discovery and design sessions with customers to gather requirements and translate functional needs into detailed designs.
  • Create and contribute to technical design documents and other project-related documentation.
  • Work with stakeholders to identify technical and business requirements, and apply best practices and standards to achieve successful project outcomes.
  • Regularly demonstrate proficiency in established practices and standards for cloud solutions.
  • Write high-performance, reliable, and maintainable code.
  • Develop test automation frameworks and associated tooling to ensure project success.
  • Handle complex and diverse cloud-based projects, including tasks such as collecting, managing, analyzing, and visualizing very large datasets.
  • Build efficient and scalable data pipelines for batch and real-time use cases across various source and target systems.
  • Optimize ETL/ELT pipelines, troubleshoot pipeline issues, and enhance observability dashboards.
  • Execute data pipeline-specific DevOps activities, such as IaC provisioning, implementing data security, and automation.
  • Analyze potential issues, perform root cause analyses, and resolve technical challenges.
  • Review bug descriptions, functional requirements, and design documents to ensure comprehensive testing plans and cases.
  • Performance tuning of batch and real-time data processing pipelines.
  • Ensure security best practices are followed when working on internal and customer-facing cloud data platforms.
  • Build foundational CI/CD pipelines for all infrastructure components, data pipelines, and custom data applications.
  • Develop observability and data quality solutions for data platforms, including ML and AI applications.
  • Act as a trusted advisor for customers, addressing technical queries and providing support.
  • Engage in thought leadership activities such as whitepaper authoring, conference presentations, and podcasting.
  • Suggest and implement ways to improve project progress and efficiency.
  • Participate in pre-sales activities when required.

Requirements:

  • Experience in implementing complex data architecture, data modeling, data design, and persistence (e.g., warehousing, data marts, data lakes).
  • Proficiency in a programming language such as Python, Java, Go, or Scala.
  • Experience with big data cloud technologies like Microsoft Fabric, Databricks, EMR, Athena, Glue, BigQuery, Dataproc, and Dataflow.
  • Ideally, you will have specific strong hands-on experience working with Google Cloud Platform data technologies—Google BigQuery, Google DataFlow, and executing PySpark and SparkSQL code at Dataproc.
  • Solid understanding of Spark (PySpark or SparkSQL), including using the DataFrame Application Programming Interface as well as analyzing and performance tuning Spark queries.
  • Strong experience in data orchestration using Apache Airflow.
  • Highly proficient in SQL.
  • Strong experience in using code repositories such as GitHub and demonstrable GitOps best practices.
  • Bring a good knowledge of popular database and data warehouse technologies and concepts from Google, Amazon, or Microsoft (Cloud & Conventional RDBMS), such as BigQuery, Redshift, Microsoft Azure SQL Data Warehouse, Snowflake, etc.
  • Have knowledge of how to design distributed systems and the trade-offs involved.
  • Have strong knowledge of CI/CD tools and frameworks such as Jenkins and GitLab to implement DevOps pipelines.
  • Proficiency in using GenAI tools for productivity e.g. Copilot.

Benefits:

  • Competitive total rewards package
  • Blog during work hours; take a day off and volunteer for your favorite charity.
  • Flexibly work remotely from your home, there’s no daily travel requirement to an office!
  • All you need is a stable internet connection.
  • Collaborate with some of the best and brightest in the industry!
  • Hone your skills or learn new ones with our substantial training allowance; participate in professional development days, attend training, become certified, whatever you like!
  • We give you all the equipment you need to work from home including a laptop with your choice of OS, and an annual budget to personalize your work environment!
  • You will have an annual wellness budget to make yourself a priority (use it on gym memberships, massages, fitness and more).
  • A generous amount of paid vacation and sick days, as well as a day off to volunteer for your favorite charity.