Data Science Specialist – Feature Store & ML Platform

Posted 19hrs ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Data Science Specialist at Compass UOL leading Feature Store capabilities development and implementing efficient data processing. Requires expertise in ML platforms and AWS services.

Responsibilities:

  • Lead the development and evolution of Feature Store capabilities: data lineage, feature views, feature recommendation, and new query engines;
  • Design and implement Apache Iceberg tables with a focus on read performance, versioning, and schema evolution;
  • Architect and optimize the serving layer with Redis for real-time features with strict latency SLOs;
  • Integrate and optimize Amazon EMR as a query and large-scale processing engine;
  • Define and implement feature selection and transformation pipelines with end-to-end traceability;
  • Establish standards for feature quality, versioning, and governance across the platform;
  • Act as the technical reference for data and data science teams that consume the Feature Store.

Requirements:

  • Proven expertise in feature engineering on enterprise ML platforms (Feast, Tecton, Hopsworks, or equivalents)
  • Advanced proficiency in Apache Spark / PySpark for distributed processing at scale
  • Deep knowledge of Apache Iceberg and lakehouse architectures (comparative experience with Delta Lake and Hudi)
  • Expertise in Redis for low-latency feature serving, including cache invalidation strategies and efficient serialization
  • Solid production experience with AWS data services (S3, Glue, EMR, Redshift, Athena)
  • Preferred:
  • Production experience with data lineage and metadata catalogs (DataHub, OpenMetadata, Marquez)
  • Experience with Amazon EMR: cluster configuration, cluster optimization, and Spark job tuning
  • Expertise in MLOps practices focused on versioning and traceability of data artifacts
  • Prior experience in a financial context with high-cardinality, high-frequency data and regulatory requirements
  • Familiarity with data quality tools at scale (Great Expectations, Soda, dbt tests).