Senior Data Scientist – International eKYC, Identity Graph

Posted 2hrs ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Senior Data Scientist driving international eKYC solutions and entity resolution for identity verification at Socure. Engaging with cross-functional teams to launch and scale innovative solutions.

Responsibilities:

  • Lead the design, development, and deployment of ML and graph-based algorithms for international entity resolution, identity trust scoring, and anomaly detection across heterogeneous, country‑specific datasets.
  • Architect reusable matching and linking frameworks that work across multiple ID schemes (e.g., national ID numbers, passports, voter IDs, mobile accounts, bank accounts) and local name/address conventions.
  • Develop probabilistic and rule‑augmented models that handle noisy, sparse, or partially labeled international data while maintaining explainability and regulatory defensibility.
  • Define and evolve the international extension of Socure’s identity graph: schema design, linkage strategies, quality tiers, and confidence scoring that can be leveraged by multiple products (Verify, KYC, watchlists, fraud).
  • Design and implement robust data quality and monitoring frameworks for international identity data (coverage, stability, drift, regional bias, label quality) and integrate them into modeling and production monitoring workflows.
  • Own experimentation strategy for major international eKYC initiatives: Design offline evaluations and online A/B tests that reflect local ground truth constraints and data sparsity.
  • Define success metrics that balance approval rates, fraud capture, and regulatory/operational constraints per market.
  • Analyze lift, stability, and fairness trade‑offs and drive go/no‑go decisions with Product and Engineering.
  • Contribute to model governance documentation and support responses to regulators and large enterprise customers regarding model logic, data provenance, fairness, and monitoring for international markets.

Requirements:

  • Master’s or Ph.D. in Computer Science, Data Science, Machine Learning, Statistics, Mathematics, or a related field, or equivalent practical experience.
  • 6+ years of hands-on applied ML / data science experience (4+ with Ph.D.), including owning production models and pipelines in high‑stakes domains (fraud, risk, identity, payments, credit, or similar).
  • Significant prior work on international or multi‑region products is strongly preferred (e.g., cross‑country KYC, credit risk, payments, or compliance systems).
  • Expert‑level proficiency in Python and SQL, with extensive experience in distributed data processing (Spark/PySpark, Databricks or similar) on very large datasets.
  • Deep experience designing, training, and deploying models for classification, ranking, anomaly detection, and/or graph learning, including:
  • Feature engineering for noisy/heterogeneous identity data.
  • Robust evaluation under label sparsity and feedback delays.
  • Calibration and thresholding tailored to regional risk and regulatory constraints.
  • Proven expertise with graph technologies (e.g., Neo4j, AWS Neptune, GraphFrames, DGL, PyTorch Geometric) and graph algorithms (entity resolution, link prediction, community detection, label propagation) at scale.

Benefits:

  • Offers Equity
  • Offers Bonus