AI Research Engineer – Kernel & Inference Optimization

Posted 1hrs ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

AI Research Engineer at Tether focusing on optimizing model serving and inference architectures. Collaborate globally to develop cutting-edge fintech solutions.

Responsibilities:

  • Drive innovation in model serving and inference architectures
  • Optimize model deployment and inference strategies
  • Work on resource-efficient models for limited hardware
  • Engineer robust inference pipelines
  • Establish comprehensive performance metrics
  • Identify and resolve bottlenecks in production environments

Requirements:

  • A degree in Computer Science or related field
  • Ideally PhD in NLP, Machine Learning, or a related field
  • Knowledge of Metal Shading Language (MSL)
  • Proven experience in low-level kernel optimizations
  • Strong expertise in writing GPU kernels for mobile devices
  • Practical experience in developing and deploying end-to-end inference pipelines
  • Deep understanding of modern model serving architectures
  • Experience in Distributed Inference Systems and techniques like Tensor Parallelism
  • Understanding of advanced optimization methods

Benefits:

  • Work remotely from anywhere in the world
  • Opportunity to collaborate with global teams
  • Cutting-edge projects in fintech