AI Research Engineer – Kernel & Inference Optimization
Posted 1hrs ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
AI Research Engineer at Tether focusing on optimizing model serving and inference architectures. Collaborate globally to develop cutting-edge fintech solutions.
Responsibilities:
- Drive innovation in model serving and inference architectures
- Optimize model deployment and inference strategies
- Work on resource-efficient models for limited hardware
- Engineer robust inference pipelines
- Establish comprehensive performance metrics
- Identify and resolve bottlenecks in production environments
Requirements:
- A degree in Computer Science or related field
- Ideally PhD in NLP, Machine Learning, or a related field
- Knowledge of Metal Shading Language (MSL)
- Proven experience in low-level kernel optimizations
- Strong expertise in writing GPU kernels for mobile devices
- Practical experience in developing and deploying end-to-end inference pipelines
- Deep understanding of modern model serving architectures
- Experience in Distributed Inference Systems and techniques like Tensor Parallelism
- Understanding of advanced optimization methods
Benefits:
- Work remotely from anywhere in the world
- Opportunity to collaborate with global teams
- Cutting-edge projects in fintech











