LLM Evaluation Engineer

Posted 2ds ago

Employment Information

Industry

Education

Salary

Experience

Job Type

Location

Report this job

Job expired or something wrong with this job?

Job Description

Develop evaluation systems for AI behavior in enterprise environments using LLMs and real-time data. Collaborate on guardrails and enforcement strategies for AI compliance.

Responsibilities:

Build the evaluation layer in the ThirdLaw platform for LLM prompts and responses
Design and tune guardrails, classifiers, and semantic judgment systems in real-time
Implement evaluation strategies with semantic similarity, foundation model scoring, and rule-based systems
Integrate model outputs with downstream enforcement actions (e.g. redaction, escalation, blocking)
Prototype, tune, and productize small language models for classification, labeling, or scoring
Collaborate with data infrastructure engineers to connect evaluation logic with ingestion and storage
Build tools to observe, debug, and improve evaluator performance across data distributions
Define abstractions for reusable evaluation components that can scale across use cases

Requirements:

7+ years of experience in ML systems or AI engineering roles
At least 1–2 years working directly with LLMs, NLP pipelines, or semantic search
Deep understanding of foundation models (e.g. OpenAI, Claude, Mistral, Llama) and APIs
Hands-on experience with vector search (e.g. FAISS, Qdrant, Weaviate) and embeddings pipelines
Proven ability to implement real-time or near-real-time evaluation logic using semantic similarity, classifier scoring, or structured rules
Strong in Python, with familiarity using libraries like Hugging Face Transformers, LangChain, and PyTorch or TensorFlow
Ability to reason about model behavior, test prompt configurations, and debug complex decision logic in production

Benefits:

Generous benefits
Market cash compensation
Above-market equity
Well-designed benefits

1hr

Power Systems Study Engineer – III

Power Systems Study Engineer III at DLB Associates working on power system studies for mission critical projects. Collaborating within the Engineering Team to optimize performance.

LLM Evaluation Engineer

Employment Information

Report this job

Job Description

Responsibilities:

Requirements:

Benefits:

ThirdLaw Molecular

Report this job

Similar Jobs

DLB Associates

CSG

AtkinsRéalis

Niobium Microsystems

Fixatex Ltd

NuScale Power

Westinghouse Electric Company

CrowdStrike

Pearce Services

Truelogic Software

Columbia Shipmanagement

Advanced Endodontics

Red Cup IT

TigerData (creators of TimescaleDB)

Common Securitization Solutions

Foresight Diagnostics Inc.

Rockwell Automation

Rockwell Automation

The Hashgraph Association

Atlantic Talent Services