AI/ML Engineer
Posted 69ds ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
AI/ML Engineer leading the implementation of multi-agent systems at Nordcloud. Focused on innovative solutions for clients in cloud technology.
Responsibilities:
- Lead the implementation and composition of tool-using agents that interact with APIs, databases, and knowledge systems.
- Leading a team of AI/ML Engineers to implement multi-agent systems.
- Building persistent agent memory systems (short-, long-, and episodic memory).
- Implementing fault-tolerant orchestration for multi-agent pipelines.
- Building and scaling cloud-native systems (on AWS, Azure, or GCP).
- Simulation and testing of multi-agent interactions for scalability, safety, and emergent behaviours.
- Building guardrail systems using tools like Guardrails AI, NeMo Guardrails, or custom validators.
- Embedding compliance and observability hooks in every agent interaction.
Requirements:
- Proven practical experience in professional services.
- Core AI/ML Expertise: deep understanding of transformer architectures, attention mechanisms, and LLM training pipelines.
- Agentic System Design: understanding of agent architectures (e.g., ReAct, Reflexion, Voyager, AutoGPT, CrewAI, AutoGen).
- Familiarity with agent orchestration frameworks (LangChain / LangGraph, Semantic Kernel, LlamaIndex, Swarm, etc.).
- Deep understanding of multi-agent communication protocols (e.g., MCP and A2A).
- Designing hierarchical agents: planner, executor, verifier, critic, and memory manager roles.
- Ability to balance autonomy vs. control, implementing “human-in-the-loop” governance mechanisms.
- Hands-on experience with Version control, CI/CD, and containerization (GitHub Actions, Docker, Kubernetes).
- Model registry, versioning, and promotion (MLflow, Weights & Biases).
- Prompt evaluation, feedback loops, token optimisation, cost monitoring.
- Understanding of Continuous deployment of multi-agent pipelines via Argo CD, GitOps, or Terraform.
- Observability for AI: telemetry on performance, latency, and behavioural drift.
- Integration of vector databases for memory and retrieval.
- Designing retrieval-augmented generation (RAG) pipelines with dynamic context injection.
- Familiarity with document loaders, chunking strategies, and embedding optimisation.
- Understanding of prompt injection, data exfiltration, and model hallucination vulnerabilities.
- Experience with safety layers (content filters, moderation, model output evaluation).
- Designing ethical and secure agent autonomy frameworks (role constraints, audit trails).
Benefits:
- Individual training budget and exam fees for certifications.
- Flexible working hours and a remote working model.
- Company laptop and needed equipment.
- Local package such as 30-day holiday allowance, pension allowance, Qualitrain card, and many more.




















