Arabic AI Evaluation Specialist
Posted 65ds ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Arabic AI Evaluation Specialist role at Welo Data to enhance large language models. Assessing AI performance and ensuring accurate results for Arabic content.
Responsibilities:
- Conduct side-by-side comparisons of AI responses and rate their quality on a 1–5 scale based on established guidelines.
- Design scenario-based and edge-case prompts to evaluate model behavior, including tricky, ambiguous, or incomplete information situations.
- Assess outputs for instruction adherence, factual accuracy, tone, safety, and overall usefulness.
- Develop clear evaluation rubrics and criteria to ensure consistent scoring across tasks.
- Create reliable reference materials (articles, transcripts, reports, etc.) to serve as the source of truth for testing.
- Write well-structured “gold standard” responses that demonstrate the most accurate and helpful answer.
- Identify potential issues such as hallucinations, inconsistencies, or cultural/contextual mismatches.
Requirements:
- Bachelor's degree or equivalent experience in Linguistics, Computational Linguistics, Communications, Technical Writing, or a related analytical field.
- B2 or superior level of English.
- Native fluency in Modern Standard Arabic in Egyptian dialect.
- Strong understanding of the distinction between Fusha and ‘Ammiyya.
- Proven experience in a role involving AI data annotation, content quality review, search quality rating, or prompt engineering.
- Ability to work independently and manage workflows effectively in a remote environment.
- Nice to Have: Multilingual proficiency in one or more Arabic dialects.
- Strong attention to detail and critical thinking to identify hallucinations and bias.
- Familiarity with data annotation platforms and model evaluation tools.
- Experience in prompt engineering, AI evaluation, linguistic QA, or translation is a plus.
- Cultural familiarity with regional norms and high-context communication styles, particularly in the GCC region.
Benefits:
- Limitless Flexibility
- Limitless Growth
- Limitless Support
- Real Impact




