Multilingual AI Quality Evaluation Specialist

4 days ago


Drottninggatan Stockholm Sweden Market Partner Full time 10,000 - 80,000 per year


The world's most popular audio streaming subscription service is looking for a Multilingual AI Quality Evaluation Specialist to join the band in a consultant assignment. The client has transformed music listening forever when launched in 2008.

Period: ASAP to full-time), with a possibility of extension.

About the role

We're looking for an Multilingual AI Quality Evaluation Specialist to define, test, and continuously improve client's multilingual AI quality standards.
You'll design and optimize evaluation frameworks, datasets, and scoring methodologies that power QUAIL (our Quality Assessment AI for Language) and MAP the Multilingual AI Portal. This role bridges localization expertise, quality data analysis, and language evaluation frameworks. You'll translate business and content goals into structured evaluation logic, ensuring that every AI-generated or AI-translated output across the company is accurate, fluent, culturally relevant, and fit for purpose. This role bridges linguistic expertise, quality science, and evaluation design, turning linguistic nuance and business intent into measurable, automatable evaluation logic.

By joining this team, you will shape the evaluation intelligence layer that underpins client's multilingual AI ecosystem.
Your work ensures that AI outputs are linguistically accurate, culturally adaptive, and explainably evaluated, directly influencing the experience of hundreds of millions of global users.

What you'll do
  • Build and implement evaluation methodologies across multilingual settings and content types
  • Develop and validate multilingual evaluation rubrics aligned with QUAIL's multi-metric architecture (accuracy, fluency, tone, compliance, factuality).
  • Design calibration studies comparing QUAIL's LLM judgments with human-rated benchmarks to ensure scoring reliability and explainability.
  • Define sampling and scoring protocols (human-in-the-loop validation, confidence thresholds, correlation metrics).
  • Collaborate with ML engineers to train and fine-tune evaluators using gold datasets and human-annotated examples; contribute to the synthetic data generation pipeline (template-based data creation and validation).
  • Analyze model outputs and error patterns, using QUAIL's scoring results to identify quality gaps and update routing or prompt logic in MAP.
  • Partner cross-functionally with Localization, PZN, and GLEE to ensure consistent language quality signals across client's agentic products.
  • Partner with engineers to implement feedback loops that iteratively improve model accuracy and cultural fit.
  • Define and document language-specific quality guidelines, thresholds, and evaluation protocols for internal use.
  • Measure LLM evaluator stability and bias across locales and continuously improve prompt instructions for fairness and accuracy.
Who you are
  • Background in Language Quality Evaluation, Applied Linguistics, Computational Linguistics, or Language Quality Research.
  • Experience in LLM evaluation, machine translation evaluation, or linguistic annotation pipelines acrros multilingual settings.
  • Strong understanding of linguistic or translation evaluation frameworks and metrics such as MQM, MetricX, or COMET; hands-on experience designing multidimensional rubrics or scoring schemes.
  • Demonstrated understanding of GenAI evaluation methods: prompt testing, model calibration, and score validation.
  • Experience collaborating with ML teams on training linguistic data pipelines, annotation workflows, or fine-tuning evaluators.
  • Deep linguistic and cultural sensitivity across multiple locales; able to articulate what "fit for purpose" quality means per content type.
  • Bonus: experience with programmatic QA, e.g., rule-based or API-driven validation using Python, YAML, or gRPC-based systems.
  • Bonus: experience with inter-rater reliability measurement (e.g., Krippendorff's alpha, Cohen's κ) or human–AI agreement studies
We are Market Partner

Market Partner is proud to be an equal opportunity employer. You are welcome to our community regardless of who you are, no matter where you come from, or what you look like. We apply ongoing selection and may fill the position as soon as we find the right candidate.



  • Stockholm, Stockholm, Sweden JobBusters AB Full time 60,000 - 120,000 per year

    Your New Role We are looking for a Multilingual AI Quality Specialist to join a dynamic team focused on evaluating AI-generated text and translations across multiple languages. In this role, you will assess the quality of outputs from AI systems and provide expert guidance to support the development of an automatic evaluation tool. Your work will help shape...


  • Sweden Invisible Agency Full time 8,000 - 65,000 per year

    Are you a Danish language expert eager to shape the future of AI? Large‑scale language models are evolving from clever chatbots into powerful engines of linguistic discovery. With high‑quality training data, tomorrow's AI can democratize world‑class education, keep pace with cutting‑edge research, and streamline communication for Danish speakers...


  • Sweden Invisible Agency Full time 8,000 - 65,000 per year

    Are you a Swedish language expert eager to shape the future of AI? Large‑scale language models are evolving from clever chatbots into powerful engines of linguistic discovery. With high‑quality training data, tomorrow's AI can democratize world‑class education, keep pace with cutting‑edge research, and streamline communication for Swedish speakers...

  • AI Data Specialist

    4 days ago


    Stockholm, Stockholm, Sweden RWS Group Full time $32,000 - $41,600 per year

    We are looking for AI Data Specialists to support the improvement of AI-generated content in Swedish.Job Type:FreelanceLocation:RemoteWork Schedule:Part-time - 10+ hours per week. Flexible - work whenever you want.Start Date:ImmediatelyDuration:Until the end of December 2025 (an extension is possible)Rate:20 USD/hourHelp Shape the Future of AIAre you a...

  • AI Data Specialist

    3 days ago


    Stockholm, Stockholm, Sweden RWS TrainAI Full time 220,000 - 222,000 per year

    We are looking for AI Data Specialists to support the improvement of AI-generated content in Swedish.   Job Type: Freelance Location: Remote Work Schedule: Part-time - 10+ hours per week. Flexible - work whenever you want. Start Date: Immediately Duration: Until the end of December 2025 (an extension is possible) Rate: 20 USD/hour   Help Shape the...

  • Science Specialist

    2 days ago


    Sweden Invisible Agency Full time 8,000 - 65,000 per year

    Are you a science specialist fluent in Swedish, eager to shape the future of AI? Large-scale language models are evolving from clever chatbots into powerful engines of scientific discovery. With high-quality training data, tomorrow's AI can democratize world-class education, keep pace with cutting-edge research, and streamline problem-solving for scientists...

  • STEM Specialist

    2 days ago


    Sweden Invisible Agency Full time 8,000 - 65,000 per year

    Are you a STEM expert fluent in Swedish, eager to shape the future of AI? Large‑scale language models are evolving from clever chatbots into powerful engines of scientific discovery and technical problem-solving. With high‑quality training data, tomorrow's AI can democratize world‑class education, keep pace with cutting‑edge research, and streamline...


  • Sweden Invisible Agency Full time $96,000 - $1,320,000 per year

    Are you an Estonian language expert eager to shape the future of AI? Large-scale language models are evolving from clever chatbots into powerful engines of linguistic discovery. With high-quality training data, tomorrow's AI can democratize world-class education, keep pace with cutting-edge research, and streamline communication for Estonian speakers...


  • Stockholm, Stockholm, Sweden Monterro Full time 100,000 - 150,000 per year

    Bring AI into products and into the hands of users. As an AI Product Tech Specialist at Monterro, you'll help Nordic B2B software companies turn promising AI concepts into robust, production-ready features. Working hands-on with engineering teams and CTOs, you'll guide everything from architecture to deployment bridging the gap between AI experiments and...


  • Stockholm, Stockholm, Sweden Arkus AI Full time 120,000 - 180,000 per year

    About Arkus AIArkus AI is on a mission to revolutionise medicine and healthcare through intelligent, AI-powered technologies. We are developing autonomous AI agents and a robust AI platform to support clinicians, researchers, and healthcare administrators in addressing some of the most complex challenges in medicine.Role OverviewWe are looking for an...