THESIS: From Raw Data to Insights: Comparing Modern Data Engineering Tools

6 days ago


Sundsvall, Sweden Knightec Group Full time 550,000 - 850,000 per year
High level description

Data engineering is the process of refining raw data into a usable state. For example, transforming raw CSV or JSON files into structured formats ready for analysis. There are many platforms and tools available to support this process, each with different trade-offs in terms of performance, scalability, and usability. This thesis will explore and compare modern data engineering platforms by applying them to real-world open datasets (such as weather data).

Who are we looking for?

Bachelor/Master of Science in Computer Science, Computer Engineering.

Project description

This thesis aims to evaluate and compare different cloud-based data engineering platforms. The work will involve building end-to-end data pipelines — from ingestion of raw open datasets (e.g., weather data) to transformation, storage, and analysis — and systematically comparing the platforms across a set of criteria. Examples of relevant platforms include Databricks, AWS-native analytics tools, or similar technologies used in industry.

Purpose and Scope

In this thesis investigate these questions:

  • Data ingestion & storage – Investigate how different platforms handle ingestion and storage of raw datasets (APIs, CSV/JSON files, large historical archives).
  • Data processing & transformation – Implement cleaning, aggregation, and enrichment workflows, and compare efficiency and flexibility.
  • Query & analytics performance – Measure and analyse query execution times and scalability for simple and complex analytical queries.
  • Cost & resource utilization – Estimate and compare the cost implications of running equivalent workloads.
  • Developer experience & integration – Evaluate usability, debugging, and integration with complementary tools (e.g., BI dashboards, machine learning frameworks).

An Exciting Journey with Knightec Group
Semcon and Knightec have joined forces as Knightec Group. Together, we are Northern Europe's leading strategic partner in product and digital service development. With a unique combination of cross-functional expertise and a holistic business understanding, we help our clients realize their strategies – from idea to complete solution.

Practical Information
This is a thesis position, located at our office in Sundsvall. Start date January or March 2026.

Please submit your application as soon as possible, but no later than If you have any questions, you are welcome to contact Johanna Edström. Note that due to GDPR, we only accept applications through our careers page.





  • Sundsvall, Sweden Knightec Group Sweden Full time 400,000 - 600,000 per year

    High level descriptionThis thesis focuses on evaluating different predictive maintenance models using provided industrial datasets. The goal is to determine which models are most effective in predicting failures and enabling proactive maintenance. The study will compare approaches such as anomaly detection, time-series forecasting, and classification-based...


  • Sundsvall, Västernorrland, Sweden Knightec Group Full time 40,000 - 80,000 per year

    High level descriptionThis thesis focuses on evaluating different predictive maintenance models using provided industrial datasets. The goal is to determine which models are most effective in predicting failures and enabling proactive maintenance. The study will compare approaches such as anomaly detection, time-series forecasting, and classification-based...


  • Sundsvall, Sweden Knightec Group Full time 400,000 - 600,000 per year

    High level descriptionEffort and time estimation remain among the most challenging aspects of software engineering. Despite decades of research, human estimates are still often uncertain and inconsistent.With the rise of advanced AI assistants and code-generating tools such as GitHub Copilot, ChatGPT, Amazon CodeWhisperer, and similar systems, new...


  • Sundsvall, Sweden Knightec Group Full time 450,000 - 600,000 per year

    High level descriptionInfrastructure as Code (IaC) is a cornerstone of modern software development and cloud operations. Terraform is one of the most widely adopted tools for provisioning and managing infrastructure across cloud providers, while is a newer framework with a developer-centric focus, combining infrastructure and application logic in a single...


  • Sundsvall, Sweden Knightec Group Full time 450,000 - 550,000 per year

    High level descriptionModern data engineering faces unprecedented challenges as organizations increasingly deploy IoT devices and edge computing infrastructure that generate massive volumes of distributed data.Traditional centralized data processing approaches struggle with the latency, bandwidth, and reliability constraints inherent in edge-to-cloud data...


  • Sundsvall, Sweden Knightec Group Full time 450,000 - 600,000 per year

    BackgroundIn today's digital landscape, services and offerings are increasingly composed of multiple interconnected applications and systems. This complexity makes it difficult to trace which services are tied to a specific customer identifier and to detect issues such as failed orders or recurring system errors.Traditionally, support and operations teams...


  • Sundsvall, Sweden Knightec Group Full time 350,000 - 550,000 per year

    High level descriptionGenerative AI is increasingly used in software development to write code, tests and suggest solutions. This increases productivity but at the cost of several risks, one being security. The AI tools being used may generate code that contains vulnerabilities, reproduce insecure patterns from training data, or create a false sense of...


  • Sundsvall, Sweden Knightec Group Sweden Full time 350,000 - 550,000 per year

    High level descriptionIndustries increasingly collect sensor data from equipment, but much of this data is not fully used for real time insights or predictive maintenance. A device-to-cloud prototype can demonstrate how even simple hardware can generate valuable data for health monitoring and failure prediction.Who are we looking for?Bachelor/Master of...


  • Sundsvall, Sweden Knightec Group Full time 60,000 - 80,000 per year

    AbstractManaging third-party dependencies is critical for software security, yet existing tools such as Dependabot treat all version updates and reported vulnerabilities as equally urgent. This lack of prioritization leads to alert fatigue and wasted developer effort. This thesis explores how AI can make dependency risk management more contextual and...


  • Sundsvall, Sweden Knightec Group Full time 400,000 - 600,000 per year

    High level descriptionWith the rising popularity of AI agents, Anthropic has developed a new protocol for LLM models to communicate with tools. The Model Context Protocol (MCP) is slowly becoming a standard for developing tools for LLMs and LLM Agents. In MCP there are no security or authentication methods, so exposing tools towards end users can be harmful...