CV¶

kenneth.enevoldsen@cas.au.dk github.com/KennethEnevoldsen

Employment¶

2025–present

Postdoc, Aarhus University
Research on continuous development and evaluation of language models

Education¶

2021–2024

PhD, Aarhus University
Center for Humanities Computing, in collaboration with Quantitative Genomics Group and Aarhus University Hospital
Main supervisor: Kristoffer Nielbo · Co-supervisors: Doug Speed, Andreas Danielsen
Research stays: UCLA (2023, Prof. Vwani Roychowdhury), UC Berkeley (2023, Prof. Tim Tangherlini)

2016–2022

BSc & MSc Cognitive Science, Aarhus University
Elective: Mathematics · GPA: 11.67/12.00

Professional Experience¶

2024–2025

Research Assistant, Aarhus University
Teaching and research in Natural Language Processing at Cognitive Science

2018–2022

Instructor, Aarhus University
Natural Language Processing, Computational Modelling, and Experimental Methods at Cognitive Science
Topics: GLM, GLMM, Bayesian modelling, R, Python, HPC, NLP, cognitive modelling

2018–2021

Student Developer, Center for Humanities Computing Aarhus
HPC, NLP, and information extraction

2017–2020

Junior Consultant, JHN Processor
Data management, data collection, economics, and user experience

Funding¶

2022–2025

Multiple Grants, Danish E-Infrastructure Cooperation
>300,000 GPU core hours and >1,000,000 CPU core hours
Case numbers: DeiC-AU-N5-2024079, DeiC-AU-N1-2025144, DeiC-KU-N5-2025117, H2-2023-15, H2-2023-16, 2022-H2-11

Counseling¶

2023-2024

The Danish Agency for Digital Governance (Digitaliseringsstyrelsen)
Invited presentations and counselling on current limitations and opportunities of language technology for Danish

Supervision¶

2025–present

Jakob Grøhn Damgaard
PhD Co-supervisor

2025

Anton Drasbæk
Master's thesis supervisor

2025

Jørgen Højlund Wibe
Master's thesis supervisor

2023

Emil Jessen
Master's thesis supervisor

Open-source Projects¶

Selected open source projects

2025-present

Danish Dynaword
The largest corpus of open-source Danish text data · Core developer and maintainer

2024-present

Massive Multilingual Embedding Benchmark (MTEB)
The de-facto Python package and benchmark for evaluating text and image embedding models across languages and use cases · Core developer and maintainer

2024-2025

Scandinavian Embedding Benchmark
The de-facto Benchmark for estimating the quality of Scandinavian embedding model. Later merged into MTEB · Core developer and maintainer

2023-present

Augmenty
A Python package for text augmentation with use cases in bias detection, evaluating model robustness, and improving model performance · Core developer and maintainer

2023-present

timeseriesflattener
A package for converting irregularly spaced time series, such as electronic health records, into statically shaped data frames · Initial developer, maintained by others

2022-present

TextDescriptives
A package for extracting text features such as dependency dynamics and metrics of text quality · Co-developer and maintainer

2022-present

Tomsup 👍
Theory of Mind Simulation using Python · Agent-based simulation implementing variational recursive k-ToM · Core developer and maintainer

2022-present

UD_Danish-DDT
The Danish Universal Dependencies Treebank, a high quality linguistic resource · Maintainer

2021-present

DaCy
State-of-the-art Danish NLP · POS tagging (98.37 acc), NER (84.39 F1), dependency parsing (88.44 LAS) on DDT and DaNE · Core developer and maintainer

2021-present

DANSK
DANSK: Danish Annotations for NLP Specific TasKs is a dataset consisting of texts from diverse domains annotated for 18 entities. Actively used in EuroEval · Language resource

Open-source Contributions¶

Selected contributions

2025

spacy-lookup-data, Explosion
Added Danish Lexeme probabilities

2024

datasets, Huggingface
Fixes for compatibility issue with numpy >=2.0.0

2024

curated-transformers, Explosion
Added support for ELECTRA models

2024

spacy-curated-transformers, Explosion
Added support for ELECTRA tokenizers

2023

confection, Explosion
Fixed issue where config where could not be filled

2023

curated-transformers, Explosion
Added support for ELECTRA models

2022

transformers, Huggingface
Bugfixes for training masked language models using flax

2021

spacy-transformers, Explosion
Allow passing arguments to the transformer backend to obtain attention weights