Dmitrii Kuzmin 1kkiRen

Dmitrii Kuzmin — NLP Engineer & Researcher

Helping large language models understand the world a bit better. I build and adapt tokenizers, fine-tune multimodal models, and streamline LLM pipelines for production-grade use.

Jump to what interests you

Quick highlights
Recent experience
Projects you can try right now
Publications & research
How to reach me

Quick highlights

Research Intern at Mohamed bin Zayed University of Artificial Intelligence (Jun 2025 – present) exploring alternative tokenization methods and preparing publication-ready research.
Middle NLP Engineer at DeepPavlov (May 2025 – present) driving R&D, benchmarking LLMs, and working with GPU stacks from diverse vendors.
Previously at Center for Applied AI (Skolkovo), Higher School of Economics, Moscow Aviation Institute, and Innopolis University, shipping tokenizer tooling, fine-tuning Qwen and Llama models, and deploying NLP services.
Active open-source maintainer of tokenizer tooling.

What I work with

NLP & Deep Learning Stack

DevOps & Tooling

Backend & Communication

Languages & soft skills

English (proficient)
Russian (native)
Flexibility · Responsibility · Curiosity

Recent experience

Research Intern · MBZUAI — Abu Dhabi, UAE (Jun 2025 – present)

Design and evaluate alternative tokenization strategies for LLM inference.
Author an academic paper on tokenizer-driven performance gains.

Middle NLP Engineer · DeepPavlov — Moscow, Russia (May 2025 – present)

Running R&D initiatives and evaluation workflows.
Running comparative testing on GPU infrastructures from Chinese manufacturers.

Middle NLP Engineer · Center for Applied AI, Skolkovo — Moscow, Russia (Feb 2025 – May 2025)

Tuned the Qwen2.5-VL model and built supporting pipelines.
Designed prompting strategies to generate actionable feedback on heterogeneous specifications.

NLP Researcher · Higher School of Economics — Moscow, Russia (Jun 2024 – May 2025)

Fine-tuned Llama3-8B-Instruct for Russian-language tasks.
Developed a Russian BPE tokenizer and tooling to manipulate existing vocabularies safely.
Built a grammar benchmark suite to quantify improvements across downstream tasks.

ML / Backend Engineer · Moscow Aviation Institute — Moscow, Russia (Jul 2023 – Oct 2023)

Delivered a sentence theme classifier and optimized database queries.
Integrated Telegram-based interfaces for model delivery.

NLP Engineer · Innopolis University — Innopolis, Russia (Jun 2023 – Jul 2023)

Developed a deep-learning sentiment model for YouTube comments.
Fine-tuned BERT for domain-specific tone classification.

Publications & research

Rethinking Tokenization — EACL 2026 (under review)

Researcher & writer, 2025. Investigates how alternative tokenizations of the same text impact LLM inference quality.

TokenSubstitution — ACL 2026 (in progress)

Proposes cost-effective adaptation approach for improving the performance of LLM generation in target language.

Multi-Aspect Tokenizer Evaluation — Russian AI Journey 2025 (accepted)

Demonstrates tokenizer adaptation as a cost-effective technique by analyzing text quality and token efficiency across diverse benchmarks

Open-source projects

TokenizerChanger — modify tokenizers

PyPI · GitHub
pip install TokenizerChanger

EmbeddingsDivision — adapt LLM embeddings

PyPI · GitHub
pip install embdiv

CRUD Calendar LLM Chatbot — Telegram assistant

Features: calendar CRUD, summarise latest news, voice reminders.
Stack: Telegram Bot API, FastAPI, RAG pipeline with Qwen2.5-VL.

Education

Innopolis University — B.S. in Data Analysis & Artificial Intelligence (2022 – 2026)
Key coursework: Software Systems Analysis and Design, Human-AI Interaction, Mathematical Analysis.

Beyond work

Tutor for first-year students at Innopolis University (Sep 2023 – Jan 2024), helping newcomers acclimate and organizing community events.
Always exploring ways to make LLM tooling more accessible and efficient.

Let’s connect

Portfolio: 1kkiren.ru
Email: [email protected]
Profiles: GitHub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly