Dikshya Mohanty · AI Researcher

Publications

My research spans large language models, multilingual NLP, evaluation, and applications in scientific domains. All preprints are available on arXiv.

Published

A Longitudinal, Multinational, and Multilingual Corpus of News Coverage of the Russo-Ukrainian War

Dikshya Mohanty, et al.

LREC 2026

We present DNIPRO, a novel longitudinal, multinational, and multilingual corpus containing 246K+ news articles covering the Russo-Ukrainian War. The dataset enables analysis of how geopolitical narratives evolve over time and differ across national and linguistic boundaries, providing insights into information dynamics and framing strategies.

LREC 2026 → arXiv:2601.16309 →

Addressing the Ecological Fallacy in Larger LLMs with the Author's Context

Nikita Soni, Dhruv Vijay Kunjadiya, Pratham Piyush Shah, Dikshya Mohanty, H. Andrew Schwartz, Niranjan Balasubramanian

To appear at CoNLL 2026

Investigating how incorporating author-level context can address ecological fallacy issues in large language models, particularly when models are trained on aggregate-level data but used to make individual-level predictions.

arXiv:2603.05928 →

Big Data and Omics Technology in Fisheries Biology: New Approaches and Changing Perspectives

Bimal Prasanna Mohanty, Dikshya Mohanty, et al.

Book Chapter, 2020

A comprehensive overview of how big data and omics technologies are transforming fisheries biology, covering new computational approaches and their impact on understanding aquatic ecosystems.

ResearchGate →

Under Review

Teaching and Evaluating LLMs to Reason About Polymer Design Related Tasks

Dikshya Mohanty, et al.

Under Review at ACL Rolling Review May 2026 Cycle

We introduce PolyBench, a comprehensive benchmark with 125K+ tasks for evaluating compositional reasoning in polymer design and synthesis. The benchmark includes knowledge-augmented reasoning traces and diagnostic evaluations that reveal specific skill gaps in how LLMs handle multi-constraint reasoning problems. Our analysis provides insights into improving LLM capabilities for scientific applications.

arXiv:2601.16312 →

In Preparation

Measuring Novelty and Diversity in LLM Reasoning for Polymer Design

Dikshya Mohanty, Niranjan Balasubramanian

In Preparation

How do composite reward signals — combining validity, novelty, and diversity — shape LLM post-training on open-ended polymer design tasks, and what does this reveal about building models capable of generating scientifically feasible hypotheses under competing objectives?

Benchmarking Human-Context-aware Large Language Models

Nikita Soni, Dhruv Vijay Kunjadiya, Pratham Piyush Shah, Dikshya Mohanty, H. Andrew Schwartz, Niranjan Balasubramanian

In Preparation

Developing benchmarks to evaluate how well large language models can incorporate and reason about human context, including individual differences, cultural backgrounds, and personal characteristics in their responses.

LHLC: Large Human Language Data Corpus

Nikita Soni, Dhruv Vijay Kunjadiya, Pratham Piyush Shah, Dikshya Mohanty, H. Andrew Schwartz, Niranjan Balasubramanian

Technical Report To Be Submitted

A large-scale corpus of human language data designed to support research on human-context-aware language models and individual-level language understanding.

Discovering the Language Factor: A Basic Human Trait Rooted in Language

August Nilsson, Dikshya Mohanty, et al.

In Preparation

Exploring fundamental human traits that can be identified and measured through language patterns, combining computational linguistics with psychometric analysis to discover language-based factors.

Benchmarking LLMs on Extraction of Battery Electrolytes Properties from Literature

Manoj Praveen Nandigama, Kuldeepsinh Raj, Dikshya Mohanty, et al.

In Preparation

Developing benchmarks for evaluating how well LLMs can extract structured information about battery electrolyte properties from scientific literature, with applications to materials science research acceleration.

Experience

My journey through academia and industry has shaped how I approach research—balancing theoretical rigor with practical impact.

Research Assistant

Stony Brook University

Aug 2024 – Present

Conducting research on large language models with a focus on compositional reasoning, retrieval-augmented generation, and evaluation. Working on developing benchmarks and techniques to improve LLM capabilities in scientific domains. Collaborating with interdisciplinary teams on projects funded by DARPA and NSF.

Key achievements: Two papers under review at top-tier venues (ACL, LREC), contributor to DARPA SciFy program, recipient of SUNY RF Academic Fellowship.

Manager / Lead Data Scientist

Ernst & Young LLP (EY)

Jan 2020 – Aug 2024

Led and mentored a team of data scientists delivering production ML platforms for tax and financial analytics supporting 40+ enterprise clients. Owned system architecture across modeling, scalable ML APIs, deployment, and MLOps in highly regulated environments.

Designed human-in-the-loop ML systems with feedback-driven metrics and active learning, driving sustained cost reduction and operational efficiency. Built end-to-end pipelines from data ingestion to model deployment, ensuring compliance with enterprise security and governance standards.

Recognition: EY Ovation Award (FY2022-23, Top 5% performer), EY Bravo Awards (FY2020-23), AI Challenge Winner (FY2020, FastAI Kaggle competition).

Applied Science Intern, Seller Lending

Amazon.com

Summer 2019

Built predictive models to estimate early seller success and risk for automated lending and loan allocation decisions in Amazon Seller Lending. Developed behavioral features and comprehensive experimentation pipelines including offline validation, A/B testing frameworks, and feature attribution analysis.

Engineered features that ranked in the top decile among ~30K existing features, demonstrating significant predictive power for seller success metrics.

Data Science Developer

Tata Consultancy Services (TCS)

Jul 2016 – Jul 2018

Built end-to-end ML/NLP pipelines for client applications, spanning data ingestion, modeling, evaluation, and API deployment. Developed clinical and pharmaceutical NLP systems for text classification and domain language understanding using large scientific corpora.

Open Source & Research Artifacts

Research datasets, benchmarks, and tools shared with the community. Committed to open science and reproducible research.

PolyBench / PolyLM

Under Review · ACL 2026

A 125K+ task benchmark for multi-constraint polymer design and synthesis, including knowledge-augmented reasoning traces and diagnostic evaluation to analyze compositional reasoning and skill gaps in LLMs.

Paper → Code (Coming Soon) → Dataset (Coming Soon) →

DNIPRO

Published · LREC 2026

A longitudinal, multinational, and multilingual corpus of 246K+ news articles analyzing geopolitical narratives and framing shifts across countries during the Russo-Ukrainian War.

LREC 2026 → arXiv → Dataset (Coming Soon) →

LHLC: Large Human Language Data Corpus

Technical Report To Be Submitted

A large-scale corpus of human language data designed to support research on human-context-aware language models and individual-level language understanding.

Dataset (Coming Soon) →

Research Tools & Utilities

Collection of scripts and utilities for ML research workflows—data preprocessing, evaluation metrics, experiment tracking, and visualization tools.

GitHub →

Selected Projects

Research projects exploring LLM capabilities, alignment, multimodal learning, and systems design.

LLM Alignment & Catastrophic Forgetting

Research Project

Studied sequential fine-tuning effects in GPT-2 (SFT, RLHF), quantifying catastrophic forgetting and impacts on commonsense reasoning.

GitHub →

Viral Humor Analysis

Research Project

Characterized humor on Reddit as measurable attributes and analyzed how these signals drive audience engagement using transformer-based NLP on 500K+ posts.

GitHub →

LLaVA 1.5 Adaptive Pruning

Research Project

Implemented adaptive token pruning strategies in a 7B Vision-Language Transformer and benchmarked efficiency tradeoffs on VQA-v2, TextVQA, and POPE.

GitHub →

Commodity Price Forecasting

Research Project

Built multimodal time-series + NLP models using government reports to predict oil and gas prices and generate economic explanations.

GitHub →

Delulu – RISC-V Processor

Systems Project

Designed and implemented a RV64IM 5-stage pipelined processor in SystemVerilog with caching, forwarding, branch prediction, and modular verification.

GitHub →