Sajjadur Rahman

sajjadurr@adobe.com

Sajjadur Rahman

Adobe Founders Tower, San Jose, CA 95113

I am a Senior Applied Science Manager at Adobe. I drive the Center for Excellence for AI Quality at Adobe Experience Platform and specifically, lead the Continual Learning and AI Safety initiatives. Prior to that, I was a Senior Research Scientist and the founding Research Manager of the Data-AI Symbiosis (DAIS) group at Megagon Labs where I led the development of data platforms for enterprise agentic systems. I received my PhD from CS@Illinois where I worked on interactive analytcis of big data with Aditya Parameswaran.

My research synthesizes techniques from data management, AI, and HCI to build scalable, interactive, and reliable systems. My work has been published in premier conferences in Databases (SIGMOD and VLDB), HCI (CHI and CSCW), and NLP (EMNLP and NAACL.) I collaborated on research projects that were recognized with the best demo award (at ICDE) and featured in popular tech blogs. My research has been deployed in open-source data exploration systems (DataSpread and Lux) and enjoyed adoption in the industry to support creative design@B12. I served as Program Chair of several workshops (DAIS@ICDE'25 and MATCHING@ACL'23) and served on the program committee of SIGMOD, VLDB, EMNLP, ACL, and IEEE ISSRE.

News

Received Distinguished Reviewer Award at SIGMOD 2025.
April 22, 2025
Joined Adobe Inc. to lead AI Quaity and Safety initiatives at Adobe Experience Platform.
April 14, 2025
Released Cypherbench, a large-scale benchmark for evaluating NL2Cypher tasks.
January 7, 2024
Article on RAG in the Wild accepted at IEEE Data Engg. Bulletin.
December 22, 2024
Workshop on Data-AI Systems accepted at ICDE'25.
September 15, 2024
Paper on "Characterizing LLMs as Rationalizers" accepted at ACL'24 Findings.
May 15, 2024
Paper on "Benchmarking Data Discovery in the Enterprise" accepted at GUIDE-AI@SIGMOD'24.
May 12, 2024
Demo on "Human-LLM Collaborative Annotation System" accepted at EACL'24.
January 22, 2024
Paper on "Human-LLM Collaborative Annotation Through Effective Verification" accepted at CHI'24.
January 18, 2024

Selected Projects

A System 2 Perspective of AI Agents

We outline our approaches toward understanding and implementing a more effective agentic workflow in the wild. To achieve the goal, we draw on the cognitive science concepts of System 1 (fast, intuitive thinking) and System 2 (slow, deliberate, analytical thinking.).

Paper · Blog · Code

Human-in-the-loop data science

Characterizing Human-in-the-loop information extraction workflows

We observed that data science workers follow an iterative task model consisting of information foraging and sensemaking loops across all the phases of an information extraction workflow. We found several limitations in both loops stemming from a lack of adherence to existing cognitive engineering principles.

Paper · Video · Blog

Progressive Visualization with Incvisage

IncVisage is a progressive visualization tool that reveals “salient” features of a visualization quickly while minimizing error, enabling rapid and error-free decision making. The approach is orders of magnitude faster than the traditional visualization systems.

Paper · Video · Code

Selected Publications

Retrieval Augmented Generation in the Wild: A System 2 Perspective.
Sajjadur Rahman, Dan Zhang, Nikita Bhutani, Estevam Hruschka, Eser Kandogan. IEEE Bulletin, 2025.
Characterizing Large Language Models as Rationalizers of Knowledge-intensive Tasks
Aditi Mishra, Sajjadur Rahman, Hannah Kim, Kushan Mitra, Estevam Hruschka. ACL 2024 Findings.
Human-LLM collaborative annotation through effective verification of LLM labels
Xinru Wang, Hannah Kim, Sajjadur Rahman, Kushan Mitra,Zhengjie Miao. CHI 2024.
Low-resource Interactive Active Labeling for Fine-tuning Language Models
Seiji Maekawa, Dan Zhang, Hannah Kim, Sajjadur Rahman, Estevam Hruschka. Findings of EMNLP 2022.
Low-resource Entity Set Expansion: A Comprehensive Study on User-Generated Text
Yutong Shao, Nikita Bhutani*, Sajjadur Rahman*, Estevam Hruschka. Findings of NAACL 2022.
Characterizing practices, limitations, and opportunities related to text information extraction workflows: a human-in-the-loop perspective
Sajjadur Rahman, Eser Kandogan. CHI 2022.
NOAH: Interactive Spreadsheet Exploration with Dynamic Hierarchical Overviews
Sajjadur Rahman, Mangesh Bendre, Yuyang Liu, Shichu Zhu, Nick Su, Karrie Karahalios, Aditya Parameswaran. VLDB 2021.
Leam: An Interactive System for In-situ Visual Text Analysis
Sajjadur Rahman, Peter Griggs, Çağatay Demiralp. CIDR 2021.
MixTAPE: Mixed-initiative Team Action Plan Creation Through Semi-structured Notes, Automatic Task Generation, and Task Classification
Sajjadur Rahman, Pao Siangliulue, Adam Marcus. CSCW 2020. Employed at [B12].
Benchmarking Spreadsheet Systems.
Sajjadur Rahman, Kelly Mack, Mangesh Bendre, Ruilin Zhang, Karrie Karahalios, Aditya Parameswaran. SIGMOD 2020. Featured in [The Morning Paper].
[Demo] Faster, Higher, Stronger: Redesigning Spreadsheets for Scale
Mangesh Bendre, Tana Wattanawaroon, Sajjadur Rahman, Kelly Mack, Yuyang Liu, Shichu Zhu, Yu Lu, Pingjing Yang, Xinyan Zhu, Kevin Chang, Karrie Karahalios, Aditya Parameswaran. ICDE 2019. Best Demo Award.
I’ve Seen “Enough”: Incrementally Improving Visualizations to Support Rapid Decision Making
Sajjadur Rahman, Maryam Aliakbarpour, Ha-Kyung Kong, Eric Blais, Karrie Karahalios, Aditya Parameswaran,Ronnit Rubinfeld. VLDB 2017.
SeeDB: efficient data-driven visualization recommendations to support visual analytics
Manasi Vartak, Sajjadur Rahman, Samuel Madden, Aditya Parameswaran,Neoklis Polyzotis. VLDB 2015. Employed by [Ponder/Lux].

Synergistic Activities

2024	VLDB'24 (PC), SIGMOD'25 (PC), Reviewer:NLP4HR@EACL'24, TiiS
2023	MATCHING@ACL 2023 (Program Chair), MATHCING@ACL 2023 (Panel Moderator), ACL 2023 (PC), EMNLP 2023 (PC), BLP@EMNLP 2023 (PC), SIGMOD 2023 (Demo PC), BigVIS 2023 (PC), CHI 2023 (Reviewer)
2022	EMNLP 2022 (PC), DaSH@EMNLP 2022 (PC), ISSRE 2022 (Industry PC), BigVIS 2022 (PC), WIT@ACL (PC), CHI 2022 (Reviewer)
2021	SIGMOD 2022 (PC), VLDB 2021 (Panelist), BigVIS 2021 (PC), WIT@KDD (PC), CHI 2021 (Reviewer)
2020	SIGMOD 2021 (PC)
2019	VLDB 2019 (Demo, external reviewer)
2018	IIT 2018 (PC)

Tweets by subZero_saj

Papers by Research Themes

Interactivity

Towards Transparent, Reusable, and Customizable Data Science in Computational Notebooks CHI 2023 LBW.

[Demo] Weedle: Composable Dashboard for Data-centric NLP in Computational Notebooks WWW 2023.

NOAH: Interactive Spreadsheet Exploration with Dynamic Hierarchical Overviews. VLDB 2021.

[Workshop] Towards integrated, interactive, and extensible text data analytics with LEAM. DASH-LA@NAACL 2021.

Leam: An Interactive System for In-situ Visual Text Analysis. CIDR 2021.

MixTAPE: Mixed-initiative Team Action Plan Creation Through Semi-structured Notes, Automatic Task Generation, and Task Classification. CSCW 2020.

I’ve Seen “Enough”: Incrementally Improving Visualizations to Support Rapid Decision Making. VLDB 2017.

SeeDB: Efficient data-driven visualization recommendations to support visual analytics. VLDB 2015.

Scalability

[Vision]Toward multifaceted human-centered AI HCAI@NeurIPS 2022.

Low-resource Interactive Active Labeling for Fine-tuning Language Models. Findings of EMNLP 2022.

Benchmarking Spreadsheet Systems. SIGMOD 2020.

NOAH: Interactive Spreadsheet Exploration with Dynamic Hierarchical Overviews. VLDB 2021.

[Demo] Faster, Higher, Stronger: Redesigning Spreadsheets for Scale. ICDE 2019.

I’ve Seen “Enough”: Incrementally Improving Visualizations to Support Rapid Decision Making. VLDB 2017.

SeeDB: efficient data-driven visualization recommendations to support visual analytics. VLDB 2015.

Empirical Studies

[Workshop] CMDBench: A Benchmark for Coarse-to-fine Multimodal Data Discovery in Compound AI Systems. Guide-AI@SIGMOD 2024.

Characterizing LLMs as Rationalizers of Knowledge-intensive Tasks. Findings of ACL 2024.

Human-LLM collaborative annotation through effective verification of LLM labels. CHI 2024.

Characterizing practices, limitations, and opportunities related to text information extraction workflows: a human-in-the-loop perspective. CHI 2022.

Low-resource Entity Set Expansion: A Comprehensive Study on User-Generated Text Findings of NAACL 2022.

Benchmarking Spreadsheet Systems. SIGMOD 2020.

[Workshop] Understanding Data Analysis Workflows on Spreadsheets: Roadblocks and Opportunities. HILDA@SIGMOD 2020.