Objectives

Data Analysis is described as the process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Performing such tasks over large and heterogeneous collections of tabular data, as found in enterprise data lakes and on the Web, is extremely challenging and an attractive research topic in data management, AI, and related communities. The goal of this workshop is to bring together researchers and practitioners in these diverse communities that work on addressing the fundamental research challenges of tabular data analysis and building automated solutions in this space.

We aim to provide a forum for: a) exchange of ideas between two communities: 1) an active community of data management researchers working on data integration and schema and data matching problems over tabular data, and 2) a vibrant community of researchers in AI and Semantic Web communities working on the core challenge of matching tabular data to Knowledge Graphs as a part of the ISWC SemTab Challenges. b) presentation of late-breaking results related to several emerging research areas such as table representation learning and its applications, use of large language models (LLMs) for tabular data analysis, andautomation of data science pipelines, and automation of data science pipelines that rely on tabular data. c) discussion of real-world challenges related to implementing industrial-scale tabular data anaylsis pipelines, and data lakes and data lakehouse solutions.

Call For Papers

Audience: Our workshop encourages participation from researchers in data management, AI, and Semantic Web communities working on a wide range of problems relevant to tabular data analysis. We hope that this will constitute a single reference point for the researchers and practitioners working in that area and help form new collaborations. We also aim to provide a venue for researchers from industry and practitioners relying on various tabular data analysis tasks to present use cases and discuss their needs in addressing real-world problems and large-scale solutions.

Topics of Interest include but are not limited to:
  • Semantic Table Annotation
  • Automated Tabular Data Understanding
  • Using Large Language Models (LLMs) for Tabular Data Analysis
  • Exploratory Data Analysis over Tabular Data
  • Table Search in Data Lakes
  • Tabular Data Discovery
  • Metadata Management for Tabular Data Analysis
  • Data Augmentation with Tabular Data
  • Integration and Matching of Tabular Data
  • Knowledge Graph Construction and Completion with Tabular Data
  • Automated Discovery of ML Features from Tabular Data
  • ML Model Development with Tabular Data
  • Visualization and Interfaces for Tabular Data Analysis
  • Data Wrangling for Tabular Data Analysis
  • Deep Learning and Representation Learning for Tabular Data Analysis
  • Extraction and Analysis of Tabular Data from (HTML/PDF) Documents and Images
  • Analysis of Tabular Data on the Web (Web Tables)
  • Practical Applications of Tabular Data Analysis
  • Benchmarking and Evaluation Frameworks for Tabular Data Analysis

Submissions

Contributions to the workshop can take the form of technical papers, posters, or statements of interest addressing various aspects of tabular data analysis, as well as reports on SemTab Challenge participation. Long technical papers should be 8-10 pages long. Short technical papers should be no more than 4 pages long. Posters should not exceed 2 pages. References do not count towards the page limits mentioned above.

Submission site: https://cmt3.research.microsoft.com/TaDA2025
Submissions should follow the format outlined in the provided zipped LaTeX proceedings directory.

Reviews will be anonymous (not dual anonymous). Authors of accepted papers will have the option to include their papers in the VLDB workshop proceedings. At least one co-author is expected to register for the VLDB 2025 conference and present the paper in-person. Please visit the VLDB 2025 registration instructions for more information.

Important Dates

  • Submission deadline: May 15, 2025
  • Notification of acceptance: June 10, 2025
  • Camera-ready copy due: July 1, 2025
  • Workshop Day: September 5, 2025
All Times are Anywhere on Earth (AoE).

Organization

General Chairs: Program Committee Chairs: Proceedings and Publicity Chair: Program Committee:
  • Aamod Khatiwada (Northeastern University)
  • Amine Mhedhbi (Polytechnique Montréal)
  • Anastasia Dimou (KU Leuven)
  • Andra-Denis Ionescu (TU Delft)
  • Christian Bizer (University of Mannheim)
  • Christos Diou (Harokopio University of Athens)
  • George Papadakis (University of Athens)
  • Haridimos Kondylakis (FORTH-ICS & University of Crete)
  • Ismael Sanz (Universitat Jaume I)
  • Kaustubh Beedkar (IIT Delhi)
  • Kavitha Srinivas (IBM Research)
  • Kostas Stefanidis (Tampere University)
  • Marco Mesiti (University of Milan)
  • Paolo Papotti (EURECOM)
  • Rafael Berlanga Llavori (University Jaume I)
  • Roee Shraga (WPI)
  • Romila Pradhan (Purdue University)
  • Sajjadur Rahman (Megagon Labs)
  • You Wu (Google)
  • Yuval Moskovitch (University of Michigan)
  • Zezhou Huang (Columbia University)