9:00 - 10:30 |
Session 1: Tabular Data Discovery |
9:00 - 9:10 |
Opening Remarks |
9:10 - 9:35 |
ALT-GEN: Benchmarking Table Union Search using Large Language Models
Koyena Pal, Aamod Khatiwada, Roee Shraga, Renée J. Miller
9:35 - 10:00 |
Finding Support for Tabular LLM Outputs
Grace Fan, Roee Shraga, Renée J. Miller
10:00 - 10:15 |
Humboldt: Metadata-Driven Extensible Data Discovery
Alex Bäuerle, Çağatay Demiralp, Michael Stonebraker
10:15 - 10:30 |
Toward a Declarative Query Language for Machine Learning
Hasan Jamil
10:30 - 11:00 |
Coffee Break |
11:00 - 12:30 |
Session 2: Keynote Session |
11:00 - 12:00 |
Keynote talk by
Haoyu Dong, Microsoft Research
Spreadsheet Intelligence and Data Analytics
12:00 - 12:15 |
Data Quality Management for Responsible AI in Data Lakes
Carolina Cortes, Camila Sanz, Lorena Etcheverry, Adriana Marotta
12:15 - 12:30 |
Large Language Models as Data Preprocessors
Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada
12:30 - 14:00 |
Lunch Break |
14:00 - 15:20 |
Session 3: LLMs & Tabular Data |
14:00 - 14:25 |
Schema Matching with Large Language Models: an Experimental Study
Marcel Parciak, Brecht Vandevoort, Frank Neven, Liesbet M. Peeters, Stijn
14:25 - 14:40 |
LLMs for Data Engineering on Enterprise Data
Jan-Micha Bodensohn, Ulf Brackmann, Liane Vogel, Matthias Urban, Anupam Sanghi,
Carsten Binnig
14:40 - 14:55 |
Transform Table to Database Using Large Language Models
Zezhou Huang, Jia Guo, Eugene Wu
14:55 - 15:20 |
DEMA: Enhancing Causal Analysis through Data Enrichment and Discovery in
Data Lakes
Kayvon Heravi, Saathvik Dirisala, Babak Salimi
15:20 - 16:00 |
Poster Session |
16:00 - 17:10 |
Session 4: Machine Learning & Tabular Data |
16:00 - 16:25 |
4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on
Minjie Wang, Quan Gan, David Wipf, ZhenkunCai, Ning Li, Jianheng Tang, Yanlin
Zhang, ZizhaoZhang, ZunyaoMao, YakunSong, Yanbo Wang, Jiahang Li, HanZhang,
Guang Yang, Xiao Qin, Chuan Lei, Muhan Zhang, Weinan Zhang, Christos Faloutsos,
Zheng Zhang
16:25 - 16:40 |
Fast and Accurate Regional Effect Plots for Automated Tabular Data
Vasilis Gkolemis, Theodore Dalamagas, Eirini Ntoutsi, Christos Diou
16:40 - 16:55 |
GFS: Graph-based Feature Synthesis for Prediction over Relational
Han Zhang, Quan Gan, David Wipf, Weinan Zhang
16:55 - 17:10 |
Closing Remarks & Awards
Keynote by Shi Han
Haoyu Dong, Microsoft Research
Title: Spreadsheet Intelligence and Data Analytics
Abstract: This keynote will unveil cutting-edge technologies designed to tackle the major
challenges in spreadsheet intelligence, encompassing areas such as detecting table ranges,
analyzing table structures and sheet layouts, understanding data semantics, and recommending
data presentations. Based on spreadsheet intelligence, the presentation will also
highlight our research and engineering efforts in boosting automation of data analytics to help
Microsoft build technical leadership in the Business Intelligence market.
In the trend of Large Language Models (LLMs), we will also present our latest explorations into
integrating LLMs with spreadsheet intelligence and data analytics.