9:00 - 10:30 |
Session 1: Tabular Data Discovery |
9:00 - 9:10 |
Opening Remarks |
9:10 - 9:35 |
ALT-GEN: Benchmarking Table Union Search using Large Language Models
[pdf]
Koyena Pal, Aamod Khatiwada, Roee Shraga, Renée J. Miller
|
9:35 - 10:00 |
Finding Support for Tabular LLM Outputs
[pdf]
Grace Fan, Roee Shraga, Renée J. Miller
|
10:00 - 10:15 |
Humboldt: Metadata-Driven Extensible Data Discovery
[pdf]
Alex Bäuerle, Çağatay Demiralp, Michael Stonebraker
|
10:15 - 10:30 |
Toward a Declarative Query Language for Machine Learning
[pdf]
Hasan Jamil
|
|
10:30 - 11:00 |
Coffee Break |
11:00 - 12:30 |
Session 2: Keynote Session |
11:00 - 12:00 |
Keynote talk by
Shi
Han
and
Haoyu Dong, Microsoft Research
Spreadsheet Intelligence and Data Analytics
|
12:00 - 12:15 |
Data Quality Management for Responsible AI in Data Lakes
[pdf]
Carolina Cortes, Camila Sanz, Lorena Etcheverry, Adriana Marotta
|
12:15 - 12:30 |
Large Language Models as Data Preprocessors
[pdf]
Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada
|
|
12:30 - 14:00 |
Lunch Break |
14:00 - 15:20 |
Session 3: LLMs & Tabular Data |
14:00 - 14:25 |
Schema Matching with Large Language Models: an Experimental Study
[pdf]
Marcel Parciak, Brecht Vandevoort, Frank Neven, Liesbet M. Peeters, Stijn
Vansummeren
|
14:25 - 14:40 |
LLMs for Data Engineering on Enterprise Data
[pdf]
Jan-Micha Bodensohn, Ulf Brackmann, Liane Vogel, Matthias Urban, Anupam Sanghi,
Carsten Binnig
|
14:40 - 14:55 |
Transform Table to Database Using Large Language Models
[pdf]
Zezhou Huang, Jia Guo, Eugene Wu
|
14:55 - 15:20 |
DEMA: Enhancing Causal Analysis through Data Enrichment and Discovery in
Data Lakes
[pdf]
Kayvon Heravi, Saathvik Dirisala, Babak Salimi
|
|
15:20 - 16:00 |
Poster Session |
16:00 - 17:10 |
Session 4: Machine Learning & Tabular Data |
16:00 - 16:25 |
4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on
RDBs
[pdf]
Minjie Wang, Quan Gan, David Wipf, ZhenkunCai, Ning Li, Jianheng Tang, Yanlin
Zhang, ZizhaoZhang, ZunyaoMao, YakunSong, Yanbo Wang, Jiahang Li, HanZhang,
Guang Yang, Xiao Qin, Chuan Lei, Muhan Zhang, Weinan Zhang, Christos Faloutsos,
Zheng Zhang
|
16:25 - 16:40 |
Fast and Accurate Regional Effect Plots for Automated Tabular Data
Analysis
[pdf]
Vasilis Gkolemis, Theodore Dalamagas, Eirini Ntoutsi, Christos Diou
|
16:40 - 16:55 |
GFS: Graph-based Feature Synthesis for Prediction over Relational
Databases
[pdf]
Han Zhang, Quan Gan, David Wipf, Weinan Zhang
|
16:55 - 17:10 |
Closing Remarks & Awards
|
|
Keynote by Shi Han
and
Haoyu Dong, Microsoft Research
Title: Spreadsheet Intelligence and Data Analytics
Abstract: This keynote will unveil cutting-edge technologies designed to tackle the major
challenges in spreadsheet intelligence, encompassing areas such as detecting table ranges,
analyzing table structures and sheet layouts, understanding data semantics, and recommending
data presentations. Based on spreadsheet intelligence, the presentation will also
highlight our research and engineering efforts in boosting automation of data analytics to help
Microsoft build technical leadership in the Business Intelligence market.
In the trend of Large Language Models (LLMs), we will also present our latest explorations into
integrating LLMs with spreadsheet intelligence and data analytics.