Sebastian Jäger
Sebastian Jäger
Publications
Projects
Talks
Contact
CV
GitHub Resume
Publications
Type
1
2
3
4
7
Date
2024
2023
2022
2021
2020
2018
From Data Imputation to Data Cleaning - Automated Cleaning of Tabular Data Improves Downstream Predictive Performance
We develop and evaluate an application-agnostic ML-based data cleaning approach using well-established imputation techniques for automated detection and cleaning of erroneous values. To improve the degree of automation, we combine imputation techniques with conformal prediction (CP), a model-agnostic and distribution-free method to quantify and calibrate the uncertainty of ML models.
Sebastian Jäger
,
Felix Bießmann
PDF
Code
Poster
Automated Extraction of Fine-Grained Standardized Product Information from Unstructured Multilingual Web Data
We implement models that reliably predict product attributes across online shops, languages, or both, and can be used to match product taxonomies between online retailers.
Alexander Flick
,
Sebastian Jäger
,
Ivana Trajanovska
,
Felix Bießmann
Project
DOI
GreenDB - A Dataset and Benchmark for Extraction of Sustainability Information of Consumer Goods
We present a second public release of the GreenDB and present a first benchmark for sustainability information extraction.
Sebastian Jäger
,
Alexander Flick
,
Jessica Adriana Sanchez Garcia
,
Kaspar von den Driesch
,
Karl Brendel
,
Felix Bießmann
PDF
Code
Dataset
Slides
DOI
Nudging Sustainable Consumption: A Large-Scale Data Analysis of Sustainability Labels for Fashion in German Online Retail
A Large-Scale data analysis of sustainability labels for fashion in German online retail.
Maike Gossen
,
Sebastian Jäger
,
Marja Lena Hoffmann
,
Felix Bießmann
,
Ruben Korenke
,
Tilman Santarius
PDF
DOI
GreenDB: Toward a Product-by-Product Sustainability Database
We present the first public release of the GreenDB and describe its scraping pipeline.
Sebastian Jäger
,
Jessica Greene
,
Max Jakob
,
Ruben Korenke
,
Tilman Santarius
,
Felix Bießmann
PDF
Code
Dataset
DOI
A Benchmark for Data Imputation Methods
Comparison of data imputation methods on a wide range of datasets, missingness patterns, and missingness fractions.
Sebastian Jäger
,
Arndt Allhorn
,
Felix Bießmann
PDF
Code
DOI
Compressing BERT - An Evaluation and Combination of Methods
Evaluation and improvement of
BERT-of-Theseus
based on real world datasets and tasks.
Sebastian Jäger
PDF
Parallelized Training of Deep NN – Comparison of Current Concepts and Frameworks
Kubernetes based evaluation of TensorFlows’ and MXNet’s throughput, scalability and practical ease of use.
Sebastian Jäger
,
Hans Peter Zorn
,
Stefan Igel
,
Christian Zirpins
Slides
DOI
Machine Learning Im Kubernetes Cluster
Erläutert wie Kubernetes und das passenden Tooling hilft, Machine-Learning-Projekte effizient und erfolgreich umzusetzen.
Sebastian Jäger
Source Document
Horizontales Skalieren Von Deep Learning Frameworks
Kubernetes basierte Evaluation der horizontalen Skalierbarkeit von TensorFlow und MXNet mit Hilfe der Datensätze Fashion-MNIST und PTB.
Sebastian Jäger
PDF
Cite
×