Publications

468 Publications visible to you, out of a total of 468

Abstract (Expand)

Machine learning (ML) models are developed on a learning dataset covering only a small part of the data of interest. If model predictions are accurate for the learning dataset but fail for unseen data then generalization error is considered high. This problem manifests itself within all major sub-fields of ML but is especially relevant in medical applications. Clinical data structures, patient cohorts, and clinical protocols may be highly biased among hospitals such that sampling of representative learning datasets to learn ML models remains a challenge. As ML models exhibit poor predictive performance over data ranges sparsely or not covered by the learning dataset, in this study, we propose a novel method to assess their generalization capability among different hospitals based on the convex hull (CH) overlap between multivariate datasets. To reduce dimensionality effects, we used a two-step approach. First, CH analysis was applied to find mean CH coverage between each of the two datasets, resulting in an upper bound of the prediction range. Second, 4 types of ML models were trained to classify the origin of a dataset (i.e., from which hospital) and to estimate differences in datasets with respect to underlying distributions. To demonstrate the applicability of our method, we used 4 critical-care patient datasets from different hospitals in Germany and USA. We estimated the similarity of these populations and investigated whether ML models developed on one dataset can be reliably applied to another one. We show that the strongest drop in performance was associated with the poor intersection of convex hulls in the corresponding hospitals’ datasets and with a high performance of ML methods for dataset discrimination. Hence, we suggest the application of our pipeline as a first tool to assess the transferability of trained models. We emphasize that datasets from different hospitals represent heterogeneous data sources, and the transfer from one database to another should be performed with utmost care to avoid implications during real-world applications of the developed models. Further research is needed to develop methods for the adaptation of ML models to new hospitals. In addition, more work should be aimed at the creation of gold-standard datasets that are large and diverse with data from varied application sites.

Authors: Konstantin Sharafutdinov, Jayesh S Bhat, Sebastian Johannes Fritsch, Kateryna Nikulina, Moein E Samadi, Richard Polzin, Hannah Mayer, Gernot Marx, Johannes Bickenbach, Andreas Schuppert

Date Published: 1st Oct 2022

Publication Type: Journal article

Abstract (Expand)

Zusammenfassung Hintergrund Mit der zunehmenden Anzahl eingenommener Arzneimittel steigt die Prävalenz von Medikationsrisiken. Hierzu zählen beispielsweise Arzneimittelwechselwirkungen, welche erwünschte und unerwünschte Wirkungen einzelner Arzneistoffe reduzieren aber auch verstärken können. Fragestellung Das Verbundvorhaben POLAR (POLypharmazie, Arzneimittelwechselwirkungen und Risiken) hat das Ziel, mit Methoden und Prozessen der Medizininformatikinitiative (MII) auf Basis von „Real World Data“ (stationärer Behandlungsdaten von Universitätskliniken) einen Beitrag zur Detektion von Medikationsrisiken bei Patient:innen mit Polymedikation zu leisten. Im Artikel werden die konkreten klinischen Probleme dargestellt und am konkreten Auswertebeispiel illustriert. Material und Methoden Konkrete pharmakologische Fragestellungen werden algorithmisch abgebildet und an 13 Datenintegrationszentren in verteilten Analysen ausgewertet. Eine wesentliche Voraussetzung für die Anwendung dieser Algorithmen ist die Kerndatensatzstruktur der MII, die auf internationale IT-, Interoperabilitäts- und Terminologiestandards setzt. Ergebnisse In POLAR konnte erstmals gezeigt werden, dass stationäre Behandlungsdaten standortübergreifend auf der Basis abgestimmter, interoperabler Datenaustauschformate datenschutzkonform für Forschungsfragen zu arzneimittelbezogenen Problemen nutzbar gemacht werden können. Schlussfolgerungen Als Zwischenstand in POLAR wird ein erstes vorläufiges Ergebnis einer Analyse gezeigt. Darüber hinaus werden allgemeinere technische, rechtliche, kommunikative Chancen und Herausforderungen dargestellt, wobei der Fokus auf dem Fall der Verwendung stationärer Behandlungsdaten als „Real World Data“ für die Forschung liegt.

Authors: André Scherag, Wahram Andrikyan, Tobias Dreischulte, Pauline Dürr, Martin F Fromm, Jan Gewehr, Ulrich Jaehde, Miriam Kesselmeier, Renke Maas, Petra A Thürmann, Frank Meineke, Daniel Neumann, Julia Palm, Thomas Peschel, Editha Räuscher, Susann Schulze, Torsten Thalheim, Thomas Wendt, Markus Loeffler, D Ammon, W Andrikyan, U Bartz, B Bergh, T Bertsche, O Beyan, S Biergans, H Binder, M Boeker, H Bogatsch, R Böhm, A Böhmer, J Brandes, C Bulin, D Caliskan, I Cascorbi, M Coenen, F Dietz, F Dörje, T Dreischulte, J Drepper, P Dürr, A Dürschmid, F Eckelt, R Eils, A Eisert, C Engel, F Erdfelder, K Farker, M Federbusch, S Franke, N Freier, T Frese, M Fromm, K Fünfgeld, T Ganslandt, J Gewehr, D Grigutsch, W Haefeli, U Hahn, A Härdtlein, R Harnisch, S Härterich, M Hartmann, R Häuslschmid, C Haverkamp, O Heinze, P Horki, M Hug, T Iskra, U Jaehde, S Jäger, P Jürs, C Jüttner, J Kaftan, T Kaiser, K Karsten Dafonte, M Kesselmeier, S Kiefer, S Klasing, O Kohlbacher, D Kraska, S Krause, S Kreutzke, R Krock, K Kuhn, S Lederer, M Lehne, M Löbe, M Loeffler, C Lohr, V Lowitsch, N Lüneburg, M Lüönd, I Lutz, R Maas, U Mansmann, K Marquardt, A Medek, F Meineke, A Merzweiler, A Michel-Backofen, Y Mou, B Mussawy, D Neumann, J Neumann, C Niklas, M Nüchter, K Oswald, J Palm, T Peschel, H Prokosch, J Przybilla, E Räuscher, L Redeker, Y Remane, A Riedel, M Rottenkolber, F Rottmann, F Salman, J Schepers, A Scherag, F Schmidt, S Schmiedl, K Schmitz, G Schneider, A Scholtz, S Schorn, B Schreiweis, S Schulze, A K Schuster, M Schwab, H Seidling, S Semler, K Senft, M Slupina, R Speer, S Stäubert, D Steinbach, C Stelzer, H Stenzhorn, M Strobel, T Thalheim, M Then, P Thürmann, D Tiller, P Tippmann, Y Ucer, S Unger, J Vogel, J Wagner, J Wehrle, D Weichart, L Weisbach, S Welten, T Wendt, R Wettstein, I Wittenberg, R Woltersdorf, M Yahiaoui-Doktor, S Zabka, S Zenker, S Zeynalova, L Zimmermann, D Zöller, für das POLAR-Projekt

Date Published: 1st Sep 2022

Publication Type: Journal article

Abstract (Expand)

We describe the creation of GRASCCO, a novel German-language corpus composed of some 60 clinical documents with more than.43,000 tokens. GRASCCO is a synthetic corpus resulting from a series of alienation steps to obfuscate privacy-sensitive information contained in real clinical documents, the true origin of all GRASCCO texts. Therefore, it is publicly shareable without any legal restrictions We also explore whether this corpus still represents common clinical language use by comparison with a real (non-shareable) clinical corpus we developed as a contribution to the Medical Informatics Initiative in Germany (MII) within the SMITH consortium. We find evidence that such a claim can indeed be made.

Authors: Luise Modersohn, Stefan Schulz, Christina Lohr, Udo Hahn

Date Published: 1st Aug 2022

Publication Type: InCollection

Abstract (Expand)

Numerous prediction models of SARS-CoV-2 pandemic were proposed in the past. Unknown parameters of these models are often estimated based on observational data. However, lag in case-reporting, changing testing policy or incompleteness of data lead to biased estimates. Moreover, parametrization is time-dependent due to changing age-structures, emerging virus variants, non-pharmaceutical interventions, and vaccination programs. To cover these aspects, we propose a principled approach to parametrize a SIR-type epidemiologic model by embedding it as a hidden layer into an input-output non-linear dynamical system (IO-NLDS). Observable data are coupled to hidden states of the model by appropriate data models considering possible biases of the data. This includes data issues such as known delays or biases in reporting. We estimate model parameters including their time-dependence by a Bayesian knowledge synthesis process considering parameter ranges derived from external studies as prior information. We applied this approach on a specific SIR-type model and data of Germany and Saxony demonstrating good prediction performances. Our approach can estimate and compare the relative effectiveness of non-pharmaceutical interventions and provide scenarios of the future course of the epidemic under specified conditions. It can be translated to other data sets, i.e., other countries and other SIR-type models.

Authors: Y. Kheifetz, H. Kirsten, M. Scholz

Date Published: 2nd Jul 2022

Publication Type: Journal article

Human Diseases: COVID-19

Abstract (Expand)

BACKGROUND: The secondary use of deidentified but not anonymized patient data is a promising approach for enabling precision medicine and learning health care systems. In most national jurisdictions (e.g., in Europe), this type of secondary use requires patient consent. While various ethical, legal, and technical analyses have stressed the opportunities and challenges for different types of consent over the past decade, no country has yet established a national consent standard accepted by the relevant authorities. METHODS: A working group of the national Medical Informatics Initiative in Germany conducted a requirements analysis and developed a GDPR-compliant broad consent standard. The development included consensus procedures within the Medical Informatics Initiative, a documented consultation process with all relevant stakeholder groups and authorities, and the ultimate submission for approval via the national data protection authorities. RESULTS: This paper presents the broad consent text together with a guidance document on mandatory safeguards for broad consent implementation. The mandatory safeguards comprise i) independent review of individual research projects, ii) organizational measures to protect patients from involuntary disclosure of protected information, and iii) comprehensive information for patients and public transparency. This paper further describes the key issues discussed with the relevant authorities, especially the position on additional or alternative consent approaches such as dynamic consent. DISCUSSION: Both the resulting broad consent text and the national consensus process are relevant for similar activities internationally. A key challenge of aligning consent documents with the various stakeholders was explaining and justifying the decision to use broad consent and the decision against using alternative models such as dynamic consent. Public transparency for all secondary use projects and their results emerged as a key factor in this justification. While currently largely limited to academic medicine in Germany, the first steps for extending this broad consent approach to wider areas of application, including smaller institutions and medical practices, are currently under consideration.

Authors: Sven Zenker, Daniel Strech, Kristina Ihrig, Roland Jahns, Gabriele Müller, Christoph Schickhardt, Georg Schmidt, Ronald Speer, Eva Winkler, Sebastian Graf von Kielmansegg, Johannes Drepper

Date Published: 1st Jul 2022

Publication Type: Journal article

Abstract (Expand)

BACKGROUND: Clinical decision support systems often adopt and operationalize existing clinical practice guidelines leading to higher guideline availability, increased guideline adherence, and data integration. Most of these systems use an internal state-based model of a clinical practice guideline to derive recommendations but do not provide the user with comprehensive insight into the model. OBJECTIVE: Here we present a novel approach based on dynamic guideline visualization that incorporates the individual patient’s current treatment context. METHODS: We derived multiple requirements to be fulfilled by such an enhanced guideline visualization. Using business process and model notation as the representation format for computer-interpretable guidelines, a combination of graph-based representation and logical inferences is adopted for guideline processing. A context-specific guideline visualization is inferred using a business rules engine. RESULTS: We implemented and piloted an algorithmic approach for guideline interpretation and processing. As a result of this interpretation, a context-specific guideline is derived and visualized. Our implementation can be used as a software library but also provides a representational state transfer interface. Spring, Camunda, and Drools served as the main frameworks for implementation. A formative usability evaluation of a demonstrator tool that uses the visualization yielded high acceptance among clinicians. CONCLUSIONS: The novel guideline processing and visualization concept proved to be technically feasible. The approach addresses known problems of guideline-based clinical decision support systems. Further research is necessary to evaluate the applicability of the approach in specific medical use cases.

Authors: Jonas Fortmann, Marlene Lutz, Cord Spreckelsen

Date Published: 1st Jun 2022

Publication Type: Journal article

Abstract (Expand)

Abstract Background In recent years, data-driven medicine has gained increasing importance in terms of diagnosis, treatment, and research due to the exponential growth of health care data. However, data protection regulations prohibit data centralisation for analysis purposes because of potential privacy risks like the accidental disclosure of data to third parties. Therefore, alternative data usage policies, which comply with present privacy guidelines, are of particular interest. Objective We aim to enable analyses on sensitive patient data by simultaneously complying with local data protection regulations using an approach called the Personal Health Train (PHT), which is a paradigm that utilises distributed analytics (DA) methods. The main principle of the PHT is that the analytical task is brought to the data provider and the data instances remain in their original location. Methods In this work, we present our implementation of the PHT paradigm, which preserves the sovereignty and autonomy of the data providers and operates with a limited number of communication channels. We further conduct a DA use case on data stored in three different and distributed data providers. Results We show that our infrastructure enables the training of data models based on distributed data sources. Conclusion Our work presents the capabilities of DA infrastructures in the health care sector, which lower the regulatory obstacles of sharing patient data. We further demonstrate its ability to fuel medical science by making distributed data sets available for scientists or health care practitioners.

Authors: Sascha Welten, Yongli Mou, Laurenz Neumann, Mehrshad Jaberansary, Yeliz Yediel Ucer, Toralf Kirsten, Stefan Decker, Oya Beyan

Date Published: 1st Jun 2022

Publication Type: Journal article

Powered by
(v.1.13.0-master)
Copyright © 2008 - 2021 The University of Manchester and HITS gGmbH
Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig

By continuing to use this site you agree to the use of cookies