TU Dresden
A joint project with Uni Leipzig DE EN

Sim2Real: Simulated Training and Test Data for Biomedical Image Analysis

Mentored by Stefanie Speidel & Bjoern Andres
at National Center for Tumor Diseases (NCT), UKD Dresden / Chair for Machine Learning for Computer Vision, TU Dresden

A major bottleneck in applications of machine learning for biological and medical tasks such as cell lineage tracing and surgery consists in the lack of annotated data. This includes both a lack of annotated training data required for learning sufficiently complex models, and a lack of annotated test data for analyzing the accuracy and robustness of learned models and algorithms empirically in rare but important cases. A promising approach to overcoming this bottleneck consists in the synthesis of realistic training data and targeted test data by means of simulation.

A challenge in this context is to bridge the domain gap, such that models trained on synthetic data generalize well to real data, and such that analyses with respect to synthetic test data inform about analyses with respect to real data. Recent advances in image synthesis and neural rendering enable controllable generation of realistic data without explicit supervision between simulated and real data during training.  This synthetic data can then facilitate training or evaluation in realistic settings where labeled data is limited or no ground truth is available.

The PhD topic tackles the generation of controllable, realistic image or video data for cell lineage tracing and surgery as application domains. Research challenges that are crucial in this context are for example:

  • How can consistent and label-preserving data be generated for robot-assisted surgery using recent advances in computer vision for generative models (e.g. GANs, Diffusion models) and neural rendering (e.g. NeRF)?
  • How can the variation of cell shape, cell motion and the appearance of intracellular structure be incorporated in the synthesis of 3D+t data for cell lineage tracing?
  • How can known priors from the surgical domain (e.g. anatomical knowledge, knowledge about light sources) be leveraged to infuse expert knowledge into the data generation process?
  • How to balance priors and data modeling to ensure an effective tradeoff between realism and label preservation?
  • How can evaluation scenarios in this context be generated that serve as ground truth which would be otherwise unobtainable (e.g. depth maps, optical flow, point correspondences)?
  • What are possible metrics to quantify data quality?
  • How to consider edge cases and account for bias?

Work Environment

You will be working at the National Center for Tumor Diseases (NCT) Dresden and the chair of Machine Learning for Computer Vision. The NCT Dresden located at the medical campus combines patient therapy and research under one roof and offers a unique research platform including an experimental operating room and a novel simulation room for robot-assisted surgery. The group researches applied machine learning methods for robot-assisted surgery and surgical data science. The 2nd workplace will be the chair of Machine Learning for Computer Vision at the computer science faculty. The chair is researching computer vision methods for different applications in life sciences and industry.

Prerequisites

  • Master’s Degree (or equivalent) in computer science, electrical engineering, applied mathematics or related fields of expertise
  • Very good programming skills (e.g. C++, Python)
  • Excellent skills and practical experience in one or more of the following research areas is beneficial:
    • Machine Learning
    • Computer Vision
  • Ability to collaborate well in an interdisciplinary environment