Skip to content
TU Dresden
A joint project with Uni Leipzig DE EN

RobSurgVis: Vision-Language Model meets Next-Generation Robotic Surgery

Mentored by Stefanie Speidel, Frank Fitzek
at UKD Dresden, National Center for Tumor Diseases (NCT) and TU Dresden, ComNets Chair

Surgical staff shortage in an aging society is one of the major challenges surgery faces in the near future. Robotic surgery is one way to overcome this challenge, in particular we need to bridge the gap between robotic systems and machine learning to provide intelligent assistance functions during an intervention and to enable a natural interaction between the surgeon and the robotic assistant. Vision-language models are showing impressive results for a range of different use cases including medical applications and are also a promising approach for robotic surgery. The combination of a Vision-language model with a robot makes it possible for the surgeon to give instructions to the robot based on the current scene, e.g. “focus on the liver” or “grasp the needle”, which are then translated into actions.

The PhD topic tackles the generation of a vision-language model for robotic surgery to enable next-generation surgeon-machine-cooperation. Research challenges that are crucial in this context are for example:

  • How can recent advances in foundation models be leveraged to generate a vision-language model for robotic surgery?
  • How can we embed surgical expert knowledge and multi-modal data sources into a queryable model?
  • How can intuitive human-machine interaction such as conversational interaction be realized with the model?
  • How can queries be translated into executable actions for the robot?
  • How can we use the model to guide the surgeon within a multi-arm robotic surgery testbed?

Work environment

You will be working at the National Center for Tumor Diseases (NCT) Dresden and ComNets Chair at TU Dresden. The NCT Dresden located at the medical campus combines patient therapy and research under one roof and offers a unique research platform including an experimental operating room and a novel simulation room for robot-assisted surgery. The group researches applied machine learning methods for robot-assisted surgery and surgical data science. The 2nd workplace will be the Deutsche Telekom Chair for Communication Network at the electrical engineering faculty. The chair is providing expertise and a multi-arm robotic platform for different use cases.

Prerequisites

  • Master’s Degree (or equivalent) in computer science, mathematics, electrical engineering or related fields of expertise
  • Very good programming skills (e.g. C++, Python)
  • Excellent skills and practical experience in one or more of the following research areas is beneficial:
    • Robotics
    • Machine Learning
    • Computer vision
  • Ability to collaborate well in an interdisciplinary environment

Further details on the requirements and application process can be found in SECAI's announcement for open PhD positions in 2024.