Diver in the Data Ocean
Michael Trubetskov is a computer specialist. He evaluates information that researchers obtain from laser-assisted analysis of molecules in blood.
Even as a little boy growing up in the Soviet Union, Dr. Michael Trubetskov loved tinkering with metal toys. Using his Конструктор construction kits, he would often assemble components into large structures. It was here that his fascination for science was born. Trubetskov has been working at the Max Planck Institute of Quantum Optics (MPQ) since 2012. His work is still very much about creating useful tools from smaller components — his toy is now software. Trubetskov writes programs aimed at helping scientists diagnose cancer using laser light — in a non-invasive way before symptoms become apparent.
“Earlier, the components determined what I was able to build” says Trubetskov. In his office on the Garchi¬¬ng Forschungszentrum research campus, a wall of screens invite one to dive into oceans of data and every kind of programming code. Trubetskov motions to the monitors, “now I work with a toy that offers me unlimited possibilities. If I need a new component, I don’t go looking for it in my tool box anymore - I just make it myself.” Trubetskov’s job is to develop programs that prepare measured data for cancer diagnosis. “My software’s aim is to clean up unprocessed data, i.e. filter out background noise and maximize the actual information relevant to cancer diagnosis.”
As a member of the Broadband Infrared Diagnostics (BIRD) project team, Trubetskov is one link in a chain made up of physicists, mathematicians and physicians that combines laser light and cancer diagnosis. At the Ludwig-Maximilians-Universität München (LMU) and the MPQ they are working on analyzing the molecular composition of blood using infrared waves to determine a patient’s state of health. The hope is to be able to detect cancer at an early stage, when the chances of successful treatment are the highest. However, the interaction between light and molecules isn’t necessarily directly visible. The most important information is hidden deep inside measurement data and overshadowed by the “background noise” caused by instruments as well as the complex chemistry in blood. It’s Trubetskov’s job to ferret out this valuable information. With the help of his programs, scientists are able to isolate and process the relevant data before it is further analyzed by artificial intelligence.
To use a metaphor, the BIRD research group is searching for a needle in a haystack, or rather, for needles in tens of thousands of haystacks. The haystacks are the blood samples collected from cancer patients and healthy volunteers. The needles are the characteristics of blood that make cancer diagnosis possible — also called the “molecular fingerprint.” When a femtosecond light pulse (a femtosecond is a millionth of a billionth of a second) hits a blood sample, the molecules in the blood begin to vibrate. It is through this “echo” that the molecular fingerprint can be read.
The problem is that the scientists don’t exactly know which characteristics of molecular fingerprints are indicators of cancer — in other words, which needles they should be looking for. Additionally, the hay stacks are teeming with “false” needles — interfering signals generated by the instruments, which are difficult to distinguish from the characteristics they are looking for. In fact, it’s impossible to cleanly separate the original pulse from the echo, since the echo is produced and influenced by the pulse. Moreover, the short pulse laser itself is so new, that its intensity is not always constant. Its fluctuations are random and must be taken into account.
To make a comparison of the blood samples possible, Trubetskov must remove the “false” needles and suppress the interfering signals in order to isolate the desired needles. Only then is it possible to analyze the relevant characteristics. The comparison of these characteristics is subsequently carried out by so-called “neural networks”, which search the data sets for patterns.
The complicated preparation of the measured data requires a wide-ranging knowledge. Trubetskov’s training as a physicist, mathematician and computer scientist gives him the combination of theoretical and practical experience he needs. “Often, what counts is intuition,” says Trubetskov. “Sometimes you can just feel that you’re on the right track. And it’s often not possible to solve problems by just sitting at your desk.” When Trubetskov isn’t making any headway with a tricky problem, he takes a break and goes swimming and picks it up again afterwards. “Sometimes you just have to take a break and do something else — and suddenly the solution will come to you.”
Some of the biggest challenges are the constantly changing requirements. “It’s often been the case that I’ve just finished a writing program and then my colleague asks me to completely change the fundamental aspects of it,” says Trubetskov. In order to deal with these requirements, Trubetskov relies on a strategy known as “agile software development.” Instead of following fixed construction plans and designing the development of programs down to the last microscopic details, Trubetskov leaves room for change. “It’s not a linear process.” But the work is worth it. “The best feeling is when something works.” Trubetskov points to the computer tower whirring under his desk. “This isn’t much more than a clutter of silicone and cables. If we can teach this machine how to give us insights into reality and to possibly diagnose cancer, it would make me incredibly proud.” And so the boy who once tinkered with his metal toys has become a researcher who is helping to shape the science of tomorrow.