How are cancer cells different from healthy cells? A new machine learning algorithm called “ikarus” knows the answer, a team led by Altuna Akalin of MDC bioinformatics reports in a journal. Genome biology.. The AI program has discovered a gene signature that is characteristic of tumors.
When it comes to identifying patterns of piles of data, humans are no match for artificial intelligence (AI). In particular, a branch of AI called machine learning is often used to find regularity in datasets, such as stock market analysis, image and voice recognition, and cell classification. To ensure a distinction between cancer cells and healthy cells, a team led by Dr. Altuna Akarin, Head of Bioinformatics and Omics Data Science Platform at the Maxdelbruck Molecular Medicine Center of the Helmholtz Society (MDC), has created a machine learning program. Developed. It is called “Icarus”.
The program discovered a pattern of tumor cells common to different types of cancer, consisting of characteristic combinations of genes.According to a paper from the journal team Genome biologyThe algorithm also detected a pattern of genotypes that was not previously clearly associated with cancer.
Machine learning basically means that the algorithm uses training data to learn how to answer a particular question on its own. This is done by searching for patterns in the data that help solve the problem. After the training phase, the system can be generalized from what it has learned to evaluate unknown data. “It was a big challenge for experts to get the right training data to make a clear distinction between” healthy “cells and” cancerous “cells,” said Jan Dohmen, the first author of the paper. increase.
Surprisingly high success rate
In addition, single cell sequencing datasets are often noisy. That is, the information about the molecular properties of individual cells is not very accurate. Probably because the number of genes detected in each cell is different, or because the samples are not always processed the same way. As reported by Domen and his colleague Dr. Vedran Franke, co-head of the study, they screened a myriad of publications and contacted a significant number of research groups to obtain appropriate datasets. Did. The team finally trained the algorithm using data from lung and colorectal cancer cells before applying it to data sets for other types of tumors.
During the training phase, ikarus had to find a list of characteristic genes and used it to classify cells. “We have tried different approaches and refined them,” says Domen. It was a time-consuming task, as all three scientists were involved. “The important thing was that ikarus eventually used two lists, one for oncogenes and one for genes from other cells,” Franke explains. After the learning phase, the algorithm was able to reliably distinguish between healthy cells and tumor cells in other types of cancer, such as tissue samples from patients with liver cancer and neuroblastoma. Its success rate tends to be very high, and even the research group was surprised. “I didn’t think there was a common feature that very accurately defines the tumor cells of different types of cancer,” says Akalin. “But we’re still not sure if this method works for all types of cancer,” Dohmen adds. To turn ikarus into a reliable tool for cancer diagnosis, researchers now want to test it with additional types of tumors.
AI as a fully automated diagnostic tool
This project aims to go far beyond the classification of “healthy” and “cancerous” cells. In the first test, ikarus has already demonstrated that this method can distinguish tumor cells from other types (and specific subtypes) of cells. “We want to make this approach more comprehensive, and we want to further develop it so that we can distinguish between all possible cell types in a biopsy,” says Akalin.
In hospitals, pathologists tend to simply examine tumor tissue samples under a microscope to identify different cell types. It’s a tedious and time-consuming task. With ikarus, this step could one day become a fully automated process. In addition, Akalin states that this data can be used to draw conclusions about the direct environment of the tumor. And it can help doctors choose the best treatment. For the composition of cancerous tissue and the microenvironment, it often indicates whether a particular treatment or medication is effective. In addition, AI may help develop new medicines. “Icarus makes it possible to identify genes that are potential drivers of cancer,” says Akarin. New therapeutic agents can then be used to target these molecular structures.
The remarkable thing about the publication is that it was created entirely during the COVID pandemic. All involved were not at the regular desk of the Berlin Institute for Medical Systems Biology (BIMSB), which is part of the MDC. Instead, they were in their home office and communicated with each other only digitally. Therefore, in Franke’s view, “this project shows that digital structures can be created to facilitate scientific research under these conditions.”
Researchers are developing ways to isolate tissue types in tumor samples
Altuna Akalin et al, Identifying Tumor Cells at the Single Cell Level Using Machine Learning, Genome biology (2022). DOI: 10.1186 / s13059-022-02683-1.. Genomebiology.biomedcentral.co… 6 / s13059-022-02683-1