Identify the authors of synthetic DNA sequences.
The advancements in knowledge and tools in the context of systems and synthetic biology increase the applicability of biotechnology. At the same time, these advances lower the burden on the accessibility of approaches such as genome engineering and increase their user group. From the biosecurity perspective, possible misuses of this technology result in serious security threads.
Possible counter mechanisms include the so called lab-of-origin approaches. The goal of these approaches is to identify the group or lab in which a genetically engineered construct was produced. Thereby, machine learning algorithms identify relevant features within DNA sequences allowing for their attribution to specific labs. While there already exist lab-of-origin predictors, new datasets and advancements in the field of large language models feature the potential to improve this process further. The goal of this project is to develop a new lab-of-origin predictor from a newly created dataset.
Additional Information
Project Capacity | Three IREP student |
Project available for | Spring, Summer and Fall 2024 |
Credits | 18 |
Available via Remote | No |
Project Supervisor | Erik Kubaczka |