File(s) under embargo
Reason: Some files are part of unpublished papers.
1
year(s)6
month(s)18
day(s)until file(s) become available
Supporting data for GENOMICS AND DEEP LEARNING GUIDED DISCOVERY OF RIBOSOMAL PEPTIDE BIOSYNTHETIC GENES: FROM MINING TO DESIGN
This dataset contains all useful data from 3 studies of my thesis.
The first study, Chapter II of my thesis, uses a correlational network method to find unclustered proteases of lanthipeptides. Files from "ChapterII supplementary data.xlsx" to "Figure2.15_cytoscape.zip" are supplementary tables and source data for figures, while "correlational-network-v1.0.zip", "23777967_protease_cluster.csv.xz", and "23777967_protease.fasta.xz" are the source code and data to reproduce the network analysis.
The second study, Chapter III of my thesis, uses a deep learning approach to distinguish and classify RiPP precursors from other peptides. "ChapterIII supplementary data.xlsx" contains the supplementary tables including training data, model performance and predicted RiPP precursors, etc., while "TrRiPP.zip" contains the source code and trained model of the deep learning approach.
The third study, Chapter IV of my thesis, uses deep learning models to predict the precursors of given RiPP-modifying enzymes and also to generate novel RiPP-modifying enzymes. "ChapterIV supplementary data.xlsx" contains the supplementary tables including training data, model performance and generatedRiPP-modifying enzymes and precursors, etc., while " BGCDesign.zip" contains the source code of the deep learning models.