HKU Data Repository
Browse

File(s) under embargo

Reason: Some files are part of unpublished papers.

1

year(s)

11

month(s)

2

day(s)

until file(s) become available

Supporting data for GENOMICS AND DEEP LEARNING GUIDED DISCOVERY OF RIBOSOMAL PEPTIDE BIOSYNTHETIC GENES: FROM MINING TO DESIGN

dataset
posted on 2023-07-07, 08:27 authored by Zheng Zhong, Yongxin LiYongxin Li

This dataset contains all useful data from 3 studies of my thesis.


The first study, Chapter II of my thesis, uses a correlational network method to find unclustered proteases of lanthipeptides. Files from "ChapterII supplementary data.xlsx" to "Figure2.15_cytoscape.zip" are supplementary tables and source data for figures, while "correlational-network-v1.0.zip", "23777967_protease_cluster.csv.xz", and "23777967_protease.fasta.xz" are the source code and data to reproduce the network analysis.


The second study, Chapter III of my thesis, uses a deep learning approach to distinguish and classify RiPP precursors from other peptides. "ChapterIII supplementary data.xlsx" contains the supplementary tables including training data, model performance and predicted RiPP precursors, etc., while "TrRiPP.zip" contains the source code and trained model of the deep learning approach.


The third study, Chapter IV of my thesis, uses deep learning models to predict the precursors of given RiPP-modifying enzymes and also to generate novel RiPP-modifying enzymes. "ChapterIV supplementary data.xlsx" contains the supplementary tables including training data, model performance and generatedRiPP-modifying enzymes and precursors, etc., while " BGCDesign.zip" contains the source code of the deep learning models.

Funding

Hong Kong Research Grants Council ECS grant HKU27107320

Hong Kong Branch of Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou) (SMSEGL20SC02)

History