Genome-Wide Analysis of the Cyclin-Dependent Kinases (CDK) and Cyclin Family in Molluscs

2021-12-22 11:40YANGQiongYUHongandLIQi
Journal of Ocean University of China 2021年6期

YANG Qiong, YU Hong, 2), *, and LI Qi, 2)

Genome-Wide Analysis of the Cyclin-Dependent Kinases (CDK) and Cyclin Family in Molluscs

YANG Qiong1), YU Hong1), 2), *, and LI Qi1), 2)

1),,266003,2),,266237,

Cell cycle regulation that plays a pivotal role during organism growth and development is primarily driven by cyclin-dependent kinases (CDKs) and Cyclins. Although CDK and Cyclin genes have been characterized in some animals, the studies of CDK and Cyclin families in molluscs, the ancient bilaterian groups with high morphological diversity, is still in its infancy. In this study, we identified and characterized 95 CDK genes and 114 Cyclin genes in seven representative species of molluscs, including,,,,,and. Genes in CDK and Cyclin families were grouped into eight and 15 subfamilies by phylogenetic analysis, respectively.It should be noted that duplication ofgene was detected in,andgenomes, which has ne- ver been recorded in animals. It is speculated that duplication may be the main course of expansion of the CDK9 subfamily in the three molluscs, which also sheds new light on the function of. In addition, Cyclin B is the largest subfamily among the Cyclin family in the seven molluscs, with the average of three genes. Our findings are helpful in better understanding CDK and Cyclin function and evolution in molluscs.

molluscs; CDK; Cyclin; genome-wide analysis

1 Introduction

Cyclin-dependent kinase (CDK) is a large family of se- rine/ threonine-specific kinases that control the eukaryotic cell cycle progression by binding cyclin partners (Pines, 1995; Johnson and Walker, 1999; Murray, 2004). During the cell cycle, cyclins accumulate and degrade periodically, and CDK/cyclin complexes are activated at specific cycle phases (Morgan, 1997). In addition to cell cycle regulation, CDKs and Cyclins are also involved in transcription, RNA processing, apoptosis and neurogenesis (Pines, 1995; Ma- lumbres, 2011; Hydbring., 2016).

The original member of CDK family was found in ge- netic screens for feast (Beach., 1982), and the first Cyclin gene was identified in sea urchin eggs (Evans., 1983). Subsequently, many members of the two families were identified in other species based on the conserved protein kinase domains. As the results, 6 to 8 CDKs and 9 to 15 Cyclins were identified in Fungi;8 to 52 Cyclin-like protein and 25 to 29 CDK proteins were detected in plants (Wang., 2004; La., 2006; Ma., 2013); and 11 to 28 Cyclin and 14 to 20 CDK genes were found in ani-mals (Cao., 2014; Malumbres, 2014). The increased complexity of cell cycle regulation in animals and plants might account for the more members of CDK and Cyclin families than in fungi, while the increased gene duplication events in plant genomes might explain the higher number of CDKs and Cyclins than in animals (Wang., 2004).Researches on invertebrates have focused on some orga- nisms, such as nematodes and sea urchins. Six CDK and 11 Cyclin genes are present in(Bo-xem, 2006), 11 CDK and 14 Cyclin genes have been iden- tified in(Cao., 2014). Despite their functions in eukaryotic cell cycle regulation, CDKs and Cyclins have undergone an extraordinary degree of evolutionary divergence and specialization. Therefore, investigation of the evolutionary history of CDKs and Cy- clins will enhance our understanding of animals and plants evolution and organism development.

Several studies have been conducted to disclose the evo- lutionary features of CDK and Cyclin families.,andwere conserved in animals and the number ofandvaried among different orga- nisms (Nieduszynski., 2002). It was indicated that cell-cycle related CDKs became more evolutionarily and functionally diverse with transcription complexity increa- sing (Guo and Stiller, 2004). A comparative phylogenetic analysis of Cyclins from protists to plants, fungi and ani- mals suggests that Cyclins can be divided into three groups (group I, groups II and III) (Ma., 2013). Phylogenetic analysis of CDK and Cyclin proteins in 18 premetazoan lineages indicate that CDK4/6 subfamily and eumetazoans emerged simultaneously, while the evolutionary conserva- tion of the Cyclin-D subfamily also tightly linked with eu- metazoan appearance (Cao., 2014).

Mollusca, the most speciose phylum in the marine realm with highly diverse body forms and lifestyles, plays an im- portant role in evolution and ecosystem. Many molluscs es-pecially the bivalvecan attain old ages (over 100 years old),such as the Geoduck clam (), the fresh- water pearl mussel (), and the ocean quahog () (Ziuganov., 2000; Strom., 2004; Wanamaker., 2008)which are in- creasingly regarded as longevity models (Abele., 2009; Bodnar, 2009). Despite remarkable evolutionary and biolo-gical significance, knowledge of CDK and Cyclin families in Mollusca is still in its infancy. Recently along with the rapid development of high-throughput sequencing techno- logies, the number of sequenced molluscan genomes has been increased rapidly (Simakov., 2013; Albertin., 2015; Wang., 2017), which provides a unique oppor- tunity to enhance our understanding of CDK and Cyclin families in molluscs.

In this study, we identified and characterized CDK and Cyclin genes in seven species of mollusc based on the ge- nome-wide data, including,,,,,and. Phylogenetic analysis and gene structure com- parison of these proteins are conducted. The results reveal detailed evolutionary information of CDKs and Cyclin part-ners, providing insights into potential function of CDK and Cyclin genes in molluscs.

2 Materials and Methods

2.1 Database Searching and Identification of CDK and Cyclin Genes

To identify CDK and Cyclin genes in seven representa- tive species of mollusc,including,,,,,, and, their whole genomic sequence database from NCBI were searched using the query sequence gene- rated from whole CDK and Cyclin family members in hu- mans (), vase tunicate (), fruit fly (), purple sea urchin (), and starlet sea anemone () (Cao., 2014). Tblastn was used to get the initial pool of CDK and Cyclin with a minimum e-value of 1e-5. After deleting the repeated entries, a uni- que set of sequences were kept for further analysis. All pu- tative CDK and Cyclin family proteins collected by Blast searching were carried out a preliminary phylogenetic ana- lysis. We verified the putatively identified Cyclin proteins by searching against SMART databases (http://smart.em- blheidelberg.de/).

2.2 Phylogenetic Analysis and Classification of the CDK and Cyclin Gene Families

To investigate the evolutionary relationship of the CDK and Cyclin families, the CDK and Cyclin amino acid se- quences of seven mollusc species and several representa- tive metazoans, including,,,,and, wereused to perform the phylogenetic analysis. All the sequences of CDK genes were aligned using ClustalW (http://www.ebi.ac.uk/clustalw/) with the default parameters. Gblocks(http://molevol.cmima.csic.es/castresana/index.html) was used to eliminate poorly aligned positions and divergent regions. The phylogenetic trees were built with MEGA 7.0 using the Neighbor-joining (NJ) method with 1000 repeti- tions for the bootstrap test. Since the Cyclin family is not conserved enough like CDK family, the genes of Cyclin fa-mily were aligned with MAFFT v7.402 (Katoh and Stand- ley, 2013) and poorly aligned sequences were removed.Only the conserved region (Cyclin-N and -C domains) were used for further phylogenetic analyses to identify the maxi-mum likelihood (ML). ML constructed using RAxML v8.2.12(Stamatakis, 2014) as implemented in the CIPRES Science Gateway v. 3.3(http://www.phylo.org/index.php) with 1000 bootstrap and LG model. The tree was displayed with Interactive Tree of Life (ITOL, https://itol.embl.de/).

2.3 Sequence Analysis and Structural Characterization

The Compute PI/MW tool at Expert Protein Analysis System (ExPAsy) site (https://web.expasy.org/compute_pi/) was used to calculation coding sequence (CDS), length mo- lecular weight (MW) and isoelectric point (pI). According to the online software BUSCA (http://busca.biocomp.unibo.it/), all the subcellular localizations could be predicted. We used the MEME (http://meme-suite.org/tools/meme) to ana-lyze the motifs of CDK and Cyclin proteins with the fol- lowing parameter: minimum width of motif, six; maximumwidth of motif, 50; and number of motifs, 10. The ITOL was used to visualize the results. According to the result of the GSDS software (Hu., 2015), the exon-intron structures of CDKs and Cyclins were shown in ITOL. To further ana- lyze CDK9 genes, we use CLUSTAL W (https://www.ge- nome.jp/tools-bin/clustalw) to get sequence alignment of thegenes from seven mollusc species,(Has-) and(Spu-). Sequence a- lignment was exported into ESPRIPT 3.0 (http://espript.ibcp.fr/ESPript/ESPript/). Structural features were described with CDK9 in human (extracted from the CDK9-CyclinT com- plex, PDB code: 3TN9).

3 Results

3.1 Identification of CDK and Cyclin Genes

As summarized in Table 1, a total of 209 genes were identified in the seven molluscs, including 95 genes of CDKfamily and 114 genes of Cyclin family. The number of CDK genes was from 13 to 15, while the number of Cyclin genes varied from 13 to 21. The component of CDK and Cyclin fa- mily members of each species was showed in Figs.1 and 2.

Table 1 Distribution of CDK and Cyclin family proteins in representative species of Mollusca

Fig.1 Schematic representation of the distribution of diffe- rent CDK family members in mollusc species. A black dot indicates the presence of clear homologs of CDK family members. Phylogenetic relationships of these organisms are derived from COI genes using MEGA 7.0 by the neighbor joining.

The bioinformation on the CDK and Cyclin family genesof seven species is provided in Table 2 and Table 3, inclu- ding name, identifier (ID), number of amino acid (aa), iso- electric points (pIs), molecular weight (MW), subcellular localization and protein length. The lengths of the proteins encoded by the CDK genes is from 207 to 1538aa, with the predicted MW ranging from 23.57 to 172.26kD. The lengths of the Cyclin proteins varies from 119 to 1162aa, with the predicted MW varying from 13.48 to 128.23kD. Among the seven mollusc species, the average lengths of CDK and Cy- clin proteinsinare far shorter than those in the other species. For CDK family, the pI was between 5.2 and 9.65, with an average pI of 7.94. Overall, 72% of the CDK family proteins had a pI more than 7, suggesting that the pro- teins are rich in acidic amino acids. For Cyclin family, the pI was between 4.56 and 10.21, with an average pI of 7.03. Among the seven molluscs, Cyclin proteins in,andare rich in acidic amino acids, while Cyclin proteins in,andare rich in alkaline amino acids.

Fig.2 Schematic representation of the distribution of diffe- rent Cyclin family members in mollusc species. A black dot indicates the presence of clear homologs of Cyclin family members. Phylogenetic relationships of these organisms are derived from COI genes using MEGA 7.0 by the neigh- bor joining.

Table 2 The bioinformation of the CDK family genes in seven mollusc species

()

()

SpeciesGene nameGene IDaaMV(Da)pISL Biomphalaria glabrataCyclin-dependent kinase 20 likeLOC10607715834539103.316.40Cytoplasm Crassostrea gigasCyclin-dependent kinase 1LOC10534613030234767.458.77Cytoplasm Cyclin-dependent kinase 2LOC10532550027331119.976.39Cytoplasm Cyclin-dependent-like kinase 5LOC10534414730735163.587.08Cytoplasm Cyclin-dependent kinase 6LOC10532771833137997.295.56Cytoplasm Cyclin-dependent kinase 7LOC10531778834138673.048.90Cytoplasm Cyclin-dependent kinase 8LOC10533005143950809.928.87Cytoplasm Cyclin-dependent kinase 9LOC10533127740145542.439.29Cytoplasm Cyclin-dependent kinase 10LOC10531932037943230.118.34Cytoplasm Cyclin-dependent kinase 11BLOC10533554980593715.255.66Nucleus Cyclin-dependent kinase 12LOC1053474991254140359.719.65Nucleus Cyclin-dependent kinase 14LOC10534543356363876.569.10Cytoplasm Cyclin-dependent kinase 17LOC10532304048755235.068.57Nucleus Cyclin-dependent kinase 20LOC10532943634338937.046.54Cytoplasm Lottia giganteaHypothetical proteinGeneID2024229125529038.466.08Cytoplasm Hypothetical proteinGeneID2024107534539290.226.01Cytoplasm Hypothetical proteinGeneID2024425946953223.918.75Cytoplasm Hypothetical proteinGeneID2023447531535823.188.30Cytoplasm Hypothetical proteinGeneID2023415038945244.098.63Cytoplasm Hypothetical proteinGeneID2023979632937878.998.89Cytoplasm Hypothetical proteinGeneID2024714638544433.348.05Cytoplasm Hypothetical proteinGeneID2024596535540991.659.32Cytoplasm Hypothetical proteinGeneID2024462037042831.729.10Cytoplasm Hypothetical proteinGeneID2024623042148908.218.82Cytoplasm Hypothetical proteinGeneID2024604633838491.457.01Cytoplasm Hypothetical proteinGeneID2024692435740843.336.65Cytoplasm Hypothetical proteinGeneID2024575830040843.337.73Cytoplasm Hypothetical proteinGeneID2024823630234804.358.26Cytoplasm Mizuhopecten yessoensisCyclin-dependent kinase 1 likeLOC11044855030435107.798.58Cytoplasm Cyclin-dependent kinase 2 likeLOC11044682020923567.318.88Cytoplasm Cyclin-dependent-like kinase 5LOC11044331030635121.216.72Cytoplasm Cyclin-dependent kinase 6 likeLOC11045212633437547.705.25Cytoplasm Cyclin-dependent kinase 7 likeLOC11044331034038786.857.16Cytoplasm Cyclin-dependent kinase 8 likeLOC11045688346253195.868.87Cytoplasm Cyclin-dependent kinase 9 likeLOC11045583839645303.219.19Cytoplasm Cyclin-dependent kinase 9 likeLOC11044136740646812.909.22Cytoplasm Cyclin-dependent kinase 10 likeLOC11044338838143332.298.42Cytoplasm Cyclin-dependent kinase 11B likeLOC11045883978691502.185.28Nucleus Cyclin-dependent kinase 12 likeLOC1104651281509168036.269.43Nucleus Cyclin-dependent kinase 14 likeLOC11046037538843443.786.90Cytoplasm Cyclin-dependent kinase 17 likeLOC11044883446252254.928.03Nucleus Cyclin-dependent kinase 20 likeLOC11045803834338809.746.13Cytoplasm Octopus bimaculoidesCyclin-dependent kinase 1 likeLOC10688022830435059.487.09Cytoplasm Cyclin-dependent kinase 2 likeLOC10687668427731959.046.76Cytoplasm Cyclin-dependent-like kinase 5LOC10688036229633712.737.59Cytoplasm Cyclin-dependent kinase 6 likeLOC10688114628132143.167.63Cytoplasm Cyclin-dependent kinase 7 likeLOC10688316328432319.456.01Cytoplasm Cyclin-dependent kinase 8 likeLOC10687834846152741.779.05Cytoplasm Cyclin-dependent kinase 9 likeLOC10687183136642452.259.17Cytoplasm Cyclin-dependent kinase 10 likeLOC10688442437843703.848.87Cytoplasm Cyclin-dependent kinase 11B likeLOC10687219181995046.496.18Nucleus Cyclin-dependent kinase 12 likeLOC1068724491538172264.599.43Nucleus Cyclin-dependent kinase 14 likeLOC10687759351157441.618.73Nucleus Cyclin-dependent kinase 17 likeLOC10687977048555107.168.54Nucleus Cyclin-dependent kinase 20 likeLOC10688314727731332.107.66Cytoplasm Pomacea canaliculataCyclin-dependent kinase 1 likeLOC11255817330234856.678.90Cytoplasm Cyclin-dependent kinase 2 likeLOC11255665629934252.608.34Cytoplasm Cyclin-dependent-like kinase 5LOC11256438229634090.176.46Cytoplasm Cyclin-dependent kinase 6 likeLOC11255623235240137.836.35Cytoplasm Cyclin-dependent kinase 7 likeLOC11255368534739426.779.08Cytoplasm Cyclin-dependent kinase 8 likeLOC11256633446153007.099.34Cytoplasm

()

()

SpeciesGene nameGene IDaaMV(Da)pISL Cyclin-dependent kinase 9 likeLOC11257486237443554.369.09Cytoplasm Pomacea canaliculataCyclin-dependent kinase 9 likeLOC11257523836542398.459.17Cytoplasm Cyclin-dependent kinase 10 likeLOC11257690538544234.199.07Cytoplasm Cyclin-dependent kinase 11B likeLOC112556334876102025.486.15Endomembrane system Cyclin-dependent kinase 12 likeLOC1125638141366151859.739.37Nucleus Cyclin-dependent kinase 14 likeLOC11256961651858116.289.22Nucleus Cyclin-dependent kinase 17 likeLOC11257142847453459.248.76Nucleus Cyclin-dependent kinase 20 likeLOC11257198334839541.866.79Cytoplasm

Table 3 The bioinformation of the Cyclin family genes in seven mollusc species

()

()

SpeciesGene nameGene IDaaMV(Da)pISL Lottia giganteaHypothetical proteinGene ID 2024642226830977.276.39Endomembrane system Hypothetical proteinGene ID2024659532737778.558.30Endomembrane system Hypothetical proteinGene ID2023179825930480.696.03Plasma membrane Hypothetical proteinGeneID2024495028332262.587.68Nucleus Hypothetical proteinGene ID2024052624628417.698.23Organelle membrane Hypothetical proteinGene ID 2025136911913482.717.11Endomembrane system Hypothetical proteinGene ID2023029459767949.116.74Endomembrane system Hypothetical proteinGene ID2024591241647791.655.57Nucleus Hypothetical proteinGene ID2024491928432306.595.20Endomembrane system Hypothetical proteinGene ID 2024631229334071.358.03Endomembrane system Hypothetical proteinGene ID2024705228032927.365.85Endomembrane system Hypothetical proteinGene ID2023985537342369.176.72Organelle membrane Hypothetical proteinGene ID2024456024627994.845.68Endomembrane system Hypothetical proteinGene ID 2024726041647048.345.34Plasma membrane Mizuhopecten yessoensisCyclin A likeLOC11045721042648319.245.73Endomembrane system Cyclin B2 likeLOC11045961712013902.524.70Nucleus Cyclin B3 likeLOC11044199547053347.178.75Endomembrane system Cyclin B likeLOC11044363746051643.467.64Organelle membrane Cyclin C likeLOC11046243020724651.436.21Endomembrane system Cyclin D2 likeLOC11045476429333657.095.15Plasma membrane Cyclin D2 likeLOC11044457514316823.794.78Nucleus Cyclin E likeLOC11044379243849996.985.71Nucleus Cyclin F likeLOC11046150479489574.036.31Plasma membrane Cyclin G1 likeLOC11044148737042327.85.36Endomembrane system Cyclin H likeLOC11045102732737828.798.57Organelle membrane Cyclin I likeLOC11044148833638617.736.40Nucleus Cyclin K likeLOC11044643456361970.429.06Plasma membrane Cyclin L1 likeLOC11045891547255269.2710.2Organelle membrane Cyclin O likeLOC11045598846251200.114.56Plasma membrane Cyclin T2 likeLOC11044305082191190.459.19Plasma membrane Cyclin Y likeLOC11045208535440390.856.42Nucleus FAM58A likeLOC11046494723227445.676.55Endomembrane system Octopus bimaculoidesCyclin A3-1LOC10686797140145996.097.47Plasma membrane Cyclin A likeLOC10687953745752092.235.72Organelle membrane Cyclin B3 likeLOC10688290442048814.737.94Organelle membrane Cyclin B likeLOC10687999238443863.639.03Organelle membrane Cyclin C likeLOC10686755129033916.236.45Endomembrane system Cyclin D2 likeLOC10687889828632866.255.15Endomembrane system Cyclin E likeLOC10687485443649233.786.47Nucleus Cyclin F likeLOC10688318578087353.836.76Plasma membrane Cyclin H likeLOC10687947031336671.255.83Endomembrane system Cyclin I likeLOC10688155932838109.248.18Endomembrane system Cyclin K likeLOC10687101867474331.639.06Plasma membrane Cyclin L1 likeLOC10688235847856177.3110.07Organelle membrane Cyclin OLOC10687273936042281.577.93Plasma membrane Cyclin T1 likeLOC10687962460466781.689.33Plasma membrane Cyclin Y likeLOC10688445937042075.548.33Nucleus FAM58A likeLOC10687927623027006.268.92Endomembrane system Cyclin A likeLOC11256048743348959.775.17Organelle membrane Pomacea canaliculataCyclin B3 likeLOC112555046332381306.48Endomembrane system Cyclin B likeLOC11255559531435548.195.29Endomembrane system Cyclin B likeLOC11255365046151222.918.68Organelle membrane Cyclin C likeLOC11255485929934726.076.34Plasma membrane Cyclin D2 likeLOC11256612329733617.775.05Endomembrane system Cyclin E2 likeLOC11257704946753680.425.92Endomembrane system Cyclin F likeLOC11256409276384712.556.36Plasma membrane Cyclin H likeLOC11255753532438457.246.07Organelle membrane Cyclin I likeLOC11255939332837207.276.64Endomembrane system Cyclin K likeLOC11257094057062518.299.09Plasma membrane Cyclin L1 likeLOC11255654349056800.9310.21Endomembrane system Cyclin Q likeLOC11257263723226812.066.86Endomembrane system Cyclin T1 likeLOC11255807982692553.598.68Endomembrane system Cyclin Y likeLOC11255659237543055.826.81Cytoplasm

3.2 Phylogenetic Analysis and Classification of the CDK and Cyclin Genes

To analyze the characteristics of molluscan CDK and Cy- clin proteins, and to examine the CDK and Cyclin genes in molluscs and other representative animals in evolutionary terms, phylogenetic tree was constructed using the CDK pro-teins from the seven molluscs, human, vase tunicate, fruit fly, starlet sea anemone, purple sea urchin and zebrafish. As shown in Fig.3, the CDK family was clustered into eight groups: CDK1 (including,and), CDK4/6 (includingand), CDK5 (including,,,and), CDK7, CDK20,CDK8, CDK9 (including,and), and CDK10/11 (includingand). Because Cy- clin sequences diverged greatly, a reliable Cyclin phyloge-netic tree failed to be obtained between molluscs and other animals like CDKs. We constructed ML trees from the se- ven organisms in Mollusca, human, purple sea urchin, part genes of vase tunicate and fruit fly (Fig.4). According to the phylogenetic tree, the Cyclin family in molluscs could be divided into 15 groups: Cyclin A, Cyclin B, Cyclin C, Cy- clin D, Cyclin E, Cyclin F, Cyclin G/I, Cyclin H, Cyclin J, Cyclin K, Cyclin L, Cyclin O, Cyclin Q, Cyclin T and Cy- clin Y (Fig.4). Cyclin B subfamily has the largest number of members (2 Cyclin B-like and a Cyclin B3) in Mollusca.

3.3 CDK and Cyclin Gene Structure and Conserved Motifs

To further interpret the structural diversity of CDK and Cyclin proteins, the gene structure and conserved motifs were analyzed (Fig.5 and Fig.6). The structures of the CDK and Cyclin genes were found to be moderately conserved among the various subfamilies, and the number and loca- tion of exons were similar in each subfamily, indicating si- milar function. The highest intron disruption was noted in the members of the, with intron disruption from 19 to 20, except forwith seven intron disruption. The lowest intron disruption was noted in the members of the, with intron disruption varying from six to nine. For Cyclin family, Cyclin F subfamily showed the highest intron disruption ranging between 14 and 17, except forwith eight intron disruption; Cyclin D subfamily showed the lowest intron disruption ranging from four to eight. In general, these results indicate that the CDKs and Cyclins in each group possess a similar number of exons, which further supports the evolutionary classification. Align-ments of all CDK9 proteins from the seven molluscs was shown in Fig.7. Multiple sequence alignment of CDK9 pro- teins revealed a highly conserved CDK domain.

Fig.3 Phylogenetic relationships of CDK genes in Mollusca and several representative metazoans, including human, vase tunicate, fruit fly, starlet sea anemone, purple sea urchin and zebrafish. The phylogenetic tree is constructed using the neigh- bor-joining (NJ) method with 1000 bootstrap. The bootstrap values are represented by various colors. The CDK genes of Mollusca are marked with blue and other organisms’ CDK genes are with black. All proteins are labeled with species names followed by accession numbers.

Fig.4 Phylogenetic relationships of Cyclin genes in Mollusca and several representative metazoans, including human, pur- ple sea urchin, part genes of vase tunicate and fruit fly. The phylogenetic tree is constructed using the maximum likeli- hood (ML) method with 1000 bootstrap. The bootstrap values are represented by various colors. The Cyclin genes of Mol- lusca are marked with blue and other organisms’ Cyclin genes arewith black. All proteins are labeled with species names followed by accession numbers.

As a consensus or a conserved region in the protein or nucleotide sequences, motifs were analyzed in this study (Fig.5 and Fig.6). Totally, 10 conserved motifs of mollus- can CDKs and Cyclins were identified using MEME. The length of these motifs varied from 15 to 29aa in CDK fa- mily, and ranged between 15 and 41aa in Cyclin family. For Cyclin genes, motifs 1, 3 and 4 were identified as N-termi- nal domains of Cyclin, while motifs 5 and 9 were identi- fied as C-terminal domains of Cyclin. As CDK family, mo- tifs 1, 2, 3, 6 and 7 were identified as protein kinase do- mains. Motif 2 (QLLRGJAYCHSNRILHRDLKPQNJLI) and motif 5 (DQLDRIFKVLGTPTEETWPGV) were com-mon in all the seven genomes, except CDK16 in. Taken together, the finding of similar gene struc- tures and conserved motifs within the same subfamily fur- ther supports the accuracy of the phylogenetic tree. On the other hand, the structural differences between different sub-families also indicate functional diversity of the CDK and Cyclin genes in Mollusca.

4 Discussion

Cell cycle is controlled by the regulatory units, cyclins, with the catalytic units, cyclin-dependent kinases. With the evolution of eukaryotes, the number of CDKs and Cyclins increased (Gunbin., 2011; Cao., 2014). For ex- ample,contains 6 CDKs and 15 Cyclins, whereas in human, the gene number is up to 20 (CDKs) and 29 (Cyclins) (Malumbres and Barbacid, 2005; Malumbres, 2014). However, investigation of cycle regu- lation in eukaryotes is limited, especially in Mollusca. Our work represents the first genome-wide identification of CDK and Cyclin family members in molluscs and provides in- sights into molecular evolution. In our study, we identified 95 genes in CDK family and 114 genes in Cyclin familyfromthe seven molluscs. The number of CDK genes ranged from13 to 15, and the Cyclin genes’ number varied from 13 to 21, which is consistent with evolution trend of CDK and Cy- clin in premetazoan lineages. Additionally, number and composition of the CDK genes were more stable than genes of the Cyclin families in the seven mollusc species, sug- gesting that CDK family was more conserved than Cyclin family and members of the Cyclin family might function in a species-specific manner.

Fig.5 Phylogenetic relationships, gene structure and architecture of conserved protein motifs in CDK genes from seven mol- lusc species. (A) The phylogenetic tree is constructed based on the conserved structure of seven mollusc species CDK pro- teins using MEGA 7.0 software. (B) Exonsand introns of CDK genes. Blue boxes indicate untranslated regions; red boxesindicate exons; and black lines indicate introns. (C) The motif composition of CDK proteins. The motifs are displayed in boxes with different colors. The length of the protein can be estimated using the scale at the bottom.

4.1 The Features of CDK Family in Molluscs

Based on our results, CDK genes in the seven molluscs are significantly conserved. The number of CDK family members are steady, ranging from 13 to 15, and can be di- vided into eight subfamilies, which is consistent with pre- vious reports (Liu and Kipreos, 2000; Guo and Stiller, 2004). The eight subfamilies can be classified into three cell-cycle-related subfamilies (CDK1/2, CDK4/6 and CDK5) and fivetranscriptional subfamilies (CDK7, CDK8, CDK9, CDK10/11and CDK20) according to their functional characteristics(Liu and Kipreos, 2000; Cao., 2014). Cell-cycle-re- lated subfamilies of CDK binding with Cyclin A, Cyclin B, Cyclin D and Cyclin E promote each phase of the cell cy- cle. The different functions and structures of these CDK subfamilies have been described in an excellent review (Wood and Endicott, 2018).

According to the phylogenetic tree, CDK5 subfamily is the most multiple subfamily with three clades includingCDK5, CDK16/17/18, and CDK14/15. CDK5 subfamily in metazoan differs greatly (Mikolcevic., 2012), but in the seven species of mollusc, it is more conservative based on gene constitution, phylogenetic and motif analyses.andwere both detected in vertebrates (Mi- kolcevic., 2012), but in the seven mollusc species, only CDK14 was identified. CDK16/17/18 is also named as PCTAIRE 1/2/3 (PCTK1/2/3), which contains a PCTAIRE sequence in the C-helix characterized by a conserved cata- lytic domain. In mammals,,andare expressed in neurons, suggesting that they play an es- sential role in the nervous system (Hirose., 1997;Herskovits and Davies, 2006; Shimizu., 2014). At pre-sent, PCTAIRE is studied as a new potential cancer treat- ment target(Dixon-Clarke., 2017; Wang., 2018). In this study, only one gene of CDK16/17/18 group,, was found in six mollusc species, while two genes,and, were found in.has large neuronsand is suited for neurobiology study,andmight play important roles.was highly conserved among the vertebrates, and most eumetazoa contain only(Mikolcevic., 2012). So, it is suggested thatmight originate earlier than CDK16/18. Our results seem to provide additional evidence to support this scenario.

Except for, the other seven subfamilies of CDK genes have only one duplication in the seven mollusc spe- cies. The gene structure and motif analyses of the seven CDK subfamilies failed to find big differences in the exon number, multiple copies and homologous genes, which fur-ther highlighted that the seven types of CDK appeared to be widely conserved among different mollusc species. The CDK9 subfamily consists of two clades, CDK9 and CDK12/13 (Liu and Kipreos, 2000). It was referred that the CDK9 subfamily split into two clades before the divergence of me- tazoans and fungi (Cao., 2014). In our results, all the seven molluscs contain CDK9 and CDK12/13. For CDK12/13, they are all CTD kinases with similar function (Kon- drashov, 2012; Zhang., 2016).In this study, they were not detected in the same species,which indicates there are substitutions between them. Notably, forgenes, there are two duplications in,and. So far, it is the first time to identify the dupli- cations ofgenes in animal genomes. According to the result of phylogenetic analysis, the two duplications donot cluster tightly, but clustered with other species’ CDK9. For example,clustering respectively withand(Fig.3). Gene duplication has been the main course of expansion of the various gene fa- milies, and is associated withthe adaptation of animals to the changing environments (Kondrashov, 2012). As a sub- unit of the positive transcription elongation factor b com- plex, CDK9 regulates transcription elongation in coopera- tion with Cyclin T. It also forms a complex with Cyclin K to regulate DNA damage signaling in replicating cells and recover from a transient replication arrest (Yu., 2010). CDK9 is a multifunctional kinase involved in a broad range of physiological processes, including myogenesis, cell growth, cellular viability and apoptosis (De Falco and Giordano, 1998; Franco., 2018). Actually, no study onCDK9 genes has been reported in molluscsby now, and our understanding of the CDK9 functions is limited. The results in our study suggested that,andmight be good materials to explore the CDK9 functions and evolution.

4.2 The Features of Cyclin Family in Molluscs

114 Cyclin family genes identified in this study can be divided into three major groups (Group I, Group II, and Group III), which is consistent with the previous study (Ma., 2013). According to our results, Group I includes,,,,,,,and, Group II includes Cy- clin Y, and Group III includes,,,,, and(). All types of Cyclin genes discovered in metazoon were also detected in the seven mollusc species. Cyclin A, B, D and E cooperated with CDK1 and CDK4/6 regulate cell cycle directly (Malumbres, 2014). It should be noted that Cyclin B in the seven mollusc species is the biggest subfamily with the average of three genes in each species, containing twogenes andgene. They are divided into three clades respectively. It is well known that Cyclin B in partner with CDK1 drive G2-M tran- sition in mitosis. Cyclin B is multiplein different species,for example, there are three Cyclin B genes in human, mouse and zebrafish; two in purple sea urchin, cattle and dog; one in Rhesus monkey, zebra finch and Florida lancelet (Gun- bin., 2011; Cao., 2014). Cyclin B in invertebrates evolved both rapidly and at uneven rates (Gunbin., 2011). In most animals, two conserved B-type cyclins are detected: Cyclin B-like protein and Cyclin B3 (Nieduszyn- ski., 2002). Cyclin B3 is more important for regula- tion of meiosis than mitosis and is relatively conserved in vertebrate and invertebrate (Nguyen., 2002; van der Voet., 2009; Miles., 2010). For Cyclin B-like pro-tein, some organisms like human have two genes includingand, which can compensate each other in function (Chotiner., 2019). In our study, two Cyclin B-like genes were detected. Based on the phylogenetic analysis, the twogenes clustered into two dis- tinct clades, one of which clustered together withandof human, suggesting that this Cyclin B-like protein has a high homology with Cyclin B1 and Cy- clin B2 in human, while the other one may carry out spe- cific functions in molluscs. Mollusca is the second largest group with more than 100000 species, which is widely dis- tributed in lakes, marshes, oceans, mountains and other en- vironments, so as to adapt to different habitats, the morpho- logical structure and lifestyle of various groups are very dif-ferent. A differentgene perhaps is a distinct developmental strategy in the adaptation to their change- able living environment (Nieduszynski., 2002; Gun- bin., 2011).

Except for the cell-cycle-regulate genes (,,and),,,andin the seven mollusc species are relatively conserved without genes duplication and with similar gene structure and motif. There is little difference of Cyclin fa- mily composition between classes, except for. Compared with other gastropods, the Cyclin family ofis more similar to bivalves, especially. This may be because bothandlive in the intertidal zone and face a complex and varied environ- ment. Additionally, a recent research suggestedshould not be part of the Cyclin family as it has no cha- racteristic Lys Glu pair (Quandt., 2020). In this study, we still analyzedas it consists of two typical cy- clin domains,Cyclin N and Cyclin C. In addition,andwere not discovered in. However, according to recent studies,andare con- served in metazoan species and are specific to animals (Ma., 2013; Cao., 2014). We found partial sequence ofingenome.The lack of wholeandgenes was possibly because of genome incompleteness (N50: 7298bp; L50: 32153bp).

Acknowledgements

This study was supported by the grants from the Na-tional Natural Science Foundation of China (No. 31672649), and the National Key R&D Program of China (No.2018YFD0900200).

Abele, D., Brey, T., and Philipp, E., 2009. Bivalve models of aging and the determination of molluscan lifespans.,44 (5):307-315, DOI: 10.1016/j.exger.2009.02.012.

Albertin, C. B., Simakov, O., Mitros, T., Wang, Z. Y., Pungor, J. R., Edsinger-Gonzales, E.,., 2015. The octopus genome and the evolution of cephalopod neural and morphological novel- ties., 524 (7564):220-224, DOI: 10.1038/nature14668.

Beach, D., Durkacz, B., and Nurse, P., 1982. Functionally homo- logous cell cycle control genes in budding and fission yeast., 300 (5894):706-709, DOI:10.1038/300706a0.

Bodnar, A., 2009. Marine invertebrates as models for aging re- search., 44 (8):477-484, DOI: 10.1016/j.exger.2009.05.001.

Boxem, M., 2006. Cyclin-dependent kinases in., 1:1-12, DOI: 10.1186/1747-1028-1-6.

Cao, L., Chen, F., Yang, X., Xu, W., Xie, J., and Yu, L., 2014. Phy-logenetic analysis of CDK and cyclin proteins in premetazoan lineages., 14 (1):1-16, DOI: 10.1186/1471-2148-14-10.

Chotiner, J. Y., Wolgemuth, D. J., and Wang, P. J., 2019. Functions of cyclins and CDKs in mammalian gametogenesis., 101 (3):591-601, DOI: 10.1093/biolre/ioz070.

De Falco, G., and Giordano, A., 1998. CDK9 (PITALRE): A mul-tifunctional cdc2-related kinase., 177 (4):501-506, DOI: 10.1002/(SICI)1097-4652(199812)177:4<501::AID-JCP1>3.0.CO;2-4.

Dixon-Clarke, S. E., Shehata, S. N., Krojer, T., Sharpe, T. D., von Delft, F., Sakamoto, K.,., 2017. Structure and inhibitor specificity of the PCTAIRE-family kinase CDK16., 474:699-713, DOI: 10.1042/BCJ20160941.

Evans, T., Rosenthal, E. T., Youngblom, J., Distel, D., and Hunt, T., 1983. Cyclin: A protein specified by maternal mRNA in sea urchin eggs that is destroyed at each cleavage division., 33 (2):389-396, DOI: 10.1016/0092-8674(83)90420-8.

Franco, L. C., Morales, F., Boffo, S., and Giordano, A., 2018. CDK9: A key player in cancer and other diseases., 119 (2):1273-1284, DOI: 10.1002/jcb.26293.

Gunbin, K. V., Suslov, V. V., Turnaev, I. I., Afonnikov, D. A., and Kolchanov, N. A., 2011. Molecular evolution of cyclin pro- teins in animals and fungi., 11 (1):224, DOI: 10.1186/1471-2148-11-224.

Guo, Z., and Stiller, J. W., 2004. Comparative genomics of cyclin-dependent kinases suggest co-evolution of the RNAP II C-terminal domain and CTD-directed CDKs., 5 (1):69, DOI: 10.1186/1471-2164-5-69.

Herskovits, A., and Davies, P., 2006. The regulation of tau phos- phorylation by PCTAIRE 3: Implications for the pathogenesis of Alzheimer’s disease., 23 (2):398-408, DOI: 10.1016/j.nbd.2006.04.004.

Hirose, T., Tamaru, T., Okumura, N., Nagai, K., and Okada, M., 1997. PCTAIRE 2, a Cdc2-related serine/threonine kinase, is predominantly expressed in terminally differentiated neurons., 249 (2):481-488, DOI: 10.1111/j.1432-1033.1997.t01-1-00481.x.

Hu, B., Jin, J., Guo, A.Y., Zhang, H., Luo, J., and Gao, G., 2015. GSDS 2.0: An upgraded gene feature visualization server., 31 (8):1296-1297, DOI: 10.1093/bioinformatics/btu817.

Hydbring, P., Malumbres, M., and Sicinski, P., 2016. Non-cano- nical functions of cell cycle cyclins and cyclin-dependent ki- nases., 17 (5):280-292, DOI:10.1038/nrm.2016.27.Epub2016Apr1.

Johnson, D., and Walker, C., 1999. Cyclins and cell cycle check- points., 39 (1): 295-312, DOI: 10.1146/annurev.pharmtox.39.1.295.

Katoh, K., and Standley, D. M., 2013. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability., 30 (4):772- 780, DOI: 10.1093/molbev/mst010.

Kondrashov, F. A., 2012. Gene duplication as a mechanism of genomic adaptation to a changing environment., 279 (1749):5048- 5057, DOI: 10.1098/rspb.2012.1108.

La, H., Li, J., Ji, Z., Cheng, Y., Li, X., Jiang, S.,., 2006. Ge- nome-wide analysis of cyclin family in rice (L.)., 275 (4):374-386, DOI: 10.1007/s00438-005-0093-5.

Liu, J., and Kipreos, E. T., 2000. Evolution of cyclin-dependent kinases (CDKs) and CDK-activating kinases (CAKs): Diffe- rential conservation of CAKs in yeast and metazoa., 17 (7):1061-1074, DOI: 10.1093/ox- fordjournals.molbev.a026387.

Ma, Z., Wu, Y., Jin, J., Yan, J., Kuang, S., Zhou, M.,., 2013. Phylogenetic analysis reveals the evolution and diversification of cyclins in eukaryotes., 66 (3):1002-1010, DOI: 10.1016/j.ympev.2012.12.007.

Malumbres, M., 2011. Physiological relevance of cell cycle ki- nases., 91 (3): 973-1007, DOI: 10.1152/physrev.00025.2010.

Malumbres, M., 2014. Cyclin-dependent kinases.,15 (6):1-10, DOI: 10.1186/gb4184.

Malumbres, M., and Barbacid, M., 2005. Mammalian cyclin-de- pendent kinases., 30 (11):630- 641, DOI: 10.1016/j.tibs.2005.09.005.

Mikolcevic, P., Rainer, J., and Geley, S., 2012. Orphan kinases turn eccentric: A new class of cyclin Y-activated, membrane-targeted CDKs., 11 (20):3758-3768, DOI: 10.4161/cc.21592.

Miles, D. C., van den Bergen, J. A., Sinclair, A. H., and Western, P. S., 2010. Regulation of the female mouse germ cell cycle du-ring entry into meiosis., 9 (2):408-418, DOI: 10.4161/cc.9.2.10691.

Morgan, D. O., 1997. Cyclin-dependent kinases: Engines, clocks, and microprocessors., 13 (1):261-291, DOI: 10.1146/annurev.cellbio.13.1.261.

Murray, A. W., 2004. Recycling the cell cycle: Cyclins revisited., 116 (2):221-234, DOI: 10.1016/s0092-8674(03)01080-8.

Nguyen, T. B., Manova, K., Capodieci, P., Lindon, C., Bottega, S., Wang, X. Y.,., 2002. Characterization and expression of mammalian cyclin B3, a prepachytene meiotic cyclin., 277 (44):41960-4169, DOI: 10.1074/jbc.M203951200.

Nieduszynski, C. A., Murray, J., and Carrington, M., 2002. Whole-genome analysis of animal A- and B-type cyclins., 3 (12): research0070.1-0070.8, DOI: 10.1186/gb-2002- 3-12-research0070.

Pines, J., 1995. Cyclins and cyclin-dependent kinases: A bioche- mical view., 308:697-711, DOI: 10.1042/bj3080697.

Quandt, E., Ribeiro, M. P., and Clotet, J., 2020. Atypical cyclins: The extended family portrait., 77 (2): 231-242, DOI: 10.1007/s00018-019-03262-7.

Shimizu, K., Uematsu, A., Imai, Y., and Sawasaki, T., 2014. Pc- taire1/Cdk16 promotes skeletal myogenesis by inducing myo- blast migration and fusion., 588 (17):3030-3037, DOI: 10.1016/j.febslet.2014.05.060.

Simakov, O., Marletaz, F., Cho, S.J., Edsinger-Gonzales, E., Hav-lak, P., Hellsten, U.,., 2013. Insights into bilaterian evo- lution from three spiralian genomes., 493 (7433):526- 531, DOI: 10.1038/nature11696.

Stamatakis, A., 2014. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies., 30 (9):1312-1313, DOI: 10.1093/bioinformatics/btu033.

Strom, A., Francis, R. C., Mantua, N. J., Miles, E. L., and Peter- son, D. L., 2004. North Pacific climate recorded in growth rings of geoduck clams: A new tool for paleoenvironmental recons- truction., 31 (6): L06206, DOI: 10.1029/2004GL019440.

van der Voet, M., Lorson, M. A., Srinivasan, D. G., Bennett, K. L., and van den Heuvel, S., 2009.mitotic cyclins have distinct as well as overlapping functions in chromosome segregation., 8 (24):4091-4102, DOI: 10.4161/cc.8.24.10171.

Wanamaker, A. D., Heinemeier, J., Scourse, J. D., Richardson, C. A., Butler, P. G., Eiriksson, J.,., 2008. Very long-lived mol-lusks confirm 17th century AD tephra-based radiocarbon re- servoir ages for North Icelandic shelf waters., 50 (3):399-412, DOI: 10.1017/S0033822200053510.

Wang, G., Kong, H., Sun, Y., Zhang, X., Zhang, W., Altman, N.,., 2004. Genome-wide analysis of the cyclin family in Ara-bidopsis and comparative phylogenetic analysis of plant cyclin-like proteins., 135 (2):1084-1099, DOI: 10.1104/pp.104.040436.

Wang, H. T., Liu, H. L., Min, S. P., Shen, Y. B., Li, W., Chen, Y. Q.,., 2018. CDK16 overexpressed in non-small cell lung cancer and regulates cancer cell growth and apoptosis via a p27-dependent mechanism., 103:399-405, DOI: 10.1016/j.biopha.2018.04.080.

Wang, S., Zhang, J., Jiao, W., Li, J., Xun, X., Sun, Y.,., 2017. Scallop genome provides insights into evolution of bilaterian karyotype and development., 1 (5):1-12, DOI: 10.1038/s41559-017-0120.

Wood, D. J., and Endicott, J. A., 2018. Structural insights into the functional diversity of the CDK-cyclin family., 8 (9):180112, DOI: 10.1098/rsob.180112.

Yu, D. S., Zhao, R. X., Hsu, E. L., Cayer, J., Ye, F., Guo, Y.,., 2010. Cyclin-dependent kinase 9-cyclin K functions in the re- plication stress response., 11 (11):876-882, DOI: 10.1038/embor.2010.153.

Zhang, T., Kwiatkowski, N., Olson, C. M., Dixon-Clarke, S. E., Abraham, B. J., Greifenberg, A. K.,., 2016. Covalent tar- geting of remote cysteine residues to develop CDK12 and CDK13 inhibitors., 12 (10):876-884, DOI: 10.1038/nchembio.2166.

Ziuganov, V., San Miguel, E., Neves, R. J., Longa, A., Fernandez, C., Amaro, R.,., 2000. Life span variation of the fresh- water pearl shell: A model species for testing longevity mecha- nisms in animals., 29 (2):102-105, DOI: 10.1639/0044-7447.

September 7, 2020;

November 16, 2020;

February 9, 2021

© Ocean University of China, Science Press and Springer-Verlag GmbH Germany 2021

. Tel: 0086-532-82032773

E-mail: hongyu@ouc.edu.cn

(Edited by Qiu Yantao)