Cellular and Molecular Biology Verification of expression of LINC00648 in the serum of lung cancer patients by TCGA database

: TCGA data were used to verify the expression of LINC00648 in lung cancer patients to provide a reference for clinical practice. Lung cancer trans-criptome data were downloaded by the TCGA database and LINC00648 data were extracted for analysis. Fifty-two patients with lung cancer diagnosed in our hospital from May 2014 to March 2016 were collected as the patient group and 30 normal people as the control group. RT-qPCR was used to detect the expression of LINC00648 in serum, follow up of patients was carried out, and bioinformatics was used to analyze the potential mechanism of LINC00648. LINC00648 was highly expressed in lung cancer. Lymphatic metastasis and probability of low differentiation were significantly increased, and the overall survival rate of highly expressed patients with lung cancer was reduced and the prognosis was poor. LINC00648 had 17 potential miR-targeted and 78 miR-targeted mRNAs. LINC00648 was found to have participated in SMAD binding, transcriptional activator activity, RNA polymerase II transcription regulatory region sequence-specific DNA bin ding, PDZ domain binding, cytokine binding, activin binding, RNA polymerase II activating transcription factor binding, transforming growth factor-beta receptor binding, etc. LINC00648 participated in the signal pathways of the Hippo signaling pathway, Transcriptional misregulation in cancer, MAPK signaling pathway, Proteoglycans in cancer. There were 55 co-expression pairs in PPI protein co-expression analysis, of which KIF11 was the most common. High expression of LINC00648 in lung cancer patients indicates poor prognosis of patients and is expected to become a potential diagnostic marker for lung cancer.


Introduction
As one of the most common causes of cancer-related deaths in the world, lung cancer has become the most serious health and public safety problem in the world (1). A recent epidemiological statistic showed that in 2018 (2), there were more than 2 million new lung cancer patients and 1.7 million death in the world. The incidence rate of lung cancer is increasing year by year and shows a younger trend. Lung cancer can be divided into small cell lung cancer and non-small cell lung cancer according to pathological types, with the proportions of 15-20 and 80-85% respectively (3,4). Although targeted therapy of molecular biological tumors has improved the survival rate of lung cancer in recent years, the survival rate of lung cancer patients is still not ideal. Studies have found that most patients have progressed to the middle and late stages after being admitted to the hospital for pathological diagnosis, which makes surgical treatment difficult and leads to poor prognosis (5)(6)(7). Data showed that the 5-year survival rate of lung cancer patients is only 16%, and how to improve this problem has become one of the main problems that clinicians need to solve (8).
At present, besides the pathological biopsy, the most accurate diagnostic scheme for lung cancer is imaging detection (9). Previous studies reported (10) that in large randomized trials using low-dose CT, it was found that the mortality rate of low-dose CT was 20% lower than that of chest X-ray. This is also an important reason why low-dose CT is recommended for early diagnosis of lung cancer. However, due to the radiation and high price of low-dose CT, it is difficult for patients to carry out multiple tests (11). Therefore, clinicians urgently need to find an economical and effective noninvasive biomarker for early lung cancer to solve this problem.
Non-coding RNA has been a hot topic discussed by various disciplines in recent ten years (12). Previously, due to technical and instrument defects, research on noncoding RNA was limited. With the continuous improvement of technology, more and more non-coding RNA has been found to be associated with the occurrence and development of various diseases (13). Among them, non-coding short-chain RNA (miR), cyclic RNA (cir-cRNA), Long non-coding RNA (LncRNA) are the most prominent (14)(15)(16). LncRNA is a long-chain non-coding RNA with a length of more than 200nt. Previous studies found that (17) LncRNAs are differentially expressed in various tumors, such as lung cancer, gastric cancer, liver cancer, colon cancer and other tumors (18)(19)(20)(21), and have certain diagnostic value and is expected to become a potential diagnostic marker for tumors. LINC00648 is one of the members of LncRNA. Previous studies on LINC00648 are very few and there is little research in Cell Mol Biol (Noisy le Grand) 2020 | Volume 66 | Issue 3 Ling Zhao et al.
lung cancer. In this study, we found that LINC00648 is highly expressed in lung cancer patients through TCGA database analysis, and is expected to become a potential diagnostic marker for lung cancer patients.
Therefore, in this study, we verified the diagnostic value and potential mechanism of LINC00648 in lung cancer patients through the TCGA database combined with clinical experiments, providing potential directions for clinical diagnosis and treatment.

TCGA database analysis
We logged on to the https://portal.gdc.cancer.gov to download mRNA data of Lung adenocarcinoma and Lung squamous cell carcinoma transcripts. The data were integrated by Perl script, and then LINC00648 was retrieved from lung cancer patients. The extracted data were transformed into the log (x+1,2), and then the difference analysis was carried out. Altogether 1145 samples were downloaded, of which 1037 were cancer samples and 108 were adjacent samples.

Clinical data collection
Fifty-two patients with lung cancer diagnosed and treated in our hospital from May 2014 to March 2016 were collected as the study group, including 40 male patients and 12 female patients, with an average age of 62.5± 6.2 years. Another 30 normal people examined during the same period were collected as the control group, including 22 males and 8 females, with an average age of 61.5± 5 5.3 years. This study was approved by the Medical Ethics Committee of our hospital. Inclusion criteria were as follows: Altogether 52 patients were confirmed as primary non-small cell lung cancer through histological examination, and targeted radiotherapy and chemotherapy were not carried out before the study; patients were classified into different stages according to the 8th Union for International Cancer Control (UICC); patients and their families in the study were informed and signed informed consent forms. Exclusion criteria were as follows: patients with other tumors and with a survival period of fewer than 3 months; patients did not cooperate with follow-up; patients with congenital diseases and immune deficiency diseases; patients had a serious infection before this study.

Collection and analysis of serum samples
A 5 ml of peripheral blood from two groups of subjects were collected, placed for 30 min, centrifuged at 3000 rpm for 10 min, and then the supernatant was collected for RT-qPCR amplification.

RT-QPCR detection
Total RNA was extracted by TRIzol kit (Invitrogen Company, USA), and the purity, concentration and integrity of total RNA were detected by UV spectrophotometer and agarose gel electrophoresis. Subsequently, reverse transcription was performed by the TaqMan™ Reverse reverse transcription kit (Invitrogen Company, USA), and the transcription steps were strictly operated according to the kit instructions. The cDNA was subjected to subsequent research. PCR amplification was carried out using the PrimeScript RT Master Mix kit (Takara Bio Company, Japan). Amplification system: 10 μL of SYBR qPCR Mix, 0.8 μL of upstream and downstream primers, 2 μL of cDNA product, 0.4 μL of 50× Rox reference dye, and finally RNase-free water was used to make up to 20 μL. PCR reaction conditions: pre-denaturation at 95 ℃ for the 60s, denaturation at 95 ℃ for 30s, annealing and extension at 60 ℃ for 40s, with a total of 40 cycles. In the experiment, three parallel repeating wells were designed, and all specimens were repeatedly tested for 3 times. GENE and GADPH were used as internal references, the data were analyzed with 2 -ΔΔ ct (22). The PCR instrument was ABI 7500PCR, the upstream primer of LINC00648: 5'-TCCCAG-TGACCCCA-3', and the downstream primer: 5'-GCC TAACCGGTGCTGCTG-3'; GADPH upstream primer: 5'-GAGAGAGAGAGAGAGACCTCACCGCTG-3', downstream primer: 5'-ACTGTGAGAGAGAGAGA-GAGAGAATTCAGT-3'.

Follow-up
The patients were followed up until March 2019. The follow-up was counted through telephone and outpatient electronic medical records at the 1st, 3rd, 6th, 9th and 12th months of each year.

Bioinformatics analysis
Http://starbase.sysu.edu.cn/ was adopted to predict LINC00648 targeted miR. miRDB, miRTarBase and TargetScan online websites were adopted to predict targeted mRNA for the predicted targeted miR, and Cytoscape to visualize ceRNA (competing for endogenous RNAs, Endogenous competitive RNA) network. The R language ClusterProfiler package was used to enrich GO and KEGG, and String was used to visualizing the protein co-expression network for targeting RNA.

Statistical analysis
In this study, the SPSS20.0 software package was applied to perform statistical analysis of the data. Gra-phPad 7 software package was used to visualize the required pictures. K-S test was used to analyze the distribution of measurement data. Normal distribution data were expressed by mean± standard deviation (Meas±SD), an inter-group comparison was conducted by an independent sample t-test, a multi-group comparison was conducted by one-way analysis of variance, expressed by F, and afterward, pairwise comparisons were conducted by LSD-t-test. ROC was used to visualize the diagnostic value of LINC00648 in lung cancer, Pearson test was used to analyze the correlation of various genes, K-M survival curve was used to plot the total survival condition of patients, Log-rank test was applied for analysis, and multivariate Cox regression was applied to analyze the prognosis of patients. When p< 0.05, there was a statistical difference.

Baseline data
First, comparing the clinical data of the two groups of patients, it was discovered that there were statistical differences in gender, age, BMI, smoking history and drinking history between the two groups (p> 0.05), indicating that the two groups were comparable. Fifty-two Cell Mol Biol (Noisy le Grand) 2020 | Volume 66 | Issue 3 Ling Zhao et al.

Diagnostic value of LINC00648 in lung cancer
In order to confirm the diagnostic value of LINC00648 in lung cancer, we further visualized the ROC curve. The result showed that the area under the curve of LINC00648 in the diagnosis of lung cancer was 0.886, which had high diagnostic value. In order to observe the expression of LINC00648 in patients with early lung cancer, we further compared the expression of LINC00648 in patients with different TNM stages. The results showed that LINC00648 was differentially expressed in different stages, and the ROC curve was visualized to find that LINC00648 has certain clinical value in diagnosing early lung cancer ( Figure 2 and patients with lung cancer diagnosed and treated in our hospital from May 2014 to March 2016 were collected as the study group, including 38 male patients and 12 female patients, with an average age of 62.5± 6.2 years. Another 30 normal people examined during the same period were collected as the control group, including 22 males and 8 females, with an average age of 61.5± 5 5.3 years ( Table 1).

Expression of LINC00648 in lung cancer
In order to verify the expression of LINC00648 in lung cancer, we found through TCGA database analysis that the expression of LINC00648 in cancer samples was significantly higher than that in the control group (p< 0.001). Therefore, we carried out clinical experiments to detect the expression of LINC00648 in the serum of patients and control groups. The results were consistent with the database results. The expression of LINC00648 in the serum of cancer patients was significantly increased. The relationship between LINC00648 and the patient's pathological data was further analyzed, and the patients were divided into high and low expression groups according to the median value of LINC000648. It was found that patients with high expression of LINC00648 showed at stage III+IV, lymphatic metastasis and the probability of low differentiation increased significantly ( Figure 1 and Table 2).  Table 3).

Relationship between LINC00648 and survival of lung cancer patients
Altogether 52 patients were followed up. The overall survival rate of the patients was 21.15% (11 cases). According to the death of the patients, the patients were divided into a survival group and death group. Comparing the expression of LINC00648 in the two groups of patients, it was found that the expression of LINC00648 in the death group was significantly higher than that in the survival group. ROC curve analysis showed that the area of LINC00648 under the predicted death curve was 0.747, which has certain clinical value. According to the median value of LINC00648, the patients were further divided into high and low expression groups. Observing the survival of patients, it was found that the overall survival rate of patients in the high expression group was significantly reduced (p =0.001, Figure 3).

Prognosis analysis of lung cancer patients
The pathological data of lung cancer patients were collected for the univariate Cox regression analysis. The results showed that lymph metastasis, differentiation, TNM staging and LINC00648 were independent factors affecting the prognosis of patients. Further multivariate Cox regression analysis found that lymph metastasis and LINC00648 were independent prognostic factors of patients (Table 4).

Bioinformatics analysis
Through bioinformatics analysis, the relevant mechanism of LINC00648 was further explored. First, a total of 17 targeted miRs were found by predicting   the targeted miRs of LINC00648 (Table 5). Then, a total of 78 potential mRNA were found by predicting the targeted mRNA of the 17 targeted miRs (Table 6). Through GO enrichment analysis, it was found that LINC00648 may participate in 7 important biological processes (Table 7). KEGG enrichment analysis found that LINC00648 participated in the signal pathways of the Hippo signaling pathway, Transcriptional misregulation in cancer, MAPK signaling pathway, Proteoglycans in cancer (Table 8). Subsequently, through PPI protein coexpression analysis, 55 coexpression pairs were found, of which KIF11 was the most over-expressed protein (Figure 4).

Discussion
In this study, we have proved through experiments that LINC00648 is highly expressed in lung cancer patients. Patients with high expression showed at stage III+IV, lymphatic metastasis and the probability of low differentiation significantly increased. and patients showed poor prognosis. LINC00648 was expected to become a potential biomarker for the diagnosis and pro-
In this study, we detected the expression of LINC00648 in serum samples of clinical patients and found that it increased significantly in patients with lung cancer, and further compared the expression in patients with different stages. We found that the expression of LINC00648 also showed an upward trend with the continuous increase of TNM stages. LINC00648 had a high clinical diagnostic value in the diagnosis of lung cancer and early lung cancer and was expected to become a potential diagnostic marker of lung cancer by the ROC curve. Lung cancer is one of the top three malignant tumors in clinical mortality and morbidity. Prognosis and survival of lung cancer patients have always been a difficult problem for clinicians to solve (27). Previous studies found that (28) early diagnosis can effectively improve the survival rate of lung cancer patients, but there were relatively few markers for predicting the survival prognosis of lung cancer patients. In this study, we have followed up the patients and compared the expression of LINC00648 in patients' serum according to the survival status of the patients. It was found that the expression of LINC00648 in patients' serum in the death group was obviously increased. Moreover, the area of LINC00648 under the prediction death curve of lung cancer patients was more than 0.7 through ROC curve analysis, which has certain clinical value. After further observation of the survival of patients in the high and low expression groups, it was found that the overall survival rate of patients in the high expression group was significantly reduced, which suggested that LINC00648 could be used as a potential observation indicator for the survival of lung cancer patients, and also found that LINC00648 could be used as an independent prognostic factor for lung cancer patients through Cox regression analysis.
Through the above research, we could preliminarily explain that LINC00648 has high clinical value in the diagnosis and prognosis of lung cancer, but the relevant mechanism is still unclear. Therefore, in this study, we determined the possible relevant mechanisms through bioinformatics, laying the foundation for our subsequent research. In this study, we first predicted miR that may have binding sites with LINC00648, they further predicted the potential binding mRNA of miR and visualized the ceRNA network map. In recent years, studies found that crosstalk occurs between LncRNAs and mRNAs through competition and sharing of miR response elements (MRE), called ceRNAs (29). And there were many studies indicated that (30,31) ceRNAs play different roles in the occurrence and development of various diseases. In this study, we found a total of 17 miRs that may be combined with LINC00648.
Through literature search, we found that miR-223-3p, miR-199a-5p, miR-199b-5p are relatively more studied in lung cancer (32)(33)(34). Further, online prediction software was used to analyze the predicted miR downstream mRNA, and 78 potential mRNA were found. Through GO enrichment analysis, it was found that LINC00648 may participate in the biological functions of SMAD binding, Transcription Al Activator Activity, RNA polymerase II transcription regulatory region sequence-specific DNA binding, PDZ domain binding, cytokine binding, activin binding, RNA polymerase II activating transcription factor binding, transforming growth factor-beta receptor binding. However, KEGG enrichment analysis found that LINC00648 participated in the occurrence of the Hippo signaling pathway, Transcriptional misregulation in cancer, MAPK signaling pathway, Proteoglycans in cancer. Many previous kinds of literature reported that (35)(36)(37)(38) Hippo signaling pathway, Transcriptional misregulation in cancer, MAPK signaling pathway, Proteoglycans in cancer signaling pathways play an important role in the occurrence and development of lung cancer, and these studies are the direction and cornerstone of our future research. Subsequently, we also predicted PPI protein co-expression of the predicted mRNA and found 55 coexpression pairs, of which KIF11 was the protein with the most over-expression pairs. KIF11 is called member 11 of the kinesin family and is located on a human 10q23.33 chromosome. Previous studies found that the differential expression of (39) KIF11 is closely related to the poor prognosis of lung cancer, but the specific mechanism has not been clearly studied, which may be our future research direction (40)(41)(42)(43)(44)(45)(46)(47)(48)(49)(50)(51)(52)(53)(54)(55).
In this study, we verified the clinical value of LINC00648 in lung cancer through the TCGA database and clinical experiments and predicted the potential mechanism of LINC00648 through bioinformatics analysis. However, this study still has certain limitations. First, we did not collect serum samples of patients with benign lung lesions. Whether LINC00648 has diagnostic value in differentiating patients with benign lung lesions is unclear. Secondly, although we analyzed the potential mechanism of LINC00648 through bioinformatics, we did not carry out further experiments to verify it. Finally, more samples are needed to further verify whether LINC00648 can be popularized clinically for the small sample size. Therefore, we hope to carry out basic experiments in future research and collect different samples and cases to further demonstrate the results of this study. To sum up, high expression of LINC00648 in lung cancer patients indicates poor pro-Cell Mol Biol (Noisy le Grand) 2020 | Volume 66 | Issue 3 Ling Zhao et al.
gnosis of patients and is expected to become a potential diagnostic marker for lung cancer.
Author contribution YL and CW wrote the manuscript, analyzed and interpreted the patient general data. JL performed PCR. YT was responsible for observation indicators analysis. All authors read and approved the final manuscript.