Authors
Keywords
Abstract
There are 5 members of the human beta globin gene family. They have blood transportation function. There are some diseases related to the mutation in these gene. the common disease related to mutation these genes are sickle cell anemia, thalassemia etc. The structural and phylogenetic analysis is performed by using some bioinformatics tools. The gene sequence is retrieved and run under some tools to perform the specific analysis. The analysis performed include are similarity with other organism, determination of conserved domains and motifs, multiple sequence alignment, generation of phylogenetic tree, identification of exons and introns and determination of restriction sites. And the tools used for these analyses are BLASTP, pfam, clustal omega, MEGA7, gene structure display tool and serial cloner respectively. This study came out with a result that the genes diverged due to duplication and then mutation. The gene contain mostly non coding portion but there are 3 exons and 2 introns. The length of 2 introns is far more than that of 3 exons. In future the techniques like CRISPR/cas gene editing tool will be used to treat the disease related the beta globin gene family.
Key words:human beta globin gene, bioinformatics study of HBB, genome wise study of HBB, Beta subunit, hemoglobin beta subunit.
Introduction:
Human beta globin is the globin protein. HBB genes encode beta globin. It’s usually functional along with alpha subunit[1]. These subunits combine and form a hemoglobin A protein which have function in the transfer of oxygen. There are four chain of globin subunit i.e., two alpha and two beta chain. There are 147 AA in hemoglobin and the M.W is of 15,867 Daltons[2]. The location of HBB gene is on chromosome number 11. The sequence of HBB gene be altered by any kind of mutation. The mutated sequence will produce the diseased protein which may cause sickle cell anemia or thalassemia. These diseases will become genetic disorders[3-5]. These proteins are present in the form of tetramer. The functional hemoglobin in human contains2-folddimerization of subunits. The interaction occurs between the HBB and HBA1. In initial dimerization one HBB and one HBA1 combine to form a complex. Then next interaction occurs between 2 dimers to form tetramer of two HBB and two HBA1[6-9].
There are 5 genes in this family in the humans, mouse, gorilla, and chimpanzee. The names of the genes are epsilon, gamma-g, gamma-A, delta and beta. These genes have almost the same function in all the species. The main globin in hemoglobin A in adult humans and the its function is the transfer of oxygen[10, 11]. These globin family is evolved by the duplication. First, alpha globin duplicated then mutation occur and new globin is formed and selected12. These newly formed again duplicated and beta globin arose. The alpha and beta globin have ancestral relation with Gnathostome and evolved 540 million years ago from Gnathostome cyclostome1314. The Gnathostome cyclostome evolved from vertebrates and invertebrates 800 million years ago. There are couple of domains of beta globin i.e., terahertz domain and 1H domain. The motif of eta globin is BRE motif[15-17].
The normal function of the beta globin is discussed but the its functional studies require different bioinformatic tools to know its complete function. The role of beta globin is determined by the several in-silico studies of the gene family. These studies revealed that the gene family contain couple of functional domains and a motif. The mutation can alter the structure and function of these gene family’s product. The DNA sequencing of several species revealed that the globin gene family is diverged from several genes of different species. globin gene family diverged from the leghemoglobin plants to hemoglobin and myoglobin in animals. The proteins encoded by all the members of the family have a same function i.e., the oxygen transformation[18-21].
The current studies involve the treatment of diseases caused by the mutation in the beta globin gene family. The new techniques and methodologies are discovering every day. Gene editing is the most successful treatment for the genetic diseases like sickle cell disease because no cure available at a time for this disease. The gene editing is done by CRISPR/cas9 system. CRISPR/cas9 system includes the endonuclease and a guide RNA which produce the dsDNA break and the repair mechanism repairs the mutated genes. The mutation in the HBB genes which cause the several genetic diseases are planned to be treated by this system and involved in the current studies. The other approaches are also used for the treatment of genetic diseases like beta thalassemia like bone merrow transplantation but there is difficulty of finding the appropriate donor. After treatment there are chances of again arrival of the disease. So, this method has limitation therefor, the strategy involved in the treatment of this disease include the development of tool for the identification of the mutation. miRNA is also the therapeutic target in this strategy[22-28]29.
Materials and M ethod:
Sequence retrival :
The sequence of beta globin gene was retrieved from the NCBI. The human betaglobin gene was written in the search bar of home page of NCBI website. Then the sequence was saved in FASTA format.https://www.ncbi.nlm.nih.gov/nuccore/NC_000011.10?report=fasta&from=5225464&to=5227071&strand=trueThen the protein sequence was obtained of the saved gene. This was performed by using transcription and translation tools. Then the protein sequence was confirmed by using different tools.https://web.expasy.org/translate/. Then the BLASTP was performed of the derived protein sequenceagainst non-redundant sequences https://blast.ncbi.nlm.nih.gov/Blast.cgi#. Then multiple sequence alignment was performed. The cDNA was obtained and the chromosomal information was gathered. The conserved domains were confirmed. All these analyses were directly performed on the NCBI platform. The multiple sequence alignment and confirmation of conserved domain was performed by MEGA7.
Proposed names | Gene Locus | Protein accession # | RNA accession# | Exons | C h r # | ORF length | Amino acid length | Start of Genomic Location | Conserved domains in protein sequence |
HBBH | LOC110006319 | NP_000509.1 | NM_000518.5 | 3 | 11 | 441 | 147 | 5001 | cd08925 |
Identification of Consereved Motif :
There are some conserved sequences are always present in every protein. In evolutionary time period this sequence remained conserved or unchanged because any change in this region degrade the protein. So, these changes never selected. These regions were identified by using a tool known as pfam. The pfam was performed of the protein sequence derived of different species. The conserved motifs were identified https://www.genome.jp/tools/motif/https://www.genome.jp/tools/motif/.
Phylogenetic Analysis :
The multiple sequence alignment of the BLASTP product was performed using CLUSTALW of MEGA7 software. Then the phylogenetic tree was constructed by using the neighbor joining and maximum likelihood. Then the tree was constructed in the newick format and PNG file. This software showed the closely related and divergent sequences. The tree was made in rooted and unrooted way.
Determination of Gene Structure :
The gene structure, which include the determination of introns and exons, was determined by the gene structure display tool. This tool determined the presence of number of introns and exons. The sequence was uploaded in the FASTA format. First on the top of the home page FASTA format was selected. Then in first search box RNA sequence was uploaded. Then in next search bar the DNA sequence of the beta subunit of globin gene family is uploaded. Then after clicking the submit the gene structure was given.http://gsds.gao-lab.org/.
Restriction Site Identification :
The restriction sites were identified by using the serial cloner. First of all, the serial cloner was installed. The genomic sequence was uploaded on the serial cloner search bar. After processing the serial cloner i.e., by clicking on the sequence the restriction sites were identified. Restriction sites with which restriction enzyme it can be digested is appeared in the result.
RESULTS:
Determination of consereved domains :
The sequence of HBB gene was extracted from NCBI and downloaded in the FASTA format. Then the NCBI-BLASTP was performed. The homologous sequences of different organism were downloaded. Then the conserved domains were identified by using pfam. The results of pfam showed that the sequence contain 8 domains.
Phylogenetic analysis of hemoglobin beta subunit :
The phylogenetic analysis is performed for the protein sequences. First of all, multiple sequence alignment was performed of the sequences. The software which was used for the multiple sequence alignment was CLUSTAL omega. CLUSTAL omega was used of MEGA7. The result was also confirmed by the online CLUSTAL omega. The alignment result was saved in a file. The format of the file was MEGA7 format. Then this fie was uploaded on MEGA7 for the generation of phylogenetic tree.The tree was generated by both the methods i.e., maximum likelihood tree and neighbor joining tree. Then the tree was saved which showed the homology between the sequences obtained from different species.
Gene Structure Display :
We need to know the structure of gene like number of intones presence of exons etc. for this purpose we can use gene structure display server. To operate this server, we need two sequences i.e., CDS and DNA sequence od human beta globin. When the FASTA format submission option was selected then two search boxes appeared. In the first box CDS sequence was pasted and in the next box gene sequence was pasted in FASTA format. Then after clicking on submit result appeared. There are 3 exons and 2 introns. The length of 2 introns is far more than the 3 exons. Thus,mostly the sequence is non-coding.
Identification Of Restriction Sites :
Restriction enzymes are the class of enzymes which cut DNA at specific sites. These enzymes are naturally present in the bacteriophages. When viral DNA attacks these enzymes comes into role. They chop down the foreign DNA into pieces. This system is used in the various techniques of molecular biology, recombinant DNA technology and genetic engineering. But to use these enzymes we must know the site from which the specific enzyme cut. Because we cannot perform restriction digest until we don’t know the restriction site of the specific restriction enzyme. Then we use that enzyme which has the restriction site in the sequence of interest. So, to find the restriction site we use a software known as serial cloner. First of all, The genomic sequence was extracted from NCBI. This sequence was uploaded on the search bar of serial cloner application. Then after clicking on the sequence map the results appeared, showing the restriction sites with the names of enzymes which, cut at that specific site. There is map of the sequence with all the restriction sites and enzymes.
Discussion :
HBB genes is the subunit of human globin gene family. There are 5 members of beta globin gene family. There are 4 subunits of hemoglobin protein i.e., 2 alpha subunits and 2 beta subunits. The main function of the protein is the transportation of oxygenated blood from lungs to al the other parts of the body. Globin gene was duplicated many times. Initially alpha subunit duplicated and then beta subunit duplicated. Therefore, there are 4 subunits of globin gene
These genes diverged from leghemoglobin of insects and soyabean. The whole analysis is done to study the structure and function of the beta globin gene family. This gene family is present since ancient times and because first evidence is present in the insects and soyabean. There are several abnormalities present related to the mutation in these gene.
Conclusion :
There is complete structural study of beta subunit of globin gene family in this article. The gene family has a very important functional role in the transportation of oxygenated blood. There are 5 genes in the family which are linked to each other on chromosome number 11 in humans. The complete structural analysis includes the presence of conserved domain and motifs i.e., how many domains are present in the gene family. The presence of introns and exons is also studied. And the study told that there are 3 exons and 2 introns. There are many diseases related to the mutation in this gene. In future the new methodologies in the advance science will introduce the treatment of these diseases.
References
2.Storz, J.F., et al., Complex signatures of selection and gene conversion in the duplicated globin genes of house mice. Genetics, 2007. 177(1): p. 481-500.
3.Kwiatkowski, D.P., How malaria has affected the human genome and what human genetics can teach us about malaria. American journal of human genetics, 2005. 77(2): p. 171-192.
4.Goehler, H., et al., A protein interaction network links GIT1, an enhancer of huntingtin aggregation, to Huntington's disease. Molecular cell, 2004. 15(6): p. 853-865.
5.Goldberg, D.S. and F.P. Roth, Assessing experimentally derived interactions in a small world. Proceedings of the National Academy of Sciences, 2003. 100(8): p. 4372-4376.
6.Shaanan, B., Structure of human oxyhaemoglobin at 2·1resolution. Journal of Molecular Biology, 1983. 171(1): p. 31-59.
7.Sidore, C., et al., Genome sequencing elucidates Sardinian genetic architecture and augments association analyses for lipid and blood inflammatory markers. Nature Genetics, 2015. 47(11): p. 1272-1281.
8.Reed, F.A. and S.A. Tishkoff, African human diversity, origins and migrations. Current Opinion in Genetics & Development, 2006. 16(6): p. 597-605.
9.Olivieri, N.F., Z. Pakbaz, and E. Vichinsky, Hb E/beta-thalassaemia: a common & clinically diverse disorder. The Indian journal of medical research, 2011. 134(4): p. 522-531.
10.Modiano, D., et al., Haemoglobin C protects against clinical Plasmodium falciparum malaria. Nature, 2001. 414(6861): p. 305-308.
11.Piel, F.B., et al., The distribution of haemoglobin C and its prevalence in newborns in Africa. Scientific Reports, 2013. 3(1): p. 1671.
12.Rubin, E.M., et al., Introduction and expression of the human Bs-globin gene in transgenic mice. American journal of human genetics, 1988. 42(4): p. 585-591.
13.Stamatoyannopoulos, G., et al., The molecular basis of blood diseases. 1987.
14.Tuan, D., et al., The" beta-like-globin" gene domain in human erythroid cells. Proceedings of the National Academy of Sciences, 1985. 82(19): p. 6384-6388.
15.Forrester, W.C., et al., Evidence for a locus activation region: the formation of developmentally stable hypersensitive sites in globin-expressing hybrids. Nucleic acids research, 1987. 15(24): p. 10159-10177.
16.Higgs, D.R., Do LCRs open chromatin domains? Cell, 1998. 95(3): p. 299-302.
17.Grosveld, F., et al., Position-independent, high-level expression of the human β-globin gene in transgenic mice. Cell, 1987. 51(6): p. 975-985.
18.Bulger, M. and M. Groudine, Looping versus linking: toward a model for long-distance gene activation. Genes & development, 1999. 13(19): p. 2465-2477.
19.Gribnau, J., et al., Intergenic transcription and developmental remodeling of chromatin subdomains in the human β-globin locus. Molecular cell, 2000. 5(2): p. 377-386.
20.Lander, E.S., et al., Initial sequencing and analysis of the human genome. 2001.
21.Waterston, R.H. and L. Pachter, Initial sequencing and comparative analysis of the mouse genome. Nature, 2002. 420(6915): p. 520-562.
22.Lindblad-Toh, K., et al., A high-resolution map of human evolutionary constraint using 29 mammals. Nature, 2011. 478(7370): p. 476-482.
23.Ponting, C.P. and R.C. Hardison, What fraction of the human genome is functional? Genome research, 2011. 21(11): p. 1769-1776.
24.Jones, F.C., et al., The genomic basis of adaptive evolution in threespine sticklebacks. Nature, 2012. 484(7392): p. 55-61.
25.Grossman, S.R., et al., Identifying recent adaptations in large-scale genomic data. Cell, 2013. 152(4): p. 703-713.
26.Fraser, H.B., Gene expression drives local adaptation in humans. Genome research, 2013. 23(7): p. 1089-1096.
27.Jeong, S., et al., The evolution of gene regulation underlies a morphological difference between two Drosophila sister species. Cell, 2008. 132(5): p. 783-793.
28.Cabeda, J.M., et al., Unexpected pattern of β‐globin mutations in β‐thalassaemia patients from northern Portugal. British journal of haematology, 1999. 105(1): p. 68-74.
29.Giardina, B., et al., The Multiple Functions of Hemoglobin. Critical Reviews in Biochemistry and Molecular Biology, 1995. 30(3): p. 165-196.