Editing
Final Report/Thesis 2019
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Background== ===DNA=== DNA is the hereditary material which stores the genetic information in humans. There are two types of DNA in human beings, one is known as nuclear DNA which is located in cell nucleus and another type is mitochondrial DNA which is located in the mitochondria. This project only focuses on the analysis of nuclear DNA. DNA stores genetic information as a sequence built up with four types of nitrogen bases which are adenine (A), guanine (G), cytosine (C), and thymine (T) [2]. Also, a sugar molecule and a phosphate molecule are attached to each nitrogen base to form a molecule called nucleotide. The bases would pair up (A with T and C with G) and multiple nucleotides are placed in two strands to form a double helix which looks like a spiral [2]. In general, a DNA is a genetic sequence formed by multiple base pairs. The genetic instructions of building and maintaining an organism are obtained from the order of these base pairs [2]. There are about 3 billion bases in human DNA, in which more than 99% of the bases are common in all human beings, and the physiological differences among people depends on these 1% DNA. [[File:DNA.png|thumb|300px|center|Figure 2: DNA structure]] ===Chromosome=== [[File:chromosome1.png|thumb|300px|Figure 3: Chromosome structure]] [[File:chromosome2.png|thumb|300px|Figure 4: 23 pairs of chromosomes in human]] Chromosome is an integrated package of DNA molecules. It has thread-like structures, and DNA molecules are coiled up around hi stones proteins to form the structure. There are 23 pairs of chromosomes in human body's cell, which is 46 chromosomes in total. 22 pairs are called autosomes which are common for both males and females and the last 23rd pair is sex chromosomes which differ males and females. In this project, the DNA data analysis would only focus on autosomes. ===SNP=== Single nucleotide polymorphism (SNP) is a genetic variation among human beings. Each SNP represents a difference in a nucleotide which is a single DNA molecule. For instance, one SNP may replace a nucleotide of base guanine (G) with cytosine (C). These SNPs can be found nearly once in every 1,000 nucleotides on average in a person's DNA. Most SNPs do not effect health of owner. However, some of these variations may associate with diseases. ===DNA reference file=== A DNA reference file stores a group of SNPs data of owner's DNA. The format of DNA reference files used in this project are of the same format which is 23andMe company's file, where 23andMe is a company that attended to provide personal genetic information for customers by using advanced genetic analysis techniques and web-based interactive tools. A screenshot of a sample file is shown below. [[File:Dna_ref.png|thumb|center|300px|Figure 5: Sample DNA file from 23andMe]] As shown in the figure, there are 4 columns in the DNA reference file: rsid, chromosome, position and genotype. The rsid is a unique id used to identify a specific SNP. The format of rsid starts with βrsβ and followed by a number (eg. rs123456). These rsids are commonly used by researchers and databases. There is another special format of rsid that starts with βiβ and followed by a number (eg. i123456). This βiβ format is used internally by 23andMe to identify the unknown SNP and cannot be used in public database. The second column chromosome identify which chromosome the SNP belongs to (1st to 22nd chromosome). The third column, position, indicates positions of SNPs in owner's DNA sequence. The last column, genotype, is the column for base pairs of variants (A, T, G, C). Note that there are some cases where the genotype result for some SNPs are not able be provided and β--β would be displayed in genotype column. It is important to note that only the SNPs with identified base pairs can be used for DNA analysis.
Summary:
Please note that all contributions to Derek may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Derek:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information