Glossary
Last updated on 2023-05-02 | Edit this page
Glossary
{:auto_ids}
accession : a unique identifier assigned to each sequence or set of
sequences
- BLAST
 - The Basic Local Alignment Search Tool at NCBI that searches for similarities between known and unknown biomolecules like DNA
 - categorical variable
 - Variables can be classified as categorical (aka, qualitative) or quantitative (aka, numerical). Categorical variables take on a fixed number of values that are names or labels.
 - cleaned data
 - data that has been manipulated post-collection to remove errors or inaccuracies, introduce desired formatting changes, or otherwise prepare the data for analysis
 - conditional formatting
 - formatting that is applied to a specific cell or range of cells depending on a set of criteria
 - CSV (comma separated values) format
 - a plain text file format in which values are separated by commas
 - factor
 - a variable that takes on a limited number of possible values (i.e. categorical data)
 - Gb
 - gigabyte of file storage or file size
 - Gbase
 - a gigabase represents one billion nucleic acid bases (Gbp may indicate one billion base pairs of nucleic acid)
 - headers
 - names at tops of columns that are descriptive about the column contents (sometimes optional)
 - metadata
 - data which describes other data
 - NGS
 - common acronym for “Next Generation Sequencing” currently being replaced by “High Throughput Sequencing”
 - null value
 - a value used to record observations missing from a dataset
 - observation
 - a single measurement or record of the object being recorded (e.g. the weight of a particular mouse)
 - plain text
 - unformatted text
 - quality assurance
 - any process which checks data for validity during entry
 - quality control
 - any process which removes problematic data from a dataset
 - raw data
 - data that has not been manipulated and represents actual recorded values
 - rich text
 - formatted text (e.g. text that appears bolded, colored or italicized)
 - string
 - a collection of characters (e.g. “thisisastring”)
 - TSV (tab separated values) format
 - a plain text file format in which values are separated by tabs
 - variable
 - a category of data being collected on the object being recorded (e.g. a mouse’s weight)