Learning Objectives
Following this assignment students should be able to:
- use, modify, and write custom functions
- use the output of one function as the input of another
- understand and use the basic relational operators
- use an
ifstatement to evaluate conditionals
Reading
-
Topics
- Functions
- Conditionals
-
Readings
Lecture Notes
Exercises
Writing Functions (5 pts)
1. Copy the following function (which converts weights in pounds to weights in grams) into your assignment and replace the
________with the variable names for the input and output.convert_pounds_to_grams <- function(________) { grams = 453.6 * pounds return(________) }Use the function to calculate how many grams there are in 3.75 pounds.
2. Copy the following function (which converts temperatures in Fahrenheit to temperatures in Celsius) into your assignment and replace the
________with the needed commands and variable names so that the function returns the calculated value for Celsius.convert_fahrenheit_to_celsius <- ________(________) { celsius = (fahrenheit - 32) * 5 / 9 ________(________) }Use the function to calculate the temperature in Celsius if the temperature in Fahrenheit is 80°F.
3. Write a function named
doublethat takes a number as input and outputs that number multiplied by 2. Run it with an input of 512.4. Write a function named
Expected outputs for Writing Functionspredictionthat takes three arguments,x,a, andb, and returnsyusingy = a + b * x(like a prediction from a simple linear model). Run it withx= 12,a= 6, andb= 0.8.Use and Modify (10 pts)
The length of an organism is typically strongly correlated with its body mass. This is useful because it allows us to estimate the mass of an organism even if we only know its length. This relationship generally takes the form:
mass = a * lengthb
Where the parameters
aandbvary among groups. This allometric approach is regularly used to estimate the mass of dinosaurs since we cannot weigh something that is only preserved as bones.The following function estimates the mass of an organism in kg based on its length in meters for a particular set of parameter values, those for Theropoda (where
ahas been estimated as0.73andbhas been estimated as3.63; Seebacher 2001).get_mass_from_length_theropoda <- function(length){ mass <- 0.73 * length ^ 3.63 return(mass) }- Use this function to print out the mass of a Theropoda that is 16 m long based on its reassembled skeleton.
- Create a new version of this function called
get_mass_from_length()that takeslength,aandbas arguments and uses the following code to estimate the massmass <- a * length ^ b. Use this function to estimate the mass of a Sauropoda (a = 214.44,b = 1.46) that is 26 m long.
Combining Functions (10 pts)
Write two functions:
- One called
get_mass_from_length()that takeslength(in m),aandbas arguments, has the following default argumentsa = 39.9andb = 2.6, uses the following code to estimate the mass (in kg)mass <- a * length ^ b, and returns it. (This function is the answer to the Default Arguments exercise, so feel free to copy over your answer if you’ve done that exercise). - One called
convert_kg_to_poundsthat converts kilograms into pounds (pounds = 2.205 * kg)
-
Use these two functions (each function should be called separately) to estimate the weight, in pounds, of a 12 m long Stegosaurus with
a = 10.95andb = 2.64(The estimatedaandbvalues for Stegosauria from Seebacher 2001). -
Use these two functions (each function should be called separately) to estimate the weight, in pounds, of a 4 m long dinosaur using the default parameters.
- One called
Choice Operators (10 pts)
Create the following variables.
w <- 10.2 x <- 1.3 y <- 2.8 z <- 17.5 colors <- c("red", "blue", "green") masses <- c(45.2, 36.1, 27.8, 81.6, 42.4) dna1 <- "attattaggaccaca" dna2 <- "attattaggaacaca"Use them to print whether or not the following statements are
TRUEorFALSE.wis greater than 10"green"is incolorsxis greater thany- Each value in
massesis greater than 40. - 2 *
x+ 0.2 is equal toy dna1is the same asdna2dna1is not the same asdna2wis greater thanx, oryis greater thanzxtimeswis between 13.2 and 13.5 (there is no way to indicated “between” in R so to do this we have to separately check if the number is greater than the minimum value and less than the maximum value, combining these two conditions with&)- Each mass in
massesis between 30 and 50.
Simple If Statement (10 pts)
To determine if a file named
thesis_data.csvexists in your working directory you can use the code to get a list of available files and directories:list.files()- Use the
%in%operator to write a conditional statement that checks to see ifthesis_data.csvis in this list. - Write an
ifstatement that loads the file usingread_csv()only if the file exists. - Add an
elseclause that prints “OMG MY THESIS DATA IS MISSING. NOOOO!!!!” if the file doesn’t exist. - Make sure your actual thesis data is backed up.
- Use the
Size Estimates by Name (20 pts)
You’re going to write a function to estimate a dinosaur’s mass based on its length. The general form of the equation for doing this is:
mass <- a * length ^ b
The parameters
aandbvary by the group of dinosaurs, so you decide to create a function that lets you specify which dinosaur group you need to estimate the size of by name and then have the function automatically choose the right parameters.Create a new function
get_mass_from_length_by_name()that takes two arguments, thelengthand the name of the dinosaur group. Inside this function useif/else if/elsestatements to check to see if the name is one of the following values and if so use the associated equation to estimate the species mass:- Stegosauria:
mass = 10.95 * length ^ 2.64(Seebacher 2001) - Theropoda:
mass = 0.73 * length ^ 3.63(Seebacher 2001) - Sauropoda:
mass = 214.44 * length ^ 1.46(Seebacher 2001)
If the name is not any of these values the function should return
NA.Run the function for:
- A Stegosauria that is 10 meters long.
- A Theropoda that is 8 meters long.
- A Sauropoda that is 12 meters long.
- A Ankylosauria that is 13 meters long.
Challenge (optional): If the name is not one of values that have
aandbvalues print warning that it doesn’t know how to convert that group that includes that groups name in a message like “No known estimation for Ankylosauria”. You can use the functionwarning("your warning text")to print a warning and the functionpaste()to combine text with a value from a variablepaste("My name is", name). Doing this successfully will modify your answer to (4), which is fine.Challenge (optional): Change your function so that it uses two different equations for Stegosauria. When Stegosauria is greater than 8 meters long use the equation above. When it is less than 8 meters long use
Expected outputs for Size Estimates by Namemass = 8.5 * length ^ 2.8`. Run the function for a Stegosauria that is 6 meters long.- Stegosauria:
DNA or RNA (15 pts)
Write a function that determines if a sequence of base pairs is DNA, RNA, or if it is not possible to tell given the sequence provided. RNA has the base Uracil (
"u") instead of the base Thymine ("t"), so sequences with u’s are RNA, sequences with t’s are DNA, and sequences with neither are unknown.You can check if a string contains a character (or a longer substring) in R using the
str_detectfunction from thestringrpackage:str_detect(string, substring), which will returnTRUEifsubstringis present instring. Sostr_detect(sequence, "u")will check if the string in thesequencevariable has the baseu.Name the function
dna_or_rna()and have it takesequenceas an argument. Have the function return one of three outputs:"DNA","RNA", or"UNKNOWN". Call the function on each of the following sequences.seq1 <- "ttgaatgccttacaactgatcattacacaggcggcatgaagcaaaaatatactgtgaaccaatgcaggcg" seq2 <- "gauuauuccccacaaagggagugggauuaggagcugcaucauuuacaagagcagaauguuucaaaugcau" seq3 <- "gaaagcaagaaaaggcaggcgaggaagggaagaagggggggaaacc"Challenge (optional): Figure out how to make your function work with both upper and lower case letters, or even strings with mixed capitalization.
Expected outputs for DNA or RNAClimate Space Rewrite (20 pts)
This is a follow up to Climate Space.
Producing a plot of occurrences on the available climate space for each of the three species required a lot of repetition of very similar code. Whenever this happens, it is usually an indication that a function could be used instead. Such functions reduce the repetition in producing the three species plots, which enables you to save time and prevent errors by not having to rewrite the same code multiple times.
-
Create a function to download occurrence data and extract the corresponding climate data, which should return a dataset of all the bioclim variables for a single species. Because the latitude and longitude columns for each occurrence dataset have different names you can select and set them to the same name using the column index, instead of the column name, to get only those columns (e.g.,
select(longitude = 2, latitude = 3). -
Create a second function for plotting the occurrences for a single species onto the available climate space, then use this function to generate separate plots for each of the three tree species.
-