Like every other programing language, R have control structures that allow you control the flow of your code execution.
If, else for testing a condition. else section is optional.
if it’s all about assigning a value to a variable, you can do like this
for for executing a loop for a fixed number of times. It takes a variable and assign it successive values from a sequence or vector.
while for executing a loop while a condition is true. It begins by testing that condition, if it is true, the loop body will execute, if not, R will skip the loop.
repeat for executing an infinite loop; the only way to exit the loop is to call break
break for breaking the execution of a loop and continue from the next line of code after the loop (just like in the previous example)
next is used to skip an iteration of a loop
Writing multiple lines of code on the command-line interactive environment is hard. I have used the script editor to write the code in this post and then copied it to R console.
Loop functions
Loop functions is so similar to loops. It just more compact and easy to use on command line.
lapply loop over a list and evaluate a function on each element. If the first argument wasn’t a list, it will be coerced to a list (using as.list). lapply always returns a list. Any arguments passed to lapply beyonf the FUN parameter, will be assigned to the ellipsis and then passed as parameters to FUN. FUN can be an anonymous function.
sapply will try to simplify the result of lapply if possible. If the result is a list where every element is length 1, then it returns a vector. If the result is a list where every element is a vector of the same length (>1), it returns a matrix. If it can’t figure things out, it returns a list.
apply apply a function over the margins of an array. Often used to apply a function to rows and columns of a matrix. It takes as parameters the array; margin which indicates which dimension will be used as parameter to the function applied; and the function to be applied. In the example below, when passing 2 for the margin it means apply the function to columns, so we got a result of vector with length 10 containing the sum of each column. When we passed 1 for the margin, it means apply the function to rows, so we got a result of vector with length 20 containing the sum of each row.
for sums and means of matrix dimensions, we have some shortcuts:
- rowSums = apply(x, 1, sum)
- rowMeans = apply(x, 1, mean)
- colSums = apply(x, 2, sum)
- colMeans = apply(x, 2, mean)
tapply apply a function over subsets of a vector. It is equal to using split and lapply together. split take a vector or other objects and splits it into groups determined by a factor or list of factors.
mapply is a multivariate version of lapply. Each element will in 1:4 repeated by the corresponding number in 4:1.
In this post we introduced the basic control structured in R. Its almost the same in any c-like programming language.
Stay tuned for more R notes.