--- title: "R basics part 1" author: "Jacques Colinge" date: "11/29/2021" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` \ ## A. Data types The basic data types in R are logical, integer, double, complex and character. By default, any number is considered a double and to explicitly specify an integer the letter L must be appended: ```{r} a <- 26 typeof(a) b <- 26L typeof(b) ``` Logical values are written ```TRUE``` and ```FALSE```. Otherwise, the classical arithmetic operators and rules apply as in any language for numerical data types. Integer arithmetic operators are available: ```%/%``` for the integer division and ```%%``` for its remainder. ```{r} 23/3 23L/3L 23L %/% 3L 23L %% 3L ``` The power can be written with two different notations: ```{r} 5**4 5^4 ``` Character literals are delimited by simple or double quotes: ```{r} "message" 'message' ``` Comparison operators enable the computation of logical values: ```{r} 6 < 45 23 >= 45 ``` Logical operators work as usual: ```{r} 34 < 5 && 6 > 7 34 < 23 || 7 > 5 !(23 < 4) ``` Variable identifiers are made of letters, dots (.), underscores (_), and digits. The first character must be a letter or a dot. The assignment operator can be written ```<-``` (preferred) or ```=```. ```{r} my.age <- 27 code <- 34+my.age*3 name <- "Robert" ``` One central data structure in R is the *atomic vector*. An atomic vector is a sequence of values that are all of the same basic type. Vectors can be generated by the function ```c()``` (short for combine): ```{r} ages <- c(23,24,45,21,34,65,43,77,12,14,24) ages[3] ages[6] ``` Vectors are natural in R, meaning that many functions can take single values or vectors as input. In such a case, they usually apply to each element of the input vector and return a vector with each result as output (functions apply component-wise): ```{r} sqrt(64) sqrt(c(1,4,9,16)) sin(ages) ``` The arithmetic and logical operators also apply component-wise to vectors: ```{r} v <- c(2,45,3,4,1,90,233) v/2 v+5 c(1,2,3,4)*c(3,2,3,2) c(1,2,3,4)*c(3,2,3) # lengths must match obviously! ages < 27 ``` Note that real linear algebra operators are written in a special way to avoid any ambiguity. For instance, the matrix product is ```%*%```. Component-wise logical operators between logical vectors are written with single symbol. Compare: ```{r} ages<33 & ages>20 ages<33 && ages>20 # only perform the operation between the first elements of each vector ``` Why should we bother using ```&&``` and ```||``` for scalars that are anyway regarded as vectors of length 1 in R? We could always use ```&``` and ```|```. There is indeed a difference: with scalar values, ```&&``` and ```||``` yield faster evaluation of the expression usually as unnecessary computations are avoided. For instance, in an expression like ```a<10 && c>23```, evaluating whether ```c>23``` is useless as soon as ```a<10``` is ```FALSE```. This optimization does not happen with ```&``` and ```|```. \ ## B. Elementary plotting and summarizing R has been designed to facilitate statistical computations and graphical display of statistical data. It comes with a powerful graphical package and all kinds of standard statistical functions. ```{r} rand.num <- rnorm(1000, mean=2, sd=1.5) # generates 1000 pseudo random numbers following a normal distribution with mean 2 and sd 1.5 hist(rand.num) summary(rand.num) mean(rand.num) sd(rand.num) median(rand.num) quantile(rand.num,prob=0.25) ``` Check the definitions of the seven functions above to learn about their many options and parameters.