The basic data types in R are logical, integer, double, complex and character. By default, any number is considered a double and to explicitly specify an integer the letter L must be appended:
a <- 26
typeof(a)
## [1] "double"
b <- 26L
typeof(b)
## [1] "integer"
Logical values are written TRUE
and FALSE
.
Otherwise, the classical arithmetic operators and rules apply as in any
language for numerical data types. Integer arithmetic operators are
available: %/%
for the integer division and %%
for its remainder.
23/3
## [1] 7.666667
23L/3L
## [1] 7.666667
23L %/% 3L
## [1] 7
23L %% 3L
## [1] 2
The power can be written with two different notations:
5**4
## [1] 625
5^4
## [1] 625
Character literals are delimited by simple or double quotes:
"message"
## [1] "message"
'message'
## [1] "message"
Comparison operators enable the computation of logical values:
6 < 45
## [1] TRUE
23 >= 45
## [1] FALSE
Logical operators work as usual:
34 < 5 && 6 > 7
## [1] FALSE
34 < 23 || 7 > 5
## [1] TRUE
!(23 < 4)
## [1] TRUE
Variable identifiers are made of letters, dots (.), underscores (_),
and digits. The first character must be a letter or a dot. The
assignment operator can be written <-
(preferred) or
=
.
my.age <- 27
code <- 34+my.age*3
name <- "Robert"
One central data structure in R is the atomic vector. An
atomic vector is a sequence of values that are all of the same basic
type. Vectors can be generated by the function c()
(short
for combine):
ages <- c(23,24,45,21,34,65,43,77,12,14,24)
ages[3]
## [1] 45
ages[6]
## [1] 65
Vectors are natural in R, meaning that many functions can take single values or vectors as input. In such a case, they usually apply to each element of the input vector and return a vector with each result as output (functions apply component-wise):
sqrt(64)
## [1] 8
sqrt(c(1,4,9,16))
## [1] 1 2 3 4
sin(ages)
## [1] -0.8462204 -0.9055784 0.8509035 0.8366556 0.5290827 0.8268287
## [7] -0.8317747 0.9995202 -0.5365729 0.9906074 -0.9055784
The arithmetic and logical operators also apply component-wise to vectors:
v <- c(2,45,3,4,1,90,233)
v/2
## [1] 1.0 22.5 1.5 2.0 0.5 45.0 116.5
v+5
## [1] 7 50 8 9 6 95 238
c(1,2,3,4)*c(3,2,3,2)
## [1] 3 4 9 8
c(1,2,3,4)*c(3,2,3) # lengths must match obviously!
## Warning in c(1, 2, 3, 4) * c(3, 2, 3): la taille d'un objet plus long n'est pas
## multiple de la taille d'un objet plus court
## [1] 3 4 9 12
ages < 27
## [1] TRUE TRUE FALSE TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE
Note that real linear algebra operators are written in a special way
to avoid any ambiguity. For instance, the matrix product is
%*%
.
Component-wise logical operators between logical vectors are written with single symbol. Compare:
ages<33 & ages>20
## [1] TRUE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
ages<33 && ages>20 # only perform the operation between the first elements of each vector
## Warning in ages < 33 && ages > 20: ‘length(x) = 11 > 1’ dans la conversion
## automatique vers ‘logical(1)’
## Warning in ages < 33 && ages > 20: ‘length(x) = 11 > 1’ dans la conversion
## automatique vers ‘logical(1)’
## [1] TRUE
Why should we bother using &&
and
||
for scalars that are anyway regarded as vectors of
length 1 in R? We could always use &
and
|
.
There is indeed a difference: with scalar values,
&&
and ||
yield faster evaluation of
the expression usually as unnecessary computations are avoided. For
instance, in an expression like a<10 && c>23
,
evaluating whether c>23
is useless as soon as
a<10
is FALSE
. This optimization does not
happen with &
and |
.
R has been designed to facilitate statistical computations and graphical display of statistical data. It comes with a powerful graphical package and all kinds of standard statistical functions.
rand.num <- rnorm(1000, mean=2, sd=1.5) # generates 1000 pseudo random numbers following a normal distribution with mean 2 and sd 1.5
hist(rand.num)
summary(rand.num)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -3.2378 0.9297 1.9049 1.9125 2.9887 6.3051
mean(rand.num)
## [1] 1.912537
sd(rand.num)
## [1] 1.52041
median(rand.num)
## [1] 1.904866
quantile(rand.num,prob=0.25)
## 25%
## 0.929745
Check the definitions of the seven functions above to learn about their many options and parameters.