1. Fondamentaux de R

1.1. Eléments généraux de programmation

[4]:
# install.packages(c("Hmisc", "FactoMineR"))
[39]:
# exemple fonction
percentage <- function(x, y, ...){
    result <- (x*100)/y
    rounded_res <-round(result, 2)
    cat(x, "is", rounded_res,"percent of", y)
}
[6]:
percentage(24,256)
24 is 9.38 percent of 256
[9]:
nchar("a b")
3
[10]:
load_v <- scan(file="https://bit.ly/3mlXPoY", what="char", sep="\n")
load_v
  1. 'This is an example character vector.'
  2. 'This is another example character vector.'
  3. 'This is yet another example character vector.'
[29]:
length(load_v)
str(load_v)
typeof(load_v)
3
 chr [1:3] "This is an example character vector." ...
'character'
[31]:
# matrix
values <- c(30708, 28481, 140, 141012, 181127, 907, 535316, 857726, 14529)
row_names <- c("each + N", "every + N", "each and every + N")
col_names <- c("BNC", "COCA", "GloWbE")
mat <- matrix(values, 3, 3, dimnames=list(row_names, col_names))
mat
A matrix: 3 × 3 of type dbl
BNCCOCAGloWbE
each + N30708141012535316
every + N28481181127857726
each and every + N 140 907 14529
[34]:
# lire un dataframe
df <- read.table("https://bit.ly/2HzfquJ", header=TRUE, sep="\t")
head(df)
A data.frame: 6 × 9
corpus_filecorpus_file_infomatchintensifierconstructionadjectivesyllable_count_adjNPsyllable_count_NP
<fct><fct><fct><fct><fct><fct><int><fct><int>
1KBF.xmlS conv a quite ferocious mess quite preadjectivalferocious 3mess 1
2AT1.xmlW biographyquite a flirty person quite predeterminerflirty 2person 2
3A7F.xmlW misc a rather anonymous name ratherpreadjectivalanonymous 4name 1
4ECD.xmlW commerce a rather precarious footholdratherpreadjectivalprecarious4foothold2
5B2E.xmlW biographyquite a restless night quite predeterminerrestless 2night 1
6AM4.xmlW misc a rather different turn ratherpreadjectivaldifferent 3turn 1
[35]:
# construire un dataframe
corpus <- c("BNC", "COCA", "Hansard", "Strathy")
size <- c(100, 450, 1600, 50)
variety <- c("GB", "US", "GB", "CA")
period <- c("1980s-1993", "1990-2012", "1803-2005", "1970s-2000s")
df.manual <- data.frame(size, variety, period, row.names = corpus)
df.manual
A data.frame: 4 × 3
sizevarietyperiod
<dbl><fct><fct>
BNC 100GB1980s-1993
COCA 450US1990-2012
Hansard1600GB1803-2005
Strathy 50CA1970s-2000s
[36]:
# lire et sauvegarder rdata (rds)
# saveRDS(df.manual.2, file="/pathtoyour/workingdirectory/df.manual.2.rds")
# readRDS("/pathtoyour/workingdirectory/df.manual.2.rds") # Mac
[37]:
# exemple for
for (i in letters[1:5]) print(i)
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[38]:
# exemple if elseif else
integer <- c(-7, -4, -1, 0, 3, 12, 14)
for (i in 1:length(integer)) {
  if (integer[i] < 0) print(paste(integer[i], "-> negative")) # first if statement
  else if (integer[i] == 0) print(paste(integer[i], "-> zero")) # second (nested) if statement
  else print(paste(integer[i], "-> positive"))
}
[1] "-7 -> negative"
[1] "-4 -> negative"
[1] "-1 -> negative"
[1] "0 -> zero"
[1] "3 -> positive"
[1] "12 -> positive"
[1] "14 -> positive"

1.2. References

Desagulier, Guillaume. 2016. “A Lesson from Associative Learning: Asymmetry and Productivity in Multiple-Slot Constructions.” Journal Article. Corpus Linguistics and Linguistic Theory 12 (1).

Haspelmath, Martin. 2011. “The Indeterminacy of Word Segmentation and the Nature of Morphology and Syntax.” Folia Linguistica 45 (1): 31–80.

https://corpling.modyco.fr/workshops/M2TAL/1.R.fundamentals.html#2_R_scripts