1. Fondamentaux de R¶
1.1. Eléments généraux de programmation¶
[4]:
# install.packages(c("Hmisc", "FactoMineR"))
[39]:
# exemple fonction
percentage <- function(x, y, ...){
result <- (x*100)/y
rounded_res <-round(result, 2)
cat(x, "is", rounded_res,"percent of", y)
}
[6]:
percentage(24,256)
24 is 9.38 percent of 256
[9]:
nchar("a b")
3
[10]:
load_v <- scan(file="https://bit.ly/3mlXPoY", what="char", sep="\n")
load_v
- 'This is an example character vector.'
- 'This is another example character vector.'
- 'This is yet another example character vector.'
[29]:
length(load_v)
str(load_v)
typeof(load_v)
3
chr [1:3] "This is an example character vector." ...
'character'
[31]:
# matrix
values <- c(30708, 28481, 140, 141012, 181127, 907, 535316, 857726, 14529)
row_names <- c("each + N", "every + N", "each and every + N")
col_names <- c("BNC", "COCA", "GloWbE")
mat <- matrix(values, 3, 3, dimnames=list(row_names, col_names))
mat
| BNC | COCA | GloWbE | |
|---|---|---|---|
| each + N | 30708 | 141012 | 535316 |
| every + N | 28481 | 181127 | 857726 |
| each and every + N | 140 | 907 | 14529 |
[34]:
# lire un dataframe
df <- read.table("https://bit.ly/2HzfquJ", header=TRUE, sep="\t")
head(df)
| corpus_file | corpus_file_info | match | intensifier | construction | adjective | syllable_count_adj | NP | syllable_count_NP | |
|---|---|---|---|---|---|---|---|---|---|
| <fct> | <fct> | <fct> | <fct> | <fct> | <fct> | <int> | <fct> | <int> | |
| 1 | KBF.xml | S conv | a quite ferocious mess | quite | preadjectival | ferocious | 3 | mess | 1 |
| 2 | AT1.xml | W biography | quite a flirty person | quite | predeterminer | flirty | 2 | person | 2 |
| 3 | A7F.xml | W misc | a rather anonymous name | rather | preadjectival | anonymous | 4 | name | 1 |
| 4 | ECD.xml | W commerce | a rather precarious foothold | rather | preadjectival | precarious | 4 | foothold | 2 |
| 5 | B2E.xml | W biography | quite a restless night | quite | predeterminer | restless | 2 | night | 1 |
| 6 | AM4.xml | W misc | a rather different turn | rather | preadjectival | different | 3 | turn | 1 |
[35]:
# construire un dataframe
corpus <- c("BNC", "COCA", "Hansard", "Strathy")
size <- c(100, 450, 1600, 50)
variety <- c("GB", "US", "GB", "CA")
period <- c("1980s-1993", "1990-2012", "1803-2005", "1970s-2000s")
df.manual <- data.frame(size, variety, period, row.names = corpus)
df.manual
| size | variety | period | |
|---|---|---|---|
| <dbl> | <fct> | <fct> | |
| BNC | 100 | GB | 1980s-1993 |
| COCA | 450 | US | 1990-2012 |
| Hansard | 1600 | GB | 1803-2005 |
| Strathy | 50 | CA | 1970s-2000s |
[36]:
# lire et sauvegarder rdata (rds)
# saveRDS(df.manual.2, file="/pathtoyour/workingdirectory/df.manual.2.rds")
# readRDS("/pathtoyour/workingdirectory/df.manual.2.rds") # Mac
[37]:
# exemple for
for (i in letters[1:5]) print(i)
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[38]:
# exemple if elseif else
integer <- c(-7, -4, -1, 0, 3, 12, 14)
for (i in 1:length(integer)) {
if (integer[i] < 0) print(paste(integer[i], "-> negative")) # first if statement
else if (integer[i] == 0) print(paste(integer[i], "-> zero")) # second (nested) if statement
else print(paste(integer[i], "-> positive"))
}
[1] "-7 -> negative"
[1] "-4 -> negative"
[1] "-1 -> negative"
[1] "0 -> zero"
[1] "3 -> positive"
[1] "12 -> positive"
[1] "14 -> positive"
1.2. References¶
Desagulier, Guillaume. 2016. “A Lesson from Associative Learning: Asymmetry and Productivity in Multiple-Slot Constructions.” Journal Article. Corpus Linguistics and Linguistic Theory 12 (1).
Haspelmath, Martin. 2011. “The Indeterminacy of Word Segmentation and the Nature of Morphology and Syntax.” Folia Linguistica 45 (1): 31–80.
https://corpling.modyco.fr/workshops/M2TAL/1.R.fundamentals.html#2_R_scripts