1. Fondamentaux de R¶

1.1. Eléments généraux de programmation¶

[4]:

# install.packages(c("Hmisc", "FactoMineR"))

[39]:

# exemple fonction
percentage <- function(x, y, ...){
    result <- (x*100)/y
    rounded_res <-round(result, 2)
    cat(x, "is", rounded_res,"percent of", y)
}

[6]:

percentage(24,256)

24 is 9.38 percent of 256

[9]:

nchar("a b")

[10]:

load_v <- scan(file="https://bit.ly/3mlXPoY", what="char", sep="\n")
load_v

'This is an example character vector.'
'This is another example character vector.'
'This is yet another example character vector.'

[29]:

length(load_v)
str(load_v)
typeof(load_v)

 chr [1:3] "This is an example character vector." ...

'character'

[31]:

# matrix
values <- c(30708, 28481, 140, 141012, 181127, 907, 535316, 857726, 14529)
row_names <- c("each + N", "every + N", "each and every + N")
col_names <- c("BNC", "COCA", "GloWbE")
mat <- matrix(values, 3, 3, dimnames=list(row_names, col_names))
mat

A matrix: 3 × 3 of type dbl
	BNC	COCA	GloWbE
each + N	30708	141012	535316
every + N	28481	181127	857726
each and every + N	140	907	14529

[34]:

# lire un dataframe
df <- read.table("https://bit.ly/2HzfquJ", header=TRUE, sep="\t")
head(df)

A data.frame: 6 × 9
	corpus_file	corpus_file_info	match	intensifier	construction	adjective	syllable_count_adj	NP	syllable_count_NP
	<fct>	<fct>	<fct>	<fct>	<fct>	<fct>	<int>	<fct>	<int>
1	KBF.xml	S conv	a quite ferocious mess	quite	preadjectival	ferocious	3	mess	1
2	AT1.xml	W biography	quite a flirty person	quite	predeterminer	flirty	2	person	2
3	A7F.xml	W misc	a rather anonymous name	rather	preadjectival	anonymous	4	name	1
4	ECD.xml	W commerce	a rather precarious foothold	rather	preadjectival	precarious	4	foothold	2
5	B2E.xml	W biography	quite a restless night	quite	predeterminer	restless	2	night	1
6	AM4.xml	W misc	a rather different turn	rather	preadjectival	different	3	turn	1

[35]:

# construire un dataframe
corpus <- c("BNC", "COCA", "Hansard", "Strathy")
size <- c(100, 450, 1600, 50)
variety <- c("GB", "US", "GB", "CA")
period <- c("1980s-1993", "1990-2012", "1803-2005", "1970s-2000s")
df.manual <- data.frame(size, variety, period, row.names = corpus)
df.manual

A data.frame: 4 × 3
	size	variety	period
	<dbl>	<fct>	<fct>
BNC	100	GB	1980s-1993
COCA	450	US	1990-2012
Hansard	1600	GB	1803-2005
Strathy	50	CA	1970s-2000s

[36]:

# lire et sauvegarder rdata (rds)
# saveRDS(df.manual.2, file="/pathtoyour/workingdirectory/df.manual.2.rds")
# readRDS("/pathtoyour/workingdirectory/df.manual.2.rds") # Mac

[37]:

# exemple for
for (i in letters[1:5]) print(i)

[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"

[38]:

# exemple if elseif else
integer <- c(-7, -4, -1, 0, 3, 12, 14)
for (i in 1:length(integer)) {
  if (integer[i] < 0) print(paste(integer[i], "-> negative")) # first if statement
  else if (integer[i] == 0) print(paste(integer[i], "-> zero")) # second (nested) if statement
  else print(paste(integer[i], "-> positive"))
}

[1] "-7 -> negative"
[1] "-4 -> negative"
[1] "-1 -> negative"
[1] "0 -> zero"
[1] "3 -> positive"
[1] "12 -> positive"
[1] "14 -> positive"

1.2. References¶

Desagulier, Guillaume. 2016. “A Lesson from Associative Learning: Asymmetry and Productivity in Multiple-Slot Constructions.” Journal Article. Corpus Linguistics and Linguistic Theory 12 (1).

Haspelmath, Martin. 2011. “The Indeterminacy of Word Segmentation and the Nature of Morphology and Syntax.” Folia Linguistica 45 (1): 31–80.

https://corpling.modyco.fr/workshops/M2TAL/1.R.fundamentals.html#2_R_scripts