33 - The quicker brown fox jumped over 2 lazy dogs." sample_text_b # "The quick brown fox named Seamus \n- 1 - \njumps over the lazy dog also named Seamus, with \n- 2 - \nthe newspaper from a boy named quick Seamus, in his mouth. Sample_text_b <- "The quick brown fox named Seamus - 1 - jumps over the lazy dog also named Seamus, with - 2 - the newspaper from a boy named quick Seamus, in his mouth. In the second example we remove page numbers which have the format “- X -”. page 2 The quicker brown fox jumped over 2 lazy dogs." sample_text_a # "The quick brown fox named Seamus jumps over the lazy dog also named Seamus, \npage 1 \nwith the newspaper from a boy named quick Seamus, in his mouth.\npage 2\nThe quicker brown fox jumped over 2 lazy dogs." # Remove "page" and respective digit sample_text_a2 <- unlist( stri_split_fixed(sample_text_a, ' \n '), use.names = FALSE) sample_text_a2 <- stri_replace_all_regex(sample_text_a2, "page \\ d*", "") sample_text_a2 <- stri_trim_both(sample_text_a2) sample_text_a2 <- sample_text_a2 stri_paste(sample_text_a2, collapse = ' \n ') # "The quick brown fox named Seamus jumps over the lazy dog also named Seamus,\nwith the newspaper from a boy named quick Seamus, in his mouth.\nThe quicker brown fox jumped over 2 lazy dogs." # Make some text with page numbers sample_text_a <- "The quick brown fox named Seamus jumps over the lazy dog also named Seamus, page 1 with the newspaper from a boy named quick Seamus, in his mouth. We can load all texts included in both folders. In our example, the folder txt/movie_reviews contains two subfolders (called neg and pos). Readtext can also curse through subdirectories. # Description: df # doc_id text unit context year language party # 1 EU_euro_2004_de_PSE.txt "\"PES # Manifestos with docvars from filenames readtext( paste0(DATA_DIR, "/txt/EU_manifestos/*.txt"), docvarsfrom = "filenames", docvarnames = c( "unit", "context", "year", "language", "party"), dvsep = "_", encoding = "ISO-8859-1") # readtext object consisting of 17 documents and 5 docvars.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
January 2023
Categories |