Page:
Interrogation
AnonymousBystander
Forum Posts: 229
Fire of Insight
3
Joined 28th Sep 2018 Forum Posts: 229
When I look at my poems section I see the title of the poem and the number of reads ... How can I scrape this data?
I'm looking to build a barchart (using R) of the poems and their number of reads ...
Help anyone?
I'm looking to build a barchart (using R) of the poems and their number of reads ...
Help anyone?
AnonymousBystander
Forum Posts: 229
Fire of Insight
3
Joined 28th Sep 2018 Forum Posts: 229
For those familiar with the R software package, here's my solution
NoPoemsRead <- function(x){
poems <- readLines(x)
## Extract out the title lines
poem_title <- poems[grep("<h2", poems)[-1]
## Extract out the number of reads lines
poem_reads <- poems[grep("reads</small", poems)]
## De-clutter work space
rm(poems)
## tidy up the data
poem_reads <- unlist(lapply(poem_reads, function(x) as.numeric(gsub("[^0-9]+","",x))))
## the two lines below could be done in one
poem_title <- unlist(lapply(poem_title, function(x) gsub("\t\t\t\t\t\t\t\t\t\t<.*?>", "", x)))
poem_title <- unlist(lapply(poem_title, function(x) gsub("<.*?>", "", x)))
names(poem_reads) <- poem_title
barplot(poem_reads, las = 2,cex.names = 0.75, main = "The Number of Poems Read",
ylab="Number of Reads", col = "skyblue")
}
NoPoemsRead("https://deepundergroundpoetry.com/poems-by/AnonymousBystander/")
NoPoemsRead <- function(x){
poems <- readLines(x)
## Extract out the title lines
poem_title <- poems[grep("<h2", poems)[-1]
## Extract out the number of reads lines
poem_reads <- poems[grep("reads</small", poems)]
## De-clutter work space
rm(poems)
## tidy up the data
poem_reads <- unlist(lapply(poem_reads, function(x) as.numeric(gsub("[^0-9]+","",x))))
## the two lines below could be done in one
poem_title <- unlist(lapply(poem_title, function(x) gsub("\t\t\t\t\t\t\t\t\t\t<.*?>", "", x)))
poem_title <- unlist(lapply(poem_title, function(x) gsub("<.*?>", "", x)))
names(poem_reads) <- poem_title
barplot(poem_reads, las = 2,cex.names = 0.75, main = "The Number of Poems Read",
ylab="Number of Reads", col = "skyblue")
}
NoPoemsRead("https://deepundergroundpoetry.com/poems-by/AnonymousBystander/")
AnonymousBystander
Forum Posts: 229
Fire of Insight
3
Joined 28th Sep 2018 Forum Posts: 229
Here's what my code produces ...