Page:

Interrogation

AnonymousBystander
Fire of Insight
United Kingdom 3awards
Joined 28th Sep 2018
Forum Posts: 229

When I look at my poems section I see the title of the poem and the number of reads ... How can I scrape this data?

I'm looking to build a barchart (using R) of the poems and their number of reads ...

Help anyone?

AnonymousBystander
Fire of Insight
United Kingdom 3awards
Joined 28th Sep 2018
Forum Posts: 229

For those familiar with the R software package, here's my solution

NoPoemsRead <- function(x){
 poems <- readLines(x)
 ## Extract out the title lines
 
 poem_title <- poems[grep("<h2", poems)[-1]
 
 ## Extract out the number of reads lines
 
 poem_reads <- poems[grep("reads</small", poems)]
 
 ## De-clutter work space
 
 rm(poems)
 
 ## tidy up the data
 poem_reads <- unlist(lapply(poem_reads, function(x) as.numeric(gsub("[^0-9]+","",x))))
 
 ## the two lines below could be done in one
 poem_title <- unlist(lapply(poem_title, function(x) gsub("\t\t\t\t\t\t\t\t\t\t<.*?>", "", x)))
 poem_title <- unlist(lapply(poem_title, function(x) gsub("<.*?>", "", x)))
 
 names(poem_reads) <- poem_title
 
 barplot(poem_reads, las = 2,cex.names = 0.75, main = "The Number of Poems Read",
         ylab="Number of Reads", col = "skyblue")
}

NoPoemsRead("https://deepundergroundpoetry.com/poems-by/AnonymousBystander/")

AnonymousBystander
Fire of Insight
United Kingdom 3awards
Joined 28th Sep 2018
Forum Posts: 229


Here's what my code produces ...

Page:
Go to: