You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 02-sentiment-analysis.Rmd
+7-7Lines changed: 7 additions & 7 deletions
Original file line number
Diff line number
Diff line change
@@ -113,8 +113,8 @@ Now we can plot these sentiment scores across the plot trajectory of each novel.
113
113
```{r sentimentplot, dependson = "janeaustensentiment", fig.width=9, fig.height=10, fig.cap="Sentiment through the narratives of Jane Austen's novels"}
114
114
library(ggplot2)
115
115
116
-
ggplot(janeaustensentiment, aes(index, sentiment, fill = book)) +
Copy file name to clipboardExpand all lines: 03-tf-idf.Rmd
+23-20Lines changed: 23 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -50,8 +50,8 @@ There is one row in this `book_words` data frame for each word-book combination;
50
50
```{r plottf, dependson = "book_words", fig.height=9, fig.width=9, fig.cap="Term Frequency Distribution in Jane Austen's Novels"}
51
51
library(ggplot2)
52
52
53
-
ggplot(book_words, aes(n/total, fill = book)) +
54
-
geom_histogram(show.legend = FALSE) +
53
+
ggplot(book_words, aes(n/total)) +
54
+
geom_histogram() +
55
55
xlim(NA, 0.0009) +
56
56
facet_wrap(~book, ncol = 2, scales = "free_y")
57
57
```
@@ -79,9 +79,9 @@ freq_by_rank
79
79
80
80
The `rank` column here tells us the rank of each word within the frequency table; the table was already ordered by `n` so we could use `row_number()` to find the rank. Then, we can calculate the term frequency in the same way we did before. Zipf's law is often visualized by plotting rank on the x-axis and term frequency on the y-axis, on logarithmic scales. Plotting this way, an inversely proportional relationship will have a constant, negative slope.
81
81
82
-
```{r zipf, dependson = "freq_by_rank", fig.width=7, fig.height=5, fig.cap="Zipf's law for Jane Austen's novels"}
82
+
```{r zipf, dependson = "freq_by_rank", fig.width=6, fig.height=5, fig.cap="Zipf's law for Jane Austen's novels"}
83
83
freq_by_rank %>%
84
-
ggplot(aes(rank, `term frequency`, color = book)) +
84
+
ggplot(aes(rank, `term frequency`, group = book)) +
85
85
geom_line(size = 1.2, alpha = 0.8) +
86
86
scale_x_log10() +
87
87
scale_y_log10()
@@ -102,9 +102,9 @@ Classic versions of Zipf's law have
0 commit comments