8000 Changes for new version of tidyr (map, unnest) · codingbooks/tidy-text-mining@d6e9099 · GitHub
[go: up one dir, main page]

Skip to content

Commit d6e9099

Browse files
author
Julia Silge
committed
Changes for new version of tidyr (map, unnest)
1 parent a77b52f commit d6e9099

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

05-document-term-matrices.Rmd

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -310,7 +310,8 @@ Each of the items in the `corpus` list column is a `WebCorpus` object, which is
310310

311311
```{r stock_tokens, dependson = "stock_articles"}
312312
stock_tokens <- stock_articles %>%
313-
unnest(map(corpus, tidy)) %>%
313+
mutate(corpus = map(corpus, tidy)) %>%
314+
unnest(cols = (corpus)) %>%
314315
unnest_tokens(word, text) %>%
315316
select(company, datetimestamp, word, id, heading)
316317
@@ -319,7 +320,7 @@ stock_tokens
319320

320321
Here we see some of each article's metadata alongside the words used. We could use tf-idf to determine which words were most specific to each stock symbol.
321322

322-
```{r}
323+
```{r stocktfidfdata, dependson="stock_tokens"}
323324
library(stringr)
324325
325326
stock_tf_idf <- stock_tokens %>%
@@ -331,7 +332,7 @@ stock_tf_idf <- stock_tokens %>%
331332

332333
The top terms for each are visualized in Figure \@ref(fig:stocktfidf). As we'd expect, the company's name and symbol are typically included, but so are several of their product offerings and executives, as well as companies they are making deals with (such as Disney with Netflix).
333334

334-
```{r stocktfidf, dependson = "stock_tf_idf", echo = FALSE, fig.cap = "The 8 words with the highest tf-idf in recent articles specific to each company", fig.height = 8, fig.width = 8}
335+
```{r stocktfidf, dependson = "stocktfidfdata", echo = FALSE, fig.cap = "The 8 words with the highest tf-idf in recent articles specific to each company", fig.height = 8, fig.width = 8}
335336
stock_tf_idf %>%
336337
group_by(company) %>%
337338
top_n(8, tf_idf) %>%

0 commit comments

Comments
 (0)
0