8000 Web scraping and API commented out for first round · Issue #64 · UBC-DSCI/introduction-to-datascience-python · GitHub
[go: up one dir, main page]

Skip to content

Web scraping and API commented out for first round #64

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8000
Closed
trevorcampbell opened this issue Dec 19, 2022 · 2 comments · Fixed by #264
Closed

Web scraping and API commented out for first round #64

trevorcampbell opened this issue Dec 19, 2022 · 2 comments · Fixed by #264
Labels
1st edition Planned for inclusion in 1st print edition enhancement New feature or request

Comments

@trevorcampbell
Copy link
Contributor
trevorcampbell commented Dec 19, 2022

I've commented out web scraping and API use in #49 , since the material is optional and we need to think more carefully about at least the web scraping part (e.g. scrapy instead of beautifulsoup)

I vote we leave this out for the first round of the course, since we have much higher priority items to handle at the moment. We should of course add it back in once we have the more important material sorted out, though.

I've also commented out the learning objectives; remember to uncomment when we reintroduce this.

@trevorcampbell
Copy link
Contributor Author

For when we reintroduce, some commentsl left over from an earlier review:

  • R & Py - Is the selector gadget better than built-in inspectors?
  • Format craigslist HTML with correct syntax highlighting
  • Pandas read_html page FileNotFoundError, explain "droplevels"
    • I think we should also build up this method more to highlights its immense convenience, and not say that it is "fantastic" to read via beautiful soup.
  • I think scrapy is both more powerful and intuitive than beautiful soup
  • Twitter images broken
  • Introduce the print function and for loops before using them for tweepy
  • Another data file not found for the tweets

@trevorcampbell trevorcampbell added the 1st edition Planned for inclusion in 1st print edition label Sep 17, 2023
@trevorcampbell
Copy link
Contributor Author

See also UBC-DSCI/introduction-to-datascience#487 -- we should move away from twitter at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1st edition Planned for inclusion in 1st print edition enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant
0