Cheat Sheet: API's and Data Collection
Package/Method Description Code Example
Syntax:
attribute = element[(attribute)]
Access the
value of a
Accessing specific
element attribute attribute of an Example:
HTML
element. href = link_element[(href)]
Syntax:
soup = BeautifulSoup(html, (html.parser))
Parse the
HTML content
of a web page
using
BeautifulSoup() BeautifulSoup.
The parser Example:
type can vary
html = (https://api.example.com/data) soup = BeautifulSoup(html, (html.parser))
based on the
project.
Syntax:
response = requests.delete(url)
Send a
DELETE
request to
remove data or
a resource
from the
delete()
server. Example:
DELETE
requests delete response = requests.delete((https://api.example.com/delete))
a specified
resource on
the server.
find() Find the first Syntax:
HTML
element = soup.find(tag, attrs)
element that
matches the
specified tag
and attributes.
Example:
first_link = soup.find((a), {(class): (link)})
Syntax:
elements = soup.find_all(tag, attrs)
Find all
HTML
elements that
find_all()
match the Example:
specified tag
and attributes. all_links = soup.find_all((a), {(class): (link)})</td>
Syntax:
children = element.findChildren()
Find all child
elements of an
findChildren()
HTML Example:
element.
child_elements = parent_div.findChildren()
Syntax:
Perform a response = requests.get(url)
GET request
to retrieve data
from a
specified
URL. GET
requests are
typically used
for reading
get() data from an
API. The Example:
response
response = requests.get((https://api.example.com/data))
variable will
contain the
server's
response,
which you can
process
further.
Headers Include Syntax:
custom
headers in the headers = {(HeaderName): (Value)}
request.
Headers can
provide
additional
information to
the server,
such as
authentication
tokens or
content types. Example:
base_url = (https://api.example.com/data) headers = {(Authorization): (Bearer YOUR_TOKEN)} response = request
Syntax:
from bs4 import BeautifulSoup
Import the
necessary
Import Libraries Python
libraries for
web scraping.
Syntax:
data = response.json()
Parse JSON
data from the
response. This
extracts and
works with the
data returned
by the API.
The
response.json()
json() Example:
method
converts the
JSON response = requests.get((https://api.example.com/data))
data = response.json()
response into a
Python data
structure
(usually a
dictionary or
list).
Syntax:
sibling = element.find_next_sibling()
Find the next
sibling
next_sibling()
element in the Example:
DOM.
next_sibling = current_element.find_next_sibling()
parent Access the Syntax:
parent element
parent = element.parent
in the
Document
Object Model
(DOM).
Example:
parent_div = paragraph.parent
Syntax:
response = requests.post(url, data)
Send a POST
request to a
specified URL
with data.
Create or
update POST
requests using
post() resources on
the server. The Example:
data parameter
response = requests.post((https://api.example.com/submit), data={(key): (value)})
contains the
data to send to
the server,
often in JSON
format.
Syntax:
response = requests.put(url, data)
Send a PUT
request to
update data on
the server.
PUT requests
are used to
update an
existing
put()
resource on Example:
the server with
the data response = requests.put((https://api.example.com/update), data={(key): (value)})
provided in the
data
parameter,
typically in
JSON format.
Query parameters Pass query Syntax:
parameters in
params = {(param_name): (value)}
the URL to
filter or
customize the
request. Query
parameters
specify
conditions or
limits for the
requested data.
Example:
base_url = "https://api.example.com/data"
params = {"page": 1, "per_page": 10}
response = requests.get(base_url, params=params)
Syntax:
element = soup.select(selector)
Select HTML
elements from
select() the parsed
HTML using a Example:
CSS selector.
titles = soup.select((h1))
Syntax:
response.status_code
Check the
HTTP status
code of the
response. The
HTTP status
code indicates
the result of
the request
status_code (success, error, Example:
redirection).
Use the HTTP url = "https://api.example.com/data"
response = requests.get(url)
status codeIt status_code = response.status_code
can be used for
error handling
and decision-
making in
your code.
Tag Example:
- (a): Find anchor () tags.
Specify any - (p): Find paragraph ((p)) tags.
- (h1), (h2), (h3), (h4), (h5), (h6): Find heading tags from level 1 to 6 ( (h1),n (h2)).
valid HTML
- (table): Find table () tags.
tag as the tag - (tr): Find table row () tags.
parameter to - (td): Find table cell ((td)) tags.
search for - (th): Find table header cell ((td))tags.
elements of - (img): Find image ((img)) tags.
tags for find() - (form): Find form ((form)) tags.
that type. Here
and find_all() - (button): Find button ((button)) tags.
are some
common
HTML tags
that you can
use with the
tag parameter.
text Retrieve the Syntax:
text content of
text = element.text
an HTML
element.
Example:
title_text = title_element.text
© IBM Corporation. All rights reserved.