[go: up one dir, main page]

0% found this document useful (0 votes)
72 views3 pages

Hands-On Activity: 2. Exploring The Semi-Structured Data Model of JSON

This document provides instructions for exploring the structure of JSON data through hands-on activities. The activities guide the user to: 1) Display the nested structure of a JSON file using a schema viewer. 2) Extract specific data values from fields within the JSON file by running a script that prompts for the file name, tweet number, and path of the desired field.

Uploaded by

HGE05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views3 pages

Hands-On Activity: 2. Exploring The Semi-Structured Data Model of JSON

This document provides instructions for exploring the structure of JSON data through hands-on activities. The activities guide the user to: 1) Display the nested structure of a JSON file using a schema viewer. 2) Extract specific data values from fields within the JSON file by running a script that prompts for the file name, tweet number, and path of the desired field.

Uploaded by

HGE05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Hands-on Activity: 2.

Exploring the Semi-structured


Data Model of JSON

Learning Goals:

By the end of this activity, you will be able to:

1. Display the nested structure of a JSON file.


2. Extract data from a JSON file.

Instructions:

Step 1. Open a terminal shell. Open a terminal shell by clicking on the square black box on the top
left of the screen.

Run cd Downloads/lect4data/json to change into the directory containing the JSON file.

Step 2. Look at JSON file. Let's look at the contents of the JSON file:
more twitter.json
Press the spacebar to go down and q to quit more.

The contents of the file is difficult to understand since it is packed together.

Step 3. View JSON schema. We can view the schema of the JSON file by running schema.py :
./json_schema.py twitter.json | more

The top-level fields are contributors, truncated, text, etc. Some fields have nested fields, such as
entities, which contains symbols, media, hashtags, etc. If go you down (press spacebar), you will see
multiple levels of nesting.

Enter q to quit more.

Step 4. Extract values in JSON data. We can extract individual values from fields within the JSON
data by running print_json.py:
./print_json.py
The print_json.py asks for the file name, tweet number, and path to extract. The path is the path to
the field in the schema.

Let's look at the value for the text field in the 99th tweet. First, enter twitter.json for the filename:

Next, enter 99 for the number:

Next, enter text for the path:

Note: you may remember the field text from the schema:

The result is:

Now let's find the value for retweeted_status/retweet_count in the 99th tweet. The retweet_count
field is nested in the retweeted_status field, so we enter retweeted_status/retweet_count for the
path:

You might also like