10000 extended 006 with metadata example and printf styling · hydroffice/python_basics@5aba2bc · GitHub
[go: up one dir, main page]

Skip to content

Commit 5aba2bc

Browse files
author
giumas
committed
extended 006 with metadata example and printf styling
1 parent a766597 commit 5aba2bc

File tree

1 file changed

+340
-1
lines changed

1 file changed

+340
-1
lines changed

006_Dictionaries.ipynb

Lines changed: 340 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -206,6 +206,341 @@
206206
"***"
207207
]
208208
},
209+
{
210+
"cell_type": "markdown",
211+
"metadata": {},
212+
"source": [
213+
"## A `dict` as a Metadata Container"
214+
]
215+
},
216+
{
217+
"cell_type": "markdown",
218+
"metadata": {},
219+
"source": [
220+
"We will now explore the use of a `dict` as a [metadata](https://en.wikipedia.org/wiki/Metadata) container. \n",
221+
"\n",
222+
"Descriptive metadata provides a resource for several purposes such as data discovery and identification. Following our previous examples of experiments collecting water salinity and temperature values, we will use a `dict` to store metadata such as:\n",
223+
"\n",
224+
"- The author of the measures (`\"first_name\"` and `\"last_name\"`).\n",
225+
"- The location where the measurements took place (`\"latitude\"` and `\"longitude\"`).\n",
226+
"- The time frame when the measures happened (`\"start_timestamp\"` and `\"end_timestamp\"`)."
227+
]
228+
},
229+
{
230+
"cell_type": "markdown",
231+
"metadata": {},
232+
"source": [
233+
"Thus, a complete set of metadata will be represented by a `dict` containing the following six keys (with the corresponding value type):\n",
234+
"\n",
235+
"- `\"first_name\"` ➜ `str` type\n",
236+
"- `\"last_name\"` ➜ `str` type\n",
237+
"- `\"latitude\"` ➜ `float` type\n",
238+
"- `\"longitude\"` ➜ `float` type\n",
239+
"- `\"start_timestamp\"` ➜ `datetime` type\n",
240+
"- `\"end_timestamp\"` ➜ `datetime` type"
241+
]
242+
},
243+
{
244+
"cell_type": "markdown",
245+
"metadata": {},
246+
"source": [
247+
"This is the first time that we use the [`datetime`](https://docs.python.org/3.6/library/datetime.html?#module-datetime) type! "
248+
]
249+
},
250+
{
251+
"cell_type": "markdown",
252+
"metadata": {},
253+
"source": [
254+
"<img align=\"left\" width=\"6%\" style=\"padding-right:10px;\" src=\"images/key.png\">\n",
255+
"\n",
256+
"A variable of `datetime` type contains all the information from both a date and a time."
257+
]
258+
},
259+
{
260+
"cell_type": "markdown",
261+
"metadata": {},
262+
"source": [
263+
"As you can read from the [Python documentation](https://docs.python.org/3.6/library/datetime.html?#datetime-objects), the `datetime` constructor is part of the `datetime` module (yes, they have both the same name!) and takes several parameters: \n",
264+
"\n",
265+
"- `datetime(year, month, day, hour=0, minute=0, second=0, microsecond=0, tzinfo=None, *, fold=0)` "
266+
]
267+
},
268+
{
269+
"cell_type": "markdown",
270+
"metadata": {},
271+
"source": [
272+
"For the aims of this notebook, you can just ignore all the parameters after the first six. In fact, we will call the `datetime` constructor with only 6 values (from `year` to `second`)."
273+
]
274+
},
275+
{
276+
"cell_type": "code",
277+
"execution_count": null,
278+
"metadata": {},
279+
"outputs": [],
280+
"source": [
281+
"from datetime import datetime\n",
282+
"\n",
283+
"begin_timestamp = datetime(2019, 2, 22, 12, 32, 40)\n",
284+
"print(str(begin_timestamp))"
285+
]
286+
},
287+
{
288+
"cell_type": "markdown",
289+
"metadata": {},
290+
"source": [
291+
"**How can the above code actually work?** It works because the parameters after the first 3 (e.g., `hour=0`) have a **default value** assigned to them (the `=0` in this specific example). This implies that, if you do *not* pass values for those parameters, Python will assign them those defined default values. "
292+
]
293+
},
294+
{
295+
"cell_type": "markdown",
296+
"metadata": {},
297+
"source": [
298+
"We can now write our `metadata`:"
299+
]
300+
},
301+
{
302+
"cell_type": "code",
303+
"execution_count": null,
304+
"metadata": {},
305+
"outputs": [],
306+
"source": [
307+
"metadata = dict()\n",
308+
"metadata[\"first_name\"] = \"John\"\n",
309+
"metadata[\"last_name\"] = \"Doe\"\n",
310+
"metadata[\"latitude\"] = 43.135555\n",
311+
"metadata[\"longitude\"] = -70.939534\n",
312+
"metadata[\"start_timestamp\"] = datetime(2019, 2, 22, 12, 32, 40)\n",
313+
"metadata[\"end_timestamp\"] = datetime(2019, 2, 22, 12, 34, 14)\n",
314+
"\n",
315+
"print(metadata)"
316+
]
317+
},
318+
{
319+
"cell_type": "markdown",
320+
"metadata": {},
321+
"source": [
322+
" "
323+
]
324+
},
325+
{
326+
"cell_type": "markdown",
327+
"metadata": {
328+
"solution2": "hidden",
329+
"solution2_first": true
330+
},
331+
"source": [
332+
"<img align=\"left\" width=\"6%\" style=\"padding-right:10px;\" src=\"images/test.png\">\n",
333+
"\n",
334+
"Populate and print a `metadata` dictionary containing the following three keys: your `\"username\"`, the `\"begin_time\"` and the `\"end_time\"` for the execution of this exercise."
335+
]
336+
},
337+
{
338+
"cell_type": "code",
339+
"execution_count": null,
340+
"metadata": {
341+
"solution2": "hidden"
342+
},
343+
"outputs": [],
344+
"source": [
345+
"metadata = dict()\n",
346+
"metadata[\"username\"] = \"jdoe\"\n",
347+
"metadata[\"start_timestamp\"] = datetime(2019, 2, 22, 12, 34, 20)\n",
348+
"metadata[\"end_timestamp\"] = datetime(2019, 2, 22, 12, 34, 21)\n",
349+
"\n",
350+
"print(metadata)"
351+
]
352+
},
353+
{
354+
"cell_type": "code",
355+
"execution_count": null,
356+
"metadata": {},
357+
"outputs": [],
358+
"source": []
359+
},
360+
{
361+
"cell_type": "markdown",
362+
"metadata": {},
363+
"source": [
364+
"***"
365+
]
366+
},
367+
{
368+
"cell_type": "markdown",
369+
"metadata": {},
370+
"source": [
371+
"# More on String Formatting"
372+
]
373+
},
374+
{
375+
"cell_type": "markdown",
376+
"metadata": {},
377+
"source": [
378+
"In this last section of this notebook, we will explore different mechanisms that Python provides for printing (**string formatting**) a value."
379+
]
380+
},
381+
{
382+
"cell_type": "markdown",
383+
"metadata": {},
384+
"source": [
385+
"At this moment, you know how to print a value with `str` type:"
386+
]
387+
},
388+
{
389+
"cell_type": "code",
390+
"execution_count": 28,
391+
"metadata": {},
392+
"outputs": [
393+
{
394+
"name": "stdout",
395+
"output_type": "stream",
396+
"text": [
397+
"The first name is: John\n"
398+
]
399+
}
400+
],
401+
"source": [
402+
"metadata = dict()\n",
403+
"metadata[\"first_name\"] = \"John\"\n",
404+
"\n",
405+
"print(\"The first name is: \" + metadata[\"first_name\"])"
406+
]
407+
},
408+
{
409+
"cell_type": "markdown",
410+
"metadata": {},
411+
"source": [
412+
"You also know that you can type-casting types using `str()`:"
413+
]
414+
},
415+
{
416+
"cell_type": "code",
417+
"execution_count": 29,
418+
"metadata": {},
419+
"outputs": [
420+
{
421+
"name": "stdout",
422+
"output_type": "stream",
423+
"text": [
424+
"The position is: 43.135555, -70.939534\n",
425+
"Start time: 2019-02-22 12:32:40\n"
426+
]
427+
}
428+
],
429+
"source": [
430+
"metadata = dict()\n",
431+
"metadata[\"latitude\"] = 43.135555\n",
432+
"metadata[\"longitude\"] = -70.939534\n",
433+
"metadata[\"start_timestamp\"] = datetime(2019, 2, 22, 12, 32, 40)\n",
434+
"\n",
435+
"print(\"The position is: \" + str(metadata[\"latitude\"]) + \", \" + str(metadata[\"longitude\"]))\n",
436+
"print(\"Start time: \" + str(metadata[\"start_timestamp\"]))"
437+
]
438+
},
439+
{
440+
"cell_type": "markdown",
441+
"metadata": {},
442+
"source": [
443+
"It is possible to achieve the same results by using the `%` modulo operator like in the following examples:"
444+
]
445+
},
446+
{
447+
"cell_type": "code",
448+
"execution_count": 33,
449+
"metadata": {},
450+
"outputs": [
451+
{
452+
"name": "stdout",
453+
"output_type": "stream",
454+
"text": [
455+
"The first name is: John\n"
456+
]
457+
}
458+
],
459+
"source": [
460+
"metadata = dict()\n",
461+
"metadata[\"first_name\"] = \"John\"\n",
462+
"\n",
463+
"print(\"The first name is: %s\" % (metadata[\"first_name\"], ))"
464+
]
465+
},
466+
{
467+
"cell_type": "code",
468+
"execution_count": 37,
469+
"metadata": {},
470+
"outputs": [
471+
{
472+
"name": "stdout",
473+
"output_type": "stream",
474+
"text": [
475+
"The position is: 43.135558, -70.939534\n",
476+
"Start time: 2019-02-22 12:32:40\n"
477+
]
478+
}
479+
],
480+
"source": [
481+
"metadata = dict()\n",
482+
"metadata[\"latitude\"] = 43.135558\n",
483+
"metadata[\"longitude\"] = -70.939534\n",
484+
"metadata[\"start_timestamp\"] = datetime(2019, 2, 22, 12, 32, 40)\n",
485+
"\n",
486+
"print(\"The position is: %s, %s\" % (metadata[\"latitude\"], metadata[\"longitude\"]))\n",
487+
"print(\"Start time: %s\" % (metadata[\"start_timestamp\"], ))"
488+
]
489+
},
490+
{
491+
"cell_type": "markdown",
492+
"metadata": {},
493+
"source": [
494+
"If you look at the above examples, you will noticed the presence of `%s` as placeholders in the string. The string is followed by a `%` operator, then by one or more variables enclosed in square brackets."
495+
]
496+
},
497+
{
498+
"cell_type": "markdown",
499+
"metadata": {},
500+
"source": [
501+
"<img align=\"left\" width=\"6%\" style=\"padding-right:10px;\" src=\"images/info.png\">\n",
502+
"\n",
503+
"In the above code, the values after the `%` operator that are inside the square brackets create a [`tuple`](https://docs.python.org/3.6/library/stdtypes.html?#tuples). <br>\n",
504+
"A `tuple` is a Python container that represents an immutable sequence (thus, you cannot change the content of a `tuple`)."
505+
]
506+
},
507+
{
508+
"cell_type": "markdown",
509+
"metadata": {},
510+
"source": [
511+
"String formatting using the `%` operator provides [additional printing options](https://docs.python.org/3.6/library/stdtypes.html#printf-style-string-formatting). Among them, you can decide how many decimal digits will be printed for a `float` value. \n",
512+
"\n",
513+
"For instance, by using `%.4f` as a placeholder, Python will print **only** the first four decimal digits: "
514+
]
515+
},
516+
{
517+
"cell_type": "code",
518+
"execution_count": 41,
519+
"metadata": {},
520+
"outputs": [
521+
{
522+
"name": "stdout",
523+
"output_type": "stream",
524+
"text": [
525+
"The position is: 43.1356, -70.9395\n"
526+
]
527+
}
528+
],
529+
"source": [
530+
"metadata = dict()\n",
531+
"metadata[\"latitude\"] = 43.135558\n",
532+
"metadata[\"longitude\"] = -70.939534\n",
533+
"\n",
534+
"print(\"The position is: %.4f, %.4f\" % (metadata[\"latitude\"], metadata[\"longitude\"]))"
535+
]
536+
},
537+
{
538+
"cell_type": "markdown",
539+
"metadata": {},
540+
"source": [
541+
"***"
542+
]
543+
},
209544
{
210545
"cell_type": "markdown",
211546
"metadata": {},
@@ -222,7 +557,11 @@
222557
"* [The official Python 3.6 documentation](https://docs.python.org/3.6/index.html)\n",
223558
" * [Glossary](https://docs.python.org/3.6/glossary.html)\n",
224559
" * [Mapping Types - dict](https://docs.python.org/3.6/library/stdtypes.html#mapping-types-dict)\n",
225-
" * [Collections - OrderedDict](https://docs.python.org/3.6/library/collections.html?highlight=ordereddict#ordereddict-objects)"
560+
" * [Collections - OrderedDict](https://docs.python.org/3.6/library/collections.html?highlight=ordereddict#ordereddict-objects)\n",
561+
" * [`datetime`](https://docs.python.org/3.6/library/datetime.html?#module-datetime) \n",
562+
" * [`tuple`](https://docs.python.org/3.6/library/stdtypes.html?#tuples)\n",
563+
"* [Hash function](https://en.wikipedia.org/wiki/Hash_function)\n",
564+
"* [Metadata](https://en.wikipedia.org/wiki/Metadata)"
226565
]
227566
},
228567
{

0 commit comments

Comments
 (0)
0