diff --git a/README.md b/README.md index 607f136..32615a6 100644 --- a/README.md +++ b/README.md @@ -1,15 +1,29 @@ # python Jupyter notebooks and datasets for the interesting pandas/python/data science video series. +# Contribution + +Feel free to contribute or suggest new ideas. To get in touch write on [mail](mailto:grouprivl@gmail.com?subject=[GitHub]%20Source%20Python). + +You can find nice guide about GitHub contribution: +* [Contributing to projects](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) +* [Step-by-step guide to contributing on GitHub](https://www.dataschool.io/how-to-contribute-on-github/) + # Who is this repo for? -For people who are interested in data science, data analysis and finding interesting relation for data. This repository is related to site: https://blog.softhints.com/tag/pandas/ where you can find more interesting videos. The youtube channel is: +For people who are interested in data science, data analysis and finding interesting insights for data. This repository is related to sites: +* [DataScientYst.com - Data Science Tutorials, Exercises, Guides, Videos with Python and Pandas](https://datascientyst.com/) +* [SoftHints.com - Python, Pandas, Linux, SQL Tutorials and Guides](https://softhints.com/) + +where you can find more interesting articles. + +New website dedicated to Pandas and Data Science was started: https://datascientyst.com/. It has better organization and covers topics in many areas. -https://www.youtube.com/channel/UCg5rvP_D735oSBatdcH5ZFA -# Popular Videos +The youtube channel is: -https://softhints.com/youtube-videos.html +* [SoftHints Youtube](https://www.youtube.com/@softhints/) +* [Popular Videos](https://www.youtube.com/@softhints/videos) # Latest Videos diff --git a/notebooks/Books/Think Python/Think_Python_Chapter_12__Tuples.ipynb b/notebooks/Books/Think Python/Think_Python_Chapter_12__Tuples.ipynb new file mode 100644 index 0000000..fc2e3da --- /dev/null +++ b/notebooks/Books/Think Python/Think_Python_Chapter_12__Tuples.ipynb @@ -0,0 +1,1497 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Chapter 12  Tuples\n", + "\n", + "\n", + "* 12.1  Tuples are immutable\n", + "* 12.2  Tuple assignment\n", + "* 12.3  Tuples as return values\n", + "* 12.4  Variable-length argument tuples\n", + "* 12.5  Lists and tuples\n", + "* 12.6  Dictionaries and tuples\n", + "* 12.7  Sequences of sequences" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 12.1 Tuples are immutable" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This chapter presents one more built-in type, the tuple, and then\n", + "shows how lists, dictionaries, and tuples work together.\n", + "I also present a useful feature for variable-length argument lists,\n", + "the gather and scatter operators." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "One note: there is no consensus on how to pronounce “tuple”. Some people say **“tuh-ple”**, which rhymes with “supple”. But in the context of programming, most people say **“too-ple”**, which rhymes with “quadruple”." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " A tuple is a sequence of values. The values can be any type, and\n", + "they are indexed by integers, so in that respect tuples are a lot\n", + "like lists. The important difference is that tuples are immutable.\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Syntactically, a tuple is a comma-separated list of values:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t = 'a', 'b', 'c', 'd', 'e'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "Although it is not necessary, it is common to enclose tuples in\n", + "parentheses:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t = ('a', 'b', 'c', 'd', 'e')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "To create a tuple with a single element, you have to include a final\n", + "comma:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t1 = 'a',\n", + "type(t1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "A value in parentheses is not a tuple:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t2 = ('a')\n", + "type(t2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "Another way to create a tuple is the built-in function tuple.\n", + "With no argument, it creates an empty tuple:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t = tuple()\n", + "t" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "If the argument is a sequence (string, list or tuple), the result\n", + "is a tuple with the elements of the sequence:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t = tuple('lupins')\n", + "t" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "Because tuple is the name of a built-in function, you should\n", + "avoid using it as a variable name." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Most list operators also work on tuples. The bracket operator\n", + "indexes an element:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t = ('a', 'b', 'c', 'd', 'e')\n", + "t[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "And the slice operator selects a range of elements.\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t[1:3]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " But if you try to modify one of the elements of the tuple, you get\n", + "an error:\n", + "
\n", + "\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t[0] = 'A'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "Because tuples are immutable, you can’t modify the elements. But you\n", + "can replace one tuple with another:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t = ('A',) + t[1:]\n", + "t" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "This statement makes a new tuple and then makes t refer to it." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The relational operators work with tuples and other sequences;\n", + "Python starts by comparing the first element from each\n", + "sequence. If they are equal, it goes on to the next elements,\n", + "and so on, until it finds elements that differ. Subsequent\n", + "elements are not considered (even if they are really big).\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "(0, 1, 2) < (0, 3, 4)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "(0, 1, 2000000) < (0, 3, 4)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 12.2 Tuple assignment" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It is often useful to swap the values of two variables.\n", + "With conventional assignments, you have to use a temporary\n", + "variable. For example, to swap a and b:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = 4\n", + "b = 3\n", + "print(f'a: {a}, b: {b}')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "temp = a\n", + "a = b\n", + "b = temp" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(f'a: {a}, b: {b}')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + "

Bonus: Tower of Hanoi

\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "This solution is cumbersome; tuple assignment is more elegant:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a, b = b, a" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "The left side is a tuple of variables; the right side is a tuple of\n", + "expressions. Each value is assigned to its respective variable. \n", + "All the expressions on the right side are evaluated before any\n", + "of the assignments." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "
\n", + " The number of variables on the left and the number of\n", + "values on the right have to be the same:\n", + "
\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a, b = 1, 2, 3" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "More generally, the right side can be any kind of sequence\n", + "(string, list or tuple). For example, to split an email address\n", + "into a user name and a domain, you could write:\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "addr = 'monty@python.org'\n", + "uname, domain = addr.split('@')\n", + "uname, domain" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "data = ['Everest', 8849, 27.9881, 86.9250]\n", + "name, height, latitude, longitude = data\n", + "\n", + "print(name, height, latitude, longitude)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "The return value from split is a list with two elements;\n", + "the first element is assigned to uname, the second to\n", + "domain." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "uname #'monty'\n", + "domain #'python.org'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 12.3 Tuples as return values" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Strictly speaking, a function can only return one value, but\n", + "if the value is a tuple, the effect is the same as returning\n", + "multiple values. For example, if you want to divide two integers\n", + "and compute the quotient and remainder, it is inefficient to\n", + "compute x//y and then x%y. It is better to compute\n", + "them both at the same time.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The built-in function divmod takes two arguments and\n", + "returns a tuple of two values, the quotient and remainder.\n", + "You can store the result as a tuple:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t = divmod(7, 3)\n", + "t" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "Or use tuple assignment to store the elements separately:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "quot, rem = divmod(7, 3)\n", + "quot" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "rem" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "Here is an example of a function that returns a tuple:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def min_max(t):\n", + " return min(t), max(t)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "max and min are built-in functions that find\n", + "the largest and smallest elements of a sequence. min_max\n", + "computes both and returns a tuple of two values.\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 12.4 Variable-length argument tuples" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Functions can take a variable number of arguments. A parameter\n", + "name that begins with * gathers arguments into\n", + "a tuple. For example, printall\n", + "takes any number of arguments and prints them:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def printall(*args):\n", + " print(args)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "The gather parameter can have any name you like, but args is\n", + "conventional. Here’s how the function works:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "printall(1, 2.0, '3','x')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "
\n", + " The complement of gather is scatter. If you have a\n", + "sequence of values and you want to pass it to a function\n", + "as multiple arguments, you can use the * operator.\n", + "For example, divmod takes exactly two arguments; it\n", + "doesn’t work with a tuple:\n", + "
\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t = (7, 3)\n", + "divmod(t)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " But if you scatter the tuple, it works:\n", + "
\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "divmod(*t)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "Many of the built-in functions use\n", + "variable-length argument tuples. For example, max\n", + "and min can take any number of arguments:\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "max(1, 2, 3)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "But sum does not.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sum(1, 2, 3)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "As an exercise, write a function called sum_all that takes any number\n", + "of arguments and returns their sum." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 12.5 Lists and tuples" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "zip is a built-in function that takes two or more sequences and\n", + "interleaves them. The name of the function refers to\n", + "a zipper, which interleaves two rows of teeth." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This example zips a string and a list:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "s = 'abc'\n", + "t = [0, 1, 2]\n", + "zip(s, t)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "The result is a zip object that knows how to iterate through\n", + "the pairs. The most common use of zip is in a for loop:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "for pair in zip(s, t):\n", + " print(pair)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "A zip object is a kind of iterator, which is any object\n", + "that iterates through a sequence. Iterators are similar to lists in some\n", + "ways, but unlike lists, you can’t use an index to select an element from\n", + "an iterator.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you want to use list operators and methods, you can\n", + "use a zip object to make a list:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "zip(s, t)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "list(zip(s, t))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "The result is a list of tuples; in this example, each tuple contains\n", + "a character from the string and the corresponding element from\n", + "the list.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the sequences are not the same length, the result has the\n", + "length of the shorter one." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "list(zip('Anne', 'Elk'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "You can use tuple assignment in a for loop to traverse a list of\n", + "tuples:\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t = [('a', 0), ('b', 1), ('c', 2)]\n", + "for letter, number in t:\n", + " print(number, letter)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " Each time through the loop, Python selects the next tuple in\n", + "the list and assigns the elements to letter and \n", + "number. The output of this loop is:\n", + "
\n", + "\n", + "0 a\n", + "\n", + "1 b\n", + "\n", + "2 c" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "If you combine zip, for and tuple assignment, you get a\n", + "useful idiom for traversing two (or more) sequences at the same\n", + "time. For example, has_match takes two sequences, t1 and\n", + "t2, and returns True if there is an index i\n", + "such that t1[i] == t2[i]:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def has_match(t1, t2):\n", + " for x, y in zip(t1, t2):\n", + " if x == y:\n", + " return True\n", + " return False" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "If you need to traverse the elements of a sequence and their\n", + "indices, you can use the built-in function enumerate:\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "for index, element in enumerate('abc'):\n", + " print(index, element)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "The result from enumerate is an enumerate object, which\n", + "iterates a sequence of pairs; each pair contains an index (starting\n", + "from 0) and an element from the given sequence.\n", + "In this example, the output is" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "0 a\n", + "1 b\n", + "2 c" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "Again.\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 12.6 Dictionaries and tuples" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + "Dictionaries have a method called items that returns a sequence of\n", + "tuples, where each tuple is a key-value pair.\n", + "
\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "d = {'a':0, 'b':1, 'c':2}\n", + "t = d.items()\n", + "t" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "The result is a dict_items object, which is an iterator that\n", + "iterates the key-value pairs. You can use it in a for loop\n", + "like this:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "for key, value in d.items():\n", + " print(key, value)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "As you should expect from a dictionary, the items are in no\n", + "particular order." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "
\n", + "Going in the other direction, you can use a list of tuples to\n", + "initialize a new dictionary: \n", + "
\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t = [('a', 0), ('c', 2), ('b', 1)]\n", + "d = dict(t)\n", + "d" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Combining dict with zip yields a concise way\n", + "to create a dictionary:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "d = dict(zip('abc', range(3)))\n", + "d" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "The dictionary method update also takes a list of tuples\n", + "and adds them, as key-value pairs, to an existing dictionary.\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It is common to use tuples as keys in dictionaries (primarily because\n", + "you can’t use lists). For example, a telephone directory might map\n", + "from last-name, first-name pairs to telephone numbers. Assuming\n", + "that we have defined last, first and number, we\n", + "could write:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "directory[last, first] = number" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "The expression in brackets is a tuple. We could use tuple\n", + "assignment to traverse this dictionary.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "for last, first in directory:\n", + " print(first, last, directory[last,first])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "This loop traverses the keys in directory, which are tuples. It\n", + "assigns the elements of each tuple to last and first, then\n", + "prints the name and corresponding telephone number." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There are two ways to represent tuples in a state diagram. The more\n", + "detailed version shows the indices and elements just as they appear in\n", + "a list. For example, the tuple ('Cleese', 'John') would appear\n", + "as in Figure 12.1.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "But in a larger diagram you might want to leave out the\n", + "details. For example, a diagram of the telephone directory might\n", + "appear as in Figure 12.2." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here the tuples are shown using Python syntax as a graphical\n", + "shorthand. The telephone number in the diagram is the complaints line\n", + "for the BBC, so please don’t call it." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 12.7 Sequences of sequences" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "I have focused on lists of tuples, but almost all of the examples in\n", + "this chapter also work with lists of lists, tuples of tuples, and\n", + "tuples of lists. To avoid enumerating the possible combinations, it\n", + "is sometimes easier to talk about sequences of sequences." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In many contexts, the different kinds of sequences (strings, lists and\n", + "tuples) can be used interchangeably. So how should you choose one\n", + "over the others?\n", + "\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + "To start with the obvious, strings are more limited than other\n", + "sequences because the elements have to be characters. They are\n", + "also immutable. If you need the ability to change the characters\n", + "in a string (as opposed to creating a new string), you might\n", + "want to use a list of characters instead.\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "
\n", + "Lists are more common than tuples, mostly because they are mutable.\n", + "But there are a few cases where you might prefer tuples:\n", + "\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + "Because tuples are immutable, they don’t provide methods like sort and reverse, which modify existing lists. But Python\n", + "provides the built-in function sorted, which takes any sequence\n", + "and returns a new list with the same elements in sorted order, and\n", + "reversed, which takes a sequence and returns an iterator that\n", + "traverses the list in reverse order.\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 12.8 Debugging" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Lists, dictionaries and tuples are examples of data\n", + "structures; in this chapter we are starting to see compound data\n", + "structures, like lists of tuples, or dictionaries that contain tuples\n", + "as keys and lists as values. Compound data structures are useful, but\n", + "they are prone to what I call shape errors; that is, errors\n", + "caused when a data structure has the wrong type, size, or structure.\n", + "For example, if you are expecting a list with one integer and I\n", + "give you a plain old integer (not in a list), it won’t work.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To help debug these kinds of errors, I have written a module\n", + "called structshape that provides a function, also called\n", + "structshape, that takes any kind of data structure as\n", + "an argument and returns a string that summarizes its shape.\n", + "You can download it from http://thinkpython2.com/code/structshape.py" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here’s the result for a simple list:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from structshape import structshape\n", + "t = [1, 2, 3]\n", + "structshape(t)\n", + "'list of 3 int'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "A fancier program might write “list of 3 ints”, but it\n", + "was easier not to deal with plurals. Here’s a list of lists:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t2 = [[1,2], [3,4], [5,6]]\n", + "structshape(t2)\n", + "'list of 3 list of 2 int'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "If the elements of the list are not the same type,\n", + "structshape groups them, in order, by type:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "t3 = [1, 2, 3, 4.0, '5', '6', [7], [8], 9]\n", + "structshape(t3)\n", + "'list of (3 int, float, 2 str, 2 list of int, int)'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "Here’s a list of tuples:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "s = 'abc'\n", + "lt = list(zip(t, s))\n", + "structshape(lt)\n", + "'list of 3 tuple of (int, str)'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "And here’s a dictionary with 3 items that map integers to strings." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "d = dict(lt) \n", + "structshape(d)\n", + "'dict of 3 int->str'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "If you are having trouble keeping track of your data structures,\n", + "structshape can help." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 12.9 Glossary" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import ipytracer\n", + "from IPython.core.display import display\n", + "\n", + "def bubble_sort(unsorted_list):\n", + " x = ipytracer.ChartTracer(unsorted_list)\n", + " display(x)\n", + " length = len(x)-1\n", + " for i in range(length):\n", + " for j in range(length-i):\n", + " if x[j] > x[j+1]:\n", + " x[j], x[j+1] = x[j+1], x[j]\n", + " return x.tolist()\n", + "\n", + "bubble_sort([6,4,7,9])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import ipytracer\n", + "from IPython.core.display import display\n", + "\n", + "def bubble_sort(unsorted_list):\n", + " x = ipytracer.List1DTracer(unsorted_list)\n", + " display(x)\n", + " length = len(x)-1\n", + " for i in range(length):\n", + " for j in range(length-i):\n", + " if x[j] > x[j+1]:\n", + " x[j], x[j+1] = x[j+1], x[j]\n", + " print(unsorted_list) \n", + " return x.tolist()\n", + "\n", + "bubble_sort([6,4,7,9])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import ipytracer\n", + "from IPython.core.display import display\n", + "import re\n", + "\n", + " \n", + "def quick_sort(arr): \n", + " input_list = ipytracer.ChartTracer(arr)\n", + " display(input_list)\n", + "\n", + " def alphanum_key(key):\n", + " return [int(s) if s.isdigit() else s.lower() for s in re.split(\"([0-9]+)\", key)]\n", + "\n", + " return sorted(input_list, key=alphanum_key)\n", + "\n", + "\n", + "quick_sort(['6','4','7','9','3','5','1','8'])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import random\n", + "def merge_sort(collectionx: list) -> list:\n", + " collectionx = ipytracer.List1DTracer(collectionx)\n", + " display(collectionx)\n", + " \n", + " for i in range(0, 8):\n", + " collectionx[i] = i\n", + " collectionx[i-1] = i-1\n", + " collectionx[i-2] = i*2\n", + "\n", + "\n", + "merge_sort([6,4,7,9,3,5,1,8,2])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def merge_sort(collection: list) -> list:\n", + "\n", + "\n", + " def merge(left: list, right: list) -> list:\n", + " \"\"\"merge left and right\n", + " :param left: left collection\n", + " :param right: right collection\n", + " :return: merge result\n", + " \"\"\"\n", + "\n", + " def _merge():\n", + " while left and right:\n", + " yield (left if left[0] <= right[0] else right).pop(0)\n", + " yield from left\n", + " yield from right\n", + "\n", + " return list(_merge())\n", + "\n", + " if len(collection) <= 1:\n", + " return collection\n", + " mid = len(collection) // 2\n", + " display(ipytracer.List1DTracer(collection))\n", + " left = merge_sort(collection[:mid])\n", + " right = merge_sort(collection[mid:])\n", + " x = merge(left, right)\n", + " display(x)\n", + " return merge(left, right)\n", + "\n", + "merge_sort([6,4,7,9,3,5,1,8,2])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def shell_sort(collection):\n", + " collection = ipytracer.List1DTracer(collection)\n", + " display(collection)\n", + " gaps = [701, 301, 132, 57, 23, 10, 4, 1]\n", + "\n", + " for gap in gaps:\n", + " for i in range(gap, len(collection)):\n", + " j = i\n", + " while j >= gap and collection[j] < collection[j - gap]:\n", + " collection[j], collection[j - gap] = collection[j - gap], collection[j]\n", + " j -= gap\n", + " return collection\n", + "\n", + "shell_sort([6,4,7,9,3,5,1,8,2])" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/notebooks/csv/data.csv.zip b/notebooks/csv/data.csv.zip new file mode 100644 index 0000000..1acfcc2 Binary files /dev/null and b/notebooks/csv/data.csv.zip differ