[go: up one dir, main page]

0% found this document useful (0 votes)
19 views45 pages

4 Strings

The document provides an overview of strings in Python, covering topics such as Unicode, common sequence operations, string indexing, and various string manipulation methods like split, join, and partition. It also includes practical examples and assignments related to isograms and substitution ciphers. Additionally, it outlines operations for encoding and decoding text using a simple substitution cipher.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views45 pages

4 Strings

The document provides an overview of strings in Python, covering topics such as Unicode, common sequence operations, string indexing, and various string manipulation methods like split, join, and partition. It also includes practical examples and assignments related to isograms and substitution ciphers. Additionally, it outlines operations for encoding and decoding text using a simple substitution cipher.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Strings in Python

CS100 / CS101
Outline

Unicode

Common Sequence Operations

String Indexing

split, join, partition
String

immutable sequence of Unicode code points

String

immutable sequence of Unicode code points

can include letters, diacritical marks (é, ï, ô, ...),
numbers, currency symbols, emoji (" $ ),
punctuation, space and line break characters,
and more.

Example Devanagari string
üŋíc0de
üŋíc0de
Unicode

144,697 characters, 159 modern + historic scripts,
symbols, emoji, non-visual control and formatting
codes

Standard: How to store and display
– Normalization rules, decomposition, collation, rendering,
and bidirectional text display order, ...
– Encoding, representation

ASCII vs. Unicode
https://home.unicode.org, Unicode wikipage
Common Sequence Operations

These operations in Python are supported by
most Data types
Common Sequence Operations
x in s True if an item of s is equal to x, else False
x not in s False if an item of s is equal to x, else True
Common Sequence Operations
x in s True if an item of s is equal to x, else False
x not in s False if an item of s is equal to x, else True
s+t the concatenation of s and t
s * n or n * s equivalent to adding s to itself n times
Common Sequence Operations
x in s True if an item of s is equal to x, else False
x not in s False if an item of s is equal to x, else True
s+t the concatenation of s and t
s * n or n * s equivalent to adding s to itself n times
s[i] ith item of s, origin 0
s[i:j] slice of s from i to j
String Indexing

N
N aa tt ii oo nn
0 1 2 3 4 5

N
N aa tt ii oo nn
-6 -5 -4 -3 -2 -1
Common Sequence Operations
x in s True if an item of s is equal to x, else False
x not in s False if an item of s is equal to x, else True
s+t the concatenation of s and t
s * n or n * s equivalent to adding s to itself n times
s[i] ith item of s, origin 0
s[i:j] slice of s from i to j
s[i:j:k] slice of s from i to j with step k
Common Sequence Operations
x in s True if an item of s is equal to x, else False
x not in s False if an item of s is equal to x, else True
s+t the concatenation of s and t
s * n or n * s equivalent to adding s to itself n times
s[i] ith item of s, origin 0
s[i:j] slice of s from i to j
s[i:j:k] slice of s from i to j with step k
len(s) length of s
Common Sequence Operations
x in s True if an item of s is equal to x, else False
x not in s False if an item of s is equal to x, else True
s+t the concatenation of s and t
s * n or n * s equivalent to adding s to itself n times
s[i] ith item of s, origin 0
s[i:j] slice of s from i to j
s[i:j:k] slice of s from i to j with step k
len(s) length of s
min(s) smallest item of s
max(s) largest item of s
Common Sequence Operations
x in s True if an item of s is equal to x, else False
x not in s False if an item of s is equal to x, else True
s+t the concatenation of s and t
s * n or n * s equivalent to adding s to itself n times
s[i] ith item of s, origin 0
s[i:j] slice of s from i to j
s[i:j:k] slice of s from i to j with step k
len(s) length of s
min(s) smallest item of s
max(s) largest item of s
s.index(x[, i[, j]]) index of the first occurrence of x in s (at or after index i
and before index j)
Common Sequence Operations
x in s True if an item of s is equal to x, else False
x not in s False if an item of s is equal to x, else True
s+t the concatenation of s and t
s * n or n * s equivalent to adding s to itself n times
s[i] ith item of s, origin 0
s[i:j] slice of s from i to j
s[i:j:k] slice of s from i to j with step k
len(s) length of s
min(s) smallest item of s
max(s) largest item of s
s.index(x[, i[, j]]) index of the first occurrence of x in s (at or after index i and before
index j)
s.count(<sub>, [<start> [, end]]) total number of occurrences of x in s
Strings

Single quotes, Double quotes

Multiline strings in Triple quotes
– Docstrings
Operations on Strings

Convert int to str, vice-versa

Convert list to str, vice-versa

Split Operations
>>> '1,2,3'.split(',')

>>> '1,2,3'.split(',', maxsplit=1)

>>> '1,2,,3,'.split(',')

>>> '1 2 3'.split()

>>> '1 2 3'.split(maxsplit=1)

>>> ' 1 2 3 '.split()


Split Operations
>>> '1,2,3'.split(',')
['1', '2', '3']
>>> '1,2,3'.split(',', maxsplit=1)
['1', '2,3']
>>> '1,2,,3,'.split(',')
['1', '2', '', '3', '']
>>> '1 2 3'.split()
['1', '2', '3']
>>> '1 2 3'.split(maxsplit=1)
['1', '2 3']
>>> ' 1 2 3 '.split()
['1', '2', '3']
Join, Partition
>>> chickens = ["hen", "egg", "rooster"]
>>> ' '.join(chickens)

>>> ' :: '.join(chickens)

>>>'foo.bar'.partition('.')
Iterate through characters of the String
>>> for code_point in some_string:
... print(code_point)
>>> >>> for index, code_point in enumerate(some_string):
... print(index, ": ", code_point)
Find, Replace
s.replace(<old>, <new>[, <count>])
s.capitalize()
s.swapcase()
s.lower(), s.upper(), s.title()
s.count(<sub>[, <start>[, <end>]])
s.startswith(<prefix>[, <start>[, <end>]]), s.endswith(<suffix>[, <start>[, <end>]])
s.find(<sub>[, <start>[, <end>]]), s.rfind(<sub>[, <start>[, <end>]])
s.index(<sub>[, <start>[, <end>]]), s.rindex(<sub>[, <start>[, <end>]])
Character Classification
s.isalnum()
s.isalpha()
s.isdigit()
s.islower(), s.isupper()
s.strip([<chars>]), s.lstrip([<chars>]), s.rstrip([<chars>]),
Strings Lab Assignments

Isogram
CS100/CS101
Outline

Example: Isogram

Lab Homework: Simple Cipher

Submit both
Isogram

Determine if a word or phrase is an isogram.

An isogram (also known as a "non-pattern word") is a
word or phrase without a repeating letter, however spaces
and hyphens are allowed to appear multiple times.

Examples of isograms:
– lumberjacks, background, downstream, six-year-old, isogram

The word isograms, however, is not an isogram, because
the s repeats.

https://en.wikipedia.org/wiki/Isogram, exercism.org
Isogram

Input:

Output
Isogram

Input: A word or a phrase (String)

Output – True / False
– True (if the supplied input is an Isogram)
– False (if the supplied input is NOT an Isogram)
Steps to do

Propose a solution

Formalize the plain English solution
– Flowchart OR Pseudocode

We’ll run it through a bunch of sample inputs

If the expected output is returned every time,
let’s translate the solution to Python code
Isogram – Solution
Isogram – Flowchart
Isogram – Pseudocode
Isogram – Sample Inputs

“” (Empty String)

Example words from first slide, uncopyrightable (longest
isogram)

First# Clan! (string contains punctuation marks)

“BackGround” (String contains upper- and lowercase letter)

Non-isograms containing single letter repetition to several
letters repeating.

...
Isogram – Solution

If the string is Empty, return True

For ever letter in the String
Isogram – Python
Isogram – Python
def is_isogram(string):
in_chars=[]
for i in string:
if i.lower() in in_chars:
return False
else:
if i.isalpha() is True:
in_chars.append(i.lower())
return True
Homework – Encryption, Decryption
Homework – Encryption, Decryption

http://en.wikipedia.org/wiki/Substitution_cipher, Simple Cipher at exercism.org


Encryption, Decryption – Key
The ROT13 key can be written as
“nnnnnnnnnnnnn”
Each position in the Key is the letter to be
substituted with the character ‘a’ in that
position
Examples.
Key is “aaaaa”. “hello” ==> “hello”
Key is “ddddd”. “hello” ==> “khoor”
Key is “nnnnn”. “hello” ==> “uryyb”
Key is “abcde”. “hello” ==> “hfnos”
http://en.wikipedia.org/wiki/Substitution_cipher, Simple Cipher at exercism.org
Homework –
Encryption, Decryption


Implement a Substitution Cipher

Substitution cipher replaces plaintext with an
identical length ciphertext

Ciphers render text less readable while still
allowing easy deciphering

http://en.wikipedia.org/wiki/Substitution_cipher, Simple Cipher at exercism.org


Substitution Cipher

Create 2 functions: encode(), and decode().

encode(text, key=“dddddddddddddddddddd”):
– Text: plain text to be encrypted
– Key: key to use to encrypt. Keep default as ROT3.
– Encode implements Simple Shift Cipher using the key (Eg. Caesar
Cipher)

encode(“dontlookup”) should return “grqworrnxs”

encode(“dontlookup”, “abcdefghij”) should return “dppwpturcy”

http://en.wikipedia.org/wiki/Substitution_cipher, Simple Cipher at exercism.org


Substitution Cipher

Create 2 functions: encode(), and decode().

decode(text, key=“dddddddddddddddddddd”):
– Text: cipher text to be decrypted to plaintext
– Key: key to use to encrypt. Default is ROT3.
– Decode implements Simple Shift Cipher using the key (Eg. Caesar
Cipher)

decode(“grqworrnxs”) should return “dontlookup”

decode(“dppwpturcy”, “abcdefghij”) should return “dontlookup”

http://en.wikipedia.org/wiki/Substitution_cipher, Simple Cipher at exercism.org


Substitution Cipher – Assumptions

If key supplied is “abc”, then the full key is
‘abcabcabc...’ (as long as the message is)

Assume there are no spaces and punctuation
marks in the message

Max message size is 100 letters

State any other assumptions in the comments

http://en.wikipedia.org/wiki/Substitution_cipher, Simple Cipher at exercism.org


Summary

Unicode

Common Sequence Operations

String Indexing

split, join, partition

You might also like