With notes from Think Python (2nd Edition)
int
, float
, string
, bool
)list
, tuple
, set
, dict
)file
, Exception
)R
in your early courses moreThink Python (2nd Edition) Chapter 1.5
4.5 + 2.5 # 7.0
5.2 / 2.3
2.2 ** 3.1
# 2.1 ^ 3.1 # This is still not an exponent and causes an error
8.4 // 3.1 # 2.0
float("-1002.101") # Convert string to int
float(10) == 10.0 # Convert `int` to `float` (almost always unnecessary)
_, remainder = divmod(8.4, 2.05) # (4.0, 0.20000000000000107)
print(0.2 == remainder) # Floating point arithmetic can be confusing
print(remainder - 0.2)
print(abs(remainder - 0.2) < 1e-10)
False
1.0547118733938987e-15
True
Think Python (2nd Edition) Chapter 1.5, Chapter 7.5
bool
s are used for conditional executiondef check_num(x):
x = float(x)
if not x.is_integer():
print(x, "is a float")
elif x % 2:
print(x, "is odd")
else:
print(x, "is even")
check_num(4.0)
check_num(5)
check_num(0.1)
4.0 is even
5.0 is odd
0.1 is a float
bool
. If they cast to True
, then they are “truthy”.Think Python (2nd Edition) Chapter 5
any
: returns True
if any element is “truthy”all
: returns True
is all elements are “truthy”filter
: easy way to filter a sequence of elementsint
: casting bool
s to int
s can be very useful: int(True)
or int(False)
if x == True:
, just if x:
None
.None
simply do x is None
or x is not None
.Think Python (2nd Edition) Chapter 3.10
x == None
and x != None
also work, but using is
is considered to be correct, because there is only one None
object.None
, so you will run into bugs if you doNone
if it cannot be found. It’s important to determine return values with documentation.s = "abcdef"
try:
print("z index:", s.index("z"))
except ValueError as e:
print("Exception:", e)
print("z find:", s.find("z"))
# This is a dictionary comprehension
letter_lookup = {letter: index for index, letter in enumerate(s)}
print("Index of b:", letter_lookup.get("b"))
print("Index of z:", letter_lookup.get("z"))
Exception: substring not found
z find: -1
Index of b: 1
Index of z: None
e
4 is my favorite number
Think Python (2nd Edition) Chapter 8
len("abcdef") # 6
sorted("defabc") # "abcdef"
reversed("hello world") # "dlrow olleh"
"hi friend".replace("friend", "john") # "hi john"
"HELLO FRIEND".lower() # "hello friend"
"hello, how are you?".split() # ["hello,", "how", "are", "you?"]
" hello there! ".strip() # "hello there!"
"+".join(["good", "evening", "friend"]) # "good+evening+friend"
"hello" in "hello friend" # True
x = input("How are you?") # User input
[1, 'b', 'c', 'd', 5]
Think Python (2nd Edition) Chapter 10
.extend
: add another sequence to a list.count
: count the number of an element in a list.pop
: remove the last element from a list.index
: find the index of an element in a list[1, 2, 3, 4]
True
my_list = list("abcdefgh")
print(my_list[:3], my_list[4:], my_list[2:5], my_list[:-1], my_list[::2])
['a', 'b', 'c'] ['e', 'f', 'g', 'h'] ['c', 'd', 'e'] ['a', 'b', 'c', 'd', 'e', 'f', 'g'] ['a', 'c', 'e', 'g']
Think Python (2nd Edition) Chapter 19.5
.pop
: Get and remove a random element from a set.add
: Add element to a set.remove
: Remove an element from a set.union
: Combine setsfrozenset
if you do not plan on changing itdict
s are great ways to map keys to values. A simple example is a histogram:my_str = "Welcome to Duke!"
my_dict = {}
for character in my_str:
if character in my_dict:
my_dict[character] = my_dict[character] + 1
else:
my_dict[character] = 1
print(my_dict)
{'W': 1, 'e': 3, 'l': 1, 'c': 1, 'o': 2, 'm': 1, ' ': 2, 't': 1, 'D': 1, 'u': 1, 'k': 1, '!': 1}
Think Python (2nd Edition) Chapter 11
.keys
: Get an iterable of all the keys in a dictionary
.values
: Get an iterable of all the values in a dictionary
.items
: Get an iterable of all the key-value pairs in a dictionary
.update
: Combine two dictionaries
{'a': 1, 'b': 3, 'c': 4}
.get
: Get a value from a dictionary but specify a defaultDo key in my_dict
not key in my_dict.keys()
collections.defaultdict
is another way to handle the need for default values in a dictionary
Nest dictionaries if necessary
Very similar to list
s, but they cannot be changed! They are immutable.
my_list = [1, 2, 3]
my_tuple = tuple(my_list)
my_list[1] = 100
print(my_tuple)
try:
my_tuple[1] = 100
except Exception as e:
print(e)
(1, 2, 3)
'tuple' object does not support item assignment
Think Python (2nd Edition) Chapter 12
To make a tuple
with one element: x = (1,)
Immutability allows tuple
s to be used as dict
keys and be stored in set
s:
my_key = (1, 2, 3, "four")
my_dict = {my_key: "my favorite numbers"}
my_set = set()
my_set.add(my_key)
print(my_dict)
print(my_set)
my_other_key = (1, ["a", "list"], 3)
try: # All elements of the tuple need to be immutable too
my_set.add(my_other_key)
except Exception as e:
print(e)
{(1, 2, 3, 'four'): 'my favorite numbers'}
{(1, 2, 3, 'four')}
unhashable type: 'list'
from collections import Counter # Makes a dictionary that counts elements
my_str = "I enjoy eating almonds"
def print_data(data):
for item in data:
print(item, end=" ")
print()
print_data(my_str)
print_data(set(my_str)) # No order or repeats
print_data(list(my_str))
print_data(tuple(my_str))
print_data(Counter(my_str)) # Dictionaries are ordered by insertion, iterates over keys
I e n j o y e a t i n g a l m o n d s
o g d e l s t y j n m i a I
I e n j o y e a t i n g a l m o n d s
I e n j o y e a t i n g a l m o n d s
I e n j o y a t i g l m d s
Think Python (2nd Edition) Chapter 7, Chapter 8.3
len
: Get the natural size of an objectenumerate
: Iterate over an object, but with index, value pairs.zip
: Iterate over two object at the same timemap
: Map the values of an iterable using a functionitertools
module: Contains many useful functions for specific kinds of iterationwhile
loops are useful if you are unsure how many iterations are neededcontinue
and break
def my_func(a, b, c=5):
return f"({a} + {b}) * {c} = {(a+b) * c}"
print(my_func(1, 1, 0))
print(my_func(10, 4))
(1 + 1) * 0 = 0
(10 + 4) * 5 = 70
def my_func_with_inf_args(*args, **kwargs):
print(args)
print(kwargs)
my_func_with_inf_args(1, 2, 3, a=4, b=5)
(1, 2, 3)
{'a': 4, 'b': 5}
def call_twice(some_func, *args, **kwargs):
some_func(*args, **kwargs)
some_func(*args, **kwargs)
call_twice(my_func_with_inf_args, 1, 2, f=9)
(1, 2)
{'f': 9}
(1, 2)
{'f': 9}
lambda
functionObjects are everywhere and naturally you can make your own.
from random import shuffle
SUITES = ("Hearts", "Spades", "Clubs", "Diamonds")
RANKS = (2, 3, 4, 5, 6, 7, 8, 9, 10, "Jack", "Queen", "King", "Ace")
class CardDeck:
def __init__(self, empty=False): # Constructor
self.cards = []
if not empty:
for suit in SUITES:
for rank in RANKS:
self.cards.append((suit, rank))
def add_card(self, suit, rank):
self.cards.append((suit, rank))
def shuffle(self):
shuffle(self.cards)
def draw_card(self):
return self.cards.pop()
my_deck = CardDeck()
my_deck.shuffle()
print(my_deck.draw_card())
print(my_deck.draw_card())
('Hearts', 9)
('Spades', 7)
Think Python (2nd Edition) Chapter 15, Chapter 16, Chapter 17, Chapter 18
Your own!
class DiscardPile(CardDeck): # Inheritance let's you reuse code
def __init__(self):
CardDeck.__init__(self, empty=True)
def add_card(self, deck):
drawn_card = deck.draw_card()
self.cards.append(drawn_card)
return drawn_card
my_deck = CardDeck()
discard = DiscardPile()
discarded_card = discard.add_card(my_deck)
print
)draw_card
)my_deck.cards
)__init__
methodmy_deck
)If your program runs into an error, it will terminate if the resulting Exception
is not caught.
Exception
: All exceptions fall under this classArithmeticError
: Base exception for OverflowError
, ZeroDivisionError
, FloatingPointError
AttributeError
: Attempting to access an attribute that does not exist on an objectIndexError
: Attempting to use an invalid index on a sequenceKeyboardInterrupt
: Raised when ctrl-c
is pressed during executionNameError
: Using a variable that does not existTypeError
: Operating on two objects with incompatible typesValueError
: Input to a function is invalidraise
your own!# Writing files, use the mode argument
# Careful! This will delete the file if it is present
my_file = open("path_to_file.txt", mode="w")
my_file.write("output I want to keep")
my_file.close() # You must close the file, or your results may be lost
# Reading files
my_file = open("path_to_file.txt", mode="r") # The mode is r by default
print(my_file.readline())
my_file.close()
Python Docs on Files, Think Python (2nd Edition) Chapter 14
os.path
module and the pathlib
module contain many methods for operating on the file system:
os.path.exists
: Determine if a file exists at a given pathos.path.join
: Join two path components togetheros.listdir
: List files in a directoryjson.load
in the json
module is vital for reading .json
files. It is also useful for writing dictionaries to a file with json.dump
..readlines()
: Read a file in line-by-line altogetherwith
statement to automatically close files when you are done using them. This includes if your program terminates unexpectedly.A friend has a directory of 1000 files where each file has one of the following extensions: .csv
, .tsv
, .json
. However, each file has comments throughout it delimited by ##
, so they do not follow the proper format. They ask you to write a Python script which will combine all the files into 1 while removing the comments and ensuring the data is in a proper .csv
format.
Character-delineated files and JSON
person,age,job,favorite_color
amy,20,waiter,blue
barry,30,engineer,grey
## we have only adults in the dataset
carl,25,None,purple ## None means unemployed
## this person did not understand the survey dan,29,superhero,pineapple
person age job favorite_color
amy 20 waiter blue
barry 30 engineer grey
## we have only adults in the dataset
carl 25 None purple ## None means unemployed
## this person did not understand the survey dan 29 superhero pineapple
{
"data": [ ## List of people
{
"name": "amy",
"age": 20,
"job": "waiter",
"favorite_color": "blue"
},
{
## Barry is friends with my Dad, Jerry
"name": "barry",
"age": 30,
"job": "engineer",
"favorite_color": "grey"
},
... omitted for brevity
]
}
Create a variable to store data from all files
For every file my friend has
Read them in line-by-line
Remove all comments in each line
Remove all empty lines
If it is a .csv
Seperate the commas and store it
If it is a .tsv
Seperate the tabs and store it
If it is a .json
Read the .json and store it
Turn the variable that is storing all the data into a .csv
Create a variable to store data from all files
For every file my friend has, Read it in as a string
import os
path_to_folder = "./my_friend/stored/the/files/here"
for file_path in os.listdir(path_to_folder):
file = open(file_path)
file_as_lines = file.readlines()
# [
# "person,age,job,favorite_color\n",
# "amy,20,waiter,blue\n",
# "barry,30,engineer,grey\n",
# "## we have only adults in the dataset\n",
# "carl,25,None,purple ## None means unemployed\n",
# "## this person did not understand the survey dan,29,superhero,pineapple"
# ]
Remove all comments in each line then remove all empty lines
def remove_comments(line):
no_whitespace = line.strip() # => "## this is a comment"
if "##" in no_whitespace:
comment_starts_at_index = no_whitespace.index("##")
filtered = no_whitespace[:comment_starts_at_index]
return filtered.strip() # In case there are spaces around the comment
return no_whitespace
print("test 1", remove_comments(" ## this is a comment "))
print("test 2", remove_comments("person,age,job,favorite_color"))
print("test 3", remove_comments("person,age,job,favorite_color## comments "))
test 1
test 2 person,age,job,favorite_color
test 3 person,age,job,favorite_color
If it is a .csv or .tsv, seperate the commas or tabs and store it
def get_delimited_entries(file_as_str, delimiter):
to_return = []
lines = file_as_str.split("\n")
for line in lines[1:]: # Skip header
to_return.append(line.split(delimiter))
return to_return
if file_path.endswith(".csv"):
my_data.extend(get_delimited_entries(file_with_no_comments, ","))
elif file_path.endswith(".tsv"):
my_data.extend(get_delimited_entries(file_with_no_comments, "\t"))
If it is a .json read the .json and store it
Turn the variable that is storing all the data into a .csv
Full script here
In a professional environment, unit tests may be written to ensure bugs are not introduced during the development process.
import unittest
from my_module import my_func
class TestMyStuff(unittest.TestCase):
def test_my_func(self):
my_output = my_func(1, 2, 3)
self.assertEqual(6, my_output)
def test_my_func_with_strings(self):
my_output = my_func(1, 2, 3, "h")
self.assertEqual(60, my_output)
def test_my_func_raises_exception(self):
with self.assertRaises(ValueError):
my_output = my_func(1, 2, 3, "h", "")
C
, C++
, fortran
, etc. to achieve very similar speeds.numpy
is so much faster.