With notes from Think Python (2nd Edition)
int, float, string, bool)list, tuple, set, dict)file, Exception)R in your early courses moreThink Python (2nd Edition) Chapter 1.5
4.5 + 2.5 # 7.0
5.2 / 2.3
2.2 ** 3.1
# 2.1 ^ 3.1 # This is still not an exponent and causes an error
8.4 // 3.1 # 2.0
float("-1002.101") # Convert string to int
float(10) == 10.0 # Convert `int` to `float` (almost always unnecessary)
_, remainder = divmod(8.4, 2.05) # (4.0, 0.20000000000000107)
print(0.2 == remainder) # Floating point arithmetic can be confusing
print(remainder - 0.2)
print(abs(remainder - 0.2) < 1e-10)False
1.0547118733938987e-15
True
Think Python (2nd Edition) Chapter 1.5, Chapter 7.5
bools are used for conditional executiondef check_num(x):
x = float(x)
if not x.is_integer():
print(x, "is a float")
elif x % 2:
print(x, "is odd")
else:
print(x, "is even")
check_num(4.0)
check_num(5)
check_num(0.1)4.0 is even
5.0 is odd
0.1 is a float
bool. If they cast to True, then they are “truthy”.Think Python (2nd Edition) Chapter 5
any: returns True if any element is “truthy”all: returns True is all elements are “truthy”filter: easy way to filter a sequence of elementsint: casting bools to ints can be very useful: int(True) or int(False)if x == True:, just if x:None.None simply do x is None or x is not None.Think Python (2nd Edition) Chapter 3.10
x == None and x != None also work, but using is is considered to be correct, because there is only one None object.None, so you will run into bugs if you doNone if it cannot be found. It’s important to determine return values with documentation.s = "abcdef"
try:
print("z index:", s.index("z"))
except ValueError as e:
print("Exception:", e)
print("z find:", s.find("z"))
# This is a dictionary comprehension
letter_lookup = {letter: index for index, letter in enumerate(s)}
print("Index of b:", letter_lookup.get("b"))
print("Index of z:", letter_lookup.get("z"))Exception: substring not found
z find: -1
Index of b: 1
Index of z: None
e
4 is my favorite number
Think Python (2nd Edition) Chapter 8
len("abcdef") # 6
sorted("defabc") # "abcdef"
reversed("hello world") # "dlrow olleh"
"hi friend".replace("friend", "john") # "hi john"
"HELLO FRIEND".lower() # "hello friend"
"hello, how are you?".split() # ["hello,", "how", "are", "you?"]
" hello there! ".strip() # "hello there!"
"+".join(["good", "evening", "friend"]) # "good+evening+friend"
"hello" in "hello friend" # True
x = input("How are you?") # User input[1, 'b', 'c', 'd', 5]
Think Python (2nd Edition) Chapter 10
.extend: add another sequence to a list.count: count the number of an element in a list.pop: remove the last element from a list.index: find the index of an element in a list[1, 2, 3, 4]
True
my_list = list("abcdefgh")
print(my_list[:3], my_list[4:], my_list[2:5], my_list[:-1], my_list[::2])['a', 'b', 'c'] ['e', 'f', 'g', 'h'] ['c', 'd', 'e'] ['a', 'b', 'c', 'd', 'e', 'f', 'g'] ['a', 'c', 'e', 'g']
Think Python (2nd Edition) Chapter 19.5
.pop: Get and remove a random element from a set.add: Add element to a set.remove: Remove an element from a set.union: Combine setsfrozenset if you do not plan on changing itdicts are great ways to map keys to values. A simple example is a histogram:my_str = "Welcome to Duke!"
my_dict = {}
for character in my_str:
if character in my_dict:
my_dict[character] = my_dict[character] + 1
else:
my_dict[character] = 1
print(my_dict){'W': 1, 'e': 3, 'l': 1, 'c': 1, 'o': 2, 'm': 1, ' ': 2, 't': 1, 'D': 1, 'u': 1, 'k': 1, '!': 1}
Think Python (2nd Edition) Chapter 11
.keys: Get an iterable of all the keys in a dictionary
.values: Get an iterable of all the values in a dictionary
.items: Get an iterable of all the key-value pairs in a dictionary
.update: Combine two dictionaries
{'a': 1, 'b': 3, 'c': 4}
.get: Get a value from a dictionary but specify a defaultDo key in my_dict not key in my_dict.keys()
collections.defaultdict is another way to handle the need for default values in a dictionary
Nest dictionaries if necessary
Very similar to lists, but they cannot be changed! They are immutable.
my_list = [1, 2, 3]
my_tuple = tuple(my_list)
my_list[1] = 100
print(my_tuple)
try:
my_tuple[1] = 100
except Exception as e:
print(e)(1, 2, 3)
'tuple' object does not support item assignment
Think Python (2nd Edition) Chapter 12
To make a tuple with one element: x = (1,)
Immutability allows tuples to be used as dict keys and be stored in sets:
my_key = (1, 2, 3, "four")
my_dict = {my_key: "my favorite numbers"}
my_set = set()
my_set.add(my_key)
print(my_dict)
print(my_set)
my_other_key = (1, ["a", "list"], 3)
try: # All elements of the tuple need to be immutable too
my_set.add(my_other_key)
except Exception as e:
print(e){(1, 2, 3, 'four'): 'my favorite numbers'}
{(1, 2, 3, 'four')}
unhashable type: 'list'
from collections import Counter # Makes a dictionary that counts elements
my_str = "I enjoy eating almonds"
def print_data(data):
for item in data:
print(item, end=" ")
print()
print_data(my_str)
print_data(set(my_str)) # No order or repeats
print_data(list(my_str))
print_data(tuple(my_str))
print_data(Counter(my_str)) # Dictionaries are ordered by insertion, iterates over keysI e n j o y e a t i n g a l m o n d s
o g d e l s t y j n m i a I
I e n j o y e a t i n g a l m o n d s
I e n j o y e a t i n g a l m o n d s
I e n j o y a t i g l m d s
Think Python (2nd Edition) Chapter 7, Chapter 8.3
len: Get the natural size of an objectenumerate: Iterate over an object, but with index, value pairs.zip: Iterate over two object at the same timemap: Map the values of an iterable using a functionitertools module: Contains many useful functions for specific kinds of iterationwhile loops are useful if you are unsure how many iterations are neededcontinue and breakdef my_func(a, b, c=5):
return f"({a} + {b}) * {c} = {(a+b) * c}"
print(my_func(1, 1, 0))
print(my_func(10, 4))(1 + 1) * 0 = 0
(10 + 4) * 5 = 70
def my_func_with_inf_args(*args, **kwargs):
print(args)
print(kwargs)
my_func_with_inf_args(1, 2, 3, a=4, b=5)(1, 2, 3)
{'a': 4, 'b': 5}
def call_twice(some_func, *args, **kwargs):
some_func(*args, **kwargs)
some_func(*args, **kwargs)
call_twice(my_func_with_inf_args, 1, 2, f=9)(1, 2)
{'f': 9}
(1, 2)
{'f': 9}
lambda functionObjects are everywhere and naturally you can make your own.
from random import shuffle
SUITES = ("Hearts", "Spades", "Clubs", "Diamonds")
RANKS = (2, 3, 4, 5, 6, 7, 8, 9, 10, "Jack", "Queen", "King", "Ace")
class CardDeck:
def __init__(self, empty=False): # Constructor
self.cards = []
if not empty:
for suit in SUITES:
for rank in RANKS:
self.cards.append((suit, rank))
def add_card(self, suit, rank):
self.cards.append((suit, rank))
def shuffle(self):
shuffle(self.cards)
def draw_card(self):
return self.cards.pop()
my_deck = CardDeck()
my_deck.shuffle()
print(my_deck.draw_card())
print(my_deck.draw_card())('Hearts', 9)
('Spades', 7)
Think Python (2nd Edition) Chapter 15, Chapter 16, Chapter 17, Chapter 18
Your own!
class DiscardPile(CardDeck): # Inheritance let's you reuse code
def __init__(self):
CardDeck.__init__(self, empty=True)
def add_card(self, deck):
drawn_card = deck.draw_card()
self.cards.append(drawn_card)
return drawn_card
my_deck = CardDeck()
discard = DiscardPile()
discarded_card = discard.add_card(my_deck)print)draw_card)my_deck.cards)__init__ methodmy_deck)If your program runs into an error, it will terminate if the resulting Exception is not caught.
Exception: All exceptions fall under this classArithmeticError: Base exception for OverflowError, ZeroDivisionError, FloatingPointErrorAttributeError: Attempting to access an attribute that does not exist on an objectIndexError: Attempting to use an invalid index on a sequenceKeyboardInterrupt: Raised when ctrl-c is pressed during executionNameError: Using a variable that does not existTypeError: Operating on two objects with incompatible typesValueError: Input to a function is invalidraise your own!# Writing files, use the mode argument
# Careful! This will delete the file if it is present
my_file = open("path_to_file.txt", mode="w")
my_file.write("output I want to keep")
my_file.close() # You must close the file, or your results may be lost
# Reading files
my_file = open("path_to_file.txt", mode="r") # The mode is r by default
print(my_file.readline())
my_file.close()Python Docs on Files, Think Python (2nd Edition) Chapter 14
os.path module and the pathlib module contain many methods for operating on the file system:
os.path.exists: Determine if a file exists at a given pathos.path.join: Join two path components togetheros.listdir: List files in a directoryjson.load in the json module is vital for reading .json files. It is also useful for writing dictionaries to a file with json.dump..readlines(): Read a file in line-by-line altogetherwith statement to automatically close files when you are done using them. This includes if your program terminates unexpectedly.A friend has a directory of 1000 files where each file has one of the following extensions: .csv, .tsv, .json. However, each file has comments throughout it delimited by ##, so they do not follow the proper format. They ask you to write a Python script which will combine all the files into 1 while removing the comments and ensuring the data is in a proper .csv format.
Character-delineated files and JSON
person,age,job,favorite_color
amy,20,waiter,blue
barry,30,engineer,grey
## we have only adults in the dataset
carl,25,None,purple ## None means unemployed
## this person did not understand the survey dan,29,superhero,pineapple
person age job favorite_color
amy 20 waiter blue
barry 30 engineer grey
## we have only adults in the dataset
carl 25 None purple ## None means unemployed
## this person did not understand the survey dan 29 superhero pineapple
{
"data": [ ## List of people
{
"name": "amy",
"age": 20,
"job": "waiter",
"favorite_color": "blue"
},
{
## Barry is friends with my Dad, Jerry
"name": "barry",
"age": 30,
"job": "engineer",
"favorite_color": "grey"
},
... omitted for brevity
]
}
Create a variable to store data from all files
For every file my friend has
Read them in line-by-line
Remove all comments in each line
Remove all empty lines
If it is a .csv
Seperate the commas and store it
If it is a .tsv
Seperate the tabs and store it
If it is a .json
Read the .json and store it
Turn the variable that is storing all the data into a .csv
Create a variable to store data from all files
For every file my friend has, Read it in as a string
import os
path_to_folder = "./my_friend/stored/the/files/here"
for file_path in os.listdir(path_to_folder):
file = open(file_path)
file_as_lines = file.readlines()
# [
# "person,age,job,favorite_color\n",
# "amy,20,waiter,blue\n",
# "barry,30,engineer,grey\n",
# "## we have only adults in the dataset\n",
# "carl,25,None,purple ## None means unemployed\n",
# "## this person did not understand the survey dan,29,superhero,pineapple"
# ]Remove all comments in each line then remove all empty lines
def remove_comments(line):
no_whitespace = line.strip() # => "## this is a comment"
if "##" in no_whitespace:
comment_starts_at_index = no_whitespace.index("##")
filtered = no_whitespace[:comment_starts_at_index]
return filtered.strip() # In case there are spaces around the comment
return no_whitespace
print("test 1", remove_comments(" ## this is a comment "))
print("test 2", remove_comments("person,age,job,favorite_color"))
print("test 3", remove_comments("person,age,job,favorite_color## comments "))test 1
test 2 person,age,job,favorite_color
test 3 person,age,job,favorite_color
If it is a .csv or .tsv, seperate the commas or tabs and store it
def get_delimited_entries(file_as_str, delimiter):
to_return = []
lines = file_as_str.split("\n")
for line in lines[1:]: # Skip header
to_return.append(line.split(delimiter))
return to_return
if file_path.endswith(".csv"):
my_data.extend(get_delimited_entries(file_with_no_comments, ","))
elif file_path.endswith(".tsv"):
my_data.extend(get_delimited_entries(file_with_no_comments, "\t"))If it is a .json read the .json and store it
Turn the variable that is storing all the data into a .csv
Full script here
In a professional environment, unit tests may be written to ensure bugs are not introduced during the development process.
import unittest
from my_module import my_func
class TestMyStuff(unittest.TestCase):
def test_my_func(self):
my_output = my_func(1, 2, 3)
self.assertEqual(6, my_output)
def test_my_func_with_strings(self):
my_output = my_func(1, 2, 3, "h")
self.assertEqual(60, my_output)
def test_my_func_raises_exception(self):
with self.assertRaises(ValueError):
my_output = my_func(1, 2, 3, "h", "")C, C++, fortran, etc. to achieve very similar speeds.numpy is so much faster.