
Python Tips
PURPOSE
Sharing a collection of quick references for common issues encountered when coding in Python
- CLASSES
- FUNCTIONS
- DICTIONARIES
- LISTS
- FORMATTING
- Testing via Python
- MISC UTIL
- Round Robin Ticket Assigner
- Using map to get sum of a 2D array
- Get files matching a regular expression
- Sending List/Array as data while making a request
- Supress error of Subprocess check Output
- Make multilevel directories
- Store temporary information to Temp Folder
- Restore Timestamps of Extracted Files
- Binomial Coefficent - nCr
- Python not able to detect folders as packages
- Upgrade Pip in Venv
CLASSES
Using variables of Parent Class
To access variables from parent class in child class,
- Call the constructor (
__init__method) of the Parent in the Child class's__init__method.
class Parent:
def __init__(self):
self.parent_name = "Parent"
class Child(Parent):
def __init__(self):
self.child_name = "Child"
Parent.__init__(self)
def print_all(self):
print("parent:", self.parent_name)
print("child:", self.child_name)
Child().print_all()Load value during Class Initialization using own function
class A:
def __init__(self):
self.a = self.load_from_func()
def load_from_func(self):
return "test_value"
def print_a(self):
print(self.a)
obj = A()
obj.print_a()This can be useful when you are going to read a file, such as a JSON, to fill values of your class variables.
FUNCTIONS
Get current function's name
There may be times where you want to dynamically get the current function's name, for example, when trying add functionality to test suite such as pytest. The simplest way is to use the inspect module.
import inspect
this_function_name = inspect.currentframe().f_code.co_name
print(this_function_name)You can also inspect the stack to get the complete function
import inspect
fn_name = inspect.stack()[0][3] #here 0 represents the depth of stack
print(fn_name)Accessing Function attributes if you know it's name
Suppose you have a function name which belongs to a class and you want to call it or access it's attributes, you can use the getattr method to generate the function from name.
# Assuming that function belongs to class TestRobustaRegistry
def fill_fn_dict(self, value):
fn_name = inspect.stack()[1][3]
fn = getattr(TestRobustaRegistry, fn_name) #`fn` becomes the function
fn.__dict__['value'] = value #Adding attributes to function dictionary
print(f"{value} added to {fn}.__dict__")
test_doc_string = fn.__doc__ #get the docstring of the function
print(f'Test docstring: {test_doc_string}')DICTIONARIES
Delete keys from dictionary
You can use the del keyword to delete keys from dictionary
test_dict = {'one':"val1", 'two':'val2'}
print(test_dict)
del test_dict['one']
print(test_dict)Get the first key from a dictionary
Use dict.keys() get the keys and then fetch first one using index
test_dict = {'one': 'val1', 'two': 'val2'}
print(test_dict)
print(test_dict.keys())
first_key = list(test_dict.keys())[0]
print(first_key)Pretty print dictionaries while logging
When logging data structures such as dictionaries, you can't really decipher the contents unless you look very hard. This defeats the purpose of why you were logging in the first place.
You might have used pprint.pprint for printing dictionaries to command line. Similary, we can use pprint.pformat. It takes the input, and generates a pretty printed string, which can then be passed to the logger.
Furthermore, you can use the \n character, so that the dictionary is printed isn't awkwardly starting directly after the timestamp.
import logging
from pprint import pformat
ds = [{'hello': 'there'}]
logging.debug(f"logging datastructure:\n{pformat(ds)}")Use copy to make duplicate dictionaries
There could be times when you want to compare if a dictionary is changed.
For this you think that you will store the current state in a temp dictionary, and do operations in the original one.
After the operation, you think you can use a simple == check. Overall, a reasonable approach.
So you start by saying,
current = {'1':'one'}
temp = current
current['2'] = {'two'}
if temp == current:
print("Something's wrong I can feel it!")
else:
print("State has changed")You run the above code and you wonder, why the equality check is passing?
The reason is, when you use the = operator to make a tempory copy, you are actually only creating a reference! Pointer Nightmares intensify
For this specific purpose, what you want is a shallow copy of the original dictionary and not a reference.
In Python, you can do that by using the copy function.
current = {'1':'one'}
temp = current.copy()
current['2'] = {'two'}
if temp == current:
print("Something's wrong I can feel it!")
else:
print("State has changed")Voila, you can now compare state between dictionaries!
Dictionary to JSON
import json
data = {
'catbs5': {
'en': [
{'article_id': '123', 'title': '123title'},
{'article_id': '1234', 'title': '1234title'},
],
'tw': [
{'article_id': '123', 'title': '123title'},
{'article_id': '1234', 'title': '1234title'},
],
},
'catbs4': {
'en': [
{'article_id': '123', 'title': '123title'},
{'article_id': '1234', 'title': '1234title'},
],
'tw': [
{'article_id': '123', 'title': '123title'},
{'article_id': '1234', 'title': '1234title'},
],
}
}
json_data = json.dumps(data)
print(json_data)LISTS
List slicing
Lists have special property where you can specify index from which you want to access the list.
[start:stop:step]- start: read list starting from this index (inclusive)
- stop: stop reading list just before this index (exclusive)
- step: by default, lists will be iterated one element at time. To increase the step size, pass in the this input.
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print(a)
print(a[1:])
print(a[1:3])
print(a[4:-2])
print(a[1::2])Convert list to indexed tuple list
Say you want to sort the elements of a list, but don't want to lose the original index,
you can convert your list to an indexed tuple list.
num_list = [0, 1, 2, 3, 3, 4, 3]
indexed_num_list = list(enumerate(num_list))
print(num_list)
print(indexed_num_list)
sorted_list_with_index_stored = sorted(indexed_num_list, key=lambda tuple_elem: tuple_elem[1], reverse=True)
print(sorted_list_with_index_stored)Unpacking lists to individual values
To unpack list and store in individual variables:
row = ["Title", "url", 33, "title2", "keyword"]
title, url, price, title2, keyword = row
print(title, url, price, title2, keyword)Combining lists together
To combine list together you can use the plus (+) operator
a = [1, 2]
b = [3, 4]
c = a + b
print(c)Bonus: You can also use the multiply (*) operator to generate a list with duplicate elements
n = 10
a = [0]
b = a*n
print(b)Generating strings from lists after filtering False values
In case you want to join a list of string values while ignoring values that will evaluate to False
- use a filter over join
x = ["a", "b", None, "4"]
y = " | ".join(filter(None, x))
print(y)- Note that the filter function can also take
functionsas filterers. - The passed function must return True/False over the passed sequence of values after evaluation.
- Read example here
Sort by length
There is a pythonic way to sort elements by their length, and it is to use the key param in the sorted function.
list_of_elems = ['ccc', 'aaaa', 'd', 'bb']
print(sorted(list_of_elems, key=len))
data = [
{
'len': 12,
'name': 33
},
{
'len': 1,
'name': 33
}
]
print(sorted(data, key=lambda elem: elem['len']))The key param can also be passed in max function
list_of_elems = ['ccc', 'aaaa', 'd', 'bb']
print(max(list_of_elems, key=len))
data = [
{
'len': 12,
'name': 33
},
{
'len': 1,
'name': 33
}
]
print(max(data, key=lambda elem: elem['len']))FORMATTING
Add 0 padding to strings
There are times when you would need to add a padding zeroes to the numbers you were converting to strings.
- Use zfill, a standard string function specifically designed for this use case
- For example, when calculating time differences and then printing the output
hrs = 4
minutes = 3
time = f"{str(hrs).zfill(2)} hour(s) {str(minutes).zfill(2)} min(s)"
print(time)You could also general string formatting over numbers as well:
hrs = 4
mins = 3
print(f"{hrs:02} hour(s) {mins:02} min(s)")Stripping values generated during a split operation
- Use list comprehension
test_str = "a, b, c,d"
out_list = [val.strip() for val in test_str.split(',')]
print(out_list)Convert Numbers to Hex
The int method used for converting values to numbers also supports base conversion.
To convert any string to hexadecimal number, just pass in the base number as 16.
num_hex = int('fff', 16)
print(num_hex)
#Similarly for octal numbers, we can pass base 8
num_oct = int('66', 8)
print(num_oct)Note that the number should be a valid hexadecimal (i.e. chars 0-9 & letters a-f are allowed when forming the number)
Convert bytes to Human Readable format
To convert bytes to human readable format, we simply divide the number by 2 (pow) 10 until we can no longer divide. This way, we can obtain the power label (KB/MB/GB) from the original input bytes.
def format_bytes(size):
"""Convert bytes to Human Readable Sizes"""
try:
power = 2**10
n = 0
power_labels = {0 : '', 1: 'K', 2: 'M', 3: 'G', 4: 'T'}
while size > power:
size /= power
n += 1
return f"{size:.2f} {power_labels[n]}B"
except:
logging.error(f"{traceback.format_exc()}")
return size
print(format_bytes(12345))
print(format_bytes(12345678910))Testing via Python
Good Testcase pattern to follow
import traceback
try:
status = False
# verification logic
# if required, use internal try/except clause
# if happy, then set status to true and break out of any loop that is running
except:
log.error(traceback.format_exc())
return statusSkip Test Cases based on command line arguments in Pytest
We can use pytest hook (pytest_collection_modifyitems) to dynamically skip test cases based on argument values.
# In conftest.py
def pytest_addoption(parser):
parser.addoption(
"--host", action="store_true", default=False, help="Environment Prod or Engg"
)
def pytest_collection_modifyitems(config, items):
host_marker = config.getoption("--host"):
if host_marker in ['prod', '']:
return
skip_if_host_engg = pytest.mark.skip(reason="host should be prod")
for item in items:
if "skip_if_host_engg" in item.keywords:
item.add_marker(skip_if_host_engg)
# In testfile
@pytest.mark.skip_if_host_engg
def test_stats_for_live_chat():
passMISC UTIL
Round Robin Ticket Assigner
We have an {agent_name:weightage} dictionary with us based on which we need to assign tickets in a round robin manner.
- go over each agent one by one
- if agent has capacity to accept ticket
- give them one ticket
- reduce capacity by one
- repeat step 1 & 2 till all agents have reached 0 capacity
agent_ticket_dict = {'Rahul': 3, 'Ramesh': 1, 'Rajesh': 0, 'Rakesh': 3, 'Brijesh': 4}
tags_list = []
sum_counts = sum(agent_ticket_dict.values())
print("Agent:Number of Tickets:"+ str(agent_ticket_dict))
print(f"Number of Agents for this run: {len(agent_ticket_dict)}")
print(f"Total Tickets that will be assigned: {sum_counts}")
assigner_exhausted = False
while not assigner_exhausted:
for agent in agent_ticket_dict:
if agent_ticket_dict[agent]:
tags_list.append('to_do_'+agent)
agent_ticket_dict[agent] -= 1
print(f"Assigner State:\n{agent_ticket_dict}")
sum_counts = sum(agent_ticket_dict.values())
print(f"Remaining Tickets that will be assigned: {sum_counts}")
if not sum_counts:
assigner_exhausted = TrueUsing map to get sum of a 2D array
arr = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
sum_arr = sum(map(sum, arr))
print('Sum of 2D Array:', sum_arr)test_list = ['Start', 'SSS', 'Strong', 'Table']
total_elems_starting_with_S = sum(map(lambda x:1 if x[0] == "S" else 0, test_list))
print(total_elems_starting_with_S)Get files matching a regular expression
We can use glob and fnmatch for extracting files that match a specific pattern
Let's say you have a list of files
test_dir
│ test.txt
| test2.txt
| other.txt
| test.pyAnd you want to extract files that have the name test in them,
or in other words, files that match the pattern: test*
- Using glob
import glob, os
with open('test.py', 'w') as f:
f.write('import os')
files = glob.glob(os.getcwd()+'/*.py')
print(files)-
We get a list of files with their exact paths
-
Using fnmatch
import fnmatch, os
with open('test.py', 'w') as f:
f.write('import os')
files = fnmatch.filter(os.listdir(), '*.py')
print(files)- We can see that fnmatch returns the filename only
- To create the fullpath, we would need to use
os.path.joinon current_dir and filename
Note: The glob module uses the os and fnmatch module internally.
Sending List/Array as data while making a request
Suppose you want to send something like:
[{"name": "ayush"}, {"name": "mandowara"}]as data, when making a request. You can do this very easily with the requests library.
import requests
url = 'https://thaturlwhichneeds.array/asparam'
headers = {"auth-token": "xyz-auth-token"}
data = [
{
'name': 'ayush'
},
{
'name': 'mandowara'
}
]
requests.post(url, json=data, headers=headers)The json keyword will encode the data to (you guessed it) JSON. It will also set the Content-Type to application/json.
I guess that's why requests has the tag line HTTP for Humans
Supress error of Subprocess check Output
In case you call a process via subprocess, but do not wish to see the error in case it throws one,
- Just redirect the standard error (
stderr) toDEVNULL
import subprocess as subp
subp.check_output("<Call the Process>", stderr=subp.DEVNULL)Make multilevel directories
Suppose you want to create folders in a path such as test\inner_folder\main\, but inner_folder does not exist,
you can use os.makedirs
import os
base = os.getcwd()
dir_structure = 'test/inner_folder/main/'
print(base)
path = os.path.join(base, dir_structure)
os.makedirs(path, exist_ok=True)
# Note: exist_ok -> suppress OSError if path already exists
for d in os.walk('test'):
print(d)Store temporary information to Temp Folder
There could be situtation where you are generating files that are only relevant during the execution of your script and are not meant to be stored for long term purposes. Moreover, you don't want these files to be tracked by Git. While you could add these to .gitignore, a much cleaner way would be to use the Temp folder provided by the OS itself. Let's say you want to create a lock file during the execution of a particular script so that another instance of the script does not override current execution,
import tempfile
from os.path import join, exists
tempfolder = tempfile.gettempdir() #Locate Temp Folder
lock_file = 'script_lock.lck'
lock_file_path = join(tempfolder, lock_file)
with open(lock_file_path, 'w') as f:
f.write('Locking')
if exists(lock_file_path):
print("Lock found.")You can also use the tempfile.TemporaryFile() function to generate a temp file during run-time if you do not want a particular file name.
import tempfile
f = tempfile.TemporaryFile()
f.write('temporary info')Restore Timestamps of Extracted Files
Usually when a zip is extracted, the timestamps of the files are the ones corresponding to the time the file was extracted at. This causes problems when we want to find out the most recent file from a list of files. To fix this problem, we can restore the timestamps of the file by reading them from the original zip file.
Just call the restore_timestamps_of_zip_contents function after extraction is done:
def restore_timestamps_of_zip_contents(self, zipname, extract_dir):
"""Restores the timestamps of zipfile contents after extraction
Parameters
----------
zipname: str
zipname path which was extracted
extract_dir: str
where the zip was extracted
Returns
-------
None
"""
try:
for f in ZipFile(zipname, 'r').infolist():
# path to this extracted f-item
fullpath = os.path.join(extract_dir, f.filename)
# still need to adjust the dt o/w item will have the current dt
date_time = time.mktime(f.date_time + (0, 0, -1))
# update dt
os.utime(fullpath, (date_time, date_time))
except:
logging.warning(traceback.print_exc())
Binomial Coefficent - nCr
To calculate nCr, use the inbuilt moduel in the math library:
from math import comb
comb(10,3)Python not able to detect folders as packages
There are times when you have a proper folder structure for a project, but python cannot interpret that you are importing a file from within the project directory.
One way is to make calls from a proper starting point, which as the root of the directory and change your import statements relative to the starting point. However, this is a tedious process.
A better hack is to append your project folder in system path using the sys module.
In the file where you are importing the other file as a module, add this to the top:
import sys
sys.path.append('/path/to/project_folder')Now, Python will look for this folder when importing modules as well.
Upgrade Pip in Venv
While using Python, you will see the warning about how you are using an old version of pip quite a lot.
This is especially annoying because you cannot seemingly upgrade pip in a venv due to an Acess Denied eror.
Well worry not, the fix is more simple than you think.
Just run this
py -m pip install --upgrade pip #this is correctInstead of this
pip install --upgrade pipThis is because when you run without the py command, pip is trying replace itself,
i.e. a running process is supposed to be uninstalled, which is denied by some Operating Systems.
When you use it with py, the upgrade command is running inside a python shell, and hence this problem is avoided.