You may look at my job title (or picture) and think, “Oh, this is easy, he’s going to resolve to stand up at his desk more.” Well, you’re not wrong, that is one of my resolutions, but I have an even more important one. I, Jeremy Gibson, resolve to do less work in 2019. You’re probably thinking that it’s bold to admit this on my employer’s blog. Again, you’re not wrong, but I think I can convince them that the less work I do, the more clear and functional my code will become. My resolution has three components.
1) I will stop using os.path
to do path manipulations and will only
use pathlib.Path
on any project that uses Python 3.4+
I acknowledge that pathlib
is better than me at keeping operating system eccentricities in mind. It is also better at keeping my code DRYer
and more readable. I will not fight that.
Let's take a look at an example that is very close to parity. First, a
simple case using os.path
and pathlib
.
# Opening a file the with os.path import os p = 'my_file.txt' if not os.path.exists(pn) : open(pn, 'a') with open(pn) as fh: # Manipulate
Next, pathlib.Path
# Opening a file with Path from pathlib import Path p = Path("my_file.txt") if not p.exists() : p.touch() with p.open() as fh: # Manipulate
This seems like a minor improvement, if any at all, but hear me out. The
pathlib version is more internally consistent. Pathlib sticks to its own
idiom, whereas os.path
must step outside of itself to accomplish path
related tasks like file creation. While this might seem minor, not
having to code switch to accomplish a task can be a big help for new
developers and veterans, too.
Not convinced by the previous example? Here’s a more complex example of path work that you might typically run across during development — validating a set of files in one location and then moving them to another location, while making the code workable over different operating systems.
With os.path
import os from shutil import move p_source = os.path.join(os.curdir(), "my", "source", "path") p_target = os.path.join("some", "target", "path") for root, dirs, files in os.walk(p_source): for f in files: if f.endswith(".tgz"): # Validate if valid: shutil.move(os.path.join(root,f), p_target)
With pathlib
from pathlib import Path # pathlib translates path separators p_source = Path().cwd() / "my" / "source" / "path" p_target = Path("some/target/path") for pth in p_source.rglob("*.tgz"): # Validate if valid: p_target = p_target / pth.name pth.rename(p_target)
Note: with pathlib I don't have to worry about os.sep()
Less work!
More readable!
Also, as in the first example, all path manipulation and control is now
contained within the library, so there is no need to pull in outside
os
functions or shutil
modules. To me, this is more satisfying. When
working with paths, it makes sense to work with one type of object that
understands itself as a path, rather than different collections of
functions nested in other modules.
Ultimately, for me, this is a more human way to think about the processes that I am manipulating. Thus making it easier and less work. Yaay!
2) I will start using f'' strings on Python 3.6+ projects.
I acknowledge that adding .format()
is a waste of precious line
characters (I'm looking at you PEP 8) and %
notation is unreadable.
The f''
string makes my code more elegant and easier to read. They
also move closer to the other idioms used by python like r''
and b''
and the no longer necessary (if you are on Python3) u''
. Yes, this is
a small thing, but less work is the goal.
for k, v in somedict.items(): print("The key is {}\n The value is {}'.format(k, v))
vs.
for k, v in somedict.items(): print(f'The key is {k}\n The value is {v}')
Another advantage in readability and maintainability is that I don't
have to keep track of parameter position as before with .format(k, v)
if I later decide that I really want v
before k
.
3) I will work toward, as much as possible, writing my tests before I write my code.
I acknowledge that I am bad about jumping into a problem, trying to solve it before I fully understand the behavior I want to see (don't judge me, I know some of you do this, too). I hope, foolishly, that the behavior will reveal itself as I solve the various problems that crop up.
Writing your tests first may seem unintuitive, but hear me out. This is known as Test Driven Development. Rediscovered by Kent Beck in 2003, it is a programming methodology that seeks to tackle the problem of managing code complexity with our puny human brains.
The basic concept is simple: to understand how to build your program you must understand how it will fail. So, the first thing that you should do is write tests for the behaviors of your program. These tests will fail and that is good because now you (the programmer with the puny human brain) have a map for your code. As you make each test pass, you will quickly know if the code doesn’t play well with other parts of the code, causing the other tests to fail.
This idea is closely related to Acceptance Test Driven Development, which you may have also heard of, and is mentioned in this Caktus post.
It All Adds Up
Although these three parts of my resolution are not huge, together they will allow me to work less. Initially, as I write the code, and then in the future when I come back to code I wrote two sprints ago and is now a mystery to me.
So that's it, I'm working less next year, and that will make my code better.