An Opinionated Guide for Writing Awesome Python Scripts!
Introduction
In the world of Python development, the proper setup can significantly boost productivity and ensure that you can write robust scripts. A collection of libraries knit together can make a huge difference in the quality of programs that you write. From command-line argument parsing to HTTP request handling — This blog post will guide you through setting up an opinionated Python environment tailored for writing robust and reliable scripts.
I have combined all the libraries here and published a working prototype in this repository. Feel free to check it out: https://github.com/vik-y/opinionated-python-example
Environment Management with Pipenv
Before diving into the libraries that will power our scripts, let’s talk about environment management. Pipenv combines the best of both worlds from pip and virtualenv, creating a seamless environment for your Python projects.
Install Pipenv:
pip install pipenv
With Pipenv, you can manage your project’s packages using the Pipfile and Pipfile.lock, ensuring that everyone on your team is working with the same package versions.
Command-Line Interfaces with Fire
Google’s Fire library transforms any Python object, function, or module into a command-line interface. This makes it incredibly easy to create scripts that are both powerful and user-friendly.
Install Fire:
pipenv install fire
Example usage:
import fire
def greet(name="World"):
return f"Hello, {name}!"
if __name__ == "__main__":
fire.Fire(greet)
Running this script with python script.py --name=Python
will output "Hello, Python!"
With fire, you can quickly add a cli interface for your scripts.
Data Validation with Pydantic
Pydantic provides data validation using Python type annotations. This ensures that the data flowing through your application is correct, reducing the likelihood of bugs and making your scripts more robust.
Install Pydantic:
pipenv install pydantic
Example usage:
from pydantic import BaseModel
class User(BaseModel):
id: int
name: str
user = User(id=1, name="John Doe")
print(user)
Instead of dataclass — you can use Pydantic to have robust objects with data validation built in.
Containerization with Docker
Docker allows you to package your application and its dependencies together into a container, ensuring that it runs the same way on every machine.
Example Dockerfile:
FROM python:3.11-slim
WORKDIR /app
COPY Pipfile Pipfile.lock ./
RUN pip install pipenv && pipenv install --deploy --ignore-pipfile
COPY . .
CMD ["pipenv", "run", "python", "your_script.py"]
Build and run your Docker container:
docker build -t my-python-app .
docker run my-python-app
Dockerizing python scripts is simple. You can use the example above to easily dockerize your python scripts that use Pipenv.
Testing with Pytest
Pytest is a powerful tool for writing simple and scalable test cases. Ensure your scripts are working as expected by covering them with tests.
Install Pytest:
pipenv install pytest
Example test:
def test_greet():
assert greet("Python") == "Hello, Python!"
Run your tests:
pipenv run pytest
Any useful script should be testable. If you’re writing anything serious then you must use pytest. Here’s a beautiful resource to learn about pytest https://github.com/pluralsight/intro-to-pytest
Retryability with Requests and Tenacity
HTTP requests can fail for various reasons, from temporary network issues to server overloads. Ensuring your scripts can gracefully handle these situations is crucial.
Install Requests and Tenacity:
pipenv install requests tenacity
Example usage:
import requests
from tenacity import retry, stop_after_attempt, wait_fixed
@retry(stop=stop_after_attempt(3), wait=wait_fixed(2))
def get(url):
response = requests.get(url)
response.raise_for_status()
return response.json()
This function will attempt to make an HTTP GET request up to three times, waiting two seconds between each attempt.
Background Tasks with Celery
For longer-running tasks, or tasks that you want to run periodically, Celery is an excellent choice. It allows you to distribute tasks across multiple worker processes, even on different machines.
Install Celery:
pipenv install celery
Example usage:
import os
from celery import Celery
from scraper import fetch_data, process_data
REDIS_URL = os.environ.get("REDIS_URL", "redis://localhost:6379/0")
app = Celery("tasks", broker=REDIS_URL)
@app.task
def scrape_and_process(url: str):
try:
data = fetch_data(url)
process_data(data)
except Exception as e:
print(f"An error occurred: {str(e)}")Run a worker:
celery -A your_script worker --loglevel=info
And call your task:
result = scrape_and_process.delay(4, 4)
print(result.get())
Conclusion
By leveraging the power of these libraries and tools, you can create Python scripts that are not only powerful and flexible but also robust and reliable. From managing environments with Pipenv to writing test cases with Pytest, this setup ensures that you are prepared to handle a wide array of scripting challenges. Happy coding!
Checkout a full working prototype that uses all the libraries mentioned above to build a script to asynchronously scrape web pages. https://github.com/vik-y/opinionated-python-example