Mastering the Art of Python Project Setup: A Step-by-Step Guide
Whether you’re a seasoned developer or simply getting began with 🐍 Python, it’s vital to know easy methods to construct robust and maintainable projects. This tutorial will guide you thru the technique of organising a Python project using among the hottest and effective tools within the industry. You’ll learn easy methods to use GitHub and GitHub Actions for version control and continuous integration, in addition to other tools for testing, documentation, packaging and distribution. The tutorial is inspired by resources corresponding to Hypermodern Python and Best Practices for a brand new Python project. Nevertheless, this shouldn’t be the one approach to do things and you may have different preferences or opinions. The tutorial is meant to be beginner-friendly but in addition cover some advanced topics. In each section, you’ll automate some tasks and add badges to your project to indicate your progress and achievements.
The repository for this series could be found at github.com/johschmidt42/python-project-johannes
- OS: Linux, Unix, macOS, Windows (WSL2 with e.g. Ubuntu 20.04 LTS)
- Tools: python3.10, bash, git, tree
- Version Control System (VCS) Host: GitHub
- Continuous Integration (CI) Tool: GitHub Actions
It is anticipated that you just are conversant in the versioning control system (VCS) git. If not, here’s a refresher for you: Introduction to Git
Commits can be based on best practices for git commits & Conventional commits. There’s the standard commit plugin for PyCharm or a VSCode Extension that assist you to write down commits on this format.
Overview
- Part I (GitHub, IDE, Python environment, configuration, app)
- Part II (Formatting, Linting, Command management, CI)
- Part III (Testing, CI)
- Part IV (Documentation, CI/CD)
- Part V (Versioning & Releases, CI/CD)
- Part VI (Containerisation, Docker, CI/CD)
Structure
- Formatters & linters (isort, black, flake8, mypy)
- Configurations (isort, .flake8, .mypy.ini)
- Command management (Makefile)
- CI (lint.yml)
- Badge (Linting)
- Bonus (Automatic linting in PyCharm, Create requirements.txt with Poetry)
In case you’ve ever worked in a team, you recognize that to attain code and magnificence consistency, it’s worthwhile to agree on formatters and linters. It is going to assist you with onboarding latest members to the codebase, create fewer merge conflicts and customarily save time because developers don’t should care about formatting and magnificence while coding.
In case you don’t know the difference between a formatter & linter and/or would really like to see them in motion, take a look at this tutorial!
One option for formatting and linting Python code is wemakepyhton, which claims to be the “strictest and most opinionated Python linter ever”. Nevertheless, I prefer the favored combination of isort and black as formatters, flake8 as linter and mypy as static type checker. mypy adds static typing to Python, which is one of the crucial exciting features in Python development immediately.
We’re going to add these tools to our project with Poetry. But since these tools are usually not a part of the applying, they ought to be added as dev-dependencies. With Poetry 1.2.0, we now can use dependency groups:
Poetry provides a approach to organize your dependencies by groups. For example, you may have dependencies which might be only needed to check your project or to construct the documentation.
When adding the dependencies, we are able to specify the group the should belong to with --group
.
> poetry add --group lint isort black flake8 mypy
Structuring the dev-dependencies in groups will make more sense later. The principal idea is that we are able to save time and resources in CI pipelines by installing only the dependencies which might be required for a selected task, corresponding to linting.
Because isort and black don’t agree on a only a few points, we’d like to implement that isort uses the profile black.
So we add the configuration within the pyproject.toml
file:
# pyproject.toml
...[tool.isort]
profile = "black"...
flake8 also must “use the black profile”. Nevertheless, flake8 has not (yet) adopted pyproject.toml because the central location for project configuration (see this heated discussion, or use the pyproject-plugin), that’s why we add it in a .flake8 file:
# .flake8[flake8]
max-line-length = 88
extend-ignore = E203
For mypy, we are able to add the configuration of the tool in response to the docs:
# pyproject.toml
...[tool.mypy]
# third party import
ignore_missing_imports = true
# dynamic typing
disallow_any_unimported = true
disallow_any_expr = false
disallow_any_decorated = false
disallow_any_explicit = true
disallow_any_generics = false
disallow_subclassing_any = true
# platform
python_version = "3.10"
# untyped
disallow_untyped_calls = true
disallow_untyped_defs = true
disallow_incomplete_defs = true
disallow_untyped_decorators = true
# None and Optional
no_implicit_optional = true
# Warnings
warn_return_any = false
warn_unreachable = true
# Misc
pretty = true...
Mypy has many settings that you may customize to fit your preferences. I won’t cover all of them here, but I encourage you to read the mypy documentation and learn easy methods to configure the static type checker in your project!
Let’s see our latest tools in motion:
> isort . --checkSkipped 2 files> black . --checkwould reformat src/example_app/app.pyOh no! 💥 💔 💥
1 file can be reformatted, 1 file can be left unchanged.> flake8 ....> mypy .Success: no issues present in 2 source files
Only considered one of the tools (black) reported a problem that we are able to fix. Omitting the --check
flag will run the formatter black for us on our Python files.
> black .
At this point we could consider adding pre-commit hooks that run these linters each time we commit. But using mypy with pre-commit is somewhat fiddly, so I’ll leave it as much as you should you want (and like) pre-commit hooks.
As we add latest tools to our project, we also need to recollect some commands to make use of them. These commands can get complicated and hard to recollect over time. That’s why it’s useful to have a single file where we are able to store and name commands for our project. That is where the Makefile is available in. Many devs are unaware that you may use make
in a Python project to automate different parts of developing a project. It’s a standard tool on the planet of software development with languages corresponding to C or C++. It will probably be used, for instance, to run tests, linters, builds etc. It’s an underutilized tool, and by integrating it into your routine, you possibly can save time and avoid errors.
GNU Make controls the generation of executables and other non-source files of a program from this system’s source file.
That way, we don’t need to recollect all of the commands and their arguments and options. It lets us specify a set of tasks via a standard interface and allows us to run several commands sequentially.
# Makefile
format-black:
@black .format-isort:
@isort .lint-black:
@black . --checklint-isort:
@isort . --checklint-flake8:
@flake8 .lint-mypy:
@mypy ./srclint-mypy-report:
@mypy ./src --html-report ./mypy_htmlformat: format-black format-isortlint: lint-black lint-isort lint-flake8 lint-mypy
To do stuff with make, you type make
in a directory that has a file called Makefile. You may also type make -f
to make use of a special filename. By default, make
prints out the command before it runs it, so that you may see what it’s doing. But there may be a UNIX dogma saying that “success ought to be silent”. So to silent commands in a goal, we are able to start the command with a `@` character. Now we just must run these two commands in a shell
> make format
> make lint
to run all our formatters and linters on our source code. If you ought to know more in regards to the format in a makefile, easy methods to set variables, add pre-requisites and phonies, I highly recommend to read: python-makefie by Aniket Bhattacharyea!
If you ought to have a well documented Makefile, take a look at the bonus a part of this part at the underside!
Now that we now have just a few more config files and a brand new Makefile as a task runner, our project should resemble this:
.
├── .flake8
├── LICENSE
├── Makefile
├── README.md
├── poetry.lock
├── pyproject.toml
└── src
└── example_app
├── __init__.py
└── app.py2 directories, 8 files
Working in a team of skilled software developers brings various challenges. Ensuring that nothing is broken and everyone seems to be working on the identical formatted code is considered one of them. For this we use continuous integration (CI), a software development practice that enables members of a team to integrate their work ceaselessly. In our case, to date, latest features (feature branches) that changed source files must pass our linters to preserve style consistency. There are plenty of CI tools corresponding to CircleCI, TravisCI, Jenkins etc., but within the scope of this tutorial we are going to use GitHub’s CI/CD workflow solution GitHub Actions.
Now that we are able to run our formatters and linters locally, let’s arrange our first workflow that may run on a GitHub server. To do that, we are going to create a brand new feature branch called feat/lint-ci and add the file .github/workflows/lint.yml
Let’s break it all the way down to make sure that we understand each part. GitHub motion workflows have to be created within the .github/workflows directory of the repository within the format of .yaml or .yml files. In case you’re seeing these for the primary time, you possibly can check them out here to raised understand them. Within the upper a part of the file, we give the workflow a reputation name: Linting
and define on which signals/events, this workflow ought to be began: on: ...
. Here, we wish that it runs when latest commits come right into a PullRequest targeting the principal branch or commits go the principal branch directly. The job runs in an ubuntu-latest* (runs-on
) environment and executes the next steps:
- checkout the repository using the branch name that’s stored within the default environment variable
${{ github.head_ref }}
. GitHub motion: checkout@v3 - install Poetry with pipx since it’s pre-installed on all GitHub runners. If you’ve a self-hosted runner in e.g. Azure, you’d need to put in it yourself or use an existing GitHub motion that does it for you.
- Setup the python environment and caching the virtualenv based on the content within the poetry.lock file. GitHub motion: setup-python@v4
- Install only the necessities which might be needed to run different linters with
poetry install --only lint
** - Running the linters with the make command:
poetry run make lint
Please note, that running the tools is simply possible within the virtualenv, which we are able to access throughpoetry run
.
*We could also run this in a container (docker) but containerisation can be covered in Part VI
**We used poetry install --only lint
to simply install the dependencies within the group lint
. You may wonder: How can we check if these dependencies are enough to run the tools locally? Well, in poetry 1.2.0, the environment is dependent upon each the Python interpreter and the pyproject.toml file. So we would wish to delete the prevailing environment with poetry env remove
or poetry env remove --all
, then create a brand new clean environment with poetry env use python3
and run poetry install --only lint
. This looks like a hustle, right? agree, but that’s how it really works for now. You may read more about this issue on this StackOverFlow Post.
Now that we now have our first workflow, how can we see it in motion? Or higher yet: How can we test it before pushing it to GitHub? There are two ways to try this:
- We are able to push our changes and see the outcomes on GitHub
- We are able to use the tool act, which lets us run GitHub actions locally and avoid the trial-and-error approach.
Let’s try the primary option and push our changes to our feature branch. Once we open a pull request, we are able to see that the workflow has began running.
And we may also see that it actually failed:
The rationale for this error is that we didn’t run this command
> poetry install/home/runner/work/python-project-johannes/python-project-johannes/example_app doesn't contain any element
before to examine if our app was installed accurately within the site-packages directory or if the name or mapping was incorrect. We are able to solve this by ensuring that the name
attribute in our pyproject.toml matches the name of our src
directory and in addition removing the package
attribute for now:
# pyproject.toml[tool.poetry]
name = "example_app"
...
Running the pipeline a second time, we see that … it fails again!
This time, our static type checker mypy reported errors due to unfollowed imports
. We are able to reproduce this by running the identical commands from the workflow locally (only install lint
packages). Seems that mypy tries to follow the imports in a file but when it may’t (since it was not installed with poetry install —- group lint
), then it would have Any
types! That is described within the mypy documentation. We are able to solve this by installing our application dependencies AND the lint dependencies with
> poetry install --with lint
This time, we see that it succeeded, Hallelujah!
And to summarise, here’s how our repository tree looks like now:
.
├── .flake8
├── .github
│ └── workflows
│ └── lint.yml
├── LICENSE
├── Makefile
├── README.md
├── poetry.lock
├── pyproject.toml
└── src
└── example_app
├── __init__.py
└── app.py4 directories, 9 files
Once we merge our PR to the principal branch, the workflow will run again. We are able to display the status of our CI pipeline on the homepage of our repository by adding a badge to the README.md file.
To get the badge, we’d like to click on a workflow run (principal branch) and duplicate the lines
The badge markdown could be copied and added to the README.md:
Our landing page of the GitHub now looks like this ❤:
If you ought to know the way this magically shows the present status of the last pipeline run in principal, take a look the commit statuses API on GitHub.