|
||
---|---|---|
src/utils | ||
tests | ||
.gitignore | ||
LICENSE | ||
README.md | ||
pyproject.toml |
README.md
Assorted Python Utility Code
This is a collection of Python utilities that we use in various scripts across our research group.
Use
We don’t publish this module to PyPI, so you need to add it to your project by directly referencing the Git repository:
dependencies = [
"utils@git+https://codeberg.org/pencelab/utils.git"
]
You’ll also need to add the following to pyproject.toml
, if using Hatch (as
most of our projects do):
[tool.hatch.metadata]
allow-direct-references = true
If you’re using mypy, it won’t be able to find the typing information for a package that’s locally installed like this. You can silence the warnings by setting the following:
[[tool.mypy.overrides]]
module = ["utils", "utils.*"]
ignore_missing_imports = true
Documentation
All methods in this repository are carefully documented with Sphinx documentation; see the autogenerated docs for more details. (FIXME: I’m not actually generating and hosting these yet; watch this space.)
Briefly, this repository includes:
utils.config
:- An API for fetching a configuration object for all of our utilities. This
lets us store things like API keys in a centralized location. The
utils.config.load
method returns adict
with the following keys:pubmed_api_key
: A PubMed API key for faster metadata queries
- An API for fetching a configuration object for all of our utilities. This
lets us store things like API keys in a centralized location. The
utils.core_ext
:remove_nones
: recursively removeNone
values from dictionaries (including in nested dictionaries or lists)
utils.corpus
format_document
: create a short, formatted representation of a JSON document in our corpus formatoptions
: click options to add for customization offormat_document
utils.data_files
check_source
: check the presence of adataSource
ortextSource
in a documentadd_source
: add adataSource
ortextSource
with version and timestampsave_with_backup
: save out data to a JSON file, making a backup if we would overwrite an existing file
utils.decorators
compose
: compose multiple decorators into a single decoratorfail_counter
: add a failure counter attribute (essentially a function-level static variable) to a function
utils.net
download_file
: download a file to the local filesystem with nice progress bar reporting
utils.text
full_strip
: remove HTML tags and extra whitespace from a string
License
All code in this repository is released under the GNU GPL v3.