A simple helper script (Python3) to modify an excel data sheet and anonymize a specific column.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Codeberg-AsGithubAlternative-buhtz 1f05c986f0 removed unneeded output excel-file 4 months ago
LICENSE License infos and PEP8 4 months ago
README.md README 4 months ago
example.xlsx Initial code and example data 4 months ago
excel_anon.py License infos and PEP8 4 months ago

README.md

Excel Anonymize

A simple helper script (Python3) to modify an excel data sheet and anonymize a specific column without loosing the assignments.

In the following example table the A column ID consist of numbers identifying indivudal persons. The IDs can appear in more than one line - e.g. 202.

ID Data
101 Lore
202 Ipsum
303 dolor
202 sit

The script create random strings (AnonID) but retain the assignments. In this example the new created AnonID for ID=202 are equal in line 3 and 5.

ID AnonID Data
101 T42 Lore
202 8UN Ipsum
303 9SQ dolor
202 8UN sit

The original ID column will be removed. So the real result will look like this.

AnonID Data
T42 Lore
8UN Ipsum
9SQ dolor
8UN sit

Why is it usefull?

In research or other data drivin projects you have sometimes access to personified data. This is a problem because of ethics and the law. If there is no need to identify the person for your research question you have to anonymize the data.

Usage

Usage:
    excel_anon <excel-datei> [<spalte>] [<zeile>] [--sheet=<name>] [--keep] [-h | --help]

Options:
    excel-datei         Name der Excel-Datei.
    spalte              Zu anonyimiserende Spalte (in Excel Notation; d.h. Buchstaben). [default: 'A']
    zeile               Start Zeile (in Excel Notation). [default: 1]
    --sheet=<name>      Name der Arbeitsmappe.
    --keep              Die Real-IDs werden nicht gelöscht. Die Pseudo-IDs werden in einer zusätzlichen Spalte erzeugt.
    -h --help           Dieser Hilfe-Text.