21 KiB
Muzik Faktry: Processing music files on Linux
CONTENTS
- INTRODUCTION
- PRIMARY FEATURES
- DEPENDENCIES
- INSTALLATION
- CONFIGURATION
- DIRECTORY STRUCTURE
- USAGE
- DAMAGED FILES
- COMMAND LINE OPTIONS
- HISTORY
INTRODUCTION
The purpose of Muzik Faktry is to produce clean, high quality FLAC files from lossless source files without compromising audio quality. The script is a user-friendly, menu driven, Bash shell script that should run on any operating system which supports recent versions of Bash. The script can perform many operations on your music files prior to adding them to your collection, all of which are non-destructive, meaning there is no reduction in audio quality. Given the focus of the script, there is limited support for formats other than FLAC and PCM/WAV, however the Split An Album, Save Metadata and Decompress tasks are able handle a wide variety of formats in order to get you started.
The audience for Muzik Faktry might range from the audiophile to the casual listener. In either case the script is designed to be easy to use while still providing a plethora of options.
Muzik Faktry was developed primarily to satisfy my own needs and thus it may not satisfy yours. For example there are no facilities for creating albums, writing files to CD, fetching lyrics, embedding cover art, or creating play lists.
For additional information see:
- The default configuration file (/config/default.conf) contains a great deal of useful information.
- Also see 'Muzik Faktry: Processing music files on Linux': https://12bytes.org/projects/muzik-faktry/
- Source code repository: https://codeberg.org/12bytes.org/muzikfaktry
If you find any bugs please report them on the Muzik Faktry code repository, or leave a comment on the dedicated web page, or send mail to 'muzikfaktry runbox com' (i trust you know where the @ and . go).
PRIMARY FEATURES
- Batch, one-by-one and single file processing
- Album splitting
- Read, write, save, restore and strip metadata
- Tag to file name, file name to tag
- Format file names using regular expressions
- Batch edit file names using regular expressions
- Decompress various lossless formats and encode to FLAC
- Comprehensive integrity and metadata checking
- Generate spectrograms of the audio frequencies
- Trim silence from the beginning and end of tracks
- Volume normalization using the ReplayGain v2 spec.
- Optimization of file size
- Find potentially duplicate files
- Multiple configuration profiles
- And more...
DEPENDENCIES
A recent version of Bash is required.
Several dependencies required by Muzik Faktry are installed by default with most GNU/Linux-based desktop operating systems or should be available in your package repository. The script will notify you of any missing dependencies. Important dependencies which may not be installed include:
- mediainfo (required) - sources: https://github.com/MediaArea/MediaInfo binaries: https://mediaarea.net/en/MediaInfo website: https://mediaarea.net/MediaInfo
- rsgain (required) - sources and binaries: https://github.com/complexlogic/rsgain
- shellcheck (strongly recommended) - sources and binaries: https://github.com/koalaman/shellcheck/releases website: https://www.shellcheck.net/
- shntool (required) - sources: http://shnutils.freeshell.org/shntool/
Regarding shntool, if you are working with files other than FLAC or WAV, additional dependencies may be necessary. For more information see the shntool website.
While optional, shellcheck is strongly recommended since it is used to check the syntax of the active configuration file. Any syntax errors in the configuration file can result in disaster since the file is essentially part of the muzikfaktry.sh script. If for some reason you don't want to install shellcheck, an on-line tool is available on the shellcheck website.
If a program is not available in your package repository you may drop precompiled binaries in the /bin folder and Muzik Faktry will use them. Make sure the files are executable:
$ chmod +x <file>
INSTALLATION
Downloads:
- Code repository: https://codeberg.org/12bytes.org/muzikfaktry
- ZIP archive: https://codeberg.org/12bytes.org/muzikfaktry/archive/main.zip
- TAR.GZ archive: https://codeberg.org/12bytes.org/muzikfaktry/archive/main.tar.gz
Unpack the archive somewhere where you have read, write and execute permissions, perhaps in your /home directory, and make the script executable:
$ chmod +x muzikfaktry.sh
CONFIGURATION
The default.conf file is located in the /config directory. It is particularly important to understand that, although the format of the file closely resembles that of a typical INI file (option=value), the configuration file is effectively part of the main script and therefore proper Bash syntax is crucial. It is for this reason that the installation of the shellcheck package is strongly recommended.
Instead of modifying the default.conf file, create a new file with a descriptive name and a '.conf' extension after which you may either copy the entire contents of the default.conf file to the new file, or add only the options you want to change. In the case of the latter, be sure to add the following code to the top of the file:
#!/usr/bin/env bash
# shellcheck disable=SC2034
Upon running the script you will be prompted to choose a configuration file to load if more than one exists. If no configuration file exists, the internal default settings shall be used.
DIRECTORY STRUCTURE
The following folders and files will be automatically created when musikfaktry.sh is run:
/backup : album files are moved here after splitting if they are kept
/bin : precompiled dependencies may be dropped here if not installed
/config : holds one or more configuration files
/discard/metadata_issues : holds files with easily repairable metadata issues
/discard/minor_issues : holds files with other repairable issues
/discard/serious_issues : holds files having serious problems or have failed a test
/discard/unrepairable : holds files which are not likely to be repairable
/discard/user_discard : holds files which were discarded by the user
/finished : holds finished files ready to be added to your collection
/holding : can be used to store files to be processed later
/logs : session log files
/metadata : holds metadata files created using the 'Save Metadata' task
/spectro : holds PNG spectrograms created using the 'Spectrogram' task
/working : files being actively worked on
duplicates.txt : contains information regarding potentially duplicate files
USAGE
Miscellaneous
IT IS STRONGLY SUGGESTED TO WORK ON COPIES OF YOUR SOURCE FILES. Muzik Faktry does not backup source files which it modifies. For example, when a compressed file is decompressed, the source file is deleted. The only exception is when splitting an album file in which case the the album and CUE sheet files are moved to the /backup directory if they are kept, or the /discard/user_discard directory if they are discarded.
During the course of operations your input will be solicited, often in the form of a question which might require a "Y" or "N" key press (lower case). Appended to the end of such a question might be "[Y/n]" or "[y/N]" where the default choice is indicated by the upper case letter. To select the default choice you may of course press the corresponding key (lower case), or the Enter/Return key, or most any other key.
To begin a session, copy your source files to the /working directory, open a terminal, cd to the /muzikfaktry directory and run the script:
$ ./muzikfaktry.sh
Alternatively, assuming you already have a backup of your music collection, you could simply rename the folder in which it resides to 'working' and copy the muzikfaktry.sh script to the directory above the /working directory. You may also create a 'config' directory along side the /working directory and drop a configuration file in it if you wish (technically the musikfaktry.sh script is the only file required to get started).
Personally i like to open my file manager to the /working directory and move it to one side of my screen, then open a terminal and move it to the other side. This allows me to watch what's happening in the /working directory as files are processed.
Files which are discarded, either by the user or automatically if batch mode is activated, are moved to a sub-folder of the /discard directory.
Tasks
Following are all of the available tasks:
1) Set File Permissions 10) Trim Silence 19) Find Duplicates
2) Split An Album 11) Encode To FLAC 20) File Information
3) Save Metadata 12) Strip Metadata 21) Play
4) Tags To File Name 13) Restore Metadata 22) Rotate Files
5) Format File Names 14) File Name To Tags 23) Logging Operations
6) Edit File Names 15) Write Tags 24) Help
7) Decompress 16) Normalize Gain 25) Quit
8) Integrity Check 1 17) Optimize
9) Spectrogram 18) Integrity Check 2
While the order of the tasks may not seem intuitive, there are reasons for it and tasks should generally be run in order for each task which is necessary. Some tasks are able to handle any file type, while others may be limited to WAV or FLAC files only. A typical scenario when processing music files might be where you start with FLAC tracks in which case you may want to consider running the following tasks, at a minimum:
1. Format File Names : recommended
2. Decompress : required for the Integrity Check 1 task
3. Integrity Check 1 : required
4. Encode To FLAC : required
5. Normalize Gain : recommended
6. Integrity Check 2 : required
Task Notes
-
Each time a task is run you will be asked if you want to batch process all files in the /working directory. If you are unfamiliar with Muzik Faktry it is suggested to avoid batch processing.
-
It is suggested to work with one file type at a time to avoid confusion.
-
For safety reasons the script automatically removes non-printable characters from file names and sanitizes the file name of the file being processed by changing it to 'working.[ext]'. If for some reason the script is terminated before a task completes, such as by pressing Ctrl+C, it is possible that the temporary name will not have been reverted. In such a case the original name of the file can be discovered by looking at the terminal output, or by checking the log file, or by examining it's metadata if there is any.
-
When splitting album files the album must be accompanied by a CUE file of the same name. For a file name of "album.flac" both "album.cue" and "album.flac.cue" are acceptable. Gap files (*.pregap.wav) are deleted automatically. Batch processing will be disabled when splitting albums in order to avoid file name conflicts.
-
Decompressing or compressing files will remove all metadata. Text metadata can be saved beforehand by running the Save Metadata task, then written back later using the the Restore Metadata task. The Save Metadata task can be avoided if some or all of the metadata you require is contained in the file name in which case it can be written back using the File Name To Tag task.
-
The Integrity Check tasks are largely responsible for flagging files with basic problems such as low bit rates, sample rates, missing metadata, etc..
-
Other than the optional Find Duplicates task, the Integrity Check 2 task is typically the final task to run before adding the files to your collection.
-
If the source file is known to be a good FLAC file and you only want to change something such as the compression ratio, the Encode To FLAC task will non-destructively re-encode the file using the settings defined in the configuration file.
-
The Normalize Gain task writes ReplayGain information as metadata to the files (the audio stream is not touched). This data can be removed at any time using the Strip Metadata task (by default all metadata will be removed, however this can be changed in the configuration file).
-
The Edit File Names task changes file names using Perl Compatible Regular Expressions. This task is always run in batch mode. To use a regular expression, the expression must be terminated with a forward slash "/" after which an optional replacement expression may be used. Back-referencing is allowed in the form of "\N". Following are some examples:
Remove the first segment of a file name, "Album - " in this case, from all files using a capture group "()" to capture the text we want to retain. The replace expression uses a back-reference "\1" to our capture group:
File names:
Album - Artist - Title
Album - Track - Artist - Title
Expression: ^(?:.+? - )(.+)/\1
Results:
Artist - Title
Track - Artist - Title
Remove "foo" from all files using a case-insensitive "(?i)" regular expression, but only if "foo" is a whole word:
File names:
foo name
some Foo name
more name foo
Foonew name
Expression: (?i)(?:^foo | foo(?= |$))
Results:
name
some name
more name
Foonew name
Error Levels
In addition to informational messages, tasks can report 4 different error levels when a problem is detected with a file:
-
"ERROR" level errors generally indicate that the file was not able to be processed by an executable. This is likely due to a damaged or incompatible file, or specifying incorrect options for an executable in the configuration file. These files are automatically moved to the /discard/unrepairable directory when batch mode is activated.
-
"WARNING" level errors generally indicate that the file failed a test based on criteria specified in the configuration file, or which is hard-coded, or that an executable failed to process the file. While unlikely, it is possible that such files are repairable. These files are automatically moved to the /discard/serious_issues directory when batch mode is activated.
-
"NOTICE" level errors indicate problems which are repairable using another task. Such files may have header issues, a problematic file name, etc.. These files are automatically moved to the /discard/minor_issues directory when batch mode is activated and can later be rotated back to the /working directory using the Rotate Files task.
-
"METADATA" level errors indicate problems which are repairable using another task. Typically such files are missing required metadata or contain excess metadata. These files are automatically moved to the /discard/metadata_issues directory when batch mode is activated and can later be rotated back to the /working directory using the Rotate Files task.
A file which produced no errors, but which is discarded by the user when batch mode is not activated, will be moved to the /discard/user_discard directory.
Tagging Operations
Tags can be read from a variety of formats but may only be written to FLAC files. Any of the tags defined in the Vorbis I format specification may be written (see: https://xiph.org/vorbis/doc/v-comment.html).
Tags can be written to all files in the /working directory by activating batch mode. One use-case for this is writing a predefined comment tag, such as "Muzik Faktry" along with a version string. This could be useful should you wish to later reprocess your music files when a newer version of the script is released with a different feature set.
Most tagging tasks should be self-explanatory with the possible exception of the Save Metadata and Restore Metadata tasks. The Save Metadata task will attempt to read the values for each tag specified in the configuration file for all files in the /working directory and write this information to text files in the /metadata directory, each file having the same name as the base file name of music file along with a .txt extension. This information can later be written back to FLAC files using the Restore Metadata task. Saving metadata could be useful if the file contains more metadata than can be derived from the file name and written as tags using the File Name To Tags task.
Spectrograms
The Spectrogram task creates an image file of the audio frequency spectrum which can be used to help determine whether the audio is actually lossless or whether it was up-scaled/up-sampled from a lossy format, such as when some genius transcodes an MP3 to FLAC. More information regarding spectrogram analysis is available on the web, including on the Muzik Faktry page at https://12bytes.org/projects/muzik-faktry/.
If the task is run in run batch mode then the spectrograms are stored as PNG image files in the /spectro directory and the file names will be the same as the audio file with a PNG extension. If batch mode is not active then each image is opened for previewing as it is generated and no image files are written to disk.
Finding Duplicate Files
This task attempts to find potentially duplicate files based on the following:
- CRC checksums: files with the same checksums are almost certainly duplicates
- file sizes: these files are probably duplicates
- number of audio samples: these files could be duplicates
- fuzzy file name matching: these files may or may not be duplicates
Rotating Files
Invoking this task will accomplish the following in the order specified:
- If there are files in the /working directory, they will be moved to the /finished directory. Here it is assumed that all files were subjected to all previous required tasks and are ready to add to your collection.
- If condition 1 is not applicable and there are files in the /holding directory, they will be moved to the /working directory.
- If conditions 1 and 2 are not applicable and there are files in the /discard/metadata_issues directory, they will be moved to the /working directory.
- If conditions 1, 2 and 3 are not applicable and there are files in the /discard/minor_issues directory, they will be moved to the /working directory.
REPAIRING DAMAGED FILES
Problems such as low sample or bit rates, poor encoding methods, damaged or truncated audio, or poor sound quality cannot be repaired by Muzik Faktry nor any other tool i'm aware of, regardless of their claims. Often the best alternative in such cases is to locate a better copy. Lessor problems may be repaired by simply recompressing/decompressing the file using the 'Encode To FLAC' and 'Decompress' tasks. Though not used by Muzik Faktry, you may find the Kwave and Wavfix tools helpful also.
COMMAND LINE OPTIONS
Run Muzik Faktry with '-h' to see the available options:
$ ./muzikfaktry.sh -h
HISTORY
After i started listening to music on a PC i became more attuned to the audio quality and the metadata that was included in the files. At the time i was using that other OS and listening to MP3's, and so i built a tool chain for processing my music which checked for errors, added ReplayGain information, etc.. I was happy with the process i created but after moving to GNU/Linux, and with low expectations, i wanted to recreate my tool chain the best i could, thus MP3 Factory was born which was the original name of this script. Eventually i figured out that the MP3 format is largely obsolete, problematic and frustrating to work with and so i started focusing on lossless files in the FLAC format which have a much more refined structure. Besides, the older MP3 Factory version of this script flagged every one of my MP3 files as "junk" due to extensive integrity checking. Abandoning the MP3 format of course necessitated a re-branding of the script as well as replacing my entire music collection with lossless files. Eventually i happily dropped all support for the MP3 format other than the ability to read its metadata.
Muzik Faktry has since evolved into a fairly comprehensive tool that has greatly exceeded both my original expectations and the inferior tool chain i had constructed on Winblows (the spelling is intentional), plus it's been a great project for learning some of the many intricacies of Bash shell scripting. Whether Bash is an appropriate language to code a multi-thousand line bit of software is certainly debatable, however it's currently the only Linux compatible scripting language i'm familiar with.