You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

#### 730 lines 33 KiB Raw Permalink Blame History

 --- layout: post title: "Measurement errors and dimensional analysis in R" date: "2020-11-17T18:00:00+01:00" slug: "measurement-errors" aliases:  - /2020/11/06/measurement-errors bibliography: - /media/bay/taha/sites/hugo/solarchemist/static/bib/library.bib - /media/bay/taha/sites/hugo/solarchemist/static/bib/references.bib csl: /media/bay/taha/sites/hugo/solarchemist/static/bib/materials-chemistry-and-physics.csl link-citations: true tags: ["R", "Python", "WSL"] draft: no comments: false output:  blogdown::html_page:  toc: true ---     {r packages, echo=F, results='hide', message=FALSE} library(knitr) library(errors) library(constants) library(units) library(quantities) library(common) library(oceanoptics) library(dplyr) library(magrittr) library(ggplot2) library(ggforce)    {r global_options, echo=FALSE, results='hide', message=FALSE} options(  digits = 7,  width = 84,  continue = " ",  prompt = "> ",  warn = 0,  knitr.graphics.error = FALSE,  stringsAsFactors = FALSE) opts_chunkset(  dev = 'svg',  fig.align = 'center',  echo = TRUE,  eval = TRUE,  cache = FALSE,  collapse = TRUE,  results = 'hide',  message = FALSE,  warning = TRUE,  tidy = FALSE)            I wrote this blog post in preparation for a hands-on web presentation I gave on Nov 13, 2020. It's published mostly as-is in the hopes it might benefit others.  [Comments or questions on the text](https://hyp.is/LwjIgikrEeu6xX800vMY4g/solarchemist.se/about/) or [regarding the code](https://codeberg.org/taha/2020-11-17-measurement-errors/issues) are most welcome!         Today I would like to demonstrate how to get automatic error propagation for your data analysis in R  using the [errors package by Iñaki Ucar et al.](https://r-quantities.github.io/errors/) @Ucar2018 I would also like to demonstrate how to combine that with dimensional analysis  (with automatic unit conversion) using the [units package by Pebesma et al.](https://r-quantities.github.io/units/) @Pebesma2016 This is usually known as *quantity calculus*.   To make it easy for you, dear reader, to follow along, we will take it from the start and assume all you have is a Windows 10 computer.  If you work on MacOS[^1] or Linux, you can [skip right ahead to "Working with measurement errors"](#working-with-quantities-and-measurement-errors) (but you may have to adapt some of the steps, as they were written with Windows 10 in mind).   [^1]: If you run MacOS, you probably want to install [Homebrew, a Linux-like package manager](https://brew.sh).   Most of the fundamental tools used for reproducible scientific work,  such as [R](https://www.r-project.org/),  [knitr](http://kbroman.org/knitr_knutshell/),  [git](https://www.freecodecamp.org/news/what-is-git-and-how-to-use-it-c341b049ae61/),  [Python](https://www.python.org/about/gettingstarted/), and not to be forgotten, the whole chain of document generation tools such as [LaTeX](https://www.latex-project.org/), [Markdown](https://www.markdownguide.org/), and [pandoc](https://pandoc.org/), "just work" on any Linux distribution, but can be a bit of a hassle on Windows.   It is not hard to see why that is. All these tools are available for free and with libre software licences.  Windows is neither free nor libre, and has a long history of working against this academic model of software sharing.  These days, Microsoft, Google and other tech behemoths embrace open source software, [simply because it allows them to monetise work without paying for it](http://marktarver.com/thecathedralandthebizarre.html). Although, to be fair, they are also contributing back (to varying degrees) to this free/libre pool of software. For an academic, open source tooling offers many tangible and intangible benefits, prime among them by [making the entire work reproducible](https://juretriglav.si/thoughts-on-reproducibility-of-open-scientific-software/), and effectively future-proof.   So, let's get to it.  We will make use of the fact that Microsoft has added something called the  Windows subsystem for Linux (WSL) on Windows 10, which allows you to run a (nearly feature-complete) Linux shell. In fact, recently Microsoft released WSL version 2 (WSL2), which attempts to bring the Linux shell natively to the Windows 10 desktop.    In order to work with R, the next sections will show you how to make use of WSL2 to install  Ubuntu 20.04, and then we will walk through installing R and RStudio Server using WSL2 Ubuntu. You will then be able to run R (in RStudio) from a browser window on your Windows desktop.  Pretty cool, right?    With that in place, we will explore how to use R and the suite of r-quantities packages  to seamlessly manipulate vectors that have associated measurement uncertainties *and* units through the steps of your data analysis, by way of an example using a subset of data from  a real experiment in the Edvinsson lab.     ## Install WSL2 on Windows 10   In the near future, installing Ubuntu on Windows will be as simple as wsl --install (it's [in the development version of Windows 10](https://devblogs.microsoft.com/commandline/distro-installation-added-to-wsl-install-in-windows-10-insiders-preview-build-20246/)), but for now, we have to [follow the steps](https://docs.microsoft.com/en-us/windows/wsl/install-win10) below to achieve the same.    Open Powershell as administrator, then enable WSL1:   ~~~powershell dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart ~~~   To enable WSL2, we must first enable the Windows virtual machine (VM) feature:   ~~~powershell dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart ~~~   Now you have to restart Windows.   Next, [download the WSL2 Linux kernel update package for x64](https://wslstorestorage.blob.core.windows.net/wslblob/wsl_update_x64.msi) (or [ARM64](https://wslstorestorage.blob.core.windows.net/wslblob/wsl_update_arm64.msi), depending on your machine), and install it.   Then open Powershell as administrator again, and set WSL2 as default for all future VMs:   ~~~powershell wsl --set-default-version 2 ~~~   Now it is time to [install our Linux distribution of choice (i.e., Ubuntu) from the Microsoft Store](https://www.microsoft.com/store/apps/9n6svws3rx71). Open the Microsoft Store, find Ubuntu, and install it. (And also, by the way, the Microsoft Store is awful. Such nasty underhanded nudging by Microsoft to create a wholly unnecessary "account").   Anyway, you just installed Ubuntu on your Windows machine, congratulations!  Now you get to create your Linux username and password: simply start the app Ubuntu from the Start menu.       ### Parenthesis: if your Windows 10 is itself a virtual machine    This is unlikely to apply to 99% of my readers, so just skip ahead to the next section.   Who does this section apply to? Well, if you, like me, run Windows 10 as a VM on top of your regular OS,  you will need to make sure that a) your host machine (layer 0) supports nested virtualisation, and b) that  your hypervisor has nested virtualisation enabled.    Details on how for the KVM hypervisor on a Linux host in footnote.[^2]   [^2]: Check that the bare-metal Linux host [supports "Nested virtualisation"](https://www.linux-kvm.org/page/Nested_Guests) by inspecting either one of /etc/modprobe.d/qemu-system-x86.conf or /sys/module/kvm_intel/parameters/nested.  The first file should contain nested=1, and the second should contain Y or 1. If not, set them accordingly. With the Windows VM shut down, edit the KVM XML config file so that the  tag includes mode=host-passthrough, like this: . Note that these settings must be in place *before* installing WSL2 Ubuntu on Windows. Also note that WSL2 supports nested virtualisation, but not WSL1.       ### Parenthesis: managing and rebooting your WSL2 virtual machines   Closing the Ubuntu shell in Windows does not restart the Ubuntu VM, and sometimes it can be useful to effect a reboot.  So how do we do that?   [It's not obvious](https://askubuntu.com/questions/1131122/cant-restart-shutdown-ubuntu-under-wsl),  but the wsl executable has two useful attributes,    ~~~powershell wsl --list --verbose wsl --terminate Ubuntu-20.04 ~~~   that allows you to list/view all installed VMs (running or not), and then you simply --terminate to shut down,  and wsl -d Ubuntu-20.04 to start again. Starting the VM can also be achieved from the app link in the Start menu, assuming the VM has one.         ## Install R and RStudio server on WSL2 Ubuntu   With a complete Ubuntu shell in place, we can now go about our business and install R. In your Ubuntu shell, add the public key of the CRAN repository, then add the Ubuntu 20.04 CRAN repo of R version 4.x to your package manager's list of sources:   {bash, echo=TRUE, eval=FALSE} sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 echo "deb https://cran.uni-muenster.de/bin/linux/ubuntu focal-cran40/" | sudo tee /etc/apt/sources.list.d/r-project.list    Note that you can change the domain to any one of the [CRAN mirrors](https://cran.r-project.org/mirmon_report.html) if you want. At this point, remember to tell apt about the new source by:   {bash, echo=TRUE, eval=FALSE} sudo apt update    Install R and its dependencies using the aptitude package manager  (we will also install libudunits2-dev, which is [a C library and utility for handling physical units](https://github.com/Unidata/UDUNITS-2/) that the R package units depends on).  {bash, echo=TRUE, eval=FALSE} sudo apt install r-base r-base-core r-recommended r-base-dev gdebi-core build-essential libcurl4-gnutls-dev libxml2-dev libssl-dev libudunits2-dev  This installs about 400 MB of software in a jiffy. That's R!   Now download the latest RStudio Server package file for Ubuntu 18+ and install it: {bash, eval=FALSE, echo=TRUE} wget https://rstudio.org/download/latest/stable/server/bionic/rstudio-server-latest-amd64.deb sudo gdebi rstudio-server-latest-amd64.deb    Great! Now all you got to do is launch your RStudio Server: {bash, eval=FALSE, echo=TRUE} sudo rstudio-server start    With that, you can open your browser (Chromium/Chrome recommended) at http://localhost:8787 and enjoy your very own instance of RStudio Server. (You sign in with the same username and password as in your Ubuntu Linux VM).   And with that, all the preparatory steps are done.[^3]  We can now start working in our data analysis environment in R.   [^3]: We should mention here that there is a known limitation to WSL2 virtualisation regarding  having services (such as RStudio Server in this case) persist across reboots of the VM or even across reboots of the Windows machine itself. In short, [WSL2 does not support systemd nor init](https://github.com/microsoft/WSL2-Linux-Kernel/issues/44), so any services on the Linux VM will [not autostart when the VM is rebooted](https://stackoverflow.com/a/55582694). Even so, the web is [full](https://github.com/DamionGans/ubuntu-wsl2-systemd-script) of [hacks](https://github.com/arkane-systems/genie) [attempting to circumvent this limitation](https://reddit.com/r/bashonubuntuonwindows/comments/ezpcyu/how_to_make_wsl2_run_services_at_startup/). On the other hand, [this might not be such a big problem](https://dev.to/bowmanjd/you-probably-don-t-need-systemd-on-wsl-windows-subsystem-for-linux-49gn)  for the kind of experimentation that having an otherwise complete Linux shell available on the Windows desktop allows.         ## Working with quantities and measurement errors   The quantities package (which ties together the capabilities of the units and errors packages) has an [excellent vignette on quantity calculus in R](https://cran.r-project.org/web/packages/quantities/vignettes/introduction.html).    The errors package is [available on CRAN](https://cran.r-project.org/web/packages/errors/index.html), and its [git repo is published on Github](https://github.com/r-quantities/errors).  The package is also [presented by its authors in an R Journal article from 2018](https://journal.r-project.org/archive/2018/RJ-2018-075/index.html), and there is also [an informative blog post by the main author for the very first release of the package](https://www.enchufa2.es/archives/errors-0-0-1.html).   The [units package](https://github.com/r-quantities/units) handles physical quantities  (values with associated units) in R,  and builds on UNIDATA's [udunits-2 C library](https://github.com/Unidata/UDUNITS-2) of  over 3000 supported units, making it a very robust implementation while avoiding the sin of re-inventing the wheel. It does lead to a dependency on the system-level package udunits2  however, making it very hard to install and work with units on Windows.   I think that these packages have finally brought R up to par with Python with regards to  quantity calculus, which is a crucial aspect for any scientist or engineer handling  physical measurement values.  Although it should be noted that the [uncertainties](https://pythonhosted.org/uncertainties/) [package](https://github.com/lebigot/uncertainties/) in Python is more mature  (v1.0 was released in 2010 whereas the errors package in R still has not achieved  that milestone), the errors package is very usable, indeed I used it for practically all the analysis steps of the latest paper I co-authored (I even managed to cite the package). @Ahmed2020   Let's take the errors and units packages out for a spin! Since this is a clean R installation, first we will need to install these packages and any other packages we would like to have available.     ### Install necessary R packages   Let's install all the r-quantities packages, and a subset of the [tidyverse packages](https://www.tidyverse.org/) (feel free to install any other packages you may like).   {r 'rstudio-server-install-packages', results="asis", echo=FALSE} include_graphics("assets/rstudio-server-install-packages.jpg")      {r, eval=FALSE} install.packages(c("errors", "units", "constants", "quantities")) install.packages(c("dplyr", "magrittr", "ggplot2", "ggforce", "tibble")) install.packages("remotes") remotes::install_github("solarchemist/R-common") remotes::install_github("solarchemist/oceanoptics")  (This operation may take a minute).   Now let's load all the packages we will need and get to work on some real-life data.   {r, echo=TRUE, eval=FALSE} library(errors) library(constants) library(units) library(quantities) library(common) library(oceanoptics) library(dplyr) library(magrittr) library(ggplot2) library(ggforce)      {r units-options, echo=FALSE} # aim: get proper (mathematically) division sign between value and unit in plot axis labels units_options(  # as soon as a division sign is included in sep, plot fails!  sep = c("~", "~"),   # but it seems we can achieve div sign between value and unit by hacking the group variable  group = c("/", ""), # appending ~ or space to group[1] throws an error, weird  # use negative powers, as division signs between units can be ambiguous  negative_power = TRUE,   # for my slash-sign hacking to work at all, parse must be FALSE, it seems  parse = FALSE)      ### Back to work   {r 'create-shared-rda', eval=FALSE, echo=FALSE} ## this chunk is left here to document how the shared rda file was created # the H02A files this.files <-   list.files(  path="/media/bay/taha/chepec/laboratory/OOHR2000/160329/H02A",  pattern = "\\.txt",   full.names = TRUE) # add the MB dye concentration for each experiment this.conc <-   c(100, 100, 50, 50, 20, 20, 10, 10,   5, 5, 2, 2, 1, 1, 0.5, 0.5) # uM # collect all experiments into a single dataframe df.MB <- NULL for (i in 1:length(this.files)) {  df.MB <-  bind_rows(  df.MB,  bind_cols(  OO2df(datafile = this.files[i], version = "1"),  conc = this.conc[i])) } df.MB <-  df.MB %>%  # H02AA experiments were bad (saturated detector), drop them from df  filter(!sampleid %in% c("H02AA", "H02AA+EtOH")) %>%  # the timestamp column is unnecessary, drop it  select(-DateTime) # save this dataframe as RDA file to disk save(  df.MB,   file = here::here("assets/uvvis-abs-MB-conc-series.rda"))      For the sake of keeping this post short, I will spare you the legwork of reading the individual spectra files into a dataframe. I have done that and saved to resultant dataframe to an rda-file which has been uploaded to a publicly accessible URL.  Let's import the data to a variable, and have a quick look at the dataframe:     {r 'read-uvvis-MB-conc-series', echo=TRUE, results="markup"} # read the MB abs spectra from the public URL df.MB <-   common::LoadRData2Variable(url="https://public.solarchemist.se/data/uvvis-abs-MB-conc-series.rda") df.MB %>% glimpse()      As you can see, it is raw data from the spectrometer, and it comes with neither  measurement errors nor units. (Unfortunately, it is still all-to-common for lab instruments to export data without such attributes).   It is perfectly acceptable to add *estimated* standard errors to this raw data  (we will add units at the same time using the set_quantities() function of the quantities package).   {r, echo=TRUE} # specify unit and error for the measured concentration values df.MB$conc <-   set_quantities(  df.MB$conc * 1e-6,  "mol/L",  # set relative error 2% (estimate)  df.MB$conc * 1e-6 * 0.02) # specify unit and error for wavelength values to instrumental resolution df.MB$wavelength <-   set_quantities(df.MB$wavelength, "nm", 1.5) # specify error for absorbance values to 1% rel error (estimate) df.MB$intensity <-   # absorbance is unitless, or more precisely, has a unit of "1"  set_quantities(df.MB$intensity, "1", df.MB$intensity * 0.01) # add a column describing the solvent (H2O or H2O+EtOH) df.MB$solvent <- "H2O" df.MB$solvent[grep("+EtOH$", df.MB$sampleid)] <- "H2O+EtOH"      For plotting, we will use the package ggplot2, which is like a cousin to the dplyr data manipulation package. Both come with a sort of "coding grammar", which is very comfortable to use once you  get used to it. In my opinion it is much easier to express chains of manipulations using this grammar  (you can find plenty of examples in the code below).   We should note that ggplotting two columns that both are units objects **will fail** with Error in Ops.units(): both operands of the expression should be "units" objects,  due to the units package (rightfully) determining that they don't have the same units  (or units that can be converted into each other).  The (very elegant) solution to this finicky behaviour is  [provided by the ggforce package](https://ggforce.data-imaginist.com/reference/scale_unit.html).  All we have to do is make sure we load library(ggforce), and then we can get back to using ggplot2 as usual (mostly as usual, you will notice  that we now use a scale_*_unit() function instead of the usual scale_*_continuous()).     (ref:uvvis-MB-conc-series) Recorded UV-Vis spectra of MB in water and MB in water with 10 vol% EtOH (with the estimated measurement error of the absorbances shown as vertical error bars in red).   {r 'uvvis-MB-conc-series', echo=TRUE, fig.cap="(ref:uvvis-MB-conc-series)", fig.align="center"} ggplot(  df.MB %>%   filter(wavelength %>% as.numeric() >= 250) %>%  filter(wavelength %>% as.numeric() <= 850)) +  facet_wrap(~solvent) +  geom_errorbar(  colour = "red",  aes(x = wavelength,   ymin = errors_min(intensity),  ymax = errors_max(intensity))) +  geom_line(aes(x = wavelength, y = intensity, group = sampleid)) +  scale_x_unit(  breaks = seq(300, 800, 100),  expand = expansion(0.03, 0)) +  scale_y_unit(expand = expansion(0.03, 0)) +  labs(x = "Wavelength", y = "Abs") +  theme(legend.position = "none")        > To check whether a unit already exists in the udunits2 library, I like > to use the udunits2::ud.is.parseable() function.  > If you know of a better way, please let me know!     Now, as you have surely already noticed, this dataset is simply a series of UV-Vis absorbance spectra at varying concentrations (of methylene blue in water and water with some ethanol, in this case).   Here's how the data looks at this point (after we set errors and/or units for some columns): {r echo=TRUE, results="markup"} df.MB %>% glimpse()    Most columns are of type "character", but note how $wavelength (and two other columns) say something else in that field. We can take a closer look at the structure of the dataframe using the str() function: {r echo=TRUE, results="markup"} df.MB %>% str()    Having thus set suitable estimates (or perhaps even properly calculated) errors for the measured  quantities *wavelength*, *absorbance*, and *concentration*, we will use this data to calculate the spectral absorption coefficient of MB in the two solvents we used.     As an introductory exercise, let's define the wavelength of the strongest absorption band along with its unit and a synthetic standard error value: {r echo=TRUE} wl.max.abs <- set_quantities(665, "nm", 5)    We can change the notation of the error from parenthesis to plus-minus when printing a quantity: {r, echo=TRUE, results="markup"} print(wl.max.abs, notation="plus-minus", digits=1)    > By default, errors uses *parenthesis* notation (which is more compact) and a single significant digit for errors.   We can also print more digits for the error part by setting digits. Both of these settings can also be set globally (for our entire R session) by using the options() function, like this: {r, echo=TRUE} options(errors.notation="plus-minus", errors.digits=2)    Before we get into calculating the spectral absorption coefficients, recall that the  Beer-Lambert law    A = \epsilon lc \;\Longleftrightarrow\; \epsilon = \frac{A}{lc}   includes the optical path length of the cuvette,$l=\SI{1}{\cm}$in our case. Let's set that as a variable so we can use it when calculating$\epsilon$:   {r, echo=TRUE} # I don't know the tolerances of the cuvette dimensions, (quite the oversight, I know) # but since these cuvettes are mass-produced I am assuming they are pretty tight and  # so estimating the error at 1/10 mm optical.length <- set_quantities(1, "cm", 0.01)      {r 'lm-wrapper', echo=FALSE} # wrap lm() to drop quantities and get around error when calling summary(lm()) # https://www.r-spatial.org/r/2018/08/31/quantities-final.html qlm <- function(formula, data, ...) {  # get units info, then drop quantities  row <- data[1, ]  for (var in colnames(data)) if (inherits(data[[var]], "quantities")) {  data[[var]] <- drop_quantities(data[[var]])  }    # fit linear model and add units info for later use  fit <- lm(formula, data, ...)  fit$units <- lapply(eval(attr(fit$terms, "variables"), row), units)  class(fit) <- c("qlm", class(fit))  fit } # get quantities attributes back for the coef method # (which is the only class we care about in this demonstration) coef.qlm <- function(object, ...) {  # compute coefficients' units  coef.units <- lapply(object$units, as_units)  for (i in seq_len(length(coef.units) - 1) + 1)  coef.units[[i]] <- coef.units[[1]] / coef.units[[i]]  coef.units <- lapply(coef.units, units)    # use units above and vcov diagonal to set quantities  coef <- mapply(set_quantities, NextMethod(), coef.units,  sqrt(diag(vcov(object))), mode="symbolic", SIMPLIFY=FALSE)    # use the rest of the vcov to set correlations  p <- combn(names(coef), 2)  for (i in seq_len(ncol(p)))  covar(coef[[p[1, i]]], coef[[p[2, i]]]) <- vcov(object)[p[1, i], p[2, i]]    coef }      With a series of UV-Vis absorbance spectra at varying concentrations, we can of course  plot the linear relationship between $A \propto c$ at a particular wavelength (such  as the main absorbance peak), but for the sake of this exercise we will take the calculation one step further and calculate $\epsilon(\lambda, A, c)$, i.e., the absorption coefficient for all wavelengths.  It is not pretty code, but it gets the job done. Here goes:      {r 'create-shared-abscoeff', eval=FALSE, echo=TRUE} # loop over each spectrum by solvent solvents <- df.MB %>% pull(solvent) %>% unique() # beware, unique() silently drops all quantity attributes wl.steps <- df.MB %>% filter(sampleid == unique(df.MB$sampleid)[1]) %>% pull(wavelength) # hold the results of the loop in an empty dataframe, pre-made in the right dimensions # complication! quantities cols must use the same (or convertable) units from the start abs.coeff <-   data.frame(  wavelength = rep(wl.steps, length(solvents)),  solvent = "",  # slope of linear fit  k = rep(set_quantities(1, "L/mol", 1), length(solvents) * length(wl.steps)),   # intercept of linear fit  m = rep(set_quantities(1, "1", 1), length(solvents) * length(wl.steps)),   # calculated abs coeff  coeff = rep(set_quantities(1, "L*cm^-1*mol^-1", 1), length(solvents) * length(wl.steps)),   rsq = NA, # R-squared of linear fit  adj.rsq = NA) # adj R-squared of linear fit i <- 0 # Fair warning: this loop is horribly slow, and may take several minutes to complete! for (s in 1:length(solvents)) {  message("Solvent: ", solvents[s])  # varying conc of a particular solvent  for (w in 1:length(wl.steps)) {  # to keep track of the current row in abs.coeff, we will use a counter i  i <- i + 1  message("[", i, "] ", solvents[s], " :: ", wl.steps[w], " nm")  # temporary df which we will use to calculate linear fits (abs.coeff) for each wl.step  intensity.by.wl <-   df.MB %>%  filter(solvent == unique(df.MB$solvent)[s]) %>%  # even though we drop the errors when filtering, it does not affect the returned rows  filter(as.numeric(wavelength) == unique(as.numeric(df.MB$wavelength))[w]) %>%  select(sampleid, wavelength, intensity, conc, solvent)  # now we can calculate abs coeff for this solvent at the current wavelength  this.fit <-   # now this is the really tricky one  # https://github.com/r-quantities/quantities/issues/10  qlm(data = intensity.by.wl, formula = intensity ~ conc)  # save R-squared and adjusted R-squared  abs.coeff$rsq[i] <- summary(this.fit)$r.squared  abs.coeff$adj.rsq[i] <- summary(this.fit)$adj.r.squared  # save coefficients of linear fit  abs.coeff$m[i] <- coef(this.fit)[[1]]  abs.coeff$k[i] <- coef(this.fit)[[2]]  abs.coeff$coeff[i] <- coef(this.fit)[[2]] / optical.length  # solvent   abs.coeff$solvent[i] <- unique(intensity.by.wl$solvent)  } } save(  abs.coeff,   file = here::here("assets/uvvis-abscoeff-MB-conc-series.rda"))      {r 'read-abscoeff-MB-conc-series', echo=TRUE} # read the calculated MB abs coeff from the public URL abs.coeff <-   LoadRData2Variable(url="https://public.solarchemist.se/data/uvvis-abscoeff-MB-conc-series.rda")    Well, that was quite a workout![^4] But now that we have saved the calculated absorption coefficients to an rda-file on disk, we won't have to repeat that calculation (for a while at least).  But the proof is in eating the pudding, so let's plot the absorption coefficients we got.     [^4]: Note that we had to redefine the lm() function using our own wrapper that takes care of unsetting and resetting the units and errors. [See above](#wrapper-function-redefining-lm-to-support-unitserrors). The code was copied verbatim from [Ucar's r-spatial article](https://www.r-spatial.org/r/2018/08/31/quantities-final.html).     (ref:abscoeff-MB-conc-series) Calculated spectral absorption coefficient for MB in water and MB in water with 10 vol% EtOH. With the estimated standard error due to propagated measurement errors of the underlying quantities (red errorbars).   {r 'abscoeff-MB-conc-series', echo=TRUE, fig.cap="(ref:abscoeff-MB-conc-series)", fig.align="center"} ggplot(  abs.coeff %>%  filter(wavelength %>% as.numeric() >= 250) %>%  filter(wavelength %>% as.numeric() <= 850)) +  geom_errorbar(  colour = "red",  aes(x = wavelength,   ymin = errors_min(coeff),  ymax = errors_max(coeff))) +  geom_line(aes(x = wavelength, y = coeff, group = solvent)) +  scale_x_unit(  breaks = seq(300, 800, 100),  expand = expansion(0.03, 0)) +  scale_y_unit(expand = expansion(0.03, 0)) +  labs(x = "Wavelength", y = "Abs. coeff.") +  theme(legend.position = "none")    That looks good. Note how the vertical errorbars of the two curves (different solvents) overlap completely at certain spectral regions, notably at the main absorbance peak.   And not only did our entire analysis automatically propagate the measurement errors, we also managed to do the same for the units, doing automatic dimensional analysis at the same time (note how the units in the plot axes labels is automatically derived from the data).   That's quantity analysis, in R, with real-life data! I hope you learned something new, and that this will inspire you to use quantity analysis  for your future work.           ## But what about Python?   I feel it would be bad form to end this post without even mentioning Python's support  for error propagation and dimensional analysis, which far predates R's.   Python has a whole array of [packages](https://pypi.org/project/units/) to  [choose](https://pypi.org/project/python-units/) from for [working with physical units](https://kitchingroup.cheme.cmu.edu/blog/2013/01/19/Using-units-in-python/),  but as far as I can tell none of them build on the UNIDATA database.   However, the [pint package](https://github.com/hgrecco/pint) appears to be the most popular by far (with 1200 stars on Github, it's actually an order of magnitude more popular than all the related R packages), and it integrates with both NumPy and the uncertainties  package and Pandas.   Support for errors and error propagation has been around since 2010 courtesy  of the [uncertainties package](https://pythonhosted.org/uncertainties) by Eric O. Lebigot.   I am not aware of any package for integrating between the uncertainties and pint packages,  but perhaps that is not necessary in Python. I'm not sure on that point. is a little lacking.   In any case, getting up and running with Python is fairly easy. In fact, Python 3.x comes pre-installed on the Ubuntu shell we just configured.  You should be aware, that you could just *write and execute* your Python code directly from  RStudio Server (or R in general) using the [reticulate package](https://rstudio.github.io/reticulate/) for R.   But if you prefer working with Python in its own IDE, you can actually  [install Jupyter Notebook](https://www.digitalocean.com/community/tutorials/how-to-set-up-jupyter-notebook-with-python-3-on-ubuntu-18-04)  and work from a browser, just like for RStudio Server. Just follow these steps:   In the WSL Ubuntu shell, run the following commands to install Jupyter Notebook:   ~~~bash sudo apt install python3-pip python3-dev sudo -H pip3  sudo -H pip3 install --upgrade pip sudo -H pip3 install jupyter ~~~   then run jupyter notebook and copy-paste the provided URL into your browser.   > Note: for serious work, you should look into installing JupyterLab and/or JupyterHub instead of Jupyter Notebook.   [This template for a student lab report](https://nbviewer.jupyter.org/gist/solarchemist/f9eb4f93518d16e0e6d3) I co-authored way back when uses both uncertainties and numpy for its calculations, and may be of interest.         ## Wrapper function redefining lm() to support units/errors   {r, ref.label='lm-wrapper', echo=TRUE, eval=FALSE}        ## sessionInfo()   {r results='markup', cache=F} sessionInfo()      ## Read more on your own...   + [Does R understand physical quantities?, Edzer Pebesma, R-spatial (2016)](https://www.r-spatial.org/r/2016/06/10/units.html) + [Measurement units in R now simplify, Edzer Pebesma, R-spatial (2016)](https://www.r-spatial.org/r/2016/08/16/units2.html) + [Automatic units in axis labels, Edzer Pebesma, R-spatial (2016)](https://www.r-spatial.org/r/2016/09/29/plot_units.html) + [Linear model where the data has uncertainty using R, Gimelist, StackExchange (2016)](https://stats.stackexchange.com/questions/235693/linear-model-where-the-data-has-uncertainty-using-r) + [Using RStudio Server in Windows WSL2, JooYoung Seo, RStudio Support (2020)](https://support.rstudio.com/hc/en-us/articles/360049776974-Using-RStudio-Server-in-Windows-WSL2) + [Using RStudio Server on the Windows 10 subsystem for Linux, Axel-Cleris Gailloty (2020)](https://agailloty.github.io/Rstudio-server-windows/) + [Enabling virtual machine control structure shadowing on a nested virtual machine with the Intel Xeon E5-2600 v3 product family, David L. Mulnix, Intel (2014)](https://software.intel.com/content/www/us/en/develop/blogs/enabling-virtual-machine-control-structure-shadowing-on-a-nested-virtual-machine.html) + [Upgrading RStudio Server, Ian Pylvainen, RStudio Support (2020)](https://support.rstudio.com/hc/en-us/articles/216079967-Upgrading-RStudio-Server) + [R installation and administration, R Core Team, CRAN (2020)](https://cran.r-project.org/doc/manuals/r-release/R-admin.html)     ## References