Tool for building and analyzing document-vector models drawn from our corpus
Updated
Code to produce a tokenized corpus from JSON documents
Updated
A collection of Python utilities that we use across various scripts
Updated
Documentation and tooling for our corpus of journal articles
Updated
Documentation on how we fetch journal articles from Sci-Hub
Updated
Code for performing basic geocoding against the GeoNames gazeteer data
Updated