Evaluate Data Quality of a Title List

Once you have located a provider's title lists, you'll want to do a quick evaluation of their quality to make sure they're suitable for use in GOKb and gain an understanding of how each list is formatted.


Step 1: Verify that key GOKb columns are present

  • Your data should contain all of the required GOKb fields that are unique to each title on the list.
  • (There are a few required GOKb fields that are not often found in provider lists, but that are easy to populate because they're the same for all titles.)
  • The following fields must be present:
    • Publication title
    • Print ISSN
    • Electronic ISSN
    • URL
    • Date of first online issue
  • Note that it is OK if individual titles are lacking ISSNs. It's only required that the data set contain a column for each ISSN type.

Step 2: Get a general sense of data quality

  • Skim the list to make sure that most fields are populated throughout the data set.
  • Spot check a few titles to make sure the ISSNs, URLs, and coverage dates are correct.
  • You may also want to get a sense of which non-required GOKb fields are available, including:
    • Additional identifiers (DOIs, provider-specific IDs, etc.)
    • Date of last online issue
    • Volume of first online issue
    • Number of first online issue
    • Volume of last online issue
    • Number of last online issue
    • Embargo period
    • Publisher names (If the package contains resources for multiple publishers)
  • The goal here is to get a sense of what data is included, what's missing, and how good the overall quality is.

Step 3: Determine treatment of title histories

  • Determine whether the provide is including historic titles on its list.
    • Some providers will include all previous iterations of a title with appropriate access dates for each based on when it was published.
    • Some providers will list only the current iteration of a title and group all available coverage under that title.
  • If historic titles are provided, determine whether the title list includes any linkages between earlier and later titles.
    • If a provider is using KBART phase II, they may create links by populating the "preceding_publication_title_id" field.
    • Some providers may just use free text notes to indicate histories.
  • Depending on the size of your list, you may wish to add title history data to your list before adding it to OpenRefine or GOKb.

Next step

Create GOKb Provider Documentation

Operated as a Community Resource by the Open Library Foundation