4. Data Extraction Tips & Tricks

4.1. Factor-Level Extraction

4.1.1. Defining the factor itself

What do I do if …

… it is not clear which factors are considered relevant points of comparison, and which are not

Factors listed here are considered “unmodifiable” or “immutable”, and thus not relevant for our purposes:

  • Immutable Factors

For examples of extractable factors, please see:

  • Common Factor Types
  • Binary and Continuous Factors

… I’m confused about how to extract a factor

If you’re confused about a factor, reach out on Slack for clarification. Additionally, add a note to indicate why the factor was extracted in that way.

… I want to compare conventional, ABF, organic, ‘welfare’ or ‘humane’ production systems

Common Factor Types

4.1.2. Selecting exposed and referent groups

Note

Groups are otherwise referred to as levels.

What do I do if…

… I am not sure which is which?

Selecting a Referent Level

… a group is not clearly defined (i.e. an “Other” group)?

Non-informative Levels

… there are more than two designated groups in the study?

Multiple Discrete Levels (Categories)

Non-informative Levels

4.1.3. Selecting the sample type

4.1.3.1. Which sample type should be extracted if multiple (i.e. fecal, water, dirt…) sample types are available?

Sample Type

4.1.4. Choosing the microbe subtype

What do I do if…

… a microbe subtype is not listed in the dropdown in CEDAR

For Salmonella species:

  • Salmonella Species

4.1.5. Factor data

What do I do if…

… the data are only available in a figure

If factor data are only available in a figure (i.e. no numbers are given on a graph, or in text), and the numerical value cannot be determined with certainty (i.e. is not zero or 100%), indicate this using the notes field, and skip extracting the factor.

… multiple data formats (i.e. a contingency table and a prevalence table) are available for a factor

Multiple Data Formats

… measurements are provided for multiple time points

Multiple Production Stages

Multiple Timepoints Within a Single Production Stage

Multiple Timepoints Within the Farm Stage

… the study uses SIR (Susceptible, Intermediate, and Resistant)

If a study includes an ‘Intermediate’ category, add the intermediate isolates/prevalence to the resistant category (i.e. we round up intermediate to resistant).

… odds ratios from both multi-variable and univariable analyses are available

Odds Ratio Extraction

… there are zero observations of resistance in both the exposed and referent groups

Zero Observations of Resistance

… the results are in log(Odds) or an estimate/coefficient of a logistic regression

Recall that the Odds Ratio = e^x, where x is the coefficient.

… the data are presented only as a relative risk

We cannot use relative risk at this time. Do not extract the factor’s data, but indicate the omission by attaching a note to the associated reference through the Notes and Issues tab.

… the study reports multi-drug resistance (MDR)

MDR Rules

… the study reports genomic data on AMR

Genomic data

4.2. General

What do I do if…

… there are no factors to extract

If there are no factors to extract, indicate this using the Exclude Extraction Reason field, and skip the reference.

… an item I need is missing from a dropdown

If an item is missing from a dropdown (i.e. a non-free-text field), reach out on Slack. If the decision is made to use an alternative item in the list, add a note to justify this replacement.