Skip to contents

All reference data (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) in this AMR package are reliable, up-to-date and freely available. We continually export our data sets to formats for use in R, MS Excel, Apache Feather, Apache Parquet, SPSS, SAS, and Stata. We also provide tab-separated text files that are machine-readable and suitable for input in any software program, such as laboratory information systems.

On this page, we explain how to download them and how the structure of the data sets look like.

microorganisms: Full Microbial Taxonomy

A data set with 52 171 rows and 23 columns, containing the following column names:
mo, fullname, status, kingdom, phylum, class, order, family, genus, species, subspecies, rank, ref, oxygen_tolerance, source, lpsn, lpsn_parent, lpsn_renamed_to, gbif, gbif_parent, gbif_renamed_to, prevalence, and snomed.

This data set is in R available as microorganisms, after you load the AMR package.

It was last updated on 14 July 2023 08:49:06 UTC. Find more info about the structure of this data set here.

Direct download links:

NOTE: The exported files for SAS, SPSS and Stata contain only the first 50 SNOMED codes per record, as their file size would otherwise exceed 100 MB; the file size limit of GitHub. Their file structures and compression techniques are very inefficient. Advice? Use R instead. It’s free and much better in many ways.

The tab-separated text file and Microsoft Excel workbook both contain all SNOMED codes as comma separated values.

Source

This data set contains the full microbial taxonomy of five kingdoms from the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF):

  • Parte, AC et al. (2020). List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ. International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; . Accessed from https://lpsn.dsmz.de on December 11th, 2022.
  • GBIF Secretariat (2023). GBIF Backbone Taxonomy. Checklist dataset . Accessed from https://www.gbif.org on January 8th, 2024.
  • Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name ‘Microorganism’, OID 2.16.840.1.114222.4.11.1009 (v12). URL: https://phinvads.cdc.gov

Example content

Included (sub)species per taxonomic kingdom:

Kingdom Number of (sub)species
(unknown kingdom) 1
Animalia 1 379
Archaea 1 314
Bacteria 36 501
Fungi 7 905
Protozoa 5 071

Example rows when filtering on genus Escherichia:

mo fullname status kingdom phylum class order family genus species subspecies rank ref oxygen_tolerance source lpsn lpsn_parent lpsn_renamed_to gbif gbif_parent gbif_renamed_to prevalence snomed
B_ESCHR Escherichia accepted Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia genus Castellani et al., 1919 facultative anaerobe LPSN 515602 482 3221780 11158430 1.0 407310004, 407251000, 407281008, …
B_ESCHR_ADCR Escherichia adecarboxylata synonym Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia adecarboxylata species Leclerc, 1962 aerobe LPSN 776052 515602 777447 1.0
B_ESCHR_ALBR Escherichia albertii accepted Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia albertii species Huys et al., 2003 aerobe LPSN 776053 515602 5427575 3221780 1.0 419388003
B_ESCHR_BLTT Escherichia blattae synonym Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia blattae species Burgess et al., 1973 likely facultative anaerobe LPSN 776056 515602 788468 1.5
B_ESCHR_COLI Escherichia coli accepted Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia coli species Castellani et al., 1919 facultative anaerobe LPSN 776057 515602 11286021 3221780 1.0 1095001000112106, 715307006, 737528008, …
B_ESCHR_DYSN Escherichia dysenteriae accepted Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia dysenteriae species likely facultative anaerobe GBIF 10862979 3221780 1.5

antibiotics: Antibiotic (+Antifungal) Drugs

A data set with 483 rows and 14 columns, containing the following column names:
ab, cid, name, group, atc, atc_group1, atc_group2, abbreviations, synonyms, oral_ddd, oral_units, iv_ddd, iv_units, and loinc.

This data set is in R available as antibiotics, after you load the AMR package.

It was last updated on 24 February 2024 14:16:52 UTC. Find more info about the structure of this data set here.

Direct download links:

The tab-separated text file and Microsoft Excel workbook, and SAS, SPSS and Stata files all contain the ATC codes, common abbreviations, trade names and LOINC codes as comma separated values.

Source

This data set contains all EARS-Net and ATC codes gathered from WHO and WHONET, and all compound IDs from PubChem. It also contains all brand names (synonyms) as found on PubChem and Defined Daily Doses (DDDs) for oral and parenteral administration.

Example content

ab cid name group atc atc_group1 atc_group2 abbreviations synonyms oral_ddd oral_units iv_ddd iv_units loinc
AMK 37768 Amikacin Aminoglycosides D06AX12, J01GB06, S01AA21 Aminoglycoside antibacterials Other aminoglycosides ak, ami, amik, … amicacin, amikacillin, amikacin, … 1.0 g 101493-5, 11-7, 12-5, …
AMX 33613 Amoxicillin Beta-lactams/penicillins J01CA04 Beta-lactam antibacterials, penicillins Penicillins with extended spectrum ac, amox, amx actimoxi, amoclen, amolin, … 1.5 g 3.0 g 101498-4, 15-8, 16-6, …
AMC 23665637 Amoxicillin/clavulanic acid Beta-lactams/penicillins J01CR02 Beta-lactam antibacterials, penicillins Combinations of penicillins, incl. beta-lactamase inhibitors a/c, amcl, aml, … amocla, amoclan, amoclav, … 1.5 g 3.0 g
AMP 6249 Ampicillin Beta-lactams/penicillins J01CA01, S01AA19 Beta-lactam antibacterials, penicillins Penicillins with extended spectrum am, amp, ampi acillin, adobacillin, amblosin, … 2.0 g 6.0 g 101477-8, 101478-6, 18864-9, …
AZM 447043 Azithromycin Macrolides/lincosamides J01FA10, S01AA26 Macrolides, lincosamides and streptogramins Macrolides az, azi, azit, … aritromicina, aruzilina, azasite, … 0.3 g 0.5 g 100043-9, 16420-2, 16421-0, …
PEN 5904 Benzylpenicillin Beta-lactams/penicillins J01CE01, S01AA14 Combinations of antibacterials Combinations of antibacterials bepe, pen, peni, … abbocillin, ayercillin, bencilpenicilina, … 3.6 g

antivirals: Antiviral Drugs

A data set with 120 rows and 11 columns, containing the following column names:
av, name, atc, cid, atc_group, synonyms, oral_ddd, oral_units, iv_ddd, iv_units, and loinc.

This data set is in R available as antivirals, after you load the AMR package.

It was last updated on 20 October 2023 12:51:48 UTC. Find more info about the structure of this data set here.

Direct download links:

The tab-separated text file and Microsoft Excel workbook, and SAS, SPSS and Stata files all contain the trade names and LOINC codes as comma separated values.

Source

This data set contains all ATC codes gathered from WHO and all compound IDs from PubChem. It also contains all brand names (synonyms) as found on PubChem and Defined Daily Doses (DDDs) for oral and parenteral administration.

Example content

av name atc cid atc_group synonyms oral_ddd oral_units iv_ddd iv_units loinc
ABA Abacavir J05AF06 441300 Nucleoside and nucleotide reverse transcriptase inhibitors abacavir sulfate, avacavir, ziagen 0.6 g 29113-8, 30273-7, 30287-7, …
ACI Aciclovir J05AB01 135398513 Nucleosides and nucleotides excl. reverse transcriptase inhibitors acicloftal, aciclovier, aciclovirum, … 4.0 g 4 g
ADD Adefovir dipivoxil J05AF08 60871 Nucleoside and nucleotide reverse transcriptase inhibitors adefovir di, adefovir di ester, adefovir dipivoxyl, … 10.0 mg
AME Amenamevir J05AX26 11397521 Other antivirals amenalief 0.4 g
AMP Amprenavir J05AE05 65016 Protease inhibitors agenerase, carbamate, prozei 1.2 g 29114-6, 30296-8, 30297-6, …
ASU Asunaprevir J05AP06 16076883 Antivirals for treatment of HCV infections sunvepra, sunvepratrade 0.2 g

clinical_breakpoints: Interpretation from MIC values & disk diameters to SIR

A data set with 29 883 rows and 13 columns, containing the following column names:
guideline, type, host, method, site, mo, rank_index, ab, ref_tbl, disk_dose, breakpoint_S, breakpoint_R, and uti.

This data set is in R available as clinical_breakpoints, after you load the AMR package.

It was last updated on 24 February 2024 14:16:52 UTC. Find more info about the structure of this data set here.

Direct download links:

Source

This data set contains interpretation rules for MIC values and disk diffusion diameters. Included guidelines are CLSI (2011-2023) and EUCAST (2011-2023).

Clinical breakpoints in this package were validated through and imported from WHONET, a free desktop Windows application developed and supported by the WHO Collaborating Centre for Surveillance of Antimicrobial Resistance. More can be read on their website. The developers of WHONET and this AMR package have been in contact about sharing their work. We highly appreciate their development on the WHONET software.

The CEO of CLSI and the chairman of EUCAST have endorsed the work and public use of this AMR package (and consequently the use of their breakpoints) in June 2023, when future development of distributing clinical breakpoints was discussed in a meeting between CLSI, EUCAST, the WHO, and developers of WHONET and the AMR package.

NOTE: this AMR package (and the WHONET software as well) contains internal methods to apply the guidelines, which is rather complex. For example, some breakpoints must be applied on certain species groups (which are in case of this package available through the microorganisms.groups data set). It is important that this is considered when using the breakpoints for own use.

Example content

guideline type host method site mo mo_name rank_index ab ab_name ref_tbl disk_dose breakpoint_S breakpoint_R uti
EUCAST 2023 human human DISK B_ACHRMB_XYLS Achromobacter xylosoxidans 2 MEM Meropenem A. xylosoxidans 10ug 26.000 20.000 FALSE
EUCAST 2023 human human MIC B_ACHRMB_XYLS Achromobacter xylosoxidans 2 MEM Meropenem A. xylosoxidans 1.000 4.000 FALSE
EUCAST 2023 human human DISK B_ACHRMB_XYLS Achromobacter xylosoxidans 2 SXT Trimethoprim/sulfamethoxazole A. xylosoxidans 1.25ug/23.75ug 26.000 26.000 FALSE
EUCAST 2023 human human MIC B_ACHRMB_XYLS Achromobacter xylosoxidans 2 SXT Trimethoprim/sulfamethoxazole A. xylosoxidans 0.125 0.125 FALSE
EUCAST 2023 human human DISK B_ACHRMB_XYLS Achromobacter xylosoxidans 2 TZP Piperacillin/tazobactam A. xylosoxidans 30ug/6ug 26.000 26.000 FALSE
EUCAST 2023 human human MIC B_ACHRMB_XYLS Achromobacter xylosoxidans 2 TZP Piperacillin/tazobactam A. xylosoxidans 4.000 4.000 FALSE

intrinsic_resistant: Intrinsic Bacterial Resistance

A data set with 134 634 rows and 2 columns, containing the following column names:
mo and ab.

This data set is in R available as intrinsic_resistant, after you load the AMR package.

It was last updated on 16 December 2022 15:10:43 UTC. Find more info about the structure of this data set here.

Direct download links:

Source

This data set contains all defined intrinsic resistance by EUCAST of all bug-drug combinations, and is based on ‘EUCAST Expert Rules’ and ‘EUCAST Intrinsic Resistance and Unusual Phenotypes’ v3.3 (2021).

Example content

Example rows when filtering on Enterobacter cloacae:

microorganism antibiotic
Enterobacter cloacae Acetylmidecamycin
Enterobacter cloacae Acetylspiramycin
Enterobacter cloacae Amoxicillin
Enterobacter cloacae Amoxicillin/clavulanic acid
Enterobacter cloacae Ampicillin
Enterobacter cloacae Ampicillin/sulbactam
Enterobacter cloacae Avoparcin
Enterobacter cloacae Azithromycin
Enterobacter cloacae Benzylpenicillin
Enterobacter cloacae Cadazolid
Enterobacter cloacae Cefadroxil
Enterobacter cloacae Cefalexin
Enterobacter cloacae Cefalotin
Enterobacter cloacae Cefazolin
Enterobacter cloacae Cefoxitin
Enterobacter cloacae Clarithromycin
Enterobacter cloacae Clindamycin
Enterobacter cloacae Cycloserine
Enterobacter cloacae Dalbavancin
Enterobacter cloacae Dirithromycin
Enterobacter cloacae Erythromycin
Enterobacter cloacae Flurithromycin
Enterobacter cloacae Fusidic acid
Enterobacter cloacae Gamithromycin
Enterobacter cloacae Josamycin
Enterobacter cloacae Kitasamycin
Enterobacter cloacae Lincomycin
Enterobacter cloacae Linezolid
Enterobacter cloacae Meleumycin
Enterobacter cloacae Midecamycin
Enterobacter cloacae Miocamycin
Enterobacter cloacae Nafithromycin
Enterobacter cloacae Norvancomycin
Enterobacter cloacae Oleandomycin
Enterobacter cloacae Oritavancin
Enterobacter cloacae Pirlimycin
Enterobacter cloacae Primycin
Enterobacter cloacae Pristinamycin
Enterobacter cloacae Quinupristin/dalfopristin
Enterobacter cloacae Ramoplanin
Enterobacter cloacae Rifampicin
Enterobacter cloacae Rokitamycin
Enterobacter cloacae Roxithromycin
Enterobacter cloacae Solithromycin
Enterobacter cloacae Spiramycin
Enterobacter cloacae Tedizolid
Enterobacter cloacae Teicoplanin
Enterobacter cloacae Telavancin
Enterobacter cloacae Telithromycin
Enterobacter cloacae Thiacetazone
Enterobacter cloacae Tildipirosin
Enterobacter cloacae Tilmicosin
Enterobacter cloacae Troleandomycin
Enterobacter cloacae Tulathromycin
Enterobacter cloacae Tylosin
Enterobacter cloacae Tylvalosin
Enterobacter cloacae Vancomycin

dosage: Dosage Guidelines from EUCAST

A data set with 503 rows and 9 columns, containing the following column names:
ab, name, type, dose, dose_times, administration, notes, original_txt, and eucast_version.

This data set is in R available as dosage, after you load the AMR package.

It was last updated on 22 June 2023 13:10:59 UTC. Find more info about the structure of this data set here.

Direct download links:

Source

EUCAST breakpoints used in this package are based on the dosages in this data set.

Currently included dosages in the data set are meant for: (), ‘EUCAST Clinical Breakpoint Tables’ v11.0 (2021), and ‘EUCAST Clinical Breakpoint Tables’ v12.0 (2022).

Example content

ab name type dose dose_times administration notes original_txt eucast_version
AMK Amikacin standard_dosage 25-30 mg/kg 1 iv 25-30 mg/kg x 1 iv 13
AMX Amoxicillin high_dosage 2 g 6 iv 2 g x 6 iv 13
AMX Amoxicillin standard_dosage 1 g 3 iv 1 g x 3-4 iv 13
AMX Amoxicillin high_dosage 0.75-1 g 3 oral 0.75-1 g x 3 oral 13
AMX Amoxicillin standard_dosage 0.5 g 3 oral 0.5 g x 3 oral 13
AMX Amoxicillin uncomplicated_uti 0.5 g 3 oral 0.5 g x 3 oral 13

example_isolates: Example Data for Practice

A data set with 2 000 rows and 46 columns, containing the following column names:
date, patient, age, gender, ward, mo, PEN, OXA, FLC, AMX, AMC, AMP, TZP, CZO, FEP, CXM, FOX, CTX, CAZ, CRO, GEN, TOB, AMK, KAN, TMP, SXT, NIT, FOS, LNZ, CIP, MFX, VAN, TEC, TCY, TGC, DOX, ERY, CLI, AZM, IPM, MEM, MTR, CHL, COL, MUP, and RIF.

This data set is in R available as example_isolates, after you load the AMR package.

It was last updated on 21 January 2023 22:47:20 UTC. Find more info about the structure of this data set here.

Source

This data set contains randomised fictitious data, but reflects reality and can be used to practise AMR data analysis.

Example content

date patient age gender ward mo PEN OXA FLC AMX AMC AMP TZP CZO FEP CXM FOX CTX CAZ CRO GEN TOB AMK KAN TMP SXT NIT FOS LNZ CIP MFX VAN TEC TCY TGC DOX ERY CLI AZM IPM MEM MTR CHL COL MUP RIF
2002-01-02 A77334 65 F Clinical B_ESCHR_COLI R I I R R R R R R R R R R
2002-01-03 A77334 65 F Clinical B_ESCHR_COLI R I I R R R R R R R R R R
2002-01-07 067927 45 F ICU B_STPHY_EPDR R R R R S S S S S S R R R
2002-01-07 067927 45 F ICU B_STPHY_EPDR R R R R S S S S S S R R R
2002-01-13 067927 45 F ICU B_STPHY_EPDR R R R R R S S S S R R R
2002-01-13 067927 45 F ICU B_STPHY_EPDR R R R R R S S S S R R R R

example_isolates_unclean: Example Data for Practice

A data set with 3 000 rows and 8 columns, containing the following column names:
patient_id, hospital, date, bacteria, AMX, AMC, CIP, and GEN.

This data set is in R available as example_isolates_unclean, after you load the AMR package.

It was last updated on 27 August 2022 18:49:37 UTC. Find more info about the structure of this data set here.

Source

This data set contains randomised fictitious data, but reflects reality and can be used to practise AMR data analysis.

Example content

patient_id hospital date bacteria AMX AMC CIP GEN
J3 A 2012-11-21 E. coli R I S S
R7 A 2018-04-03 K. pneumoniae R I S S
P3 A 2014-09-19 E. coli R S S S
P10 A 2015-12-10 E. coli S I S S
B7 A 2015-03-02 E. coli S S S S
W3 A 2018-03-31 S. aureus R S R S

microorganisms.groups: Species Groups and Microbiological Complexes

A data set with 521 rows and 4 columns, containing the following column names:
mo_group, mo, mo_group_name, and mo_name.

This data set is in R available as microorganisms.groups, after you load the AMR package.

It was last updated on 14 July 2023 08:49:06 UTC. Find more info about the structure of this data set here.

Direct download links:

Source

This data set contains species groups and microbiological complexes, which are used in the clinical_breakpoints data set.

Example content

mo_group mo mo_group_name mo_name
B_ACNTB_BMNN-C B_ACNTB_BMNN Acinetobacter baumannii complex Acinetobacter baumannii
B_ACNTB_BMNN-C B_ACNTB_CLCC Acinetobacter baumannii complex Acinetobacter calcoaceticus
B_ACNTB_BMNN-C B_ACNTB_DJKS Acinetobacter baumannii complex Acinetobacter dijkshoorniae
B_ACNTB_BMNN-C B_ACNTB_NSCM Acinetobacter baumannii complex Acinetobacter nosocomialis
B_ACNTB_BMNN-C B_ACNTB_PITT Acinetobacter baumannii complex Acinetobacter pittii
B_ACNTB_BMNN-C B_ACNTB_SFRT Acinetobacter baumannii complex Acinetobacter seifertii

microorganisms.codes: Common Laboratory Codes

A data set with 4 957 rows and 2 columns, containing the following column names:
code and mo.

This data set is in R available as microorganisms.codes, after you load the AMR package.

It was last updated on 8 July 2023 15:30:05 UTC. Find more info about the structure of this data set here.

Direct download links:

Source

This data set contains commonly used codes for microorganisms, from laboratory systems and WHONET.

Example content

code mo
1011 B_GRAMP
1012 B_GRAMP
1013 B_GRAMN
1014 B_GRAMN
1015 F_YEAST
103 B_ESCHR_COLI