Pesticideinfo.org is the largest pesticide-related database in the world, earning in 2003, an award from the US EPA. Here’s an example of the chemical page for Atrazine.
Though the look of the site is now quite dated (it was built 8 years ago and is no longer updated) it’s both fast and reliable, even running on older equipment. This is due to extensive optimizations both at the data design level and through extensive MySQL tuning.
I was the Co-Director and technical lead for this project and wrote all front and back-end code.
The data include over 100 distinct datasets collected from authoritative sources from around the globe including EPA, UN and European datasets. These data are imported and harmonized to create a fully cross-referenced dataset that can be queried and used in broad variety of ways.
Because we imported data from so many sources, the scripts to load, validate and process the data were extensive; and included over 40 pages of SQL code alone (SQL is normally quite terse). You can read more on the data here. The diagram below shows a partial diagram of the database schema for this project.
Chemical data is harmonized across many sources and questions of fact are carefully researched by Dr. Susan Kegley, now CEO and Principle at The Pesticide Research Institute. The quality of the dataset meant our top users at the time were the US EPA and the US Military; and today it’s used as a reference resource by scientists from around the world.
Here’s an example of the chemical page for Atrazine.
Toxicity data from around the Globe
To communicate complex toxicity data we created a scorecard which reflects toxicity assessments from dozens of sources.