West Nile virus genomic data from California

Summary: Genomic epidemiology of West Nile virus in California. Data here.

We are sequencing West Nile virus from California, with an emphasis on San Diego, Kern, and Sacramento/Yolo counties, to understand how 1) the virus spreads between regions, 2) is maintained locally between seasons, and 3) the factors that promote local outbreaks. Our goal is to generate 600-700 new West Nile virus genomes from infected birds and mosquitoes collected from 2004-2017.

Below is a table showing the count of sequenced genomes by county.

County Sequence Count
Kern 189
SanDiego 179
Sacramento 144
Yolo 35
LosAngeles 18
Fresno 13
Stanislaus 10
Kings 8
Butte 8
Tulare 6
SanBernardino 5
Pierce 4
Yuba 3
Spokane 3
Shasta 3
Placer 3
Lake 3
ContraCosta 3
Sutter 2
Riverside 2
Merced 2
Alameda 2
Solano 1
SanJoaquin 1
Madera 1
Calaveras 1
Total 649

The samples from San Diego county were provided by Nikos Garfield and Saran Grewal from the San Diego County Vector Control Program. The samples from all the other counties in California were provided by Chris Barker, Sarah Wheeler and Ying Fang (University of California, Davis).

The samples from WA were provided by Krisztian Magori from the Eastern Washington University, Amy Salamone, Wayne Clifford and David Kangiser from the Washongton State Department of Health.

The sequencing is being performed using an amplicon-based sequencing scheme using Primal. Our full protocol is available online here. Briefly, after preparing cDNA from WNV samples, we generate 38 amplicons, ~400 bp in length, with ~100 bp overlap between amplicons in two PCR reactions. The two amplicon pools are subsequently combined and subjected to standard Illumina library preparation. Sequencing data is processed using the iVar pipeline which we will be released soon.

Stay tuned for new data uploads and preliminary analysis.

Please note, that the draft_consensus_sequences folder contains genomes that we are still processing to exlucde potential contamination.

Disclaimer. Please note that this data is still based on work in progress and should be considered preliminary. If you intend to include any of these data in publications, please let us know – otherwise please feel free to download and use without restrictions. We have shared this data with the hope that people will download and use it, as well as scrutinize it so we can improve our methods and analyses. Please contact us if you have any questions or comments – we’ll buy beers for #ResearchParasites that spot flaws and faults in the data and come up with improvements!

Andersen Lab
The Scripps Research Institute
La Jolla, CA, USA
[email protected]