Archived Post

From listmasteranimalgenome.org  Wed Feb  1 09:07:17 2017
Return-Path: <listmasteranimalgenome.org>
X-Original-To: omia-supportanimalgenome.org
Delivered-To: omia-supportanimalgenome.org
Received: by nagrp5.ansci.iastate.edu (Postfix, from userid 501)
	id 931B2C072084; Wed,  1 Feb 2017 09:07:17 -0600 (CST)
From: Matthew McClure <mmcclureicbf.com> 
Subject: RE: Limited representation of OMIA causative mutations for cattle in SNP databases
       
Postmaster: submission approved by list moderator
To: OMIA-Supporters <omia-supportanimalgenome.org>
Date: Wed, 01 Feb 2017 09:07:17 -0600

I agree that having journals require mutations to be deposited in to dbSNP
(or similar) is a nice idea, but being on both sides of this issue it is not
an easy thing to get your data into dbSNP (at least not the last time I did),

What I would ask is that reviewers require that the authors include
sufficient flanking sequence in their papers--even if they include it in a
supplementary table. Personally I can attest to that if they have a dbSNP id
that is great, but having enough flanking sequence in the paper is almost as
good. When reviewers/journals allow authors to have a paper about causative
mutation X and only provide 5bp of flanking or 4 flanking amino acids it does
the community no good.

As for Marina Fortes comment that " to think about is more ways to show the
value and need of database curators, so that more of us can justify
allocating time to this important task." I fully agree with this. I’ve done
the work for ~120 bovine traits that Zhiliang describes “comprehend,
locate, match, merge,” blast, identify,….. and it takes a lot of time and
effort.

In the past 2 years I and Jen McClure have put together a booklet for all of
the mutations on our custom bovine SNP chip (the IDB) so that farmers can
easily understand the genetic diseases when I start providing reports on what
genetic traits are in their herds (later this year). The farmer version is
basic with ~1 paragraph describing the disease/trait, a list of known common
sires that carry the mutations,... The extended version we made has the
farmer information, ~1 paragraph on vet/sciencitst description, OMIA,
References, genomic information (p.W345L) and ~100bp of flanking sequence. (a
screen shot of HH1 is below)

I've offered this information to Frank for OMIA and he does hope to
incorporate into OMIA. I will eventually publish it in a journal along with
allele frequency data in Irish cattle (we have a lot genotyped). Until I
publish it if the community wishes to use this resource you can find it at
https://www.icbf.com/wp/?page_id=2170 under the Genetic Disease section
(choose the Clinican and Research edition). While it doesn’t list it on the
booklet if you reference this please put myself and Jennifer McClure as the
authors of this work. The booklet only lists information on the diseases that
we have validated probes for, but I do have some information on genetic
traits that are in the validation pipeline.

So yes it’s a good idea to ask Journal editors to require submission to
dbSNP, dbVAR, …. But I would also ask the community when they act as
reviewers to require authors to provide enough flanking sequence.

Sincerely,

Matthew McClure, PhD Geneticist

Irish Cattle Breeding Federation Highfield House, Shinagh, Bandon, Co. Cork,
P72X050 Ireland

O: +353 (0)23 882-0153 F: +353 (0)23 882-0229 C: +353 (0)87 238-8327 SKYPE:
matthew.mcclure8


-----Original Message-----
.From: Hu, Zhiliang [AN S] [mailto:zhuiastate.edu]
.Sent: 01 February 2017 03:32
.To: OMIA-Supporters
.Cc: Hu, Zhiliang [AN S]
.Subject: Re: Limited representation of OMIA causative mutations for cattle in SNP
databases

I completely agree with Carrie that journal requirements for data deposition
into a well maintained public database as a pre-request is an essential first
step to not only make the data available to public, but also for the
databases to populate and re-synthesize the data to make them more useful
under a federated database concept; In this case, it’s between the OMIA and
the dbSNP. Many people, including the authors of that Korean paper commented
by Frank, may not fully realize what it takes to get information across
databases.

For example, in the case for Animal QTLdb, it was a long way to convince the
journal editors for them to believe such requirement will benefit the whole
community in terms of data reuse, meta analysis, structured analysis,
networked analysis, etc. Thanks to the editors of the Animal Genetics and
Journal of Animal Science for their understandings and visions, we are
fortunate to have their endorsement requiring QTL/ association data deposition
as their publication pre-requests. Thanks the team at Iowa State Uinversity
led by Jim Reecy, Max Rothschild, Susan Lamont and Christuggle to have
successfully pursued this. We need to convince more journals to do the same.

Go one step further, suppose you have the data volunteered into a database by
each publication -- are these information get automatically integrated? One
perception often is, for example, when you have a phenotype located to a
genome map (being it cyto OR linkage OR physical), and you get a bunch of
SNPs in the same genome, and one could subsequently imagine that, through
some cyber magics you could automatically have a list of SNPs mapped to the
phenotype. That would be ideal world if you are lucky. When this does not
automatically happen, whom to ask for? Where to go for an answer? There is an
easily negligible aspect in database development -- data curation. Data
curation takes time, needs man-power, and most importantly, requires someone
with qualified knowledge and skills to comprehend, locate, match, merge, and
synthesize information from different sources. In addition, this job can be
tedious, time consuming, and challenging in terms of coping with emerging
sciences/technologies. We need to convince the funding agencies for these
professional man power to be invested for the post-publication data curation,
along with tool builders to efficiently and effectively facilitate this
process. Frank has been doing an exceptional outstanding working in OMIA
development and data curations. We need more people to follow, and to support
him.

Thirdly, only by effective data integration and exchange between specialized
databases, can information become more useful, and knowledge extend beyond
the horizon. Again the example can be between OMIA and dbSNP (more such
examples exist, such between OMIA and Uberon, QTLdb, NCBI Gene DB, dbVAR, ...
to link information in a number of different dimensions.

This message has been too long, I will stop here.

The Korea paper on the coming issue of Animal Genetics comes at a good
timing, although it could be better geared towards a more practical approach
to solve their problem.

Zhiliang


-----Original Message-----
.From: Carrie Finno <cjfinnogmail.com>
.Date: Tue, Jan 31, 2017 at 03:15 PM
.To: OMIA-Supporters <omia-supportanimalgenome.org>
.Subject: Re: Limited representation of OMIA causative mutations for cattle in SNP
databases
Resent-From: <omia-supportanimalgenome.org>
Resent-Date: Tue, Jan 31, 2017 at 03:15 PM

Dear group,

I agree that this is problematic that these variants are not being uploaded
to open-access databases. Unfortunately, I think that this will only solved
once journals *require* the upload prior to publication. This is the way that
next-gen sequence data is getting uploaded to SRA now...you cannot even
submit a paper unless you have already uploaded your data to NCBI SRA. So,
although uploading to SRA is quick a time-consuming task (as is dbSNP), it
will get done in order to submit a manuscript.

So, perhaps we need to rally to encourage journal editors to require any new
putative functional variants are uploaded to dbSNP in the same manner?

I look forward to further discussion.

Best,

Carrie


---------------------------------------------------------------------------
Carrie Finno, DVM, PhD
Diplomate, ACVIM
Assistant Professor, Veterinary Genetics University of California-Davis
280 Vet Med II
One Shields Ave
Davis, CA 95616
(530)-752-2739
----------------------------------=======-----------------------------------

On Sat, Jan 28, 2017 at 8:38 AM, Frank Nicholas wrote:

> Dear OMIA colleagues,
>                 The following correspondence is self-explanatory.
>                 I look forward to some fruitful discussion and,
> ultimately, to resolving the problem.
>                 Regards
>                 Frank
>
> .From: Frank Nicholas
> .Sent: Saturday, January 28, 2017 12:10 PM
> .Subject: RE: Excerpt from: Animal Genetics Content Alert (New
> Articles)
>
> Dear Suzanne
>                 Thank you very much for circulating this
> just-published note in Animal Genetics.
>                 To everyone included in Suzanne’s email:
> I spotted this note last week, and wrote immediately to its senior
> authors (who have been cc’d in this message). In my message to the
> authors, I thanked them for highlighting a problem that has been
> occupying my mind for several years, namely that many likely causal
> variants in OMIA are not included in dbSNP or any other variant database.
>                 Because the Korean team has raised such an important
> issue, and because the discussion needs to include a wider group than
> is included in this email list, I suggest that we transfer this
> discussion to the OMIA Support Group discussion list
> http://www.animalgenome.org/...munity/omia-support/ , kindly set up
> a few years ago by Zhiliang Hu.
> If you are not on this list and wish to continue with this
> conversation, please join up!
> Zhiliang and Jim Reecy (who are also cc’d in this message) have very
> generously led two grant proposals to the USDA for funding OMIA
> enhancements; both were near-misses. Solving the problem highlighted
> in the Animal Genetics note is one of three projects that comprised
> the USDA grant proposals (the other two being text mining and ontologies).
>                 For several years I have been compiling manually a
> table of OMIA likely causal variants, aiming to provide for each
> variant its location on the relevant current assembly. For any variant
> that has been entered in dbSNP, this is an easy task. For the many
> that have not been so entered, it can be very time-consuming to dig
> out the relevant information, especially for variants published in the
> pre-assembly era. Ensembl’s Variant Effect Predictor (VEP)
> http://asia.ensembl.org/info/ docs/tools/vep/index.html has proved to
> be very useful, enabling the table to be populated with relevant
> information on-the-fly via REST APIs.
> My plan has been to place an abbreviated form of this table on the
> OMIA web site, highlighting those variants that are not in dbSNP or
> Ensembl Variation, hoping that this will stimulate authors to submit
> their variant to one of the databases.
>                 Also, in recent times, whenever anyone publishes a new
> likely causal variant, I write to them, asking if they would submit
> the variant to a database, and explaining about the OMIA table.
> Interestingly, I am often told that entering single variants in dbSNP
> is a very tedious business and, consequently, authors are often
> reluctant to make the submission. Having never submitted a variant to
> any database, I have no first-hand experience. But if anyone in this
> email list has had some experience, it would be very helpful to hear
> from you via the OMIA Support Group discussion list.
> http://www.animalgenome.org/...munity/omia-support/
>                 Another strategy on the OMIA to-do list is to ask
> journal editors to require any likely causal variant to be submitted
> to a variant database prior to publication.  Zhiliang and Jim and I
> have also planned to ask editors to require relevant OMIA IDs to be
> included in any paper publishing a likely causal variant.
>                 There’s more to be said, but I’ll leave further
> discussion to the OMIA Support Group discussion list.
>                 Suffice to say that I welcome the Korean team
> highlighting this problem, and, with the support of people in this
> list and of the wider OMIA community, I am optimistic that we will be
> able to work with colleagues at NCBI and Ensembl to solve it!
>                 Regards
>                 Frank
>
> .From: Hubbard, Suzanne [mailto:Suzanne.HubbardARS.USDA.GOV]
> .Sent: Saturday, January 28, 2017 1:04 AM
> .Subject: Excerpt from: Animal Genetics Content Alert (New Articles)
>
> Animal Genetics
> © Stichting International Foundation for Animal Genetics
>
> Early View http://onlinelibrary.wiley.com/...urnal/10.1111/(ISSN)
> 1365-2052/earlyview?campaign=wolearlyview (Online Version of Record
> published before inclusion in an issue)
>
> These Early View articles are now available on Wiley Online Library
> http://onlinelibrary.wiley.com?campaign=wolearlyview
>
> Brief Notes
>
> Limited representation of OMIA causative mutations for cattle in SNP
> databases http://onlinelibrary.wiley.com/...i/10.1111/age.12534/
> abstract?campaign=wolearlyview
> Aditi Sharma, Yongmin Cho, Bong-Hwan Choi, Han-Ha Chai, Jong-Eun Park
> and Dajeong Lim Version of Record online: 24 JAN 2017 | DOI:
> 10.1111/age.12534