Jessica Mester MS LCGC Disclosure I am an employee of GeneDx Inc a whollyowned subsidiary of OPKO Health Inc Overview Effective use of variant nomenclature in your lit search Where to look ID: 916316
Download Presentation The PPT/PDF document "Sequence Variant Literature Search Tips ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Sequence Variant Literature Search Tips and Tricks
Jessica Mester, MS, LCGC
Slide2Disclosure
I am an employee of GeneDx, Inc., a wholly-owned subsidiary of OPKO Health, Inc.
Slide3Overview
Effective use of variant nomenclature in your lit searchWhere to look?Common lit search speed bumps and how to overcome them
Slide4Purpose of the literature search
Find articles that allow for application of ACMG criteria during variant curationFunctional studies
Case reports (
de novo
, segregation, phenotype, co-occurrence)
Molecular characterization
Case-control data
Evidence might be for or against pathogenicity
Better understand phenotypic spectrum
There is not one perfect literature search tool
– combination of sources might be necessary!
Slide5Starting your search
HGMD (Human Gene Mutation Database, http://www.hgmd.cf.ac.uk/ac/index.php
)
Professional ($$$) version
Public (free) version – 3 years behind, available to academic/non-profit users
Gene-specific databases
Several listed at
https://grenada.lumc.nl/LSDB_list/lsdbs
(Locus Specific Database List), maintained by LOVD
If you are focusing on one gene/disease, and it has a dedicated variant database, become best friends with that resource!
ClinVar: some submitters provide citations, not all specific to variant
Make sure to review the article yourself to ensure your specific variant is actually present!
Slide6Literature Search Tools
MasterMind: https://mastermind.genomenon.com
/
Others in development
Slide7Google/GoogleScholar
> PubMed
PubMed: may “hit” if variant in abstract or title, but that’s it!
Google
Helpful: locates variants in article text, supplemental tables
Not as helpful: also finds non-genetics related links (ATM variants: finds you ATM machines at an address resembling your variant nomenclature)
Detective work sometimes needed to figure out what article a table is from
GoogleScholar
Limited to published academic literature
Better than regular google at finding variants in tables within a paper
Slide8Building Google Search Terms
GENE AND (“V1” OR “V2” OR “V3”….)
Take a “generous” approach to nomenclature – authors might not use strict HGVS format!
Include genomic position (GRCh37/hg19 still most used)
Remember alternate or “historic” gene names (STK11 = LKB1, NBN = NBS1…)
Know if
alternate transcripts
or nomenclature used in addition to HGVS (standard)
MUTYH: alternate transcripts (c.1187G>A/p.G396D = c.1145G>A/p.G382D)
BRCA1/2: BIC nomenclature; nucleotide numbering starts at the beginning of the cDNA clone.
Watch for “push to the right”
Slide9Finding Alternate Transcripts/Nomenclature
Might be on a laboratory report
In the VCI: appears in “basic information” of Evidence View; also on
ClinVar
variant page
Excellent
article on this subject
: PMID
30096381
, DiStefano M et al.
J
Mol
Diagn
. 2018 Nov;20(6):789-801. “Curating Clinically Relevant Transcripts for the Interpretation of Sequence Variants
.” (Companion talk by Dr. DiStefano on
ClinGen
YouTube channel)
Slide10“Push to the Right” examples
Normal sequence: CTGCTGCTGAAAAAVariant deletion of one “A”: CTGCTGCTG
A
AAAA
Could be named c.25delA OR c.29delA – can’t tell
which
A is really deleted.
HGVS nomenclature would “push to the right” – c.29delA
Deletion 3 base pairs in repetitive sequence: CT
GCT
GCTGAAAAANo matter which three are deleted, surrounding sequence reads the same
CT
GCT
GCTGAAAAA CTGCT
GCT
GAAAAA
CTG
CTGCTGAAAAA
Consider alternate potential nomenclatures accordingly
15
21
25
29
Slide11Lit Search Term Examples
Variant
Type
Lit
search string
Nonsense
PTEN AND (“1003C>T” OR “1003 C>T” OR “Arg335Ter” OR “R335X” OR “Arg335STOP” OR “89720852”)
Missense
NF1 AND (“277T>C” OR “277 T>C” OR “Cys93Arg” OR “C93R” OR “29486100”)
Frameshift
MYBPC3
AND (“1028delC” OR “1028del” OR “Thr343MetfsX7” OR “T343MfsX7” OR “Thr343Metfs*” OR “T343Mfs*” OR “47367820”)
Intronic
FBN1
AND (“1148-2A>C” OR “1148-2 A>C” OR “IVS10-2A>C” OR “IVS10-2 A>C” OR “48808561”)
In-frame
indel
MSH6 AND ("2157_2159delTAC" OR "2157delTAC" OR "2157del3" OR "Thr720del" OR "T720del“ OR “48027279”)
Different amino
acid change at same residue
Variant
of interest: TP53 His179Asn
TP53 AND (“His179*” OR “H179*”) – would
NOT
include “NOT His179Asn” because one article might discuss both variants.
* = wildcard operator; finds anything starting with what precedes it.
Slide12Google Search Results
Don’t worry, it’s not that badSeveral links: take you to websites that have extracted variants or info from
ClinVar
,
gnomAD
, other databases
Can pull up articles where MYBPC3 mentioned, but variant with same nomenclature found in a different gene
Over time: becomes easier to recognize potentially useful hits
Slide13Finding Source Publications for Google “Hits”
Best case scenario: link takes you directly to article.
Slide14Database “Hits”
ClinVar
: typically top of search results
LOVD entry
Slide15Detective Work Needed
Clues: name of document indicates first or last author likely to be BH
Funke
, published in Genetics in Medicine during 2010, “200661” might be article number
Slide16Journal website…
Slide17Google Scholar
Slide18Caution: double-dipping
Same individual/family may be reported in several different articlesOccasionally recognized and previous publications cited
If not, look for…
Overlapping authors
Patient recruitment/ascertainment in article methods
Helpful clinical details (gender, ethnicity, family history…)
If it’s a rare variant, and several of these factors line up…most likely to be the same individual
Slide19In Summary
Finding helpful literature is a learning process, skills honed with time and experienceUse available database resources to help increase efficiency
Use comprehensive nomenclature terms in Google and
GoogleScholar
searches
Always verify that your specific variant of interest is included in a publication
Slide20Thank You!
ClinGen Education Working Group
Karen Wain
Danielle
Azzaritti
Sarah Barnett
Lisa Kurtz
Erin Riggs
PTEN VCEP Biocurators
Felicia Hernandez
Melody Perpich
Kaitlin Sesock
Other
ClinGen
/Broad folks
Jenny Goldstein
Steven Harrison
Becky
Siegert
Slide21www.clinicalgenome.org
clingen@clinicalgenome.org
ClinGen
is primarily funded through NHGRI through the following three grants:
U41HG006834, U41HG009649,
U41HG009650.