Re: [AIRG] Biomedical Text Mining with KinderMiner, 04/24


Date: Wed, 24 Apr 2019 14:50:07 +0000
From: Aubrey Barnard <barnard@xxxxxxxxxxx>
Subject: Re: [AIRG] Biomedical Text Mining with KinderMiner, 04/24
AIRG,

This afternoon Ron Stewart will be introducing us to some knowledge 
discovery techniques that he and his collaborators have found useful for 
mining PubMed and similar biomedical literature. I would argue that 
knowledge discovery and automatically building knowledge bases is 
fundamental to any AI approach that we expect to interact with humans or 
to exhibit "common sense". Just as search transformed the web, we await 
how machine reading will transform scientific inquiry.

4pm, CS 3310

Read the KinderMiner paper first, and then maybe look at the others as 
you have interest / time:
1. KinderMiner: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5543342/
2. LBD survey: https://dx.doi.org/10.1515%2Fjdis-2017-0019
3. BERT: https://arxiv.org/abs/1810.04805

Aubrey


On 4/22/19 9:50 AM, Stewart, Ron wrote:
> Hi All,
>  Â ÂThe next AIRG is at 4PM on 4/24 in CS 3310.
> 
>  Â Â I have been collaborating with a lot of biologists for several 
> years, and they often ask questions like âWhat genes are involved with 
> cardiomyocytes?â or âWhat genes are involved with WNT signaling?â Âor 
> maybe âWhat drugs might be useful for a particular diseaseâ? ÂSo, ÂI 
> would go off to Google scholar, and comb through the literature to try 
> to find an association between some gene or drug and some biological 
> process or key phrase (such as âcardiomyocyteâ). ÂI got sick of doing 
> that. ÂWhile there is extensive literature on various aspects of NLP, ÂI 
> mostly ignored it to start, and so we built a simple way based on 
> co-occurence to try to rank genes or drugs with regard to how likely 
> they are involved/associated with a biological process or keyphrase.
> 
> The first paper is our KinderMiner paper Â(Finn Kuusisto is the first 
> author):
> 
> AMIA Jt Summits Transl Sci Proc 
> <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5543342/#>. 2017; 2017: 
> 166â174.
> Published online 2017 Jul 26.
> PMCID: PMC5543342
> PMID: 28815126 <https://www.ncbi.nlm.nih.gov/pubmed/28815126>
> 
> 
>   A Simple Text Mining Approach for Ranking Pairwise Associations in
>   Biomedical Applications
> 
> Finn Kuusisto 
> <https://www.ncbi.nlm.nih.gov/pubmed/?term=Kuusisto%20F%5BAuthor%5D&cauthor=true&cauthor_uid=28815126>, 
> PhD,1 John Steill 
> <https://www.ncbi.nlm.nih.gov/pubmed/?term=Steill%20J%5BAuthor%5D&cauthor=true&cauthor_uid=28815126>, 
> MS,1 Zhaobin Kuang 
> <https://www.ncbi.nlm.nih.gov/pubmed/?term=Kuang%20Z%5BAuthor%5D&cauthor=true&cauthor_uid=28815126>, 
> MS,2 James Thomson 
> <https://www.ncbi.nlm.nih.gov/pubmed/?term=Thomson%20J%5BAuthor%5D&cauthor=true&cauthor_uid=28815126>, 
> VMD, PhD,1,2 David Page 
> <https://www.ncbi.nlm.nih.gov/pubmed/?term=Page%20D%5BAuthor%5D&cauthor=true&cauthor_uid=28815126>, 
> PhD,2and Ron Stewart 
> <https://www.ncbi.nlm.nih.gov/pubmed/?term=Stewart%20R%5BAuthor%5D&cauthor=true&cauthor_uid=28815126>, 
> PhD1
> 
> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5543342/
> 
> KinderMiner provides a quick way to get an idea about what the roughly 
> 30 million articles in PubMed say about your particular biological 
> process or keyphrase you are interested in.
> 
> ------------------------------------------------------------------------------------------------------------------------
> After going through KinderMiner (which is not math heavy and certainly 
> not very AI-ish), ÂI wanted to talk about Serial KinderMining (SKiM). 
>  ÂSKiM is designed to look for associations across distinct and 
> separated literature domains. (Example: some drug or compound might be 
> mentioned in the nutrition literature to have some effect on a symptom. 
>  Â Â This symptom might be mentioned in some medical journals as being 
> associated with a disease. However, you might not ever see a paper that 
> talks about the drug/compound and the disease together.) Â This falls 
> under the larger scope of Literature-Based Discovery (LBD).
> 
> A nice review that covers some of the classic examples of LBD is:
> 
> J Data Inf Sci. 2017 Dec; 2(4): 43â64. 
> <https://www.degruyter.com/view/j/jdis.2017.2.issue-4/jdis-2017-0019/jdis-2017-0019.xml>
> doi: 10.1515/jdis-2017-0019 <https://dx.doi.org/10.1515%2Fjdis-2017-0019>
> PMCID: PMC5771422
> NIHMSID: NIHMS932897
> PMID: 29355246 <https://www.ncbi.nlm.nih.gov/pubmed/29355246>
> 
> 
>   Rediscovering Don Swanson: the Past, Present and Future of
>   Literature-Based Discovery
> 
> Neil R. Smalheiser 
> <https://www.ncbi.nlm.nih.gov/pubmed/?term=Smalheiser%20NR%5BAuthor%5D&cauthor=true&cauthor_uid=29355246>
> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5771422/
> 
> 
> Again, none of this is very AI-ish. ÂIâll talk a little about SKiM and 
> how we are using it towards LBD about drug repurposing.
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> 
>  ÂIâm interested in word embeddings as a potential complement to 
> KinderMining-like approaches.
> 
> If there is any time left after that, then I would probably talk about 
> word embeddings and some recent work on context-specific word embeddings 
> such as BERT:
> 
> 
>   BERT: Pre-training of Deep Bidirectional Transformers for Language
>   Understanding
> 
> Jacob Devlin 
> <https://arxiv.org/search/cs?searchtype=author&query=Devlin%2C+J>, 
> Ming-Wei Chang 
> <https://arxiv.org/search/cs?searchtype=author&query=Chang%2C+M>, Kenton 
> Lee <https://arxiv.org/search/cs?searchtype=author&query=Lee%2C+K>, 
> Kristina Toutanova 
> <https://arxiv.org/search/cs?searchtype=author&query=Toutanova%2C+K>
> https://arxiv.org/abs/1810.04805
> 
> 
> 
> Ron
> 
> Ron Stewart, Ph.D
> Associate Director-Bioinformatics
> Regenerative Biology Laboratory
> Morgridge Institute for Research
> 608 316-4349
> rstewart@xxxxxxxxxxxxxxxxxxxxxx <mailto:rstewart@xxxxxxxxxxxxxxxxxxxxxx>
> 
> 
> 
> 
> 

[← Prev in Thread] Current Thread [Next in Thread→]