[AIRG] Biomedical Text Mining with KinderMiner, 04/24


Date: Mon, 22 Apr 2019 14:50:14 +0000
From: "Stewart, Ron" <RStewart@xxxxxxxxxxxxx>
Subject: [AIRG] Biomedical Text Mining with KinderMiner, 04/24
Hi All,
   The next AIRG is at 4PM on 4/24 in CS 3310.

    I have been collaborating with a lot of biologists for several years, and they often ask questions like âWhat genes are involved with cardiomyocytes?â or âWhat genes are involved with WNT signaling?â  or maybe âWhat drugs might be useful for a particular diseaseâ?  So,  I would go off to Google scholar, and comb through the literature to try to find an association between some gene or drug and some biological process or key phrase (such as âcardiomyocyteâ).  I got sick of doing that.  While there is extensive literature on various aspects of NLP,  I mostly ignored it to start, and so we built a simple way based on co-occurence to try to rank genes or drugs with regard to how likely they are involved/associated with a biological process or keyphrase.   

The first paper is our KinderMiner paper  (Finn Kuusisto is the first author):

. 2017; 2017: 166â174.
Published online 2017 Jul 26.
PMCID: PMC5543342
PMID: 28815126

A Simple Text Mining Approach for Ranking Pairwise Associations in Biomedical Applications

Finn Kuusisto, PhD,1 John Steill, MS,1 Zhaobin Kuang, MS,2 James Thomson, VMD, PhD,1,2 David Page, PhD,2and Ron Stewart, PhD1


KinderMiner provides a quick way to get an idea about what the roughly 30 million articles in PubMed say about your particular biological process or keyphrase you are interested in.

------------------------------------------------------------------------------------------------------------------------
After going through KinderMiner (which is not math heavy and certainly not very AI-ish),  I wanted to talk about Serial KinderMining (SKiM).  SKiM is designed to look for associations across distinct and separated literature domains. (Example: some drug or compound might be mentioned in the nutrition literature to have some effect on a symptom.     This symptom might be mentioned in some medical journals as being associated with a disease. However, you might not ever see a paper that talks about the drug/compound and the disease together.)   This falls under the larger scope of Literature-Based Discovery (LBD). 

A nice review that covers some of the classic examples of LBD is:

PMCID: PMC5771422
NIHMSID: NIHMS932897
PMID: 29355246

Rediscovering Don Swanson: the Past, Present and Future of Literature-Based Discovery



Again, none of this is very AI-ish.  Iâll talk a little about SKiM and how we are using it towards LBD about drug repurposing.
ââââââââââââââââââââââââââââââââââââââââââââââââââââââââ

 Iâm interested in word embeddings as a potential complement to KinderMining-like approaches.

If there is any time left after that, then I would probably talk about word embeddings and some recent work on context-specific word embeddings such as BERT:

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding




Ron

Ron Stewart, Ph.D
Associate Director-Bioinformatics
Regenerative Biology Laboratory
Morgridge Institute for Research
608 316-4349





[← Prev in Thread] Current Thread [Next in Thread→]