I mean, if somebody includes the name of a gene in their text, the goal is to annotate (then use the annotation) the gene in the text of the paper. It's often nontrivial to extract the meaning from the text without having a full experiment human. A typical case would be a reference to a gene by name, where the paper authors really meant the transcript or the protein product.
The semantic parsing part is an annotation process. AI algorithms are required to make accurate annotations - a huge amount of context is required.
So if authors instead marked up their papers such that each mentioned entity's semantic meaning was obvious, it would make it much easier to AI that scans all papers and generates hypotheses.
I'm thinking of a less than ambitious angle to the use of AI or at least parsing algorithms: cross referencing commonly researched subjects and methods. If say a certain method seems to yield less than ideal results it would be nice to know if someone else figured out the problem well in advance of any laboratory work. Feeding that sort of information into a computer would be dead simple since I assume most methods are easily categorized as it is.
No, I don't think it's dead simple to take methods and compare them. The problem is that most of the method details are implicit and leave out a lot of the aspects that are required to replicate a study.
Thank you very much for unpacking it and all the best in your career in Biology! I am in tech now, but studied Biology at Texas A&M for undergrad so hearing the words in your response reminded me of the good ol' days!
The semantic parsing part is an annotation process. AI algorithms are required to make accurate annotations - a huge amount of context is required.
So if authors instead marked up their papers such that each mentioned entity's semantic meaning was obvious, it would make it much easier to AI that scans all papers and generates hypotheses.