Nice piece from @NathanBenaich
"Covid-19 is so new and complex that the data needed to train AI to combat it does not exist"
^ a common issue in all of drug discovery, not just Covid-19... https://twitter.com/NathanBenaich/status/1307638896403132416">https://twitter.com/NathanBen...
"Covid-19 is so new and complex that the data needed to train AI to combat it does not exist"
^ a common issue in all of drug discovery, not just Covid-19... https://twitter.com/NathanBenaich/status/1307638896403132416">https://twitter.com/NathanBen...
We need more few-shot learners in bio + chem to make these problems tractable even when large datasets don’t exist or can’t be readily generated (a big focus for us @invivo_ai)
Groups like @RecursionPharma are leading the charge on data generation and showing compelling evidence of these approaches working in practice: https://twitter.com/recursionchris/status/1307729228582936576?s=21">https://twitter.com/recursion...
But what if we can’t scale the biology or chemistry to the constraints of existing deep learning algos?
We’ve been conditioned to focus on the data piece & that bigger data = better prediction. A useful heuristic but not the full story
We’ve been conditioned to focus on the data piece & that bigger data = better prediction. A useful heuristic but not the full story
In drug discovery, we need new ML approaches built *specifically* for small / sparse datasets, closely integrated with strategies for data augmentation + active learning (smart data vs big data)
Seeing strong evidence in current pharma collabs that this combo can quickly unlock previously intractable problems for ML in drug discovery