報(bào) 告 人:佐治亞大學(xué)統(tǒng)計(jì)系暨生物信息研究所劉亮教授
報(bào)告題目:Species delimitation using machine learning
報(bào)告時(shí)間:2024年5月16日 上午9:30-10:30
報(bào)告地點(diǎn):分析測(cè)試中心620
主辦單位:生命科學(xué)學(xué)院、比較基因組學(xué)省高校重點(diǎn)實(shí)驗(yàn)室、江蘇省基因組學(xué)國(guó)際聯(lián)合研究中心、科學(xué)技術(shù)研究院
報(bào)告人簡(jiǎn)介:
劉亮,美國(guó)佐治亞大學(xué)統(tǒng)計(jì)系暨生物信息研究所教授。國(guó)際分子系統(tǒng)發(fā)育基因組學(xué)研究領(lǐng)域新型物種樹(shù)方法的創(chuàng)始人之一,曾獲2008年度國(guó)際系統(tǒng)生物學(xué)家協(xié)會(huì)優(yōu)秀科研獎(jiǎng)。長(zhǎng)期擔(dān)任 Systematic Biology, Bioinformatics, Journal ofMathematic Biology, Molecular Biology and Evolution, Molecular Ecology等國(guó)際學(xué)術(shù)期刊的評(píng)委,在Science、PNAS、Systematic Biology、Molecular Biology andEvolution、Bioinformatics等國(guó)際學(xué)術(shù)期刊發(fā)表論文80余篇,論文總引用次數(shù)約3.5萬(wàn)余次,單篇論文最高引用2.4萬(wàn)余次。擔(dān)任美國(guó)國(guó)家自然科學(xué)基金委員會(huì)二審評(píng)委。
報(bào)告摘要:
In the realm of biology, species are identified through a classification systemthat groups together organisms with shared traits and the ability to reproducewith each other. There's a keen interest in understanding how species aredefined and whether their evolutionary roots can be traced through geneticsequences. Various techniques are employed for species identification, including Automated Barcode Gap Discovery, the General Mixed Yule Coalescentmethod, and the Poisson Tree Process method. Yet, these methods come withdrawbacks, such as time consumption and difficulty handling large datasets. In this talk, we delve into employing supervised machine learning techniques like Catboost, XGboost, Classification Tree, Support Vector Machine, and K-nearestNeighbors. Additionally, we explore unsupervised machine learning with K-meansClustering and a deep learning approach using Neural Networks for speciesdelimitation. We examine five distinct species trees as our test cases. Ourfindings reveal that supervised machine learning models exhibit superioraccuracy compared to unsupervised machine learning and deep learning models.