中國搜尋引擎百度已在Natural Language Processing(NLP)上超越Microsoft和Google，主因為中英的語言差異。
Chinese search giant Baidu has beaten Microsoft and Google in an ongoing natural language processing competition, thanks to the linguistic differences between Chinese and English.
百度的引擎Ernie在General Language Understand Evaluation (GLUE)，一個分析NLP的平台，取得90.1高分，比Microsoft和Google還要高分。而百度亦是首個研發中文引擎的公司，不久後亦嘗試研究英語NLP引擎。
Baidu’s model, called ERNIE (Enhanced Representation through kNowledge IntEgration), recorded the highest score of 90.1 – just ahead of Microsoft and Google – on the General Language Understanding Evaluation (GLUE) benchmark and analysis platform for natural language understanding. Baidu’s model was first developed to understand Chinese language but researchers soon found it was able to understand English better too.
Baidu’s ERNIE was inspired by Google’s BERT, a “masked” language model used by the US tech giant to train AI to understand human language. Google’s model hides 15 per cent of the words in each sequence and then tries to predict them based on the context.
However, given that many Chinese characters do not have an inherent meaning until they are strung together with other characters – a key linguistic difference from English – the team at Baidu needed to train its AI model to understand how to hide a string of meaningful characters and predict the masked ones.