NTPU AACSB EMI x FGFC x IM Invited Talk 2
Time: 2022/11/23 (Wednesday) 9:10-12:00
Title: Matching Texts with Data for Evidence-based Information Retrieval
Speaker: Prof. Makoto P. Kato, University of Tsukuba, Japan. https://www.mpkato.net/
In-Person and Online (Hybrid): (Google Meet: https://meet.google.com/miy-fbif-max )
In-Person Place: B8F40, National Taipei University
Registration: https://docs.google.com/forms/d/14GJH9Ww8ifaQHYCFEUknZ3_x2zJpFNc8W5mQVds9y0o/viewform
Organizer: NTPU AACSB EMI x FGFC x IM
Host: Prof. Min-Yuh Day, National Taipei University
Biography
Makoto P. Kato received his Ph.D. degree in Graduate School of Informatics from Kyoto University, Sakyo Ward, Yoshidahonmachi, in 2012. Currently, he is an associate professor of Faculty of Library, Information and Media Science, University of Tsukuba, Japan. In 2008, he was awarded ‘WISE 2008 Kambayashi Best Paper Award’ through the article ‘Can Social Tagging Improve Web Image Search?’ with other researchers. In 2010, he served as a JSPS Research Fellow in Japan Society for the Promotion of Science. During the period 2010 to 2012, he also served in Microsoft Research Asia Internship (under supervision by Dr. Tetsuya Sakai in WIT group), Microsoft Research Asia Internship (under supervision by Dr. Tetsuya Sakai in WSM group), and Microsoft Research Internship (under supervision by Dr. Susan Dumais in CLUES group). From 2012, he worked as an assistant professor in Graduate School of Informatics, Kyoto University, Japan. His research and teaching career began, and he worked as an associate professor from 2019 in Graduate School of Informatics, Kyoto University, Japan. His research interests include Information Retrieval, Web Mining, and Machine Learning, while he is an associate professor in Knowledge Acquisition System Laboratory (Kato Laboratory), University of Tsukuba, Japan.
Abstract
We are now facing the problem of misinformation and disinformation on the Web, and search engines are struggling to retrieve reliable information from a vast amount of Web data. One of the possible solutions to this problem is to find reliable evidences supporting a claim on the Web. But what are “reliable evidences”? They can include authorities’ opinions, scientific papers, or wisdom of crowds. However, they are also sometimes subjective as they are outcomes produced by people.
This talk discusses some approaches incorporating another type of evidences that are very objective — numerical data — for reliable information access.
(1) Entity Retrieval based on Numerical Attributes
Entity retrieval is a task of retrieving entities for a given text query and usually based on text matching between the query and entity description. Our recent work attempted to match the query and numerical attributes of entities and produce explainable rankings. For example, our approach ranks cameras based on their numerical attributes such as resolution, f-number, and weight, in response to queries such as “camera for astrophotography” and “camera for hiking”.
(2) Data Search
When people encounter suspicious claims on the Web, data can be reliable sources for the fact checking. NTCIR Data Search is an evaluation campaign that aims to foster data search research by developing an evaluation infrastructure and organizing shared tasks for data search. The first test collection for data search and some findings are introduced in this talk.
(3) Data Summarization
While the data search project attempts to develop a data search system for end users and help them make decisions based on data, it is still difficult for users to quickly interpret data. Therefore, data summarization techniques are also necessary to enable users to incorporate data in their information seeking process. Recent automatic visualization and text-based data summarization techniques are presented in this talk.
]]>