Pre-Training Tasks For Embedding-Based Large-Scale Retrieval: Difference between revisions

From statwiki
Jump to navigation Jump to search
Line 9: Line 9:


===Contributions of this paper===
===Contributions of this paper===
*Introduce a set of tasks that model natural feedback from a teacher and hence assess the feasibility of dialog-based language learning.
*Placholder
*Evaluated some baseline models on this data and compared them to standard supervised learning.
*Placholder
*Introduced a novel forward prediction model, whereby the learner tries to predict the teacher’s replies to its actions, which yields promising results, even with no reward signal at all
*Placholder
 
Code for this paper can be found on Github:https://github.com/facebook/MemNN/tree/master/DBLL


==Background on  Memory Networks==
==Background on  Memory Networks==

Revision as of 20:40, 17 November 2020

Contributors

Pierre McWhannel wrote this summary with editorial and technical contributions from STAT 946 fall 2020 classmates. The summary is based on the paper "Pre-Training Tasks for Embedding-Based Large-Scale Retrieval" which was presented at ICLR 2020. The authors of this paper are Wei-Cheng Chang, Felix X. Yu, Yin-Wen Chang, Yiming Yang, Sanjiv Kumar [1].

Introduction

This problem domain which the paper is centered on is large-scale query-document retrieval. This problem is: given a query/question to collect relevant documents which contain the answer(s) and identify the span of words which contain the answer(s). This problem is often separated into two steps 1) a retrieval phase which reduces a large corpus to a much smaller subset and 2) a reader which is applied to read the shortened list of documents and score them based on their ability to answer the query and identify the span containing the answer, these spans can be ranked according to some scoring function. In the setting of open-domain question answering the query, [math]\displaystyle{ q }[/math] is a question and the documents [math]\displaystyle{ d }[/math] represent a passage which may contain the answer(s). The focus of this paper is on the retriever in open-domain question answering, in this setting a scoring function [math]\displaystyle{ f: \mathcal{X} \times \mathcal{Y} \rightarrow \mathbb{R} }[/math] is utilized to map [math]\displaystyle{ (q,d) \in \mathcal{X} \times \mathcal{Y} }[/math] to a real value which represents the score of how relevant the passage is to the query. We desire high scores for relevant passages and and low scores otherwise and this is the job of the retriever. Certain characteristics desired of the retriever are high recall since the reader will only be applied to a subset meaning there is an an opportunity to identify false positive but not false negatives. The other characteristic desired of a retriever is low latency. The popular approach to meet these heuristics has been the BM-25 [2] which is relies on token-based matching between two high-dimensional and sparse vectors representing [math]\displaystyle{ q }[/math] and [math]\displaystyle{ d }[/math] this approach performances well and retrieval can be performed in time sublinear to the number of passages with inverter indexing. The downfall of this type of algorithm is its inability to be optimized for a specific task.

Typically there are

Contributions of this paper

  • Placholder
  • Placholder
  • Placholder

Background on Memory Networks


Figure 2: end-to-end model


A memory network combines learning strategies from the machine learning literat


References

[1] Wei-Cheng Chang, Felix X Yu, Yin-Wen Chang, Yiming Yang, and Sanjiv Kumar. Pre-training tasks for embedding-based large-scale retrieval. arXiv preprint arXiv:2002.03932, 2020.

[2] Stephen Robertson, Hugo Zaragoza, et al. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval, 3(4):333–389, 2009.