Many MRC models proposed so far show evaluation values beyond human capabilities in various tasks and datasets, but I think it is difficult to easily say YES when asked, does you understand a given context better than humans?
First of all, there are many differences as to whether the current MRC techniques really "understand". Understanding and digesting the content, then answering, is clearly different from learning the type of problem and finding and combining specific patterns in a given article to produce an output.
Second, it is not clear how well the current MRC task, evaluation metrics, and datasets reflect the “real world”. Link's thesis is the first work to solve this part, analyzing 57 MRC Tasks and Datasets, and classifying and defining Tasks, Evaluation metrics, and Datasets. Most of the people who have researched this page know this, but it seems to be good for the purpose of organizing it all at once.
In this paper, as an Open Issue (1) What aspects of the current MRC technology should be supplemented? (2) We conclude by presenting two topics: How much do we understand about “understanding”? I think both are things you need to consider to go to Human-Like AI.