Meta-Learning For Domain Generalization: Difference between revisions

From statwiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 62: Line 62:


=== Object Detection ===  
=== Object Detection ===  
For object detection, the PACS multi-domain recognition benchmark is exploited; a dataset designed for the cross-domain recognition problems .This dataset has 7 categories (‘dog’, ‘elephant’, ‘giraffe’, ‘guitar’, ‘house’, ‘horse’ and ‘person’) and 4 domains of different stylistic depictions (‘Photo’, ‘Art painting’, ‘Cartoon’ and ‘Sketch’). The diverse depiction styles provide a significant domain gap. The Result of Current approach compared to other approaches are presented in Table 1. The baseline models are D-MTAE[],Deep-All[],Vanilla AlexNet[],DSN[]and AlexNet+TF[]. On the average the Proposed method outperforms other methods in average.  
For object detection, the PACS multi-domain recognition benchmark is exploited; a dataset designed for the cross-domain recognition problems .This dataset has 7 categories (‘dog’, ‘elephant’, ‘giraffe’, ‘guitar’, ‘house’, ‘horse’ and ‘person’) and 4 domains of different stylistic depictions (‘Photo’, ‘Art painting’, ‘Cartoon’ and ‘Sketch’). The diverse depiction styles provide a significant domain gap. The Result of Current approach compared to other approaches are presented in Table 1. The baseline models are D-MTAE[5],Deep-All[2], DSN[6]and AlexNet+TF[2]. On the average the Proposed method outperforms other methods in average.  


[[File:ashraf4.jpg |center|800px]]
[[File:ashraf4.jpg |center|800px]]


<div align="center">Table 1: Cross-domain recognition accuracy (Multi-class accuracy) on the PACS dataset. Best performance in bold. </div>
<div align="center">Table 1: Cross-domain recognition accuracy (Multi-class accuracy) on the PACS dataset. Best performance in bold. </div>
=== Cartpole ===
[[File:ashraf5.jpg |center|800px]]
<div align="center">Table 2: Cart-Pole RL. Domain generalisation performance across pole length. Average reward testing on 3 held out domains with random lengths. Upper bound: 200. </div>
[[File:ashraf5.jpg |center|800px]]
<div align="center">Table 3: Cart-Pole RL. Generalisation performance across both pole length and cart mass. Return testing on 3 held out domains with random length and mass. Upper bound: 200. </div>
=== Mountain Car ===




Line 83: Line 95:


[5]: [Ghifary et al. 2015] Ghifary, M.; Bastiaan Kleijn, W.; Zhang, M.; and Balduzzi, D. 2015. Domain generalization for object recognition with multi-task autoencoders. In ICCV.
[5]: [Ghifary et al. 2015] Ghifary, M.; Bastiaan Kleijn, W.; Zhang, M.; and Balduzzi, D. 2015. Domain generalization for object recognition with multi-task autoencoders. In ICCV.
[6]: [Bousmalis et al. 2016] Bousmalis, K.; Trigeorgis, G.; Silberman, N.; Krishnan, D.; and Erhan, D. 2016. Domain separation networks. In NIPS.

Revision as of 17:44, 9 November 2020

Presented by

Parsa Ashrafi Fashi

Introduction

Domain Shift problem addresses the problem where a model trained on a data distribution cannot perform well when tested on another domain with different distribution. Domain Generalization tries to tackle this problem by producing models that can perform well on unseen target domains. Several approaches have been adapted for the problem, such as training a model for each source domain, extract a domain agnostic component domains and semantic feature learning. Meta-Learning and specifically Model-Agnostic Meta-Learning which have been widely adapted recently, are models capable of adapting or generalizing to new tasks and new environments that have never been encountered during training time. Here by defining tasks as domains, the paper tries to overcome the problem in a model-agnostic way.


Previous Work

There were 3 common approaches to Domain Generalization. The simplest way is to train a model for each source domain and estimate which model performs better on a new unseen target domain [1]. A second approach is to presume that any domain is composed of an domain-agnostic and a domain specific component. By factoring out the domain specific and domain-agnostic component during training on source domains, the domain-agnostic component can be extracted and transferred as a model that is likely to work on a new source domain [2]. Finally, a domain-invariant feature representation is learnt to minimize the gap between multiple source domains and it should provide a domain independent representation that performs well on a new target domain [3][4][5].

Method

In the DG setting, we assume there are S source domains [math]\displaystyle{ S }[/math] and T target domains [math]\displaystyle{ T }[/math] . We define a single model parametrized as [math]\displaystyle{ \theta }[/math] to solve the specified task. DG aims for training [math]\displaystyle{ \theta }[/math] on the source domains, such that it generalizes to the target domains. At each learning iteration we split the original S source domains [math]\displaystyle{ S }[/math] into S−V meta-train domains [math]\displaystyle{ \bar{S} }[/math] and V meta-test domains [math]\displaystyle{ \breve{S} }[/math] (virtual-test domain). This is to mimic real train-test domain-shifts so that over many iterations we can train a model to achieve good generalization in the final-test evaluated on target domains T .

The paper explains the method based on two approaches; Supervised Learning and Reinforcement Learning.

Supervised Learning

First, [math]\displaystyle{ l(\hat{y},y) }[/math] is defined as a cross-entropy loss function. ( [math]\displaystyle{ l(\hat{y},y) = -\hat{y}log(y) }[/math]). The process is as follows.

Meta-Train

The model is updated on S-V domains [math]\displaystyle{ \bar{S} }[/math] and the loss function is defined as: [math]\displaystyle{ F(.) = \frac{1}{S-V} \sum\limits_{i=1}^{S-V} \frac {1}{N_i} \sum\limits_{j=1}^{N_i} l_{\theta}(\hat{y}_j^{(i)}, y_j^{(i)}) }[/math] In this step the model is optimized by gradient descent like follows: [math]\displaystyle{ \theta^{\prime} = \theta - \alpha \nabla_{\theta} }[/math]

Meta-Test

In each mini-batch the model is also virtually evaluated on the V meta-test domains [math]\displaystyle{ \breve{S} }[/math]. This meta-test evaluation simulates testing on new domains with different statistics, in order to allow learning to generalize across domains. The loss for the adapted parameters calculated on the meta-test domains is as follows: [math]\displaystyle{ G(.) = \frac{1}{V} \sum\limits_{i=1}^{V} \frac {1}{N_i} \sum\limits_{j=1}^{N_i} l_{\theta^{\prime}}(\hat{y}_j^{(i)}, y_j^{(i)}) }[/math]

the loss on the meta-test domain is calculated using the updated parameters [math]\displaystyle{ \theta }[/math] from meta-train. This means that for optimization with respect to [math]\displaystyle{ G }[/math] we will need the second derivative with respect to [math]\displaystyle{ \theta }[/math].

Final Objective Function

Combining the two loss functions, the final objective function is as follows: [math]\displaystyle{ argmin_{\theta} \; F(\theta) + \beta G(\theta - \alpha F^{\prime}(\theta)) }[/math]. Algorithm 1 illustrates the supervised learning approach.

Algorithm 1: MLDG Supervised Learning Approach.

Reinforcement Learning

In application to the reinforcement learning (RL) setting, we now assume an agent with a policy π that inputs states x and produces actions a in a sequential decision making task: [math]\displaystyle{ a_t = \pi_{\theta}(x_t) }[/math]. The agent operates in an environment and its goal is to maximize its return, [math]\displaystyle{ R = \sum\limits_{t} \delta^t R_t(x_t, a_t) }[/math]. Here, tasks map to return functions and domains map to different environments.

Meta-Train

In meta-training, the loss function [math]\displaystyle{ F(·) }[/math]now corresponds to the negative return [math]\displaystyle{ R }[/math] of policy [math]\displaystyle{ \pi_{\theta} }[/math], averaged over all the meta-training environments in [math]\displaystyle{ \bar{S} }[/math].

Meta-Test

The step is like meta-test of supervised learning and loss is again negative of return function. For RL calculating this loss requires rolling out the meta-train updated policy <math> \theta in the meta-test domains to collect new trajectories and rewards. The reinforcement learning approach is also illustrated completely in algorithm 2.

Algorithm 1: MLDG Reinforcement Learning Approach.

Experiments

The Proposed method is exploited in 4 different experiment results (2 supervised and 2 reinforcement learning experiments).

Illustrative Synthetic Experiment

In this experiment, nine domains by sampling curved deviations are synthesized from a diagonal line classifier. We treat eight of these as sources for meta-learning and hold out the last for final-test. Fig. 1 shows the nine synthetic domains which are related in form but differ in the details of their decision boundary. The results show that MLDG performs near perfect and the baseline model without considering domains overfits in the bottom left corner.

Figure 1: Synthetic experiment illustrating MLDG.

Object Detection

For object detection, the PACS multi-domain recognition benchmark is exploited; a dataset designed for the cross-domain recognition problems .This dataset has 7 categories (‘dog’, ‘elephant’, ‘giraffe’, ‘guitar’, ‘house’, ‘horse’ and ‘person’) and 4 domains of different stylistic depictions (‘Photo’, ‘Art painting’, ‘Cartoon’ and ‘Sketch’). The diverse depiction styles provide a significant domain gap. The Result of Current approach compared to other approaches are presented in Table 1. The baseline models are D-MTAE[5],Deep-All[2], DSN[6]and AlexNet+TF[2]. On the average the Proposed method outperforms other methods in average.

Table 1: Cross-domain recognition accuracy (Multi-class accuracy) on the PACS dataset. Best performance in bold.

Cartpole

Table 2: Cart-Pole RL. Domain generalisation performance across pole length. Average reward testing on 3 held out domains with random lengths. Upper bound: 200.
Table 3: Cart-Pole RL. Generalisation performance across both pole length and cart mass. Return testing on 3 held out domains with random length and mass. Upper bound: 200.

Mountain Car

Conclusion

References

[1]: [Xu et al. 2014] Xu, Z.; Li, W.; Niu, L.; and Xu, D. 2014. Exploiting low-rank structure from latent domains for domain generalization. In ECCV.

[2]: [Li et al. 2017] Li, D.; Yang, Y.; Song, Y.-Z.; and Hospedales, T. 2017. Deeper, broader and artier domain generalization. In ICCV.

[3]: [Muandet, Balduzzi, and Scholkopf 2013] ¨ Muandet, K.; Balduzzi, D.; and Scholkopf, B. 2013. Domain generalization via invariant feature representation. In ICML.

[4]: [Ganin and Lempitsky 2015] Ganin, Y., and Lempitsky, V. 2015. Unsupervised domain adaptation by backpropagation. In ICML.

[5]: [Ghifary et al. 2015] Ghifary, M.; Bastiaan Kleijn, W.; Zhang, M.; and Balduzzi, D. 2015. Domain generalization for object recognition with multi-task autoencoders. In ICCV.

[6]: [Bousmalis et al. 2016] Bousmalis, K.; Trigeorgis, G.; Silberman, N.; Krishnan, D.; and Erhan, D. 2016. Domain separation networks. In NIPS.