Learning How to Learn: Meta Learning Approach to Improve Deep Learning

Dr. Ashish Kr. Chakraverti; Sugandha Chakraverti; Dr. Yashpal Singh

doi:10.17577/IJERTCONV8IS10001

ENCADEMS - 2020 (Volume 8 - Issue 10)

Learning How to Learn: Meta Learning Approach to Improve Deep Learning

DOI : 10.17577/IJERTCONV8IS10001

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 397
Authors : Dr. Ashish Kr. Chakraverti, Sugandha Chakraverti, Dr. Yashpal Singh
Paper ID : IJERTCONV8IS10001
Volume & Issue : ENCADEMS – 2020 (Volume 8 – Issue 10)
Published (First Online): 18-07-2020
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Learning How to Learn: Meta Learning Approach to Improve Deep Learning

Dr. Ashish Kr. Chakraverti1 Sugandha Chakraverti2

Associate Professor, CSE, MIET, Gr. Noida UP Assistant Professor, CSE, RKGIT, Ghaziabad

Dr. Yashpal Singp

Professor, MIET Gr. Noida UP

AbstractMeta-Learning describes the abstraction to designing more elevated level components associated with preparing Deep Neural Networks. The expression "Meta- Learning" is tossed around in Deep Learning writing often referencing "AutoML", "Few-Shot Learning", or "Neural Architecture Search" when in reference to the robotized design of neural system architectures. Rising up out of entertainingly titled papers such as "Figuring out how to learn by inclination descent by slope descent", the success of OpenAI's rubik's solid shape mechanical hand demonstrates the development of the thought. Meta-Learning is the most promising worldview to propel the state-of-the-craft of Deep Learning and Artificial Intelligence.

Meta-learning is one of the most dynamic regions of research in the profound learning space. A few ways of thinking inside the Artificial Intelligence(AI) people group buy in to the postulation that meta-learning is one of the venturing stones towards opening Artificial General Intelligence(AGI). As of late, we have seen a blast in innovative work of meta-learning systems. In any case, a portion of the essential thoughts behind meta-learning are still generally misconstrued by information researchers and designers. From that point of view, we figured it may be a smart thought to audit a portion of the crucial ideas and history of meta-learning just as a portion of the mainstream calculations in the space.

Keywords Deep Learning; Meta learning;Artificial General Intelligence

INTRODUCTION

The term metalearning first happened in the territory of instructive brain research. One of the most refered to scientists right now, Biggs, portrayed met-alearning as monitoring and assuming responsibility for one's own learning [6]. Thus, metalearning is seen as a comprehension and adjustment of learning itself on a more elevated level than just securing subject information. In that manner, an individual mindful and equipped for metalearning can survey their learning approach and change it as per the prerequisites of a particular undertaking.

Metalearning as utilized in an AI setting has numerous likenesses to this portrayal. Subject information converts into base-realizing, where experience is gathered for one explicit learning task. Metalearning begins at a more significant level and is worried about collecting experience more than a few utilizations of a learning framework as per [9].

Over the most recent 20 years, AI look into was confronted with an expanding number of accessible calculations including a huge number of parametrisation, pre- preparing and postprocessing approaches just as a significantly stretched out scope of uses because of expanding registering power and more extensive availabil-ity of PC discernible informational collections. By advancing a superior comprehension of AI itself, metalearning can give a priceless assistance maintaining a strategic distance from broad experimentation techniques for calculation choice, and beast power looks for appropriate parametrisation. Seeing how to benefit from past ex-perience of a prescient model on specific undertakings can improve the presentation of a learning calculation and permit to all the more likely comprehend what causes an offered calculation to perform well on a given issue.

The possibility of metalearning isn't new, one of the first and original contri-butions having been given by [53]. In any case, the strict term just began showing up in AI writing during the 1990s, yet still numerous publi-cations manage issues identified with metalearning without utilizing the real word. This commitment attempts to get a handle on each perspective metalearning has been examined from, refering to books, research and survey papers of the most recent decade. We trust this review will give a valuable asset to the information mining and AI people group.

The rest of this paper is composed as follows. In Section 2 we re-see meanings of metalearning given in logical writing, concentrating on com-mon topics happening in every one of them. Segment 3 portrays various ideas of metalearning, connecting them to the definitions given in Section 2. In Section4 commonsense contemplations emerging when planning a metalearning framework are talked about, while open research headings are recorded in Section 5.
DEFINITION

First, In the 1990s, the term metalearning started to appear in machine learning re- search, although the concept itself dates back to the mid-1970s [53]. A number of definitions of metalearning have been given, the following list cites the main review papers and books from the last decade:
1. Metalearning studies how learning systems can increase in efficiency through experience; the goal is to
  
  understand how learning itself can become flexible according to the domain or task under study. ([65])
2. The primary goal of metalearning is the understanding of the inter- action between the mechanism of learning and the concrete contexts in which that mechanism is applicable. ([25])
3. Metalearning is the study of principled methods that exploit meta- knowledge to obtain efficient models and solutions by adapting ma- chine learning and data mining processes. ([9])
4. Metalearning monitors the automatic learning process itself, in the context of the learning problems it encounters, and tries to adapt its behaviour to perform better. ([62])
Learning systems that adapt and improve by experience are a key concept of definitions 1, 3 and 4. This in itself however does not suffice as a descrip- tion, as it basically applies to all machine learning algorithms. Metalearning becomes metalearning by looking at different problems, domains, tasks or con- texts or simply past experience. This aspect is inherent in all of the definitions, although somewhat disguised in definition 3 using the term metaknowledge in- stead. Metaknowledge as described by the authors stands for knowledge to be exploited from past learning tasks, which may both mean past learning tasks on the same data or using data of another problem domain. Definition
1. differs in emphasising a better comprehension of the interaction between domains and learning mechanisms, which does not necessarily imply the goal of improved learning systems, but the pursuit of a better understanding of for which tasks individual learners succeed or fail.
  
  Rephrasing, the common ground the above definitions share, we propose to define a metalearning system as follows:
  
  Definition 1
  1. A metalearning system must include a learning subsystem, which adapts with experience.
  2. Experience is gained by exploiting metaknowledge extracted
  1. . . . in a previous learning episode on a single dataset, and/or
  2. . . . from different domains or problems.
  Furthermore, a concept often used in metalearning is that of a bias, which, in this context, refers to a set of assumptions influencing the choice of hypotheses for explaining the data.
  1. distinguishes declarative bias specifying the rep- resentation of the space of hypotheses (for example representing hypotheses using neural networks only) and procedural bias, which affects the ordering of the hypothese (for example preferring hypothesis with smaller runtime). The bias in base-learning according to this theory is fixed, whereas metalearning tries to choose the right bias dynamically.
2. Notions of Metalearning
  
  Metalearning can be employed in a variety of settings, with a certain disagree- ment in literature about what exactly constitutes a metalearning problem. Different notions will be presented in this section while keeping an eye on the question if they can be called metalearning approaches according to Def- inition 1. Figure 1 groups general machine and metalearning approaches in relation to
  
  Definition 1. Each of the three circles presents a cornerstone of the definition (1: adapt with experience, 2a: meta- knowledge on same data set, 2b: meta-knowledge from different domains), the approaches are arranged into the circles and their overlapping sections depending on which parts of the defini- tion applies to them. As an example, ensemble methods do generally work with experience gained with the same data set (definition 2a) and adapt with experience (definition 1), however, the only approach potentially applying all three parts of the definition is algorithm selection, which appears where all three circles overlap.
  
  Fig. 1 Notions of metalearning vs. components of a metalearning system
  1. Ensemble methods and combinations of base-learners Model combination is often used when several
    
    applicable algorithms for a problem are available. Instead of selecting a single algorithm for a problem, the risk of choosing the wrong one can be reduced by combining all or a subset of the available outcomes. In machine learning, advanced model combination can be facilitated by ensemble learning according to [17] and [69], which comprises strategies for training and combining outputs of a number of machine learning algorithms. One often used approach of this type is resampling, leading to a number of ensemble generation techniques. Two very popular resampling- based ensemble building methods are:
    Although in these cases the information about base- learning is drawn in the sense of point 2a of Definition 1, these algorithms are limited to a single problem domain with a bias that is fixed a priori, so that they, using the definition above, do not undoubtedly qualify as metalearning methods.
  2. Algorithm recommendation
A considerable amount of metalearning research has been devoted to the area of algorithm recommendation. In this special case of metalearning, the aspect of interest is the relationship between data characteristics 1 and algorithm per- formance, with the final goal of predicting an algorithm or a set of algorithms suitable for a specific problem under study. As a motivation, the fact that it is infeasible to examine all possible alternatives of algorithms in a trial and error procedure is often given along with the experts necessary if pre-selection of algorithms is to take place. This application of metalearning can thus be both useful for providing a recommendation to an end-user or automatically selecting or weighting algorithms that are most promising.
[62] points out another aspect: it is not only the algorithms themselves, but different parameter settings that will naturally let performance of the same algorithm vary on different datasets. It would be possible to regard versions of the same algorithm with different parameter settings as different learning algo- rithms altogether, but the author advocates treating the subject and studying its effects differently. Such an approach has for example been taken in [26] and [41], where the authors discuss a hybrid metalearning and search based tech- nique to facilitate the choice of optimal parameter values of a Support Vector Machine (SVM). In this approach, the candidate parameter settings recom- mended by a metalearning algorithm are used a starting point for further optimization using Tabu Search or Particle Swarm

Optimization techniques, with great success. [51] investigate increasing the accuracy and decreasing runtime of a genetic algorithm for selecting learning parameters for a Support Vector Machine and a Random Forests classifier. Based on past experience on other datasets and corresponding dataset characteristics, metalearning is used to select a promising initial population for the genetic algorithm, reducing the number of iterations needed to find accurate solutions.

An interesting treatment of the above problem can also be found in [31], where the authors propose to take into account not only the expected per- formance of the algorithm but also its estimated training time. In this way the algorithms can be ordered according to the estimated training complexity, which allows to produce relatively well-performing models very quickly and then look for better solutions, while the ones already trained are producing predictions. These ideas are further extended in [30], where some modifications of the complexity measures used are introduced.

The classic application area of algorithm selection in machine learning is classification. [56] however tries to generalise the concepts to other areas including regression, sorting, constraint satisfaction and optimisation. Met- alearning for algorithm selection has also been investigated in the area of time series forecasting, where the term was first used in [48]. A comprehensive and recent treatment of the subject can be foundin [66] and [37], where time series are clustered according to their characteristics and recommendation rules or combination weights derived with machine learning algorithms. Maintaining the Integrity of the Specifications
CONSIDERATIONS FOR USING METALEARNING

Before applying metalearning to any problem, certain practical choices have to be made. This includes the choice of a metalearning algorithm, which can even constitute a meta-metalearning problem itself. Selection of appropriate metaknowledge and the problem of setting up and maintaining metadatabases have to be tackled, research efforts of which will be summarised in this section.
As metalearning profits from knowledge obtained while looking at data from other problem domains, having sufficient datasets at ones disposal is impor- tant. [57] propose transforming existing datasets (datasetoids) to obtain a larger number of them and show success of the approach on a metalearn- ing post-processing problem. [62] states that there is no lack of experiments being done, but datasets and information obtained often remain in peoples heads and labs. He proposes a framework to export experiments to specifically designed experiment databases based on an ontology for experimentation in machine learning. The resulting database can then, for example, give informa- tion on rankings of learning algorithms, the behaviour of ensemble methods, learning curve analyses and the bias-variance behaviour of algorithms. One example of such database can be The Open Experiment Database 4 . An analysis of this database together with a critical review can be found in [19].

An alternative approach to the problem of scarcity metadatabases has been presented in [50], where the authors describe a dataset generator able t pro- duce synthetic datasets with specified values of some metafeatures (like kur- tosis and skewness). Although the proposed generator appears to be at a very early stage of development, the idea is definitely very promising, also from the point of view of performing controlled experiments on datasets with spec- ified properties. Similarly to feature selection, synthetic data generation has received a considerable attention in the recent

generic machine learning and data mining literature, especially in the context of data streams and concept drift (please see [3] and references therein).
CONCLUSIONS AND RESEARCH CHALLENGES Research in the area of metalearning is continuing in several directions. One area is the identification of metafeatures. As mentioned before, the vast majority of publications investigates extracting features from the dataset, mostly in the form of statistical or information theoretic measures. Landmarking is a different approach using simple base learning algorithms and their performance to describe the dataset at hand. However, [9] argue that characteristics of learning algorithms and gaining a better understanding of their behaviour would be a valuable research avenue with very few publications, for example [63], that exist in this area

to date.

A lot of publications on metalearning focus on selecting the base-learning method that is most likely to perform well for a specific problem. Fewer pub- lications like [11] and

consider ranking algorithms, which can be used to guide combination weights and to increase robustness of a metalearning system.

Regarding adaptivity and continuous monitoring, many approaches go fur- ther than the static traditional metalearning approaches, for example by using architectures that support life-long learning such as in [33]. However, research in this area can still go a long way further investigating continuous adjust- ment, rebuilding or discarding of base-learners with the help of metalearning approaches.

Users of predictive systems are faced with a difficult choice of an ever in- creasing number of models and techniques. Metalearning can help to reduce the amount of experimentation by providing dynamic advice in form of assistants, decrease the time that has to be spent on introducing, tuning and maintaining models and help to promote machine learning outside of an academic environment.

REFERENCES
1. Abbasi, A., Albrecht, C., Vance, A.O., Hansen, J.V.: Metafraud: a meta-learning frame- work for detecting financial fraud. Management Information Systems Quarterly 36(4), 12931327 (2012)
2. Aiolli, F.: Transfer learning by kernel meta-learning. Journal of Machine Learning Research-Proceedings Track 27, 8195 (2012)
3. Albert Bifet Geoff Holmes, R.K., Pfahringer, B.: Data stream mining a practical approach. Tech. rep., The University of Waikato (2011)
4. Bensusan, H., Giraud-Carrier, C., Kennedy, C.: A higher-order approach to metalearning. Proceedings of the ECML2000 workshop on Meta-Learing: Building auto- matic advice strategies for Model Selection and Method Combination (2000)
5. Bernstein, A., Provost, F., Hill, S.: Toward intelligent assistance for a data mining process: an ontology-based approach for cost- sensitive classification. IEEE Transactions on Knowledge and Data Engineering 17, 503518 (2005)
6. Biggs, J.B.: The role of meta-learning in study process. British Journal of Educational Psychology 55, 185212 (1985)
7. Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, New York, USA (1995)
8. Bonissone, P.P.: Lazy meta-learning: creating customized model ensembles on demand. In: Advances in Computational Intelligence, pp. 123. Springer (2012)
9. Brazdil, P., Giraud-Carrier, C., Soares, C., Vilalta, R.: Metalearning: Applications to Data Mining. Springer (2009)
10. Brazdil, P., Soares, C.: A comparison of ranking methods for classification algorithm selection. In: R. de Mantaras, E. Plaza (eds.) Machine Learning: Proceedings of the 11th European Conference on Machine Learning ECML2000, pp. 6374. Springer (2000)
11. Brazdil, P., Soares, C., de Costa, P.: Ranking learning algorithms: Using IBL and metalearning on accuracy and time results. Machine Learning 50(3), 251277 (2003)
12. Breiman, L.: Bagging predictors. Machine Learning 24(2), 123140 (1996)
13. Bruha, I., Famili, A.: Postprocessing in machine learning and data mining. ACM SIGKDD Explorations Newsletter 2, 110114 (2000)
14. Budka, M., Gabrys, B.: Ridge regression ensemble for toxicity prediction. Procedia Computer Science 1(1), 193201 (2010). DOI 10.1016/j.procs.2010.04.022. URL http://www.sciencedirect.com/science/article/pii/S1877050910000232
15. Budka, M., Gabrys, B., Ravagnan, E.: Robust predictive modelling of water pollution using biomarker data. Water Research 44(10), 32943308 (2010). DOI 10.1016/j.watres.2010.03.006. URL Http://www.sciencedirect.com/science/article/pii/S004313541000179X
16. Cao, L.: Domain-driven data mining: Challenges and prospects. IEEE Transactions on Knowledge and Data Engineering 22, 755769 (2010)
17. Dietterich, T.: Ensemble methods in machine learning. In: Proceedings of the First International Workshop on Multiple Classifier Systems, pp. 115 (2000)
18. Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 7180 (2000)
19. Driessens, K., Vanwinckelen, G., Blockeel, H.: Meta-learning from an experiment database. In: Proceedings of the Workshop on Teaching Machine Learning at the 29th International Conference on Machine Learning, Edinburgh, UK (2012)
20. Evgeniou, T., Micchelli, C., Pontil, M.: Learning multiple tasks with kernel methods. Journal of Machine Learning Research 6, 615 637 (2005)
21. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on- line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119139 (1997). DOI http://dx.doi.org/10.1006/jcss.1997.1504
22. FuÂ¨rnkranz, J., Petrak, J., Brazdil, P., Soares, C.: On the use of fast subsampling esti- mates for algorithm recommendation. Tech. rep., sterreichisches Forschungsinstitut fr Artificial Intelligence (2002)
23. Gama, J., Brazdil, P.: Cascade generalisation. Machine Learning 41(3), 315343 (2000)
24. Giraud-Carrier, C.: The data mining advisor: Meta-learning at the service of practi- tioners. In: Proceedings of the Fourth International Conference on Machine Learning and Applications, ICMLA 05, pp. 113119. IEEE Computer Society, Washington, DC, USA (2005)
25. Giraud-Carrier, C.: Metalearning – a tutorial. Tutorial at the 7th International Con- ference on Machine Learning and Applications (ICMLA), San Diego, California, USA (2008)
26. Gomes, T.A., Prudencio, R.B., Soares, C., Rossi, A.L., Carvalho, A.: Combining meta- learning and search techniques to select parameters for support vector machines. Neu- rocomputing 75(1), 3 13 (2012)
27. Guazzelli, A., Zeller, M., Lin, W.C., Williams, G.: PMML: An open standard for sharing models. The R Journal 1(1), 6065 (2009)
28. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. The Journal of Machine Learning Research 3, 11571182 (2003)
29. Hernansaez, J.M., BotÂ´a, J.A., GoÂ´mez-Skarmeta, A.F.: METALA: a J2EE technology based framework for web mining. Revista Colombiana de ComputacioÂ´n 5(1) (2004)
30. Jankowski, N.: Complexity measures for meta-learning and their optimality. Solomonoff 85th Memorial. Lecture Notes in Computer Science. Springer-Verlag (2011)
31. Jankowski, N., Grabczewski, K.: Universal meta-learning architecture and algorithms. In: W. Duch, K. Grabczewski, N. Jankowski (eds.) Meta-learning n Computational Intelligence. Springer (2009)
32. Kadlec, P., Gabrys, B.: Learnt topology gating artificial neural networks. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN 2008) as part of the 2008 IEEE World
  
  Congress on Computational Intelligence (WCCI2008), pp. 26052612 (2008)
33. Kadlec, P., Gabrys, B.: Architecture for development of adaptive on-line prediction models. Memetic Computing 4(1), 241269 (2009)
34. Kalousis, A., Hilario, M.: Feature selection for meta-learning. In: D. Cheung, G. Williams, Q. Li (eds.) Advances in Knowledge Discovery and Data Mining, Lec- ture Notes in Computer Science, vol. 2035, pp. 222233. Springer Berlin Heidelberg (2001)
35. Kalousis, A., Theoharis, T.: NOEMON: design, implementaion and performance results of an intelligent assistant for classifier selection. Intelligent Data Analysis 5(3), 319337 (1999)
36. KoÂ¨pf, C., Iglezakis, I.: Combination of task description strategies and case base proper- ties for meta-learning. In: Proceedings of the 2nd international workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning, pp. 6576 (2002)
37. Lemke, C., Gabrys, B.: Meta-learning for time series forecasting and forecast combina- tion. Neurocomputing 73(10), 20062016 (2010)
38. Lemke, C., Riedel, S., Gabrys, B.: Dynamic combination of forecasts generated by diversification procedures applied to forecasting of airline cancellations. In: Proceedings of the IEEE Symposium Series on Computational Intelligence, pp. 8591 (2009)
39. Matijas, M., Suykens, J.A., Krajcar, S.: Load forecasting using a multivariate meta- learning system. Expert Systems with Applications 40(11), 44274437 (2013)
40. Metal: Meta-learning assistant for providing user support in machine learning and data mining. http://www.metal-kdd.org/ (2002)
41. de Miranda, P., Prudencio, R., de Carvalho, A., Soares, C.: An experimental study of the combination of meta-learning with particle swarm algorithms for svm parameter selection. Computational Science and Its ApplicationsICCSA 2012 pp. 562575 (2012)
42. Molina, M.D.M., Romero, C., Ventura, S., Luna, J.M.: Meta-learning approach for au- tomatic parameter tuning: A case of study with educational datasets. In: EDM, pp. 180183 (2012)
43. Morik, K., Scholz, M.: The miningmart approach to knowledge discovery in databases. In: Intelligent Technologies for Information Analysis, pp. 4765. Springer (2004)
44. Nguyen, P., Kalousis, A., Hilario, M.: A meta-mining infrastructure to support kd work- flow optimization. eCML PKDD 2011 p. 1 (2011)
45. Nguyen, P., Kalousis, A., Hilario, M.: Experimental evaluation of the e-lico meta-miner. 5th Planning to learn workshop WS28 at ECAI 2012 p. 18 (2012)
46. 46. Pan, S., Yang, Q.: A survey on transfer learning. Knowledge and Data Engineering, IEEE Transactions on 22(10), 13451359 (2010)
47. Pfahringer, B., Bensusan, H., Giraud-Carrier, C.: Meta-learning by landmarking various learning algorithms. In: In Proceedings of the Seventeenth International Conference on Machine Learning, pp. 743 750. Morgan Kaufmann (2000)
48. Prudencio, R., Ludermir, T.: Using machine learning techniques to combine forecast- ing methods. In: Proceedings of the 17th Australian Joint Conference on Artificial Intelligence, pp. 11221127 (2004)
49. Prudencio, R.B., Ludermir, T.B.: Meta-learning approaches to selecting time series mod- els. Neurocomputing 61, 121137 (2004)
50. Reif, M., Shafait, F., Dengel, A.: Dataset generation for Meta- Learning. In: KI-2012: Poster and Demo Track, pp. 6973 (2012)
51. Reif, M., Shafait, F., Dengel, A.: Meta-learning for evolutionary parameter optimization of classifiers. Machine Learning 87, 357380 (2012)
52. Reif, M., Shafait, F., Goldstein, M., Breuel, T., Dengel, A.: Automatic classifier selection for non-experts. Pattern Analysis and Applications pp. 114 (2012). DOI 10.1007/s10044-012- 0280-z
53. Rice, J.: The algorithm selection problem. In: M. Rubinov, M.C. Yovits (eds.) Advances in Computers, vol. 15. Academic Press, Inc. (1976)
54. Silver, D., Bennett, K.: Guest editors introduction: special issue on inductive transfer learning. Machine Learning 73, 215220 (2008)
55. Silver, D.L., Poirier, R., Currie, D.: Inductive transfer with context-sensitive neural networks. Machine Learning 73(3), 313336 (2008)
56. Smith-Miles, K.: Cross-disciplinary perspectives on meta-learning for algorithm selec- tion. ACM Computing Surveys 41(6), 125 (2008)
57. Soares, C.: UCI++: Improved support for algorithm selection using datasetoids. In: T. Theeramunkong, B. Kijsirikul, N. Cercone, T.B. Ho (eds.) Advances in Knowledge Discovery and Data Mining, Lecture Notes in Computer Science, vol. 5476, pp. 499506. Springer Berlin Heidelberg (2009)
58. Todorovski, L., Blockeel, H., Dzeroski, S.: Ranking with predictive clustering trees. In: T. Elomaa, H. Mannila, H. Toivonen (eds.) Proceedings of the 13th European Confer- ence on Machine Learning, pp. 444455. Springer (2002)
59. Todorovski, L., Brazdil, P., Soares, C.: Report on the experiments with feature selection in meta-level learning. In: Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta- Learning and ILP: Forum for Practical Problem Presentation and Prospective Solutions. Citeseer (2000)
60. Todorovski, L., Dzeroski, S.: Combining classifiers with meta decision trees. Machine learning 50(3), 223249 (2003)
61. Tsai, C.F., Hsu, Y.F.: A meta-learning framework for bankruptcy prediction. Journal of Forecasting 32(2), 167179 (2013)
62. Vanschoren, J.: Understanding machine learning performance with experiment databases. Ph.D. thesis, Arenberg Doctoral School of Science, Engineering & Tech- nology, Katholieke Universiteit Leuven (2010)
63. Vanschoren, J., Blockeel, H.: Towards understanding learning behavior. In: In Proceed- ings of the Annual Machine Learning Conference of Belgium and the Netherlands, pp. 8996 (2006)
64. Vilalta, R., Drissi, Y.: A characterization of difficult problems in classification. In: Proceedings of the 6th European Conference on Principles and Practice of Knowledge Discovery in Databases, Helsinki, Finland (2002)
65. Vilalta, R., Drissi, Y.: A perspective view and survey of meta- learning. Artificial Intel- ligence Review 18, 7795 (2002)
66. Wang, X., Smith-Miles, K., Hyndman, R.: Rule induction for forecasting method selec- tion: Meta-learning the characteristics of univariate time series. Neurocomputing 72, 25812594 (2009)
67. Wirth, R., Shearer, C., Grimmer, U., Reinartz, T., Schloesser, J., Breitner, C., En- gels, R., Lindner, G.: Towards process-oriented tool support for kdd. Proceedings of the 1st European Symposium on Principles of Data Mining and Knowledge Discovery, Trondheim, Norway (1997) 68. Wolpert, D.: Stacked generalization. Neural Networks 5, 241259 (1992)
68. Yao, X., Islam, M.: Evolving artificial neural network ensembles. IEEE Computational Intelligence Magazine 3, 3142 (2008)
69. Zhang, J., Ghahramani, Z., Yang, Y.: Flexible latent variable models for multi-task learning. Machine Learning 37, 221242 (2008)

Learning How to Learn: Meta Learning Approach to Improve Deep Learning

Leave a Reply