TEACHER: Achim Ahrens COURSE ID: LANGUAGE:

AN INTRODUCTION TO MACHINE LEARNING IN STATA

Over the last few years we have experienced an unprecedented explosion in the availability of data relating to social, economic, financial and health-related phenomena. Today researchers, professionals and policy makers have therefore access to enormous datasets (so-called Big Data) containing an abundance of information regarding individuals, companies and institutions.

Machine learning has evolved in response to both the need to analyse extremely large databases and the availability of both sophisticated software and extremely powerful computer capacity. Machine learning, an application of artificial intelligence, offers a relatively new approach to data analysis, which trains systems to automatically learn and improve from experience without being explicitly programmed, relying  instead on patterns and inference in the data.

This workshop offers participants an introduction to both Machine Learning techniques and the commands for Machine Learning recently introduced in Stata 16.

Researchers and professionals working in biostatistics, economics, epidemiology, social and political sciences and public health wishing to implement Machine Learning techniques in Stata.

Participants should be familiar with Stata. An introductory knowledge of econometrics and/or statistics is also required.

SESSION 1: EXAMPLES OF MACHINE LEARNING METHODOLOGIES

This opening session focuses on the more popular supervised and unsupervised Machine Learning (ML) techniques, and their implementation in Stata. This session focuses on regression trees, random forests and cluster analysis.

SESSION 2: REGULARIZED REGRESSION

Regularized regression and the Lasso approach play a central role in Machine Learning. This session is devoted therefore to Lasso, Elastic Net and related methodologies. We will demonstrate their application in Stata using both the user written Lassopack commands and Stata 16’s new Machine Learning routines.

SESSION 3: CAUSAL INFERENCE WITH MACHINE LEARNING

The primary strength of Machine Learning is prediction. In this session, we illustrate how Lasso and other Machine Learning methodologies can also be used to facilitate causal inference. The workshop concludes by looking at the latest developments in the evolving literature on Machine Learning and causal inference.

The Workshop will be held on September 27th at the Hotel Brunelleschi, Piazza Santa Elisabetta 3.

During the course workshop participants will be provided with: a printed copy of the course handouts, the Stata do files and databases used through the workshop; together with a temporary licence of Stata 16 valid for 30 days. In order to benefit as much as possible from the course, we strongly suggest participants bring their own laptops along with them to be able to actively follow the applied sections.

The maximum number of participants permitted will be restricted to 15.

Individuals interested in attending the Workshop must return their completed registration forms by email (formazione@tstat.it) to TStat by the 15th September 2019.

Further details regarding our registration procedures, including our commercial terms and conditions, can be found at https://www.tstat.it/utenti/xvi-convegno-italiano-degli-utenti-distata.


NAME

EMAIL

OBJECT

ADDITIONAL COMMENTS

I authorise the use of my personal data pursuant to Article 13 of L. Decree no 196 / 2003

Over the last few years we have experienced an unprecedented explosion in the availability of data relating to social, economic, financial and health-related phenomena. Today researchers, professionals and policy makers have therefore access to enormous datasets (so-called Big Data) containing an abundance of information regarding individuals, companies and institutions.

 

Machine learning has evolved in response to both the need to analyse extremely large databases and the availability of both sophisticated software and extremely powerful computer capacity. Machine learning, an application of artificial intelligence, offers a relatively new approach to data analysis, which trains systems to automatically learn and improve from experience without being explicitly programmed, relying instead on patterns and inference in the data.