Perspectives on Computational Research
(MACS 30200). Rick Evans and Benjamin Soltoff, M/W 11:30-1:20 p.m. & Weekly Lab Wednesdays 4:30-5:20 p.m.
This course focuses on applying computational methods to conducting social scientific research through a student-developed research project. Students will identify a research question of their own interest that involves a direct reference to social scientific theory, use of data, and a significant computational component. The students will collect data, develop, apply, and interpret statistical learning models, and generate a fully reproducible research paper. We will identify how computational methods can be used throughout the research process, from data collection and tidying, to exploration, visualization and modeling, to the final communication of results. The course will include modules on theoretical and practical considerations, including topics such as epistemological questions about research design, writing and critiquing papers, and additional computational tools for analysis.
Computing for the Social Sciences
(MACS 30500). Benjamin Soltoff, M/W 1:30-2:50 p.m. & Weekly Lab Wednesdays 3-4:20 p.m.
This is an applied course for social scientists with little-to-no programming experience who wish to harness growing digital and computational resources. The focus of the course is on generating reproducible research through the use of programming languages and version control software. Major emphasis is placed on a pragmatic understanding of core principles of programming and packaged implementations of methods. Students will leave the course with basic computational skills implemented through many computational methods and approaches to social science; while students will not become expert programmers, they will gain the knowledge of how to adapt and expand these skills as they are presented with new questions, methods, and data.
(MACS 40700). Benjamin Soltoff, M/W 1:30-2:50 p.m.
Social scientists frequently wish to convey information to a broader audience in a cohesive and interpretable manner. Visualizations are an excellent method to summarize information and report analysis and conclusions in a compelling format. This course introduces the theory and applications of data visualization. Students will learn techniques and methods for developing rich, informative and interactive, web-facing visualizations based on principles from graphic design and perceptual psychology. Students will practice these techniques on many types of social science data, including multivariate, temporal, geospatial, text, hierarchical, and network data. These techniques will be developed using a variety of software implementations such as R, ggplot2, D3, and Tableau.
Advanced Topics in Causal Inference
(MACS 52000). Guanglei Hong, Kazuo Yamaguchi, and Fang Yang, Tuesdays 1:30-4:20 p.m.
This course provides an in-depth discussion of selected topics in causal inference that are beyond what are covered in the introduction to causal inference course. The course is intended for graduate students and advanced undergraduate students who have taken the intro course and want to extend their knowledge in causal inference. Topics include (1) alternative matching methods, randomization inference for testing hypothesis and sensitivity analysis; (2) marginal structural models and structural nested models for time-varying treatment; (3) Rubin Causal Model (RCM) and Heckman’s scientific model of causality; (4) latent class treatment variable; (5) measurement error in the covariates; (6) the M-estimation for the standard error of the treatment effect for the use of IPW; (7) the local average treatment effect (LATE) and its problems, sensitivity analysis to examine the impact of plausible departure from the IV assumptions, and identification issues of multiple IVs for multiple/one treatments; (8) Multi-level data for treatment evaluation for multilevel experimental designs and observational designs, and spilt-over effect; (9) Nonignorable missingness and informative censoring issues.
Spatial Regression Analysis
(MACS 55000). Luc Anselin, M/W 1:30-2:50 p.m. PQ: Graduate level econometrics or multivariate regression, matrix algebra.
This course covers statistical and econometric methods specifically geared to the problems of spatial dependence and spatial heterogeneity in cross-sectional data. The main objective of the course is to gain insight into the scope of spatial regression methods, to be able to apply them in an empirical setting, and to properly interpret the results of spatial regression analysis. While the focus is on spatial aspects, the types of methods covered have general validity in statistical practice. The course covers the specification of spatial regression models in order to incorporate spatial dependence and spatial heterogeneity, as well as different estimation methods and specification tests to detect the presence of spatial autocorrelation and spatial heterogeneity. Special attention is paid to the application to spatial models of generic statistical paradigms, such as Maximum Likelihood, Generalized Methods of Moments and the Bayesian perspective. An important aspect of the course is the application of open source software tools such as R, GeoDa and PySal to solve empirical problems.
Computational Social Science Workshop
(MACS 50000). James Evans, Thursdays 11-12:20 p.m. Saieh 247. PQ: Computation students must register for a R. Other faculty and graduate students welcome.
High performance and cloud computing, massive digital traces of human behavior from ubiquitous sensors, and a growing suite of efficient model estimation, machine learning and simulation tools are not just extending classical social science inquiry, but transforming it to pose novel questions at larger and smaller scales. The Computational Social Science (CSS) Workshop is a weekly event that features this work, highlights associated skills and data, and explores the use of CSS in the world. The CSS Workshop alternates weekly between research workshops and professional workshops. The research workshops feature new CSS work from top faculty and advanced graduate students from UChicago and around the world, while professional workshops highlight useful skills and data (e.g., machine learning with Python’s scikit-learn; the Twitter firehose API) and showcase practitioners using CSS in the government, industry and nonprofit sectors. Each quarter, the CSS Workshop also hosts a distinguished lecture, debate and dinner, and a student conference.
Machine Learning for Public Policy
(CAPP 30254). Rayid Ghani, Days/Times TBD. PQ: PPHA 31100 or PPHA 31300 and CAPP 30122 or PPHA 30550.
This course will be an introduction to machine learning and how it can be applied to public policy problems. It’s designed for students who are interested in learning how to use modern, scalable, computational data analysis methods and tools, and apply them to social and policy problems. This course will teach students: what role machine learning can play in designing, implementing, evaluating, and improving public policy; machine Learning methods and tools; how to solve policy problems using machine learning methods and tools. This is a hands-on course where students will be expected to use Python (as well as other computational tools) to implement solutions to various policy problems. We will cover supervised and unsupervised learning algorithms and will learn how to use them with data from a variety of public policy problems in areas such as education, public health, sustainability, economic development, and public safety.
Databases for Public Policy
(CAPP 30235). Aaron Elmore, Days/Times TBD. PQ: CAPP 30122.
The course will cover the foundations of Database Management Systems (DBMS). This includes data models, database design, SQL, core database system components (e.g. transactions, recovery, query processing), distributed databases, NewSQL/NoSQL, and systems for data analytics (e.g. column-orientated databases, data warehouses). The goals for this class are for you to have the ability to model and design a database, an understanding of the core components of a database management system, the ability to write SQL, and an understanding of the differences between databases and data models.
Computer Science with Applications-3
(CAPP 30123). Matthew Wachs, Days/Times TBD.
This three-quarter sequence teaches computational thinking and skills to students who are majoring in the sciences, mathematics, and economics. Lectures cover topics in (1) programming, such as recursion, abstract data types, and processing data; (2) computer science, such as clustering methods, event-driven simulation, and theory of computation; and to a lesser extent (3) numerical computation, such as approximating functions and their derivatives and integrals, solving systems of linear equations, and simple Monte Carlo techniques. Applications from a wide variety of fields serve both as examples in lectures and as the basis for programming assignments. In recent offerings, students have written programs to evaluate betting strategies, determine the number of machines needed at a polling place, and predict the size of extinct marsupials. Students learn Java, Python, R and C++.
Statistical Theory and Methods - 2
(STAT 24500). Chao Gao, Days/Times TBD. PQ: STAT 24400 w/ grade of B- or better, or STAT 24410, w/ grade of C+ or better; and MATH 19620 or 20250 or 25500 or 25800 or STAT 24300.
This course is the second quarter of a two-quarter systematic introduction to the principles and techniques of statistics, as well as to practical considerations in the analysis of data, with emphasis on the analysis of experimental data. This course continues from either STAT 24400 or STAT 24410 and covers statistical methodology, including the analysis of variance, regression, correlation, and some multivariate analysis. Some principles of data analysis are introduced, and an attempt is made to present the analysis of variance and regression in a unified framework. Statistical software is used.
(STAT 28000). Lek-Heng Lim, Days/Times TBD.
This is an introductory course on optimization that will cover the rudiments of unconstrained and constrained optimization of a real-valued multivariate function. The focus is on the settings where this function is, respectively, linear, quadratic, convex, or differentiable. Time permitting, topics such as nonsmooth, integer, vector, and dynamic optimization may be briefly addressed. Materials will include basic duality theory, optimality conditions, and intractability results, as well as algorithms and applications.
Analysis in Rn-1
(MATH 20300.) Daniil Rudenko, Days/Times TBD. PQ: MATH 16300 or MATH 15910 or MATH 15900 or MATH 19900.
For students concentrating in Computational Economics with no prior exposure to Real Analysis. Both theoretical and problem solving aspects of multivariable calculus are treated carefully. This course covers the construction of the real numbers, the topology of R^n including the Bolzano-Weierstrass and Heine-Borel theorems, and a detailed treatment of abstract metric spaces, including convergence and completeness, compact sets, continuous mappings, and more.
Analysis in Rn - 2
(MATH 20400). Marco Mendez Guaraco, Days/Times TBD. PQ: MATH 20700 OR MATH 20300 AND MATH 20250 or STAT 24300.
For students concentrating in Computational Economics who have taken MATH 20300 or who have prior exposure to Real Analysis. This course covers differentiation in R^n including partial derivatives, gradients, the total derivative, the Chain Rule, optimization problems, vector-valued functions, and the Inverse and Implicit Function Theorems.
Analysis In Rn-3
(MATH 20500). Marco Mendez Guaraco, Days/Times TBD. PQ: MATH 20400 or MATH 20800.
For students concentrating in Computational Economics with excellent exposure to Real Analysis. This course covers integration in R^n including Fubini's Theorem and iterated integration, line and surface integrals, differential forms, and the theorems of Green, Gauss, and Stokes.
(MPCS 55001). Geraldine Brady, Days/Times TBD. PQ: immersion math (MPCS 50103) or placement. Immersion programming (MPCS 50101) or programming waiver, or core programming (MPCS 51036 or 51040), or instructor consent.
The course is an introduction to the design and analysis of efficient algorithms, with emphasis on developing techniques for the design and rigorous analysis of algorithms rather than on implementation. Algorithmic problems include sorting and searching, discrete optimization, and algorithmic graph theory. Design techniques include divide-and-conquer methods, dynamic programming, greedy methods, graph search, as well as the design of efficient data structures. Methods of algorithm analysis include asymptotic notation, evaluation of recurrences, and the concepts of polynomial-time algorithms. NP-completeness is introduced toward the end the course. Students who complete the course will have demonstrated the ability to use divide-and-conquer methods, dynamic programming methods, and greedy methods, when an algorithmic design problem calls for such a method. They will have learned the design strategies employed by the major sorting algorithms and the major graph algorithms, and will have demonstrated the ability to use these design strategies or modify such algorithms to solve algorithm problems when appropriate. They will have derived and solved recurrences describing the performance of divide-and-conquer algorithms, have analyzed the time and space complexity of dynamic programming algorithms, and have analyzed the efficiency of the major graph algorithms, using asymptotic analysis.
(MPCS 53001). Zachary Freeman, Days/Times TBD. PQ: MPCS 51036 or 51040 or 51100 (completed or concurrently enrolled). Non-MPCS students must meet prerequisites and complete Course Request Form.
Students will learn database design and development and will build a simple but complete web application powered by a relational database. We start by showing how to model relational databases using the prevailing technique for conceptual modeling -- Entity-Relationship Diagrams (ERD). Concepts covered include entity sets and relationships, entity key as a unique identifier for each object in an entity set, one-one, many-one, and many-many relationships as well as translational rules from conceptual modeling (ERD) to relational table definitions. We also examine the relational model and functional dependencies and their application to the methods for improving database design: normal forms and normalization. After design and modeling, students will learn the universal language of relational databases: SQL (Structured Query Language). We start by introducing relational algebra -- the theoretical foundation of SQL. Then we examine in detail the two aspects of SQL: data definition language (DDL) and the data manipulation language (DML). Concepts covered include subqueries (correlated and uncorrelated), aggregation, various types of joins including outer joins and syntax alternatives. Students will gain significant experience with writing and reading SQL queries throughout the course in the detailed discussions in class, online homework, and the real-world individual project.
(PLSC 30600). Justin Grimmer, Days/Times TBD.
Computational Approaches to Cognitive Neuroscience
(PSYC 34410). Nicholas Hatsopoulos, Days/Times TBD. PQ: BIOS 24231 or CPNS 33100.
This course is concerned with the relationship of the nervous system to higher order behaviors (e.g., perception, object recognition, action, attention, learning, memory, and decision making). Psychophysical, functional imaging, and electrophysiological methods are introduced. Mathematical and statistical methods (e.g. neural networks and algorithms for studying neural encoding in individual neurons and decoding in populations of neurons) are discussed. Weekly lab sections allow students to program cognitive neuroscientific experiments and simulations.
(CMSC 35400). Imre Kondor, T/Th 3-4:20 p.m. PQ: Instructor consent.
This course provides hands-on experience with a range of contemporary machine learning algorithms, as well as an introduction to the theoretical aspects of the subject. Topics covered include: the PAC framework, Bayesian learning, graphical models, clustering, dimensionality reduction, kernel methods including SVMs, matrix completion, neural networks, and an introduction to statistical learning theory.
Machine Learning and Large Scale Data Analysis
(CMSC 25025). John Lafferty, T/Th 1:30-2:50 p.m. & Weekly Lab Wednesdays 1:30-2:50, 3-4:20, or 4:30-5:50 p.m. PQ: CMSC 15400 or CMSC 12200 and STAT 22000 or STAT 23400, or by consent.
This course is an introduction to machine learning and the analysis of large data sets using distributed computation and storage infrastructure. Basic machine learning methodology and relevant statistical theory will be presented in lectures. Homework exercises will give students hands-on experience with the methods on different types of data. Methods include algorithms for clustering, binary classification, and hierarchical Bayesian modeling. Data types include images, archives of scientific articles, online ad clickthrough logs, and public records of the City of Chicago. Programming will be based on Python and R, but previous exposure to these languages is not assumed.
Applications of Hierarchical Linear Models
(SOCI 30112). Stephen Radenbush, Days/Times TBD.
A number of diverse methodological problems such as correlates of change, analysis of multi-level data, and certain aspects of meta-analysis share a common feature--a hierarchical structure. The hierarchical linear model offers a promising approach to analyzing data in these situations. This course will survey the methodological literature in this area, and demonstrate how the hierarchical linear model can be applied to a range of problems.
Time Series Analysis and Stochastic Processes
(MPCS 58020). Andrew Siegel, Monday 5:30-8:30 p.m. PQ: MPCS 51036 or 51040 or 51100. Non-MPCS students must meet prerequisites and complete Course Request Form.
Stochastic processes are driven by random events. They can be used to model phenomena in a broad range of disciplines, including science/engineering (e.g. computational physics, chemistry, and biology), business/finance (e.g. investment models and operations research), and computer systems (e.g. client/server workloads and resilience modeling). In many cases relatively simple stochastic simulations can provide estimates for problems that are difficult or impossible to model with closed-form equations. In this class we focus on the rudimentary ideas and techniques that underlie stochastic time series analysis, discrete events modeling, and Monte Carlo simulations. Course lectures will focus on the basic principles of probability theory, their efficient implementation on modern computers, and examples of their application to real world problems. Upon completion of the course, students should have an adequate background to quickly learn in depth specific Monte Carlo approaches in their chosen field of interest.
Modeling and Signal Analysis for Neuroscientists
(CPNS 32111). Wim Van Drongelen, Days/Times TBD. PQ: BIOS 26210, BIOS 26211 or instructor approval.
The course provides an introduction into signal analysis and modeling for neuroscientists. We cover linear and nonlinear techniques and model both single neurons and neuronal networks. The goal is to provide students with the mathematical background to understand the literature in this field, the principles of analysis and simulation software, and allow them to construct their own tools. Several of the 90-minute lectures include demonstrations and/or exercises in Matlab.
Genes and Environment in Language and Cognitive Development
(PSYC 42052). Daniel Yurovsky and Susan Levine, Days/Times TBD.
Children show tremendous variability in how quickly and how well they learn their native language. Where does this variability come from? We’ll explore both genetic and environmental contributions to language and cognitive development, aiming for an integrative understanding that moves beyond debates about nature and nurture. Readings will include work in behavioral genetics, environmental plasticity, niche inheritance, and cultural evolution and transmission.