Scalable big data systems are a central part of modern data science. This course will cover topics including design and use of parallel dataflow systems (MapReduce/Hadoop and Spark), scalable and parallel Python analytics frameworks, and cloud data systems (cloud storage, cloud-native data processing). A major component of this course is hands-on programming using scalable analytics tools and cloud resources such as Google Cloud and Azure Cloud.