About This Course
Welcome to the self-paced version of Mining of Massive Datasets!
The course is based on the text Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, and Jeff Ullman, who by coincidence are also the instructors for the course. The book is published by Cambridge Univ. Press, but by arrangement with the publisher, you can download a free copy Here. The material in this on-ine course closely matches the content of the Stanford course CS246.
The major topics covered include: MapReduce systems and algorithms, Locality-sensitive hashing, Algorithms for data streams, PageRank and Web-link analysis, Frequent itemset analysis, Clustering, Computational advertising, Recommendation systems, Social-network graphs, Dimensionality reduction, and Machine-learning algorithms.
The course is intended for graduate students and advanced undergraduates in Computer Science. At a minimum, you should have had courses in Data structures, Algorithms, Database systems, Linear algebra, Multivariable calculus, and Statistics.
Jure is an associate professor of computer science at Stanford. His research area is mining of large social and information networks. He is the author of the Stanford Network Analysis Platform, a general-purpose network analysis and graph mining library. For more information, see his Home Page.
Anand is a serial entrepreneur, venture capitalist, and academic, based in Silicon Valley. He founded two successful startups, Junglee (acquired by Amazon) and Kosmix (acquired by Walmart). At Amazon, he was co-inventor of Mechanical Turk. Currently, he is a founding partner of Milliways Labs, an early-stage venture-capital firm. For more information, see his Blog, called "Datawocky".
Jeffrey D. Ullman
Jeff Ullman is a retired professor of Computer Science at Stanford. His Home Page offers additional information about the instructor.