Welcome
What you’re reading is a lightly modified version of R for Data Science for the class Data Science for Linguists at the University of Pittsburgh. R4DS was originally written by Hadley Wickham and Garrett Grolemund (book site, GitHub). It was released under a Creative Commons BY-NC-ND 3.0 License, and this version is being made available under that same license.
Text in yellow boxes like this one are from Dan. 99.9% of the rest is by the original authors.
This is the website for “R for Data Science”. This book will teach you how to do data science with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it. In this book, you will find a practicum of skills for data science. Just as a chemist learns how to clean test tubes and stock a lab, you’ll learn how to clean data and draw plots—and many other things besides. These are the skills that allow data science to happen, and here you will find the best practices for doing each of these things with R. You’ll learn how to use the grammar of graphics, literate programming, and reproducible research to save time. You’ll also learn how to manage cognitive resources to facilitate discoveries when wrangling, visualising, and exploring data.
This website is (and will always be) free to use, and is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. If you’d like a physical copy of the book, you can order it from bookshop.org; it was published by O’Reilly in January 2017. If you’d like to give back please make a donation to Kākāpō Recovery: the kākāpō (which appears on the cover of R4DS) is a critically endangered native NZ parrot; there are only 213 left.
Please note that R4DS uses a Contributor Code of Conduct. By contributing to this book, you agree to abide by its terms.
Acknowledgements
R4DS is a collaborative effort and many people have contributed fixes and improvements via pull request: adi pradhan (@adidoit), Andrea Gilardi (@agila5), Ajay Deonarine (@ajay-d), @AlanFeder, pete (@alonzi), Alex (@ALShum), Andrew Landgraf (@andland), @andrewmacfarland, Michael Henry (@aviast), Mara Averick (@batpigandme), Brent Brewington (@bbrewington), Bill Behrman (@behrman), Ben Herbertson (@benherbertson), Ben Marwick (@benmarwick), Ben Steinberg (@bensteinberg), Brandon Greenwell (@bgreenwell), Brett Klamer (@bklamer), Christian Mongeau (@chrMongeau), Cooper Morris (@coopermor), Colin Gillespie (@csgillespie), Rademeyer Vermaak (@csrvermaak), Abhinav Singh (@curious-abhinav), Curtis Alexander (@curtisalexander), Christian G. Warden (@cwarden), Kenny Darrell (@darrkj), David Rubinger (@davidrubinger), David Clark (@DDClark), Derwin McGeary (@derwinmcgeary), Daniel Gromer (@dgromer), @djbirke, Devin Pastoor (@dpastoor), Julian During (@duju211), Dylan Cashman (@dylancashman), Dirk Eddelbuettel (@eddelbuettel), Edwin Thoen (@EdwinTh), Ahmed El-Gabbas (@elgabbas), Eric Watt (@ericwatt), Erik Erhardt (@erikerhardt), Etienne B. Racine (@etiennebr), Everett Robinson (@evjrob), Flemming Villalona (@flemingspace), Floris Vanderhaeghe (@florisvdh), Garrick Aden-Buie (@gadenbuie), Garrett Grolemund (@garrettgman), Josh Goldberg (@GoldbergData), bahadir cankardes (@gridgrad), Gustav W Delius (@gustavdelius), Hadley Wickham (@hadley), Hao Chen (@hao-trivago), Harris McGehee (@harrismcgehee), Hengni Cai (@hengnicai), Ian Sealy (@iansealy), Ian Lyttle (@ijlyttle), Ivan Krukov (@ivan-krukov), Jacob Kaplan (@jacobkap), Jazz Weisman (@jazzlw), John D. Storey (@jdstorey), Jeff Boichuk (@jeffboichuk), Gregory Jefferis (@jefferis), <U+848B><U+96E8><U+8499> (@JeldorPKU), Jennifer (Jenny) Bryan (@jennybc), Jen Ren (@jenren), Jeroen Janssens (@jeroenjanssens), Jim Hester (@jimhester), JJ Chen (@jjchern), Joanne Jang (@joannejang), John Sears (@johnsears), @jonathanflint, Jon Calder (@jonmcalder), Jonathan Page (@jonpage), Justinas Petuchovas (@jpetuchovas), Jose Roberto Ayala Solares (@jroberayalas), Julia Stewart Lowndes (@jules32), Sonja (@kaetschap), Kara Woo (@karawoo), Katrin Leinweber (@katrinleinweber), Karandeep Singh (@kdpsingh), Kyle Humphrey (@khumph), Kirill Sevastyanenko (@kirillseva), @koalabearski, Kirill Müller (@krlmlr), Noah Landesberg (@landesbergn), @lindbrook, Mauro Lepore (@maurolepore), Mark Beveridge (@mbeveridge), Matt Herman (@mfherman), Mine Cetinkaya-Rundel (@mine-cetinkaya-rundel), Matthew Hendrickson (@mjhendrickson), @MJMarshall, Mustafa Ascha (@mustafaascha), Nelson Areal (@nareal), Nate Olson (@nate-d-olson), Nathanael (@nateaff), Nick Clark (@nickclark1000), @nickelas, Nirmal Patel (@nirmalpatel), Nina Munkholt Jakobsen (@nmjakobsen), Jakub Nowosad (@Nowosad), Peter Hurford (@peterhurford), Patrick Kennedy (@pkq), Radu Grosu (@radugrosu), Ranae Dietzel (@Ranae), Robin Gertenbach (@rgertenbach), Richard Zijdeman (@rlzijdeman), Robin (@Robinlovelace), Emily Robinson (@robinsones), Rohan Alexander (@RohanAlexander), Romero Morais (@RomeroBarata), Albert Y. Kim (@rudeboybert), Saghir (@saghirb), Jonas (@sauercrowd), Robert Schuessler (@schuess), Seamus McKinsey (@seamus-mckinsey), @seanpwilliams, Luke Smith (@seasmith), Matthew Sedaghatfar (@sedaghatfar), Sebastian Kraus (@sekR4), Sam Firke (@sfirke), Shannon Ellis (@ShanEllis), @shoili, S’busiso Mkhondwane (@sibusiso16), @spirgel, Steven M. Mortimer (@StevenMMortimer), Stéphane Guillou (@stragu), Sergiusz Bleja (@svenski), Tal Galili (@talgalili), Tim Waterhouse (@timwaterhouse), TJ Mahr (@tjmahr), Thomas Klebel (@tklebel), Tom Prior (@tomjamesprior), Terence Teo (@tteo), Will Beasley (@wibeasley), @yahwes, Yihui Xie (@yihui), Yiming (Paul) Li (@yimingli), Hiroaki Yutani (@yutannihilation), @zeal626, Azza Ahmed (@zo0z)
R4DS is hosted by https://www.netlify.com as part of their support of open source software and communities.