From April 11, 2017

Why an Undergraduate Data Science Degree?

The job ‘Data Scientist’ was heralded as “The Sexiest Job of the 21st Century” by Harvard Business Review in 2012[1] at a crest of the ongoing publicity in the career fields associated with ‘big data.’ Articles on both the discipline and reality regularly appear in a variety of popular press outlets, including The Economist[2] and The New York Times[3], concurrently with growing discussion in more scholarly venues. The increased need for this specialty is driven by the fact that human activity is already generating petabytes of data each day and “data is projected by some experts to increase by 2,000 percent between now and 2020”[4]. Society will need more professionals and researchers capable of competently dealing with the huge influx of data that will be accumulated in the next decade and onward.

All this is great, and certainly helps motivate the creation of an undergraduate degree in data science (the language above came from our internal proposal), but it’s not what actually inspired me to start the process. That came from two sessions at SIGCSE 2014[5].The first was on a paper by Paul Anderson,  James Bowring, Renee McCauley, George Pothering and Christopher Starr titled: “An Undergraduate Degree in Data Science: Curriculum and a Decade of Implementation Experience” (DOI: http://dx.doi.org/10.1145/2538862.2538936 also linked on the resource pages). The other was a panel session, “Data Science as an Undergraduate Degree” with Paul Anderson, James McGuffee and David Uminsky (DOI: http://dx.doi.org/10.1145/2538862.2538868). At these sessions I got to hear what the College of Charleston (Paul Anderson) and the University of San Francisco (David Uminsky) were doing with undergraduate degrees. And it sounded like things that Valparaiso University was already offering, with the exception of perhaps an introductory course in data science. Moreover, it sounded like exactly the sort of degree I wish I’d been able to take as an undergraduate!

However, being able to actually follow-through with offering the program had more to do with several additional factors (besides the excitement). Before diving further into the process of actually creating the curriculum and elements, I want to discuss what made Valpo ready to start a Data Science degree so you can evaluate for yourself if it’s even feasible…

Valparaiso University already had…

  • A large Mathematics & Statistics Department (14 tenured/tenure-track faculty, two full-time lectures, and adjuncts).
  • Significant faculty experience in operations research, graph theory and scientific computing
  • A deep statistics curriculum including an actuarial science major
  • A complete computer science degree, covering all the basics
  • The Mathematics & Statistics Department and Computing & Information Sciences department had only recently split into two departments, so still had very strong communication and ties together.
  • A large, top-ranked college of engineering requiring more frequent offerings of mathematics and statistics electives, partially populated by engineers.
  • A master’s degree in Information Technology with 150-250 students, regularly offering courses in data mining, and information management systems (databases).
  • A master’s degree in Analytics and Modeling, where many of the courses were cross-listed with undergraduate courses

Together these factors combined to allow Valpo to start the new degree with very minimal curricular changes or additions which is not something feasible at most schools. Now, you certainly don’t need all of these factors to start your own program, but I you’ll probably at least need strong mathematics, statistics and computer science departments with good, clear communication between them. The rest just makes it easier.


[1] https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/

[2] 5,300 search hits for “Big Data” in the print edition of the Economist. www.economist.com

[3] 304 search hits for “Big Data” articles in the last 12 months of NY Times. www.nytimes.com

[4] http://www.wired.com/2015/01/a-new-generation-of-data-requires-next-generation-systems/

[5] SIGCSE refer’s to the Association of Computing Machinery (ACM)’s Special Interest Group for Computer Science Education. Specifically, SIGCSE is usually used to refer to the group’s annual Technical Symposium, typically held in early March.