Editor’s note: The Focus on Research column highlights different research projects and topics being explored at Penn State. Each column will feature the work of a different researcher from across all disciplines. The following originally appeared on The Conversation.
Big data is increasingly becoming part of everyday life. Network security companies use it to improve the accuracy of their intrusion detection services. Dating services use it to help clients find soulmates. It can enhance the efficiency and accuracy of fraud detection, in turn helping protect your personal finances.
“Big data” is a catchall term for any data set of exceedingly large volume. It could be transaction information at a credit card company, invoice data at an online retailer or meteorological measurements from a weather station. All these data sets have unique characteristics that make it extremely difficult to use conventional computing technologies and techniques to store and process them for analysis. Their variety is daunting, and high velocity is required to handle them in a timely manner.
Organizations in any field can use big data to enhance their effectiveness, which is why there are seemingly unlimited career opportunities in big data these days. The big data industry is growing fast, with the market predicted to grow at a compound annual growth rate of 23.1 percent over the 2014-2019 period.
So who is going to store, manage and process all this information? Well, why not you? Companies are starved for people with this kind of expertise. Big data is a growth industry, and people from a variety of academic backgrounds can find successful careers in this area.
Many backgrounds lead to big data
But you didn’t major in “big data?” Don’t worry. Your academic background shouldn’t be an inhibiting factor when you start to contemplate becoming a big data professional.
People working in fields such as physics, bioinformatics, statistics, political science and psychology are already heavy users and analyzers of a large amount of data. Transition from these types of disciplines to big data analytics could be relatively smooth.
If your original education and training didn’t focus on data, that’s not necessarily a problem. Your own discipline-specific knowledge, insights and perspectives can be valuable when figuring out how to leverage big data in the most sensible way. The only catch is you need to be willing and able to acquire the technical skills necessary to either analyze or work with big data.
Types of jobs in this field
Despite the unique nature of each big data career, there are common categories of jobs or career paths.
The most fundamental of these focus on data infrastructure — how the data is actually housed and accessed. These infrastructure jobs involve developing and maintaining the necessary hardware and software. A cloud computing environment is especially well equipped to handle big data due to its scalable nature.
Big data management professionals rely on the data infrastructure to actually populate it with data and manipulate them. Conventional database management workers are natural candidates who could be trained quickly to work as big data management experts. They already have general database management knowledge. But they need to get up to speed on dealing with big data. These can be much more unstructured than what you find in a traditional database, where each record conforms to a certain structure in the form of data fields and types. Imagine a student record, with discrete first name and last name fields. Big data often doesn’t have this kind of nice organization: It can be as unstructured as a bunch of Twitter feeds or Facebook postings by millions of users.
Statisticians are essential in the big data industry. They’re the number crunchers who specialize in analyzing and interpreting the data. There are many advanced techniques used by statisticians, which require years of training. They depend on the data infrastructure providers and data management workers to store and retrieve their source data for further processing.
Visualization specialists are also key in the big data industry. One of the most critical aspects of big data analytics is communicating the results of an analysis to decision-makers — and they often lack expertise in data interpretation or statistics. Visualization empowers a layperson to understand the significance and implication of the numbers produced by a big data analytics effort. Think about being presented with a large set of numbers that you’re told indicate a changing climate. It’s a lot easier to understand the data’s significance when shown a graph with a sharp turn upwards, implying exponential growth.
Finally, machine learning experts focus on automating the statistical and visual interpretations of big data. Automation is critical, especially when the amount of data to be analyzed is beyond human capabilities — as is the case in most big data scenarios. Machine learning is based on self-learning algorithms. These computer programs autonomously enhance their own performance and accuracy through trial and error.
Many curricula available today through universities provide foundational knowledge in all the technical areas of big data. Students can eventually pick their specialty, which can be further honed in a graduate program.
Expected qualities and skills
One of the core qualities often found in big data professionals is willingness to learn. The big data landscape is dynamic and constantly requires continuing education. To survive in this environment, you should enjoy learning new skills and be unafraid of trying out novel technologies.
And the most successful big data worker isn’t just a numbers geek. People in this area also need to have a business mindset. Companies are always eager to leverage the information stemming from their big data analyses. They’re looking for people who naturally make connections between actionable information and what the companies are striving to accomplish in both the short and long term. If you aren’t interested in linking these two interests, your job security may eventually be at risk.
You could focus on any one of these areas of big data — data infrastructure, data management, statistics, visualization, machine learning — and become an expert. Another option is to become a generalist; you have exposure to all these technical requirements and as a project manager work with the specialist to solve any given problem.
As a university professor specializing in information sciences and technology, I encounter many students who figure out their true passion for big data only during their senior year while doing their job search; by then, they’ve missed a golden opportunity to prepare themselves academically for this thriving emerging profession. The earlier this epiphany comes before graduation, the better. But there’s nothing holding back grad students or adult learners from investing their time wisely and acquiring the necessary skills. This is especially true in the field of big data analytics due to the abundance of learning resources in the form of both self-learning and traditional education.
What are you waiting for? Start your journey today.
Jungwoo Ryoo is an associate professor of information sciences and technology at Penn State Altoona.