What skills do data scientists need?
Data scientists act as translators, bridging the gap between the world of raw data and actionable business solutions. They possess a keen understanding of the data landscape within an organization. This empowers them to identify, acquire, and analyze the most relevant data for uncovering valuable insights. Their ability to explain findings and predictions to the business world supports informed decision making. They are skilled detectives, adept at using a variety of analytical and statistical techniques to unlock the hidden potential within data. These insights are then transformed into actionable roadmaps and strategies that directly address real-world business problems.
BY Shao-Fang (Pam) Wang

Building the Foundation: Data Acquisition and Analysis in Data Science
At their core, data scientists are data analysis professionals. The fundamental skills for data scientists are identifying and acquiring the data necessary for answering business problems, performing analyses and modeling to extract insights.
For example, imagine wanting to understand the revenue of a food delivery service app. Data scientists would identify relevant data points like usage patterns, order volume, average order cost, and customer satisfaction survey results. Data scientists pinpoint the specific data sources (tables, databases) holding the most accurate and complete information for each metric. They then preprocess the data by identifying and handling outliers, removing incorrect data, imputing missing values, and ensuring the data is in a format suitable for analysis.
If the organization doesn't have the necessary data for the analysis, data scientists collaborate with engineers and other teams to design effective data collection methods, such as collecting online user behavior data, conducting user surveys, or utilizing publicly available databases. This collaborative approach ensures the identification, acquisition, and analysis of the correct data for robust analysis.
Beyond data acquisition and preparation, data scientists possess the skills to analyze data and address business problems. This involves activities like exploratory analysis, building machine learning model prototypes, and applying a range of statistical techniques and tools. Their proficiency extends to programming languages (SQL, Python, R), statistical techniques, machine learning algorithms, and software tools (AWS, Tableau, Snowflake). They strategically select these tools to extract the desired insights and ensure accurate, unbiased results. Data scientists also understand the principles of experimentation, particularly A/B testing commonly used within organizations. They can design experiments, facilitate the process and execution, and analyze the results.
Through these combined skills, data scientists uncover patterns, predict future trends using machine learning algorithms, and employ statistical analysis to understand relationships between variables. Ultimately, they translate this knowledge into actionable insights that drive data-driven decision making.
Beyond the Data: Leveraging Business Acumen for Impact
In addition to technical skills, data scientists excel at translating real-world problems into actionable data analysis. This critical skill relies on a blend of technical expertise and business acumen. Data scientists actively listen to stakeholders to grasp the core business issue and its impact. Through this process, they build a comprehensive understanding of the industry, the company's specific challenges, and how data can best support and resolve those problems.
For example, "Why is revenue down?" is a common business problem. For data scientists, it lacks the specificity needed for actionable insights. To address this, they identify the core question behind the problem. This might involve proposing hypotheses that link the business problem to potential solutions. Using a food delivery service as an example, they hypothesize customer churn or a decrease in order volume could be factors, drawing on their business knowledge and past analyses.
Through discussion and analysis, data scientists can pinpoint the actual cause and transform broad business questions into specific, data-driven strategies. Instead of asking "Why is revenue down?", data scientists might reframe it as a more focused query: "Are there certain age groups in Taiwan whose meal order volume has decreased significantly over the past six months?"
This targeted strategy allows data scientists to delve deeper into the Taiwanese market, for example, by analyzing meal order characteristics across different age groups. Their analysis may reveal a decline in orders from specific age groups. They can then further investigate how marketing campaigns can align with these customers' preferences and influence their app usage. Ultimately, the analysis can identify the most effective marketing campaigns for this target group and estimate the potential increase in order volume if the company adopts the recommended strategy. By transforming a broad "why" question into actionable steps, data scientists address the revenue downturn and pave the way for potential future revenue growth.
Data scientists act as skilled detectives when translating business questions. They use active listening to grasp the core issue presented by stakeholders. Clarifying questions get to the heart of the problem, uncovering hidden opportunities for improvement. This isn't a solo mission; through collaboration, data scientists work with other teams to refine the initial question, ensuring it aligns with everyone's expectations. Once a clear understanding is established, analytical thinking takes center stage. Here, the data scientist's problem-solving skills come into play, identifying the root cause of the business problem. Critical thinking allows them to evaluate potential solutions and their feasibility. Finally, they identify the metrics that will measure the success of the chosen solution, ensuring it effectively addresses the business challenge.
The Power of Data Storytelling: Communication Skills for Data Scientists
Effective communication skills are the cornerstone of a successful data scientist. While technical skills and business knowledge are essential, data scientists excel as service providers who leverage their analytical skills to generate actionable insights. Their work goes beyond simply generating results; they are also skilled storytellers, weaving data-driven insights into clear and concise messages for audiences with varying levels of technical expertise. This ability to translate complex analyses into compelling narratives that resonate with stakeholders and connect directly to real-world business objectives ensures stakeholders can take action on the insights and drive better outcomes for the organization.
Data scientists bridge the gap between the analytical world and other teams by explaining analyses and data in a way that ensures everyone is on the same page. This is crucial for two key reasons. First, clear communication fosters a shared understanding of the insights extracted from the data and their potential implications for the business. Second, it encourages valuable feedback from other teams. This collaborative exchange can significantly improve the analysis itself and ultimately, its real-world impact.
To effectively tell their data stories, data scientists rely heavily on strong data visualization skills. They need to translate complex data sets into clear and visually compelling formats, such as charts and graphs, to facilitate the story and make their points. These visualizations not only enhance communication with audiences who may not have a strong technical background, but can also reveal hidden patterns and trends within the data itself, aiding in explaining complex concepts that might be difficult to describe verbally. Critically, data visualization should not be misleading. It is not for data scientists to cherry-pick data to fit a predetermined narrative.
Imagine a data scientist working for a food delivery service discovering a statistically significant link between delivery speed and customer satisfaction. While these two factors generally increase together, faster delivery times might only improve satisfaction scores up to a certain point. To better understand this nuanced relationship, a scatter plot with satisfaction scores on the y-axis and delivery time on the x-axis could be created. The plot would likely show that the increase in both satisfaction and delivery speed will plateau at some point, indicating that other factors might influence satisfaction scores.
In addition, the data scientist might find that a 20% improvement in delivery speed could lead to a 5% to 15% increase in satisfaction scores. Rather than simply reporting these numbers, the data scientist would translate them into a compelling narrative that demonstrates empathy for the customer perspective. This narrative would also address the potential impact on the business. For example, a primary reason customers utilize food delivery services is to save time. By reducing delivery times, the company is essentially returning valuable time to its customers, aligning directly with the company's core value. The analysis has already shown that this enhanced convenience is expected to boost customer satisfaction, which may ultimately lead to increased revenue and decreased customer churn. Further analysis could directly explore the relationship between these factors.
Furthermore, using data tables, the data scientist can break down the potential range of outcomes (5% to 15% increase) and the associated risks. They would then outline factors influencing the outcome, such as the likelihood of achieving a 15% increase versus a 5% increase. In addition, they would leverage past data and analysis to propose strategies to maximize the potential for a higher increase. This could involve recommending investments in additional delivery personnel or developing algorithms to optimize routes. By anticipating potential outcomes and proposing alternative strategies, the data scientist demonstrates a proactive and comprehensive understanding of the situation.
Effective communication goes beyond simply reporting numbers. Data scientists leverage data visualization, storytelling, and empathy to ensure everyone can interpret the findings, their implications, and potential actions. This collaborative approach is critical for achieving optimal outcomes based on data-driven insights.
Data Science Expertise: From Wrangling Data to Driving Decisions
Data scientists are a powerful blend of technical expertise and soft skills, enabling them to bridge the gap between the analytical world and the business landscape. They use analytical tools to dissect complex problems, acting as data wranglers who clean and manipulate massive datasets. Their problem-solving mentality, fueled by curiosity and a keen eye for detail, ensures accurate and insightful analysis. But data scientists aren't just number-crunchers; their business acumen allows them to translate real-world problems into actionable data questions and interpret results within a business context. Finally, their communication mastery empowers them to turn complex findings into clear narratives, fostering collaboration and driving data-driven decisions across all levels.
By mastering this unique blend of skills, data scientists become invaluable assets, bridging the gap between intricate data analysis and the practical world of research and business strategy.
Reference
✨Further Reading:If you would like to read the Chinese version, please refer to《深藏不漏的數據科學家》.If you would like to know more about data sciences, please refer to "The secret behind Big Data-Data Science" "The Indispensable Role of Data Science in Our Lives"