

Discover more from DSBoost
Data changes who can do what! - DSBoost #30
🎙️ Podcast of the week
SDS 703: How Data Happened: A History, with Columbia Prof. Chris Wiggins
Key takeaways:
Technologies like cryptography, and by extension data science, have political implications in that they change the dynamics of power. However, "political" in this context doesn't necessarily relate to voting; it refers more broadly to the distribution and exercise of power.
For example, the ability to analyze large datasets can give organizations, governments, or even individuals new capabilities that weren't possible before. These capabilities could be in terms of predictive analytics, decision-making, or automation
Data changes who can do what.
inclusion of Humanities in Data Science is important:
As data science has a broader impact on society, understanding ethical implications is crucial.
Data scientists need to communicate their findings not just to technologists but also to a broader audience, which humanities can help improve.
Knowledge in humanities helps data scientists understand the broader societal and political implications of their work.
Ethical considerations are critical to ensure that the technology serves the broader good and isn’t affected by biases.
🧵 Featured content
Microsoft is bringing Python to Excel
The goal is to boost data analysis and visualizations.
Here are the possibilities:
Use Python libraries like Matplotlib and Seaborn to analyze data in Excel. Then refine your data with Excel's formulas and charts.
Type Python code into an Excel cell. Microsoft's cloud runs the calculations and sends the results back to your sheet.
No additional software or add-ons are needed to use this feature.
All you need to start ML:
(link to the tweet)
Design Patterns in ML:
(link to the tweet)
To elaborate on the idea above:
Traditional software engineering is not as heavily affected by data as Machine Learning. Since the data changes over time, the design patterns need to adapt.
In general software engineering, the input and output are usually well-defined. Data involves randomness and it is less exact, more like exploration hence the output cannot be defined clearly in advance.