Discover more from DSBoost
How to stand out with your portfolio? - DSBoost #5
Welcome to the fifth issue of DSBoost, the weekly newsletter where you can discover interesting people in the ML/AI world, get the main takeaways of a relevant podcast, and stay up to date with the latest news in the field!
💬 Interview of the week
This week we interviewed Brian, who is a Data Science and Analytics leader. Enjoy:
What did you study/are you studying (if your background is different from DS, how did you end up in the field)?
I have over 20 years of Product Management experience and recently returned to school to get a BS in Analytics.
What are your favorite resource sites and books (ML/AI)?
What got you into your current role (portfolio, certification, etc.)?
I lobbied with our CTO to let me start an Analytics and Data Science team. We're now about 20 people and have those functions and Data Engineering.
What do you enjoy the most in your work?
Delighting people with data/insights in ways they didn't think were possible or didn't have the skills to extract independently.
What tools do you use the most / favorite tools?
I'm a fan of VSCode as an everyday coding environment. My tech stack is Snowflake, Looker, Astronomer, dbt, Fivetran, and AWS.
Do you use ChatGPT or other Al tools during your work? If so, how do they help you? Do they change your approach to problems?
I've been using it to summarize bodies of text and help form interview questions/answers for Data Scientists. It's fun to come up with a question (behavioral or technical) and see how ChatGPT answers it vs. them.
I also use GitHub Copilot a lot. It helps speed up coding simple routines that aren't proprietary but necessary.
What is your favorite topic within the field?
I enjoy building classifiers. There is something awesome about delivering a solution that helps people make decisions.
For example - which site visitors will most likely purchase / convert?
Which one of the recent AI/ML models will have the most significant impact on the industry in your opinion?
It's hard to say right now - so much is happening in AI, and a lot will change in 12 months. I'm very impressed by text-to-image/video right now. There is a potentially massive impact on the Media & Entertainment industry where we will see significant portions of the content we watch AI-generated.
What is the biggest mistake you've made? (preferably DS related)
I spent the first year of my DS learning the wrong courses. The course was probably great, but it was master's level coursework that went deep into the math without really covering the fundamentals.
Learn the fundamentals of why an algorithm does what it does, don't worry about the math at first; learn that later.
You have excellent projects on your blog. What would you suggest to beginners when they are building their portfolio? How to stand out?
You can start with simple projects to get used to the mechanics of those simple datasets (Titanic, Iris, Yelp, etc.).
The greatest learning comes when you need to find your dataset, figure out how to extract it, automate it, and prepare it for Machine Learning or NLP.
Get used to GitHub and the workflow. Use it with every project and don't skimp. First of all, you’re going to use Git every day in your professional life. Second, it helps you showcase your work.
Can you share a fun fact about yourself?
I taught myself how to program, started a company, and was acquired when I was 25. That’s how I got into Product Management.
🎙️ Podcast of the week
A data storyteller is someone who uses data to tell a story, which can include visuals, graphics, charts, and other forms of data representation.
Color is an essential component of data visualization and storytelling, and should be used intentionally and purposefully, rather than randomly or arbitrarily.
Using a limited color palette can create a cohesive design and prevent overwhelming the viewer.
Color contrast is important for readability and accessibility, particularly for those with visual impairments.
Different colors can have cultural or symbolic associations that should be considered when choosing color schemes.
Experimenting with different color combinations and testing them with the audience can help determine what works best.
Canva, Datawrapper, Tableau, Power BI, Click, R, Python, and Excel are tools that can be used for data visualization.
Thanks for reading DSBoost! Subscribe for free to receive new posts and support my work.
🧵 Featured threads
🤖 What happened this week?
Microsoft Research has released BioGPT, an LLM trained on biomedical research literature. It achieves better-than-human performance on answering biomedical-related questions.
New developer program Foundry to be released by OpenAI. It will allow customers to run OpenAI model inference at scale with dedicated capacity.
OpenAI suggest in this article that their systems are getting closer to AGI, so they need to be increasingly cautious with the creation and deployment of their models.
👥 Under the radar
Here are a few words from Zeng personally:
Hey there, I'm Zeng and I'm an AI artist which basically means I create amazing art using the power of artificial intelligence. My passion for art led me to launch femmestock, a monthly subscription service for feminine AI images that are perfect for social media. With femmestock, I'm on a mission to bring beauty and creativity to the digital world. I'm also the co-creator of the WhoWhatWhyAi newsletter with my partner Brian, where we explore the fascinating and ever-evolving world of AI. When I'm not busy creating art or curating the latest AI trends, you can find me learning no-code tools and working on different no-code projects. Let's connect and explore the world of AI together.