02 : About me

Introduction

Hello, My name is Yang Xiao. First of all, welcome and thank you for visiting my personal page.

I graduated from Baruch college in MS statistics in 12/2020. I am a self-motivatied data guy who wants to use what I have learned in college to solve real life problems to help people or create value for company.For example, Data Visualization, Hypothesis Test, Model building,validation and selection, NLP, ML technique,etc

I like cooking. I usually search for a interesting recipe online and spend hours on completing it, that makes me feel a sense of accomplishment. And I feel so happy to share the food I made with others(most of them taste great....)

So I am gonna show you the projects I have done in shool, work, my practice,and also I will show you my cooking skill!(the page is still under working...😉)

Work Experience

Sep 2020 -- Present

Data Scientist - Prospect 33

I am working as a data scientist on email project. This project aimed to make it easier for manager to take a deep insight on their emails data and manager their meeting invitations.

  • Fetched email data from company accounts by Gmail API daily automatically.
  • Updated invitation information by Calendar API daily automatically.
  • Used NLP technique to parse the email data and saved them as CRM database.
May 2020 -- Nov 2020

Data Scientist Intern - Covid19 Simulator - Prospect 33

Engaged as a Data Scientist (Internship) to support the development of an AI-driven modeling tool to help predict COVID-19 spread-rates throughout the US. The Covid County Simulator (covidcountysim.org) was developed to enable decision-makers responsible for local response to make sound, data-driven policies and for citizens to have a tool to draw attention to potential major outbreaks in their communities. The tool’s primary purpose is to run simulations of the infection in the community to better measure the impact of limiting/restricting movement

  • Used Python to create data pipelines to extract and clean raw data from disparate sources.
  • Built SIR model to predict and simulate the spread of covid19 by every county in the USA.
  • Tested several methods before landing on Ridge regression as the method proved the best.
  • Utilized Flask and Dash to design a dashboard to use with the model.
Jan 2017 -- Mar 2017

Data Analyst Intern - China DADI Insurance Co.

  • Calculated premiums and established payment method for clients.
  • Learned new products and services, received technical assistance in developing new accounts.
  • Customized insurance programs to individual customers, covering a variety of risks.
  • Interviewed prospective clients to collect data about their financial resources and needs, the physical condition of the person or property to be insured.
May 2016 -- Jul 2016

Data Analyst - HuNan Province Government.

With the condition of air were getting worse because of pollution in Hu Nan province, our team got a contract job from government to gather some relative data to find contributions that did significant effect on air pollution. In the end, drew a conclusion by writing a report to government.

  • Analyzed data, determined pollution contributions and did prediction using SPSS and R
  • Wrote reports and made suggestions to government based on analytical outcomes.

Education

Sep 2013 -- Jul 2017

BS in Statistics

Central South University

Jan 2019 -- Dec 2020

MS in Data Science

Baruch College

Jan 2021 -- Present

Self-learning

From work,book and internet

Language

English Fluent

Chinese Manderin Native Language

Social Network