Yiqun Hu

Results-driven data scientist with extensive experience in the development of complex neural networks for diverse applications, including image recognition and natural language processing. My versatility is demonstrated across finance and healthcare industries, showcasing adaptability and tailored solution delivery. Beyond data, I find exhilaration in rock climbing, an activity that hones my problem-solving and resilience – key assets for untangling complex challenges in data science.

LinkedIn  /  Github  /  Email

profile photo

Education
Johns Hopkins University,  Baltimore, MD
   Master in Information Systems,  2021-2022
Northern Illinois University,   DeKalb, IL
Master in Statistics,   2019-2021
- Business Analytics and SAS Scholarship
- Dissertation: Traffic Fatality Rate Prediction Based on Deep Neural Network and Bayesian Neural Network

Bachelor in Mathematics,   2015-2019
- Major in Probability & Statistics, Minor in Economics
Work Experience
Data Scientist
Johns Hopkins University Center for Digital Health & Artificial Intelligence (CDHAI), Baltimore, MD
- Deep Learning & Machine Learning in Healthcare.
Teaching Assistant
Johns Hopkins University Carey Business School, Baltimore, MD
- Graduate level Data Analytics Course (Nov. 2022 - Jan. 2023) taught by Prof. Mohammad Yazdi.
- Graduate level Cloud Computing with Hadoop Course (Jan. 2023 - March 2023) taught by Prof. Minghong Xu.
- Graduate level Machine Learning Course (March. 2023 - June 2023) taught by Prof. Gordon Gao.
Data Scientist
Ping An Technology, Shanghai, China
cs188 Supplemental Instruction Leader
- Math101 Fall 2018, supervised by Michael Mutersbaugh
Supplemental Instruction Leader
- Math103 Spring 2019, supervised by Jeanne Padilla
Patent
CN113190683A Enterprise ESG Index Determination Method Based on Clustering Technology and Related Product
Shizhuo Zhu, Xi Shao, Yiqun Hu
China Patent, 2021

The embodiments of this application provide a method which can distinguish original news and reprinted news among multiple news articles of a news event, improve the accuracy of ESG scoring by enterprises, and develop the precision of investment decisions.

Awards & Achievements
  • AWS Certified Machine Learning - Specialty, Nov 2023
  • HackerRank Advanced SQL Certificate, Feb 2024
  • Business Analytics & SAS Scholarship by Northern Illinois University College of Business OM&IS Department
  • Society of Actuaries Exam P Passed, Credential ID: 832335, July 2020
  • 2017 CSEE Cup National College Students' Electrical Math Modeling Competition: Second Award (Top15%)
  • Projects
    Boston House Prices Regression using Random Forest Regressor and XGBRegressor (Python Keras, sklearn)
    Supervised by Prof. Graeme Warren (JHU), 2022
    colab

    Led a team of two to conduct descriptive and predictive analytics on potential factors that could contribute to the value of owner-occupied homes at Boston using two regressors (the Random Forest Regressor and XGB Regressor).

    Traffic Fatality Rate Prediction Based on Deep Neural Network and Bayesian Neural Network
    Yiqun Hu, supervised by Prof. Duchwan Ryu (NIU)
    Master Dissertation, 2021
    ProQuest

    Bayesian Neural Network model works better than general Deep Neural Network model on traffic fatality rate prediction

    Wine Quality Prediction with Statistical Learning (Python)
    Supervised by Prof. Lei (Larry) Hua (NIU), 2019

    Implemented EDA in Python to explore the influence of 11 predictors on wine quality by analyzing heavy datasets collected from UCI Machine Learning Depository.

    Predicted wine quality data by using methods of Linear Regression, KNN, Random Forest, Ridge & Lasso Regression, and optimize the final Random Forest model to improve the prediction accuracy.

    Factors Affecting the Number of Prescription Drugs by Longitudinal Regression Analysis (SAS)
    Supervised by Prof. Chaoxiong (Michelle) Xia (NIU), 2018

    Led a team of 4 students to construct models to investigate the relationship between Prescription Medicines and 15 independent variables based on 4000+ data records from The LSOA II Wave 2 Survivor data (CDC).

    Utilized SAS to build multiple regression models (Binomial, Normal, Negative Normal, Poisson, Forward Logit, Backward Logit) to get the best fit model and visualize the analyses in a 20-page paper.

    Monthly Milk Production Forecast by Time Series Analysis (R)
    Supervised by Prof. Lelys Bravo de Guenni (UIUC), 2018

    Consolidated and cleaned the datasets of monthly milk production (pounds per cow) from January 1962 to December 1975.

    Completed the data visualization, model building, and residual analysis in regression and predicted the data for the next 10 years with ARIMA model.

    CN-ESG (Environmental, Social, Governance) Evaluation Framework
    Supervised by Dr. Shizhuo Zhu (NYU Shanghai), 2020 - 2021

    Collaborated with cross-functional teams to build a distinctive CN-ESG evaluation framework including 24 themes, 154 secondary indicators, and 43 industry indicators using a variety of Machine Learning techniques and integrated ESG score into FactSet dataset.

    Performed sentiment analysis with Deep Learning using BERT on over 20 million public opinion data and improved the model performance (F1 score increased by 3%) by fine-tuning.

    Developed RESTful APIs for trained ML models deployment into Linux production environment (Python, Flask, CentOS).


    National Park Biodiversity Data Visualization using R shiny (Course Project)
    Supervised by Prof. Mohammad Ali Alamdar Yazdi (JHU), 2022

    webpage | code
    Data visualization and statistics for providing information on the presence and status of species from park to park.

    Web development for Xiaomi™ Smart Home Appliances (Course Project)
    Supervised by Mr. Joseph Demasco (JHU), 2021

    code
    Designed, developed and deployed the website of Xiaomi smart home appliances with WordPress, HTML, CSS, JavaScript, PHP and SQL.

    Volunteer Experience
    cs188 Photojournalist, Northern Star Press


    © Yiqun Hu | Last Update: March 2024