I am a M.S.E. Student in Data Science of AMS (Department of Applied Mathematics and Statistics) at the Whiting School of Engineering, Johns Hopkins University. My primary interests lie in Graph Neural Networks (GNN) and Large Language Models (LLMs). I have hands-on experience with Python, TensorFlow, and PyTorch, and have worked on several projects involving Time Series Analysis, LLMs and GNNs.
Currently, I am actively seeking full-time opportunities in the fields of Time Series Analysis, LLMs, GNNs, AI, and Data Science, where I can apply my knowledge and skills to real-world challenges. I am particularly interested in roles that allow me to contribute to the development and deployment of advanced AI models. Please feel free to connect with me via email for job opportunities.
đź“– Education
- Sep 2023 - May 2025 (Expected), M.S.E. in Data Science, Johns Hopkins University, Baltimore, MD, U.S.A.
- Sep 2018 - June 2023, Bachelor in Data Science, Drexel University, Philadelphia, PA, U.S.A. *
- Sep 2018 - June 2022, Bachelor in Computer Science, Lanzhou University, Gansu, China *
* I participated in a joint cooperative program and hold bachelor’s degrees from both Lanzhou University and Drexel University
đź’» Work Experience
Health-Union, LLC | Philadelphia, PA
Sep 2022 - Mar 2023 | Data Co-op
- Responsible for data management and data acquisition using Snowflake.
- Conducted data monitoring, identified the cause of abnormal data, and found corresponding solutions.
- Applied Temporal Convolutional Network (TCN) models to accurately predict the click-through rates on the Health-Union website, enhancing both model performance and predictive precision.
AchievAI Technology Co., Ltd. | Guangdong, China
Dec 2020 - Sep 2022 | Python Programmer (90% remote)
- Developed algorithms for image processing and machine learning, successfully extracting precise annual ring data from noisy wood texture images, enhancing accuracy and robustness.
- Conducted research on generating high-quality wood texture images with GANs.
- Developed a Python plugin on Linux for converting 3D wood models (.stl) to high-resolution .tiff images, significantly enhancing plugin functionality.
Newland Digital Technology Co., Ltd. | Fujian, China
Aug 2020 - Sep 2020 | Python Programmer
- Conducted market research on community security systems, integrating big data analysis to produce a comprehensive report.
- Developed Python web crawlers using Requests and Beautiful Soup, processing data from over 1,000 web pages for efficient data collection and storage.
- Implemented automated anti-scraping techniques, including user behavior simulation and dynamic IP proxy, improving data extraction accuracy and stability.
🎖 Technical Skills
- Language: Python, C, SQL, Matlab, R, JavaScript, PHP, Bash, Lisp
- Frameworks: PyTorch, Tensorflow, LangChain, Scikit-learn, OpenCV, NetworkX, Selenium
- Deep Learning Models: LLMs, GRNNs, Graph Transformer, GAN, ResNet, TCN, LSTM
- Machine Learning: upport Vector Machines, Naive Bayes, Decision trees, Logistic Regression
đź“ť Publications
- Earthquake Magnitude Estimation Using Gated Graph Recurrent Neural Networks, Yantian Ding, Luana Ruiz, ICASSP 2025 (Under Review)
- HIV-AICare: A Domain Knowledge-guided Reinforcement Learning Approach for Optimizing Antiretroviral Therapy in People with HIV, Dapeng Yao, Wei Jin, Yao Zhao, Luis Parra-Rodriguez, Jane O’Halloran, Raha Dastgheyb, Zhengling Qi, Yantian Ding, David B. Hanna, Andrea Norcini-Pala, Amanda B. Spence…, Nature Medicine (Under Review)
- Phonocardiogram(PCG) Murmur Detection Based on the Mean Teacher Method, Yi Luo, Zuoming Fu, Yantian Ding, Xiaojian Chen, Vishal Patel, Kai Ding, Sensors (accepted)
- Ground Based-cloud Classification based on Comparing Different Classification Models, Tao Zhang, Yantian Ding, Wangjiang Gong, Zeyu Hou, CONF-CDS 2022
🎙 Research Projects
HIV-AICare Web Chatbot and Website Development | Research in Johns Hopkins University
May 2024 – Current | Advised by Prof. Yanxun Xu
- Co-author of the paper “HIV-AICare: A Domain Knowledge-guided Reinforcement Learning Approach for Optimizing Antiretroviral Therapy in People with HIV” (Under Review).
- Developed a chatbot for the HIV-AICare website based on MedAlpaca and RAG technologies, assisting users in interpreting web model calculation results and navigating the platform, significantly enhancing user engagement and system intelligence.
- Scripted in Python to efficiently gather and process necessary data for the project, leveraging Selenium to automate interactions with web pages.
Earthquake Dataset for Graph Neural Network | Research in Johns Hopkins University
Oct 2023 - Current | Advised by Prof. Luana Ruiz
- First-author of the parer “Earthquake Magnitude Estimation Using Gated Graph Recurrent Neural Networks” (Under Review).
- Built and optimized a seismic time series dataset (approximately 295GB) from New Zealand’s GeoNet, and simultaneously designed and implemented preprocessing workflows, including data cleaning, graph structure construction, and data format conversion, ensuring compliance with the input requirements of GRNN models.
- Refactored GRNN code using PyTorch Geometric (PyG), significantly improving performance and efficiency, achieving a roughly 20\% increase in accuracy compared to traditional Graph Transformer.
CALEX Extraction System | Graduation Design in Drexel University
Sep 2022 - Jun 2023 | Advised by Prof. Hegler Tissot
- Managed the development and coding of the four-layer CALEX Extraction System, enhancing the processing of natural language texts for temporal data annotation.
- Assisted in dataset annotation, focusing on accuracy and efficiency to support applications in various domains.
- Utilized rule-based programming to standardize temporal expressions, improving the system’s scalability and reliability across its four layers.
Traffic Accident Prediction based on Neural Network | Graduation Design in Lanzhou University
Dec 2021 - Jun 2022 | Advised by Prof. Longjie Li
- Built a Graph Residual Neural Network incorporating an Attention Mechanisms model with a prediction accuracy of 92.42% on unseen data.
- Collected and collated accident data, speed data, and road data from NYC Open Data, NOAA, Uber Movement, and Open Street Map in Brooklyn.
- Developed a traffic accident severity forecasting model for the Brooklyn area with Python.
Ground Based-cloud Classification based on Comparing Different Classification Models
May 2021 - Aug 2021 | Advised by Prof. David Woodruff
- Co-first-author of the paper “Ground Based-cloud Classification based on Comparing Different Classification Models”. Lecture Notes on Data Engineering and Communications Technologies book series (CONF-CDS 2022). Accepted in March 2022.
- Preprocessed around 15,000 cloud images in 5 types in the same format and pixels using LBP.
- Implemented four classification models—KSVM, MLP, Custom Vision and ResNet to compare each classification’s accuracy.
Raspberry Pi-Based Facial Recognition System
July 2020 - Apr 2021 | Advised by Prof. Qiming Liu
- Designed a smaller facial recognition machine system, equipped with a trained face recognition algorithm.
- Developed a well-trained machine learning model with high accuracy and response speed.
- Built and debugged for the Raspberry Pi hardware system.
🏆 Honors and Awards
- A.J. Drexel Scholarship, Fall 2021
- Overseas Exchange Scholarship, Lanzhou University, 2021
- Overseas Exchange Scholarship, Lanzhou University, 2022