Qianyu Zheng
Undergraduate Student, Researcher (ML)
I primarily work on applying computational methods, specifically machine learning, in scientific discoveries of natural science. My current research focuses on leveraging big data mining, machine learning, and high performance computing in computational modeling to answer scientific questions and empower data-driven decision making.
During my Bachelor studies and internship, I have collaborated with researchers, scholars, and developers from Georgia Tech
, Leibnitz Institute of Plant Biochemistry
, and University of Maryland
.
Education
Aug. 2022 — May 2025 (Est.)
B.S. in Computer Science
Georgia Institute of Technology, Atlanta, GA
Current GPA: 4.00/4.00
Industry Research Experience
Summer 2024
Leibnitiz Institute for Plant Biochemistry, Halle (Saale), Germany
Research Scientist, Computational Chemistry
Developing and applying computational methods to study the structure and function of proteins and enzymes.
- Spearheaded protein engineering research, establishing scalable and visualizable strategies to investigate protein families of millions of sequences and extract meaningful insights via sequence similarity networks and graph analysis.
- Crafted biologically significant data splitting strategies with clustering and evolutionary algorithms, encouraging fair model evaluation and ensuring the reliability of research outcomes.
- Spearheaded protein engineering research, establishing scalable and visualizable strategies to investigate protein families of millions of sequences and extract meaningful insights via sequence similarity networks and graph analysis.
- Crafted biologically significant data splitting strategies with clustering and evolutionary algorithms, encouraging fair model evaluation and ensuring the reliability of research outcomes.
Academic Research Experience
Apr. 2023 — Now
Georgia Institute of Technology, Atlanta, GA
Undergraduate Researcher, Fung group, CSE
Advisor:
Researching the applications of Graph Neural Networks (GNNs) in material science, under the guidance of Assoc. Prof. Victor Fung at Georgia Tech.
- Conduct independent research to design more stable machine learning force fields for molecular dynamics simulation.
- Obtained proficiency in GNNs, PyTorch, deep learning, and research methodologies.
- Conduct independent research to design more stable machine learning force fields for molecular dynamics simulation.
- Obtained proficiency in GNNs, PyTorch, deep learning, and research methodologies.
Honors and Awards
2024
President's Undergraduate Research Award
For research project "Accurate and stable machine learning force field for crystal structures"
2022 - 2024
Faculty Honors
For academic excellence with GPA of 4.0
Certificates
April 2024
Microsoft Office Specialist: Excel Expert
Microsoft, MO-201
Expertise in Manage workbook options and settings, Manage and format data, Create advanced formulas and macros, and Manage advanced charts and tables in Excel.
April 2024
Microsoft Office Specialist: Excel Associate
Microsoft, MO-200
Expertise in Manage worksheets and workbooks, Manage data cells and ranges, Manage tables and table data, Perform operations by using formulas and functions, and Manage charts in Excel.
March 2024
AWS Certified Machine Learning - Specialty
Amazon Web Services, MLS-C01
Expertise in building, training, tuning, and deploying machine learning (ML) models on AWS.
March 2024
AWS Certified Cloud Practitioner
Amazon Web Services, CLF-C02
Foundational, high-level understanding of AWS Cloud, services, and terminology
Projects
August 2024 - Now
Designed a multimodal tool for flexible queries for human protein sequences in UniProt database.
- Leveraged LLM Llama 3 to generate text queries as training data, a CLIP model (BERT + ESM) in contrastive learning of protein sequence and user query embeddings.
- Developed a Flask application (Flask, HTML/CSS) deployed with AWS Fargate, ECR, ECS. Now live at nl2prot.org.
- Obtained experience in multimodal learning, LLM training, Cloud Computing, PyTorch, and deep learning.
- Leveraged LLM Llama 3 to generate text queries as training data, a CLIP model (BERT + ESM) in contrastive learning of protein sequence and user query embeddings.
- Developed a Flask application (Flask, HTML/CSS) deployed with AWS Fargate, ECR, ECS. Now live at nl2prot.org.
- Obtained experience in multimodal learning, LLM training, Cloud Computing, PyTorch, and deep learning.
August 2024 - Now
Participate in the Workout Of the Day (WOD) prediction project group at Data Science @ GT.
- Use Python to perform data cleaning and feature engineering pipelines for the downstream machine learning tasks.
- Leverage modern optimization libraries to design an automated hyperparameter search pipeline for modeling.
- Use Python to perform data cleaning and feature engineering pipelines for the downstream machine learning tasks.
- Leverage modern optimization libraries to design an automated hyperparameter search pipeline for modeling.
Jan. 2024 - May 2024
Developed a Machine Learning solution for audio classification in a group of 4.
- Utilized Python to preprocess audio data by audio signal processing and feature extraction (e.g. Mel spectrogram, STFT, chroma features).
- Finetune large vision models (ResNet, ViT) for genre classification with Mel spectrogram.
- Utilized Python to preprocess audio data by audio signal processing and feature extraction (e.g. Mel spectrogram, STFT, chroma features).
- Finetune large vision models (ResNet, ViT) for genre classification with Mel spectrogram.
Jan. 2024 - May 2024
Developed a diet management app in a group of 4 using Android Studio and Java.
- Designed a user-friendly interface for users to track their diet and nutrition.
- Implemented MVVM architecture, Design Patterns, Firebase for backend, and Unit Testing.
- Gained experience in Git CI/CD pipeline, Android Studio, Java, and software engineering.
- Designed a user-friendly interface for users to track their diet and nutrition.
- Implemented MVVM architecture, Design Patterns, Firebase for backend, and Unit Testing.
- Gained experience in Git CI/CD pipeline, Android Studio, Java, and software engineering.
Oct. 2023 - Nov. 2023
Explored the applications of Singular Value Decomposition (SVD) in image compression and signal denoising.
Oct. 2023 - Dec. 2023
Implement a pipeline to extract source data from various forms of scientific plots.
- Utilized YOLO and fineuned OCR to extract data from various forms of scientific plots.
- Deploy with Flask backend and HTML/CSS frontend.
- Utilized YOLO and fineuned OCR to extract data from various forms of scientific plots.
- Deploy with Flask backend and HTML/CSS frontend.
Feb. 2023
A tool tailored for PR teams of companies to analyze impact of tweets on stock prices.
- Trained a word2vec model for tweet analysis.
- Utilized GPT for generation of sample tweets.
- Deploy with Flask backend and Wix on Google Cloud.
- Trained a word2vec model for tweet analysis.
- Utilized GPT for generation of sample tweets.
- Deploy with Flask backend and Wix on Google Cloud.