Hi, I’m Xiaoya Lin(林小雅), an aspiring researcher currently pursuing my Bachelor of Electrical and Electronic Engineering (Year 3) at Nanyang Technological University. My research interest centers around Generative Visual Forensics & Deepfake Detection, Image Restoration and Visual Anomaly & Robust Generative Modeling.

You can find my CV here: LinXiaoya’s Curriculum Vitae.

Email / Github / LinkedIn / Wechat

🎓 Education

Nanyang Technological University, Singapore
Bachelor of Electrical and Electronic Engineering
Aug 2023 – May 2027 (Expected)

  • Specialization in Data Science and Machine Learning
  • Dean’s List (2024)
  • NTU Science & Engineering Undergraduate Scholarship Recipient

💼 Research & Work Experience

GlobalFoundries

Data Scientist Intern (May 2025 – Dec 2025)
Research Focus: Intelligent Data Engineering for Scalable Analysis and Visual ML

Project 1: Scalable Trace Data Pipeline & Compression for Semiconductor Analytics

  • Designed and optimized modular pipelines using AWS S3, Boto3, and PySpark for multi-month trace data retrieval and analysis.
  • Investigated and deployed Parquet + Snappy compression techniques to enhance computational efficiency and data fidelity.
  • Achieved major scalability milestones: extraction time reduced from hours to 4–8 mins, storage reduced by up to 90%, and eliminated memory crashes on large-scale workloads.

Project 2: Trace-to-Image Machine Learning Pipeline

  • Conducted a literature survey and pilot study on time-series-to-image conversion and visual classification for semiconductor trace data.
  • Developed a multi-phase pipeline converting segmented traces into images (e.g., Gramian Angular Fields, line charts), followed by RNN baseline model training and hyperparameter tuning using Optuna and Keras-Tuner.
  • Delivered a reproducible ML pipeline with ≥5% accuracy improvement, validated through internal presentations and journal deliverables.

A*STAR

Healthcare Data Pre-Processing Research Intern (Jan 2025 – Apr 2025)

  • Preprocessed and anonymized facial healthcare datasets for AI validation
  • Applied feature preservation techniques aligned with model optimization
  • Supported research in compliance with national healthcare standards

Classbro (Shanghai DAOBI EdTech)

Data Science & ML Instructor (Jun 2024 – Dec 2024)

  • Developed and delivered lectures on core ML algorithms and SQL
  • Designed course content that contributed 10,000 SGD+ in growth

🧠 Skills & Technologies

  • Programming Languages: Python, R, C
  • Data Science Tools: Pandas, NumPy, Scikit-learn, TensorFlow, XGBoost
  • Cloud & Big Data: AWS S3, Boto3, PySpark
  • Databases: MySQL, MongoDB, NoSQL
  • Software & Tools: Jupyter Lab, Figma
  • Certifications: Bloomberg Market Concepts, Coursera ML & Python modules

🔬 Research Projects

URECA Project – AlzCare Smart Watch (Aug 2024 – Present)

  • Leading AI wearable development for dementia care
  • Implementing behavioral monitoring, geofencing, and fall detection models
  • Engaging with SG Jamiyah & SG DementiaHub for user-centered validation
  • Applied ensemble ML models to Kaggle data
  • Achieved 91% accuracy via TensorFlow neural network optimization
  • Delivered business insights through structured analytics and visualization

EV Transition Database Analysis (Aug 2023 – Nov 2023)

  • Used MySQL & MongoDB to assess EV infrastructure in Singapore
  • Proposed strategic recommendations for sustainable transportation

🌱 Co-Curricular Activities & Leadership

  • NTU IET, Liaison Manager (Jul 2024 – Present)
  • NTU Investment Interactive Club, Member (Jul 2024 – Present)
  • CFLS-MUN, Academic Director (Dec 2019 – Dec 2021)
    • Chaired the North-East MUN Conference with 1,000+ attendees
    • Led delegations to World MUN and Yale MUN
    • Trained 100+ students in debate strategy and resolution writing