Course 1: Computational Methods and Machine Learning for Medical Research
This course focuses on advanced predictive modeling, from classical machine learning to deep learning, with a strong emphasis on medical applications.
- Regression vs. Classification
- Parametric, Non-Parametric, and Semi-Parametric Models
- Bias-Variance Tradeoff & Model Fitting (Overfitting, Underfitting)
- The Curse of Dimensionality
- Model Training, Testing, and Cross-Validation Strategies
- Model Accuracy Metrics (for classification and regression)
- Bootstrapping and Permutation Testing (for t-statistics, z-statistics)
- Regression Models: Linear/Multiple Linear, Polynomial/Splines, Regularization (Lasso, Ridge, Elastic Net)
- Classification Models: Logistic Regression, Support Vector Machines (SVMs), Decision Trees & Random Forest, Generalized Additive Models (GAMs)
- Dimensionality Reduction: Principal Component Analysis (PCA)
- Data Visualization and Embedding: t-SNE, UMAP
- Clustering Algorithms
- Feature Selection and Engineering Techniques
- Hyperparameter Tuning and Optimization
- Model Selection and Comparison Strategies
- Model Interpretation and Feature Importance (e.g., SHAP, LIME)
- Building and Interpreting Nomograms
- Deep Learning Basics: The Gradient Descent Algorithm and Optimizers
- Introduction to PyTorch for Medical AI
- Neural Network Architectures: FNNs, CNNs, RNNs & Transformers
- Medical Imaging and Computer Vision: Image pre-processing and analysis
- Natural Language Processing (NLP): Using Large Language Models (LLMs)
- Specialized Applications: Neural Signal Decoding (EEG), Genetic Analysis
- Using Pre-trained Models and Transfer Learning
- Creating and Deploying Interactive Models with Shiny
Course 2: Scientific Research Methodology and Medical Statistics
This foundational course covers the entire research lifecycle, from formulating a question to publishing results, including the essential statistical methods required.
- Research Philosophy and the History of Evidence-Based Medicine
- Paradigms: Positivist, Interpretivist, Pragmatist, Critical
- Hierarchy of Evidence and Quality Assessment
- Quantitative, Qualitative, and Mixed-Methods Research Designs
- Secondary Research Designs (Intro to Systematic Reviews, Meta-Analyses)
- Genesis of a Strong Research Question: FINER Criteria
- The PICO(T) Framework for Clinical Questions
- Operationalizing Questions into Variables and Hypotheses
- Literature Review and Identifying Research Gaps
- Writing a Study Protocol: Aims, Methods, Timeline
- Data Collection Planning & Data Management (HIPAA)
- IRB/Ethical Approval and Informed Consent
- Protocol Registration (e.g., ClinicalTrials.gov)
- Sampling: Probability and Non-Probability Methods
- Bias: Selection, Information, Confounding, and Observer Bias
- Strategies to Avoid Bias: Randomization, Blinding, Matching
- Sample Size Calculation: Power, Effect Size, Type I/II Errors
- Descriptive Statistics & Normality Testing
- Univariate Analysis (Parametric and Non-parametric tests)
- Categorical Data Analysis (Chi-square, Fisher's Exact Test)
- Multivariate Analysis (Multiple Linear & Logistic Regression)
- Statistical Inference: P-values, Confidence Intervals, Hypothesis Testing
- Causal Inference: Bradford Hill Criteria, DAGs, Propensity Scores
- Clinical Trials: Phases (I-IV), Adaptive Designs
- Quality Assessment: CONSORT, STROBE, GRADE System
- Research Integrity: Identifying Bad Research, Scientific Misconduct
- The Scientific Peer Review System
- Scientific Writing: Manuscripts, Case Reports (CARE guidelines)
- Grant Writing and Research Funding
- AI as an Intelligent Research Assistant
Course 3: Systematic Review and Meta-analysis with AI
A specialized course on evidence synthesis, integrating modern AI tools to streamline the review process from search to publication.
- Overview and History of Evidence Synthesis (Cochrane)
- Types of Reviews: Intervention, Diagnostic, Prognostic
- PRISMA 2020 Reporting Guidelines
- Statistical Foundations: Fixed vs. Random Effects, Heterogeneity (I²), Forest Plots
- Scientific Databases (PubMed, Embase, Cochrane) & Grey Literature
- Advanced Search Strategy Development (MeSH, Boolean operators)
- AI-Enhanced Search: Semantic search, automated query expansion
- Screening Tools: Manual (Rayyan) and AI-Assisted (ASReview)
- Managing the PRISMA Flow Diagram
- Traditional vs. AI-Assisted Data Extraction (NLP, NER)
- Risk of Bias Assessment Tools: RoB 2 (RCTs), ROBINS-I (Observational)
- Plotting and Visualizing Risk of Bias (Traffic light plots)
- Pairwise and Single-Arm Meta-Analysis
- Network Meta-Analysis (NMA)
- Diagnostic Test Accuracy Meta-Analysis (HSROC)
- Meta-Regression, Subgroup Analysis, and Trial Sequential Analysis
- Publication Bias Assessment (Funnel Plots, Egger's Test)
- Sensitivity Analysis (Leave-one-out)
- The GRADE Approach for Assessing Certainty of Evidence
- Agentic Frameworks: Coding (R: meta, metafor; Python: PyMeta) vs. No-Code Platforms
- Writing the Results and Preparing the Manuscript for Publication
Course 4: Survival Analysis and Individual Reconstructed Data Meta-analysis
This course delves into time-to-event analysis and the advanced technique of reconstructing and pooling individual patient data from published studies.
- Time-to-Event Data, Censoring, Survival and Hazard Functions
- Data Structure for Survival Analysis
- Kaplan-Meier Curves and Survival Probability Estimation
- Estimating Median Survival Time
- Comparing Survival Between Groups: Log-Rank Test
- Interpreting Hazard Ratios
- Building Multivariate Cox Regression Models
- Assessing the Proportional Hazards Assumption
- Time-Dependent Covariates & Landmark Analysis
- Competing Risks Analysis (Fine-Gray model)
- Conditional Survival and Smooth Survival Plots
- Parametric Survival Models (Weibull, etc.)
- Advantages of IPD over Aggregate Data
- One-stage vs. Two-stage IPD Meta-analysis
- Digitizing Kaplan-Meier Curves (Guyot Algorithm)
- Reconstructing IPD using R/Python tools
- Validating and Assessing Quality of Reconstructed Data
- Performing Meta-Analysis with Reconstructed IPD
Course 5: R and Python Programming for Medical Research
A practical, hands-on course designed to equip researchers with the fundamental programming skills needed for data analysis in R and Python.
- R Basics: Data Types, Functions, Control Structures
- Data Engineering with the Tidyverse (dplyr, tidyr)
- Data Visualization with ggplot2
- Python Basics for Data Science
- Data Manipulation with Pandas and NumPy
- Introduction to PyTorch for Deep Learning
- Data Pre-processing and Handling Missing Data (Imputation)
- Data Transformation and Encoding
- Balancing Data: Upsampling, Downsampling, Bootstrapping
- Descriptive Statistics and Normality Testing in R/Python
- Implementing Parametric and Non-parametric Tests
- Bootstrapping for Confidence Intervals
- Permutation Tests
- Using Models like Gemini for automated data extraction
- Using Models like BioMistral for prognostic tasks
Course 6: Retrospective Databases Data Analysis in R and Python
This course provides a practical guide to working with large, real-world retrospective databases like SEER and NIS, covering the entire workflow from data extraction to advanced analysis.
- Overview of Major Databases: NSQIP, SEER, NIS
- Data Extraction Techniques and Data Dictionaries
- Ethical and Methodological Challenges
- Data Filtering and Subsetting Techniques
- Advanced Data Cleaning Strategies
- Dealing with Missing Data in Large Datasets
- Developing a Statistical Analysis Plan (SAP)
- Controlling for Confounding
- Propensity Score Matching (PSM)
- Propensity Score Weighting (PSW)
- Assessing Covariate Balance After Matching
- Applying Multiple and Logistic Regression Models
- Model Building and Selection in a High-Dimensional Setting
- Interpreting Results from Observational Database Studies