Bioinformatics

What I do

I build reproducible bioinformatics workflows for microbial ecology and multi-omics—spanning metabarcoding, whole-genome sequencing, metagenomics, metatranscriptomics, and viromics—transforming raw sequencing data into interpretable ecological and functional insight.


Core Areas

Amplicon Sequencing and Analysis
16S / 18S / COI / ITS
  • Study design and experimental planning (controls, contamination mitigation, metadata structure)
  • Read QC, primer/adapter trimming, denoising (ASVs / OTUs)
  • Taxonomic assignment and community profiling
  • Diversity and ecological statistics (alpha, beta diversity, ordination, hypothesis testing)
  • Publication-ready visualization and fully reproducible reporting
Genomics (Isolates / WGS)
Short-read • Long-read • Hybrid

Short-read genomics

  • QC → assembly → polishing → assessment
  • Genome annotation and functional characterization
  • Comparative genomics and phylogenomics
  • Plasmid identification and analysis

Long-read genomics

  • Nanopore read QC and assembly strategies
  • Polishing, validation, and circularization checks
  • High-contiguity genome reconstruction
  • Structural variation and genome architecture assessment

Hybrid assembly (short + long reads)

  • Hybrid assembly design and troubleshooting
  • Optimization of assembly parameters
  • Quality assessment and standardized reporting of assembly metrics
  • Integration into downstream comparative analyses
Metagenomics
Shotgun community sequencing
  • Read QC, decontamination, and host-removal strategies
  • Taxonomic profiling and functional profiling
  • Genome-resolved metagenomics: assembly → binning → QC → dereplication
  • Coverage-normalized abundance estimation across samples
  • Metabolic modeling and pathway reconstruction
Metatranscriptomics
RNA-based activity profiling
  • RNA read QC and preprocessing strategies
  • Mapping-based quantification and transcript abundance estimation
  • DNA–RNA coupling analyses (activity vs potential)
  • Differential expression workflows (when appropriate)
  • Functional activity interpretation within ecological context
Viromics
Viral community genomics
  • Viral contig identification and QC (virome-aware pipelines)
  • vOTU clustering and viral taxonomy assignment
  • Auxiliary metabolic gene (AMG) identification
  • Viral diversity and abundance profiling
  • Viral–host comparative analysis
Functional Screening and Genome Mining
AMR • Biosynthetic potential • Synthetic communities
  • Antibiotic resistance gene profiling (genomes and metagenomes)
  • Screening for biosynthetic gene clusters and production-associated genes
  • Trait-based genome mining to support synthetic community development
  • Functional complementarity analysis and candidate selection
Applied Statistics & Modeling
Ecological modeling • Ordination • Resilience frameworks
  • Community ecology statistics (Bray–Curtis, clustering, diversity metrics)
  • Constrained ordination (RDA / db-RDA) and PERMANOVA
  • Modeling relationships between environmental gradients and community structure
  • Ecosystem resilience modeling (e.g., functional redundancy frameworks)