DWG-001 / Portfolio
Computer Vision · Weakly Supervised Learning
Campinas, BR · 2026
GabrielGutierrez

Making computer vision work in the real world,
where labels are noisy, annotations are scarce,
and ground truth is a moving target.

Role
PhD Candidate in Computer Science
Institution
Unicamp — Campinas, BR
Focus
Weakly Supervised Semantic Segmentation
Publication
Geophysical Prospecting · EAGE 85th
Framework
discovery-unicamp/Minerva
Background
NLP → Vision · Supervised → Weakly Supervised
01 — About

Who I am

Background
I'm a PhD candidate in Computer Science at Unicamp and a software engineer by training. The two things I do most seriously are research and engineering. In my work they're deeply intertwined rather than separate tracks.
Research focus
I study weakly supervised semantic segmentation, teaching computer vision models to understand images without painstaking pixel-level annotation. Real-world labels are expensive and imprecise, so I focus on methods that learn from noisy, incomplete, or inaccurate labels. This is especially meaningful in domains like geoscience, where expert annotation is both scarce and inherently subjective.
Through-line
My path runs from NLP to vision, supervised to weakly supervised, seismic facies to general segmentation. I work extensively with transformer architectures (ViT, SegFormer, SETR, MiT, BERT) and CNN-based models (DeepLab, U-Net). My engineering background means I approach research with a systems mindset, and my research means I build tools shaped by real scientific needs, not abstract requirements.
02 — Research

Projects

01 /

Weakly Supervised Semantic Segmentation

PhD research on training segmentation models under weak supervision, when ground truth annotations are noisy, incomplete, or generated automatically. Current focus on pseudo-label generation and refinement strategies that allow better use of scarce labeled data without sacrificing model reliability.

Computer VisionWeak SupervisionPseudo-labelsSemantic Segmentation
In Progress
02 /

Transformer Architectures for Seismic Segmentation

A systematic comparison of transformer-based segmentation architectures applied to seismic facies data, bridging state-of-the-art vision models and the practical demands of geoscientific interpretation. The study addresses model performance and the inherent ambiguity of expert annotation in subsurface data.

Seismic FaciesSegFormerSETRGeoscience
Published in Geophysical Prospecting (peer-reviewed)
Presented at the 85th EAGE Annual Conference
Published
03 /

Fake News Detection with BERT

Two years as a junior researcher applying transformer-based NLP to automated misinformation detection. This was where I first got serious about machine learning and established the methodological foundation, from supervised NLP to the weakly supervised vision work that followed.

NLPBERTTransformersMisinformation
Completed
03 — Open Source

Minerva

Software Architect & Core Maintainer
Training infrastructure for researchers who need reproducibility.

Minerva fills the gap between raw PyTorch and production MLOps, the space where researchers burn time on glue code. Concrete, opinionated classes so experiments can be built and reproduced, not assembled from scratch each time.

The architecture is layered: readers, datasets, data modules, and pipelines, each with a defined contract to extend. The standout decision is FromPretrained: Minerva wraps messy SSL checkpoint transfer into a constructor-compatible class with regex filters and a rename map, composing cleanly with YAML configs.

Aimed at graduate researchers in applied deep learning: time-series, computer vision, limited-label domains. Not a production platform. Engineering for science means reproducibility and honest defaults over throughput.

01
PyTorch Lightning foundation — structured training loops with researcher-first configurability and a clean path to evaluation pipelines
02
Layered composable architecture — readers, datasets, data modules, and pipelines each have defined contracts and base classes to extend
04
YAML-driven reproducibility — every parameter serializable; collaborators can reproduce any experiment with a single CLI command
05
SSL model catalog — LFR, TF-C, and SimCLR-style modules for representation learning on limited-label data
03
FromPretrained — composable checkpoint loading with regex filters and rename maps for clean transfer from SSL runs
04 — Technical Depth

Skills

Vision Models
  • ViT (Vision Transformer)
  • SegFormer
  • SETR
  • Segmenter
  • MiT (Mix Transformer)
  • DeepLab Family
  • U-Net
NLP & Language
  • BERT
  • Transformer architectures
  • Fine-tuning strategies
  • Text classification
Research Methods
  • Weakly supervised learning
  • Pseudo-label generation
  • Systematic comparison
  • Geoscience applications
  • Seismic segmentation
Engineering
  • PyTorch + Lightning
  • Python (primary)
  • OSS framework design
  • Software architecture
  • Modular ML pipelines