Daniel Tobon Collazos

About me

I currently work at XmartLabs as a Machine Learning Engineer, where I contribute to projects involving embedding generation for e-commerce recommendation systems using image and text features.

From October 2022 to September 2024, I worked at AB-InBev as an MLOps Engineer. I led a company-wide Inner Sourcing initiative that produced 62 reusable CI/CD workflows to accelerate delivery pipelines, cutting pipeline setup time by 40%. I also deployed and maintained microservices in Kubernetes using Terraform, collaborated globally with DevOps and Data Science teams, and developed frameworks for API configuration and asynchronous data ingestion with PySpark and asyncio.

I worked until December 2021 as a Machine Learning Scientist at AITuring supervised by Diego Ismael León Nieto. My role involved re-factorizing the current machine learning pipeline.

I worked until December 2020 as a researcher in electronics, embedded systems, and hardware design at IMaR Technology Gateway supervised by Krishna Panduru. My role involved IoT and embedded systems for industry applications.

Obtained my BSc in Mechatronics Engineering from the Universidad Autonoma de Occidente (supervised by Victor Romero). My degree project involved developing a robotics perception system to estimate geometric features in a tree.

Grew up in Santiago de Cali, Colombia in 1994; moved to Tralee, Ireland in 2019 after finishing my undergraduate. Moved back to Colombia in December 2020.

Publications & Research

A photogrammetric system for dendrometric feature estimation of individual trees
Daniel Tobon Collazos
IEEE Colombian Conference on Robotics and Automation (CCRA).

Work Experience

Machine Learning Engineer

Collaborated with LEVIs as a Machine Learning Engineer to generate 800 image and text embeddings of size 1024 for recommendations in an e-commerce platform using cosine similarity with 100 clothes images.

MLOps Engineer

Worked on global-scale MLOps initiatives with AB-InBev, improving infrastructure and engineering practices:
* Created an inner-source project with 62 reusable workflows, cutting pipeline setup time by 40%
* Managed Kubernetes clusters with Terraform, deploying 3 microservices
* Collaborated with 6 DevOps engineers, 10 data scientists, and 7 stakeholders on a global project
* Developed an async post module with PySpark to handle 500 requests/sec to an external REST API
* Implemented unit/integration testing framework for Databricks
* Built a config manager using OmegaConf for FastAPI + SQLAlchemy-based services

Machine Learning Scientist

Typically work as a member of the AI team that focuses on researching, building and designing self-running ML systems and tools to detect and classify objects. Identify data distribution that affects model performance running machine learning debugging and using results to improve models, developing architectures according to client goals requirements.
* Re-factorized deep learning models for object detection and classification
* Applied resources in the Google Cloud Platform to test ML pipelines such as Cloud functions, Vertex AI, Buckets
* Improved evaluation metrics such as confusion matrix, accuracy, iou
* Proposed CV alternatives to current base-line
* Doing research on MLOps proposals
* Enabled custom metrics on Tensorboard

Research engineer embedded systems/electronics/hardware design

Under Krishna Panduru guidance, I played a key role in the prototyping laboratory with short-term projects 2-3 months, building, deploying, researching, coding, and documentation that runs on embedded systems to analyze sensors data.
* Managed a vision system for quality inspection using OpenCV
* Demonstrated an interface framework for a ZR300 Intel Realsense camera in PCL and ROS
* Maintained and demonstrated a data science and Industrial IoT project for a human-machine interface to automate quality inspection in biomedical industry
* Adjusted a RFID project to read an animal tag using the RFIDRW-E-TTL with a nRF5 SDK 16.0
* Completion of a CMake project for the nRF52 SDK to program an nRF52832 using JLink
* Developed, modified and debugged an IoT application for getting the strain deformation data of a gauges sensor using an ESP32 micro-controller and the esp-idf framework.

Selected Projects

Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm

3D Dendrometric feature estimation of an individual tree with OpenMVG-PMVS2

This project is a photogrammetric system for dendrometric feature estimation of individual trees. The purpose of this project is to do a 3D reconstruction of an individual tree using Open Multiple View Geometry (openMVG) and get dendrometry estimation (diameter at breast height (DBH), tree crown height, total tree height, crown volume, morphic factor and percentage canopy missing) of a stem tree

pointcloudToMesh

C++ application to convert pcd file, ply file, txt file or xyz point cloud to MESH representation (Gp3).

See Also