Juan de la Cruz Programming • Data Science • Mathematics

About Me

Hello, my name is Juan de la Cruz, and I’m a data engineer at HSBC. I specialize in designing and maintaining data pipelines, ensuring data quality, and enabling scalable analysis to support strategic decision-making. I hold a Bachelor of Science in Mathematics and Chemistry from the University of the Ozarks. My experiences have fueled a strong passion for data science and my goal is understanding the whole journey of the data process.

Experience

HSBC México, S.A.

Data Engineer

Designed and executed advanced data architectures and data marts to support the data needs of multiple business units within the bank, elevating data stability, reliability, and delivering actionable insights for data-driven decision-making.

University of Arkansas

INBRE Summer Research Fellow

Programmed Metropolis and Subspace Sampling Monte Carlo methods Algorithms in C to simulate a system of particles under the canonical ensemble. Designed and tested a new Monte Carlo method to calculate the free energy difference in a double well potential using the Arkansas High-Performance Computing Center resources.

National Science Computational Institute

XSEDE Empower Program Fellow

Performed 3D in-compressible turbulent-mixing simulations using the Arkansas High-Performance Computing Center (AHPCC) and Stampede2 from the XSEDE computational resources to asses the efficacy of Neural Graph Networks to simulate the Rayleigh-Taylor (RT) instability.

University of the Ozarks

Data Analyst and Health Informant

Built tools for automated collection to create data visualizations and dashboards for the university’s business unit, marketing department, and institution research.

Advanced Materials Research Center

Research Intern - 15th Summer Research Program at CIMAV

Introduced an automatic-fitting option for various parameters in a 2D X-ray diffraction novel software package (ANAELU) through genetic algorithms

Jones Learning Center

Academic Tutor - Student Success Center & TRIO Program

Enhanced student learning by optimizing a wide range of instructional approaches and innovative classroom activities.

Education

University of the Ozarks

December 2021

Bachelor of Science in Mathematics and Chemistry

Minor(s): Computer Science and Economics

Coursework

Mathematics and Programmming

Calculus II, & III | Advanced Calculus | Probability and Statistics | Linear Algebra | Discrete Mathematics | Abstract Algebra | Data Structures and Algorithms

Chemistry

General Chemistry I & II | Quantitative Chemical Analysis | Physical Chemistry | Organic Chemistry | General Physics

Featured Projects

COVID-19 Dashbaord

COVID-19 Dashboard

     

I developed data dashboards and visualizations to support decision-making for COVID-19 surveillance, outbreak management, and response activities at the University of the Ozarks. The dashboards integrate raw data from Salesforce, extracted via API, as well as data from the Johns Hopkins University COVID-19 data repository and the Arkansas Department of Health, providing insights into the situation at the county and state levels. Python was used to perform data analysis, data mining, web scraping, and metric calculations to populate each field of the dashboards.

WEBSITE ARTICLE
FlowChecked

FlowChecked

   

FlowChecked aims to be the first step toward an airflow simulation designed to predict the propagation of airborne viruses in enclosed spaces. The simulation was developed using the Navier-Stokes equation and the finite difference method. The primary goal of this project is to raise awareness and help prevent new coronavirus transmissions by highlighting the risks associated with indoor activities. The project also suggests safer layout configurations and air conditioning system designs to reduce the risk of infection in these spaces. Additionally, various layouts were created to simulate the spread of the virus across different scenarios.

WEBSITE GITHUB
Pink Code

Pink Code

     

The project involves a convolutional neural network (CNN) written in Python that analyzes a dataset of mammography scans from breast cancer patients. After training the CNN on numerous mammograms, the model can quickly identify the density of masses and determine whether the type of breast cancer is benign or malignant. Currently, the CNN's accuracy is approaching 93%.

WEBSITE GITHUB
Sentiment and Geospatial Analysis of COVID-19 Tweets

Sentiment and Geospatial Analysis of COVID-19 Tweets

   

In this entry for the CdeCMX Challenge, which tackles various coding challenges across different areas of expertise, we developed methods for analyzing COVID-19 data through tweets and case statistics by municipality in Mexico. Below are our visualizations and results.

WEBSITE GITHUB
ANAELU

ANAELU

   

Recently developed 2D X-ray detectors have created a demand for fast and reliable tools to interpret 2D-XRD patterns. The CIMAV Crystallography Group is actively addressing this need with the introduction of ANAELU, a novel software package. This Rietveld-style program enables the characterization of polycrystal structures, particularly texture evaluation, through 2D-XRD modeling and fitting to experimental data. While previous versions of ANAELU were accurate, the manual nature of parameter optimization made the process slow. The current work marks a significant update for ANAELU, introducing an automatic-fitting option for various model parameters, thereby improving efficiency.

POSTER

Languages

My favorite languages for scientific computing, data science, and mathematical modeling.

Front-End

My preferred technologies for front-end web development and component design.

Back-End

My preferred technologies for back-end web programming and database architecture.

Tools

My favorite tools for version control, machine learning support, and container orchestration.

Get in Touch