281-541-0361 | jus@justusl.com | justusl.com | linkedin.com/in/justusl | github.com/juicestus
Relevant Coursework
1st place overall at HowdyHack 2024.
1st place in the TAMUHack 2026 HPC Track, sponsored by G-Research.
3rd place at the Build4Good 2025 Pokerbot Challenge, sponsored by Hudson River Trading and SIG Susquehanna.
Commodities Research Engineering.
I was a part of Optiver's Future Focus externship. I learned how to program for a quantitative trading environment.
I interned with Microsoft's Azure Compute Host Networking team, developing high-performance remote direct memory access (RDMA) networking infrastructure. I was primarily working on erasure coding support to enable loss tolerance over the WAN. This reduced the need for expensive retransmissions, therefore significantly decreasing tail-latency for cross-region RDMA traffic.
I also worked on experiments to quantify the side effects of running Live VM Migration (LM) over RDMA, allowing the team to gain the performance benefits of LM over RDMA without negatively impacting storage workloads for other customers.
Also, I helped build hyperscale load-testing infrastructure for cross-region RDMA, LM RDMA, GPU Direct RDMA, and Guest RDMA.
After high-school, I interned at NASA's Johnson Space Center, with the robotics systems technology branch. I wrote firmware for test equipment that was used to validate the drivetrain of the VIPER (Volatiles Investigating Polar Exploration Rover) lunar rover.
I interned at WebCreek, a software contractor in my hometown, during my junior and senior years of high school. I specifically worked on TrainBeyond an immersive equipment training simulation system, licensed to high-profile clients in the oil/gas and heavy machinery industries. I did mostly Occulus and OpenXR graphics work in C#. Much of my work directly resulted in new customer acquisition.
I work at Texas A&M's Internet Research Lab (IRL) under Dr. Dmitri Loguinov. IRL conducts research in a number of different avenues, including network performance analysis, stochastic modeling, large-scale Internet measurement, big-data computing, graph algorithms, congestion control, peer-to-peer networking, and web crawling.
My specific work is in designing fast, distribution-insensitive sorting and partitioning algorithms. I am currently working on a family of vectorized partitioning algorithms. By extending from bipartition to large n-partition, these algorithms are able to achieve a high work-per-key intensity per fetch to DRAM. In combination with the Vortex framework, we can produce a samplesort algorithm that is very fast and insensitive to the input distribution.
Due to the deep level of our optimizations (generally at the μ-op level), we design our core partitioning algorithms instruction-by-instruction in x86-64 assembler with the AVX2 instruction set extension.
Languages: C, C++, x86-64 Assembly, Python, Java, C#, JavaScript, TypeScript, Golang, Rust, Swift, Arm Assembly.
Technologies: Git, CMake, Linux, Networking, TCP/UDP/IP, SIMD, Drivers, OpenCV, TensorFlow, PyTorch, .NET, ASP, Flask, React, Node.js, JQuery, Unity3D, JavaFX, AWS, Azure, Firebase, Google Cloud, STM32, ESP8266/32, MongoDB, SQL.
Areas: Low-Latency Networking, Kernel-Bypass Networking, Performance Optimization, Big-Data Processing, Systems Programming, Distributed Systems, Embedded Systems, Firmware Development, Microarchitecture, Full-Stack Development.