Welcome to my website!
About me
Hello, I am Frederico Rocha Boller, also known as El Medonho!
I am from Brazil, live in Petrópolis/RJ and currently am an undergrad at Universidade Federal do Rio de Janeiro (UFRJ), on track to graduate in Jul/27. Starting with my hobbies, as I believe this is one of the best ways to get to know someone, for as long as I can remember my favorite thing to do is playing games, for the last 10 years I have been running and hiking as a sport (recently have been doing some trail running), and lately I have taken a liking to reading, developing some projects (like this one) and cooking with friends. Most of my time at college was spent practicing competitive programming in sites like codeforces and atcoder to participate in ICPC alongside my teammates Marcos and Tiago.
Projects
Actor Network Community Analysis
You can check the whole ipynb notebook here.
Final project for my Complex Networks subject. This project analyses an actor co-star network to detect collaboration communities. It explores the correlation between these communities and film genre, language and actor's age. Uses Louvain and Leiden algorithms for community detection and heuristic methods for correlation.
This project uses the public dataset from IMDb. To run the algorithms on the graph, I used the NetworkX library with the cuGraph backend (uses CUDA to run Louvain and Leiden faster). The full project is available on my GitHub.
ExtractData.cpp: Used to extract data from the dataset and create more friendly.txtfiles with the information that was needed in the analysis.CommunityDetection.py: Used to run Louvain and Leiden community detecting algorithms on the extracted networks.
Nodes in the graph represent actors, and two actors are adjacent if they have participated in at least one movie together. Edges have weights proportional to the number of movies in which two actors have participated together. In this project, I made several cuts in the graph, but all had similar results. After running community detecting algorithms, I constructed a feature vector for each actor which embeds information regarding favorite genres, age and language spoken, and measured Euclidean distance between these vectors. It turned out that nodes in the same community were much closer (in regards to the Euclidean distance) than nodes from different communities. Finally, I concluded empirically that there is a strong correlation between network topology and the actors' demographic and professional profiles.
High performance computing with CUDA
This started by a will that I had earlier to start learning CUDA, and was further motivated by the necessity of a project for my distributed systems course. You can find all of my code in my github repo.
Initially, I have spent time reading books and blog posts, giving me much of the initial knowledge that I needed practicing, and also coding in CUDA in leetcode-like website called tensara, which gave me some needed practice and encouraged me to search for general optimizations in kernels (much used in machine learning workloads).
By understanding fundamentals of GPU architecture, optimizations and the code flow of CUDA, I was able to start working in a distributed version of GEMM, which used both my main and old PCs. Since the hardware was asymetric between nodes (one has a RTX 5060 Ti and the other has a GTX 1050 Ti), I had to balance workload between them in order to achieve a speedup in comparison to my baseline of a single RTX 5060 Ti. The main diffculty here was dealing with networking, since both my environments were in WSL, windows was hiding them under a Hyper VNAT internal logic, which had me needing to use a middleman in São Paulo (using Tailscale), which increased latency. In the end, I used some logic to minimize the amount of data transfered, such as using a seed logic to generate input and having the end matrix be the sum of many matrix multiplications, and was able to achieve a 20% speedup using both GPUs over a single RTX 5060 Ti baseline. After switching my OS to Ubuntu Server, I was able to make them communicate directly on LAN.
You can see the whole ipynb notebook going more in depth here.
Tools
- Jujuba chooses
- Coming soon