Momentum is a widely-used strategy for accelerating the convergence of gradient-based optimization techniques. Momentum was designed to speed up learning in directions of low curvature, without becoming unstable in directions of high curvature. In deep learning, most practitioners set the value of momentum to 0.9 without attempting to further tune this hyperparameter (i.e., this is the default value for momentum in many popular deep learning packages). However, there is no indication that this choice for the value of momentum is universally well-behaved.

Within this post, we overview recent research indicating that decaying the value of momentum throughout training can aid…

In this post, I aim to outline the theoretical foundations for one of the most canonical quantum phenomena — **entanglement**. I aim to do this in a way that can be fully understood by anyone with a basic understanding of arithmetic and (very basic) linear algebra. Consequently, the majority of this post will be comprised of (hopefully understandable) descriptions of the background knowledge needed to fully understand quantum entanglement from a mathematical/scientific perspective. These tools come from both quantum mechanical (e.g., quantum states, superposition, observables) and mathematical (e.g., complex numbers, tensor products, eigenvectors/values) worlds. …

Recommendation systems are one of the most widespread forms of machine learning in modern society. Whether you are looking for your next show to watch on Netflix or listening to an automated music playlist on Spotify, recommender systems impact almost all aspects of the modern user experience. One of the most common ways to build a recommendation system is with matrix factorization, which finds ways to predict a user’s rating for a specific product based on previous ratings and other users’ preferences. …

For the last two years, I have been researching Compositional Pattern Producing Networks (CPPN), a type of augmenting topology neural network proposed in [1]. However, throughout my research, I was always baffled by some of the concepts and theory behind CPPNs, and I struggled to understand how they work. Although a few open source versions of CPPN exist online, I wanted to build my own implementation for customization purposes in my research, which turned out to be significantly more involved than I initially expected. So, I wanted to create a comprehensive explanation of the theory behind CPPNs and how they…

Within the Sklearn Python library, there are a group of several example data sets that can be imported. Among these data sets is a binary classification breast cancer data set that was drawn from observations made in the state of Wisconsin. I chose to work on this data set for a personal project because it presented a topic that I found to be impactful and interesting (and a good way to practice my skills in data science) — **my goal was to train a classification model to identify malignant breast cancer tumors with an accuracy of >95%**. …

PhD student at Rice University