You can reach out to me at mynkpl1998@gmail.com.
data:image/s3,"s3://crabby-images/87bb8/87bb8540ae8fd3350fe2f3009dd298e1f40f1584" alt=""
[~]$
whoami1. Introduction Machine learning models are getting larger and larger, and these larger models tend to perform better. However, these larger models have high memory requirements just to load them into memory. Consumer hardware doesn’t have a huge amount of memory to load these large models. Model compression is a technique used to compress large models into smaller ones at the expense of small to negligible inaccuracies. The compressed models can be optimized to run on consumer hardware and can make use of NPUs for high performance....
Introduction Data Representation
What is tokenization ? Machine learning models operate on numerical representations of input data. Language models, such as LLMs, work with text input. Tokenization is the process of converting the text into smaller chunks called tokens, which are then assigned numerical representations. The collection of unique tokens forms what we call a vocabulary. It is essentially a dictionary that maps each token to a unique integer. The size of the vocabulary is determined by the number of unique tokens in the corpus....
Bitwise Operators The following are the bitwise operators which are available in C/C++ programming language. Operator Meaning & Bitwise AND | Bitwise OR ^ Bitwise XOR ~ Bitwise complement « Shift left » Shift right XOR The main property of XOR is that it keep the bit same if both operands are same, else it flips the bit. A B Result 0 0 0 0 1 1 1 0 1 1 1 0 Few Properties of XOR Property Result A ^ 0 A A ^ 1 ~A A ^ A 0 A ^ A ^ A A A ^ B ^ A B A ^ B ^ B A Few properties of Shift operators....
I assume the reader is familiar with the basics of Reinforcement learning and has a basic understanding of statistics and a bit of calculus. One should be comfortable with manipulating value functions, policy, and bellman equations. The main idea of writing this blog post is to summarize and extend the understanding of reinforcement learning methods that directly optimizes policy. More or less, this blog post is a summary for me to revisit the concepts and various tricks that are helpful while dealing with Policy-based optimization....