DeepSeek V3/R1 - Understanding Reasoning LLMs
Multi-Head Latent Attention (MLA)
Mixture-of-Experts (MoE)
DeepSeek Summary
OLMo 2
Normalization Layer Placement
OK-Norm
OLMo 2 Summary
Hello! This is my first post. I created this blog to document ML experiments, small projects, and random notes.
I wanted a place to share my thoughts on machine learning, document experiments, and write tutorials that might help others. This blog serves as both a learning tool for me and hopefully a resource for others interested in ML and development.
You can expect to see content about:
- Machine learning experiments and research
- Programming tutorials and tips
- Project write-ups and case studies
- Random thoughts and observations
Here's a simple Python example:
def greet(name):
return f"Hello, {name}!"
print(greet("World"))
Inline math works great: $E = mc^2$
Block math equations also render beautifully:
$$ \int_0^1 x^2 , dx = \frac{1}{3} $$
I'm planning to write about topics like neural networks, data visualization, and various programming techniques. Stay tuned!
Thanks for reading, and welcome to my blog!