GPT Technical Evolutionary History (1)GPT stands for Generative Pre-Training Transformer, is a series of large-scale language models developed by OpenAI. In order to understand…Feb 15Feb 15
DeepSeek-R1 Technical Analysis: Incentivizing Reasoning Capability in LLMs via Reinforcement…In previous 5 blogs, I explained 5 key techniques DeepSeek model has used to reduce the training cost and improve model accuracy:Feb 4Feb 4
Transformer Clear Explanation: Attention Is All You Need! — 2017Before Transformer, RNN(recurrent neural networks), LSTM(long short-term memory, a variant of RNN) and gated RNN have been firmly…Jan 21Jan 21
AlexNet: ImageNet Classification with Deep Convolutional Neural Networks — 2012This work was made by Alex Krizhevsky, Ilya Sutskever and Geoffery E. Hinton in 2012. The AlexNet won the 2012 ImageNet Challenge, and…Dec 31, 2024Dec 31, 2024