mcore customization doc minor fix (NVIDIA#8421) (NVIDIA#8437)

Signed-off-by: Huiying Li <[email protected]> Co-authored-by: Huiying <[email protected]>
rohitrango · Feb 16, 2024 · 88126f3 · 88126f3
1 parent 3fda655
commit 88126f3
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/docs/source/nlp/nemo_megatron/mcore_customization.rst b/docs/source/nlp/nemo_megatron/mcore_customization.rst
@@ -1,7 +1,7 @@
 Megatron Core Customization
 ---------------------------
 
-Megatron Core (Mcore) offers a range of functionalities, one of the most notable being the ability for users to train GPT models on an epic scale. Users can use ``megatron.core.models.gpt.GPTModel`` (Mcore GPTModel) to initialize the model, and then pretrain/load weights into the model. Mcore GPTModel adopts the typical GPT structure, beginning with embedding layer, positional encoding, followed by a series of transformer layers and finally output layer. 
+Megatron Core (Mcore) offers a range of functionalities, one of the most notable being the ability for users to train Transformer models on an epic scale. Users can enable decoder/GPT variants by using ``megatron.core.models.gpt.GPTModel`` (Mcore GPTModel) to initialize the model, and then pretrain/load weights into the model. Mcore GPTModel adopts the typical GPT structure, beginning with embedding layer, positional encoding, followed by a series of transformer layers and finally output layer.
 
 In the rapidly advancing world of LLM, it is increasingly important to experiment with various configurations of the transformer block within each transformer layer. Some of these configurations involve the use of different module classes. While it is possible to achieve this with “if else” statements in Mcore, doing so makes Mcore less readable and less maintainable in the long term. Mcore spec intends to solve this challenge by allowing users to specify a customization of the transformer block in each layer, without modifying code in mcore. 
 We will dive more into the details of mcore spec in the first section of this blog. Then, we will demonstrate the usefulness of mcore spec using Falcon as an example.