[Model][3/N] Refactor sfa into mla and remove deepseek_v3_2.py by whx-sjtu · Pull Request #3769 · vllm-project/vllm-ascend

whx-sjtu · 2025-10-25T10:26:56Z

This is the follow-up PR to PR #3189, which continues to refactor sfa into mla and finally remove deepseek_v3_2.py. This is the last PR of deepseek modeling refactoring. After this, all deepseek-related model codes are removed from vllm_ascend.

FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with correct accuracy. An example is shown below:

llm = LLM(
        model=model,
        tensor_parallel_size=GPUs_per_dp_rank,
        enforce_eager=True,
        compilation_config={
            "cudagraph_capture_sizes": [2,4],
            # "cudagraph_mode": "FULL_DECODE_ONLY",
        },
        speculative_config={
            "method": "deepseek_mtp",
            "num_speculative_tokens": 1,
        },
        quantization="ascend",
        max_num_seqs=4,
        max_model_len=1024,
        max_num_batched_tokens=12,
        gpu_memory_utilization=0.9,
        enable_expert_parallel=True,
        trust_remote_code=True,
        additional_config={}
    )

Inference results:

Details

INFO 10-30 16:36:48 [llm.py:306] Supported_tasks: ['generate']
Adding requests: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 123.55it/s]
Processed prompts:   0%|                                                                                                           | 0/7 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]INFO 10-30 16:36:48 [llm.py:306] Supported_tasks: ['generate']
Adding requests: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 117.83it/s]
Processed prompts: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [02:01<00:00, 17.41s/it, est. speed input: 0.43 toks/s, output: 29.77 toks/s]
DP rank 1, Prompt: '窗前明月光，', Generated text: '疑是地上霜。举头望明月，低头思故乡。\n\n——李白《静夜思》\n\n一、故乡\n\n我的故乡在浙江省绍兴府山阴县东浦村。东浦在绍兴府城的西北，离城约二十里路。东浦是一个很大的村庄，居民约有三千户人家。人家大都沿河而居，中间是一条大河，直通州山。大河把东浦村分为南北两岸，两岸往来，是用一条用三块石板拼成的石桥，叫做“新桥”。新桥附近，是一个热闹的市集，店铺林立，人来人往，熙熙攘攘，是东浦村的商业中心。\n\n东浦的居民，大都姓陈，只有少数姓沈姓胡的。姓陈的分为好几房，我们这一房叫做“颍川郡陈氏”，是河南颍川郡迁来的。我们这一房，在东浦村算是大族，族中曾出过不少名人，如陈洪绶（老莲）是明末的大画家，陈鹤是清代的诗人，陈元鼎是清代的学者，陈寿祺是清代的史学家，陈师曾是近代的画家，陈半丁是近代的画家，陈叔通是近代的政治家，陈布雷是近代的政论家，陈仪是近代的军人，陈诚是近代的军人，陈立夫是近代的政治家，陈果夫是近代的政治家，陈公博是近代的政治家，陈璧君是近代的政治家，陈铭枢是近代的军人，陈济棠是近代的军人，陈绍宽是近代的军人，陈调元是近代的军人，陈继承是近代的军人，陈长捷是近代的军人，陈安宝是近代的军人，陈明仁是近代的军人，陈\n\n我的故乡\n\n我的故乡在浙江省绍兴府山阴县东浦村。东浦在绍兴府城的西北，离城约二十里路。东浦是一个很大的村庄，居民约有三千户人家。人家大都沿河而居，中间是一条大河，直通州山。大河把东浦村分为南北两岸，两岸往来，是用一条用三块石板拼成的石桥，叫做“新桥”。新桥附近，是一个热闹的市集，店铺林立，人来人往，熙熙攘攘，是东浦村的商业中心。\n\n东浦的居民，大都姓陈，只有少数姓'
DP rank 1, Prompt: 'The president of the United States is Mr.', Generated text: ' Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the'
DP rank 1, Prompt: 'The capital of France is', Generated text: " Paris. This is the largest city in France and its main political, cultural and commercial center. The modern location of the city is the north of the central part of the country, on the banks of the Seine River. In addition to being the capital of France, Paris is also the administrative center of the Ile-de-France region. The population of the city is more than 2.2 million people. This is the core of a large Parisian agglomeration called Big Paris, which has more than 11 million inhabitants. This agglomeration is one of the largest in Europe.\n\n## Paris - the capital of France\n\nThe capital of France is located on the Seine River, in the historical region of Ile-de-France. The city is a major transport hub, as well as the national highway network. Several major highways and railways diverge from Paris in different directions. There are two international airports and a river port in the city.\n\nParis is a world famous cultural and scientific center. The headquarters of UNESCO is located here. The city has a large number of museums, art galleries, exhibition centers, theaters. The most famous museums in Paris are the Louvre, the Musée d'Orsay, the Carnavalet Museum, the Grevin Museum, the Picasso Museum, the Cité des Sciences and Industry, etc. The most famous educational institutions are the University of Paris (Sorbonne), the College de France, the Polytechnic Institute, the Higher practical school, National Institute of Management, National agronomic institute and others.\n\n## History of Paris\n\nThe capital of France has a long history. The first settlements on the site of the modern city were in the 4th millennium BC. e. In the 3rd century BC. e. the Celtic tribe of the Parisians settled here. The first mention of Paris (Lutetia) is found in the 6th book of Julius Caesar on the war with the Gauls, dated 53 BC. e. In that era, the city was a fortified fortress of the Gauls, located on the island of the Seine. In 52 BC. e. the fortress was captured by the Romans. They gave it the name Lutetia and rebuilt it in their own traditions, with baths, a forum and an amphitheatre. The city existed as a center of the Gallo-Roman culture until the 5th century, when it was captured by the Franks, who gave it its modern name. Since the 6th century, Paris has been the main city"
DP rank 1, Prompt: 'The future of AI is', Generated text: " here, and it's powered by you.\n\nIntroducing Intel® Gaudi® 3 AI accelerator, delivering 50% faster training and inference, 40% better inference power efficiency, and 1.5x better performance per dollar on average versus Nvidia H100.¹\n\nAccelerate your AI ambitions with an open, scalable systems approach that lets you integrate Intel Gaudi 3 accelerators into existing on-prem, cloud, and edge environments.\n\nLearn more at intel.com/AI\n\n¹ Performance varies by use case, configuration, and other factors. Learn more at intel.com/PerformanceIndex. Results simulated as of March 2024. Performance results are based on testing by Intel as of 3/27/2024 and may not reflect all publicly available security updates. No product or component can be absolutely secure. Intel technologies may require enabled hardware, software, or service activation. Your costs and results may vary. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. © Intel Corporation\n\n# Contents\n\n## 4\n**Editor’s letter**  \nThe AI revolution is here, and it’s just getting started.\n\n## 6\n**The AI revolution**  \nHow artificial intelligence is changing the world.\n\n## 8\n**The AI revolution in business**  \nHow AI is transforming the way companies operate.\n\n## 10\n**The AI revolution in health care**  \nHow AI is improving patient outcomes and reducing costs.\n\n## 12\n**The AI revolution in education**  \nHow AI is personalizing learning and making education more accessible.\n\n## 14\n**The AI revolution in transportation**  \nHow AI is making our roads safer and our commutes more efficient.\n\n## 16\n**The AI revolution in entertainment**  \nHow AI is creating new forms of entertainment and changing the way we consume media.\n\n## 18\n**The AI revolution in government**  \nHow AI is being used to improve public services and make government more efficient.\n\n## 20\n**The AI revolution in the military**  \nHow AI is being used to develop new weapons and improve military operations.\n\n## 22\n**The AI revolution in the environment**  \nHow AI is being used to address climate change and other environmental challenges.\n\n## 24\n**The AI revolution in society**  \nHow AI is changing the way we live, work, and interact with each other.\n\n## 26\n**The future of"
DP rank 1, Prompt: '感时花溅泪，', Generated text: '恨别鸟惊心。\n\n烽火连三月，家书抵万金。\n\n白头搔更短，浑欲不胜簪。\n\n【注释】\n\n①春望：春天登高远望。②国破：指长安被安史叛军占领。③城：指长安城。④感时：感伤时事。⑤恨别：恨恨离别。⑥烽火：指战争。⑦家书：家信。⑧抵：值。⑨白头：指白发。⑩搔：抓。⑪浑：简直。⑫不胜簪：插不上簪。簪，古代男子束发，用簪来插住头发，也用它把帽子别在头发上。\n\n【译文】\n\n国家已经破碎不堪，只有山河还在。长安城里又是春天了，但是经过叛军的烧杀抢掠，早已满目荒凉，到处长着又深又密的草木。虽然春花盛开，但看了不是使人愉快，而是让人流泪，觉得花好像也在流泪；虽然到处是春鸟和鸣，但心里由于和家人离别而忧伤，听了鸟鸣，不仅不高兴，还让人惊心。战乱持续了很长时间了，家里已久无音信，一封家信可以抵得上一万两黄金那么宝贵。由于忧伤烦恼，头上的白发越来越稀少，简直连簪子也戴不住了。\n\n【赏析】\n\n这首诗是肃宗至德二年（757）三月杜甫在长安时所作。当时长安被安史叛军焚掠一空，满目荒凉。诗中即景生情，抒写了忧时伤乱的感慨。全诗沉着蕴藉，真挚自然。“国破山河在，城春草木深”，开篇即写春望所见：国都沦陷，城池残破，虽然山河依旧，可是乱草遍地，林木苍苍。一个“破”字，使人休日惊心，继而一个“深”字，令人满目凄然。诗人在此明为写景，实为抒感，寄情于物，托感于景，为全诗创造了气氛。“感时花溅泪，恨别鸟惊心。”花鸟本为娱人之物，但因感时恨别，却使诗人见了反而堕泪惊心。诗的前四句，都统在“望”字中。诗人俯仰瞻视，视线由近而远，又由远而近，视野从城到山河，再由满'
Processed prompts: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [02:01<00:00, 17.40s/it, est. speed input: 0.43 toks/s, output: 29.78 toks/s]
DP rank 0, Prompt: '窗前明月光，', Generated text: '疑是地上霜。举头望明月，低头思故乡。\n\n——李白《静夜思》\n\n一、故乡\n\n我的故乡在浙江省绍兴府山阴县东浦村。东浦在绍兴府城的西北，离城约二十里路。东浦是一个很大的村庄，居民约有三千户人家。人家大都沿河而居，中间是一条大河，直通州山。大河把东浦村分为南北两岸，两岸往来，是用一条用三块石板拼成的石桥，叫做“新桥”。新桥附近，是一个热闹的市集，店铺林立，人来人往，熙熙攘攘，是东浦村的商业中心。\n\n东浦的居民，大都姓陈，只有少数姓沈姓胡的。姓陈的分为好几房，我们这一房叫做“颍川郡陈氏”，是河南颍川郡迁来的。我们这一房，在东浦村算是大族，族中曾出过不少名人，如陈洪绶（老莲）是明末的大画家，陈鹤是清代的诗人，陈元鼎是清代的学者，陈寿祺是清代的史学家，陈师曾是近代的画家，陈半丁是近代的画家，陈叔通是近代的政治家，陈布雷是近代的政论家，陈仪是近代的军人，陈诚是近代的军人，陈立夫是近代的政治家，陈果夫是近代的政治家，陈公博是近代的政治家，陈璧君是近代的政治家，陈铭枢是近代的军人，陈济棠是近代的军人，陈绍宽是近代的军人，陈调元是近代的军人，陈继承是近代的军人，陈长捷是近代的军人，陈安宝是近代的军人，陈明仁是近代的军人，陈\n\n我的故乡\n\n我的故乡在浙江省绍兴府山阴县东浦村。东浦在绍兴府城的西北，离城约二十里路。东浦是一个很大的村庄，居民约有三千户人家。人家大都沿河而居，中间是一条大河，直通州山。大河把东浦村分为南北两岸，两岸往来，是用一条用三块石板拼成的石桥，叫做“新桥”。新桥附近，是一个热闹的市集，店铺林立，人来人往，熙熙攘攘，是东浦村的商业中心。\n\n东浦的居民，大都姓陈，只有少数姓'
DP rank 0, Prompt: 'The president of the United States is Mr.', Generated text: ' Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the United States is Mr. Obama. The president of the'
DP rank 0, Prompt: 'The capital of France is', Generated text: " Paris. This is the largest city in France and its main political, cultural and commercial center. The modern location of the city is the north of central France, in the region of Ile-de-France. Paris has a large number of attractions, among which the Eiffel Tower is only one of them. The city is also famous for its cultural institutions and world-famous architectural ensembles. The Louvre, the Musée d'Orsay, the Centre Georges Pompidou are just a few of the famous museums in Paris. The city is also known for its fashion, cuisine and atmosphere.\n\n## What to see in Paris\n\n### 1. Eiffel Tower\n\nThe Eiffel Tower is one of the most famous and recognizable landmarks in the world. It is located in Paris, France, and was built for the 1889 World's Fair. The tower is 324 meters high and was the tallest man-made structure in the world until the completion of the Chrysler Building in New York City in 1930. The Eiffel Tower is made of iron and weighs over 10,000 tons. It has three levels that are open to the public, with restaurants and observation decks on the first and second levels. The tower is visited by millions of tourists every year and is a symbol of Paris and France.\n\n### 2. Louvre Museum\n\nThe Louvre Museum is one of the world's largest and most famous art museums. It is located in Paris, France, and is home to over 35,000 works of art, including the Mona Lisa and the Venus de Milo. The museum is housed in the Louvre Palace, which was originally built as a fortress in the late 12th century. The Louvre is divided into eight departments: Egyptian Antiquities; Near Eastern Antiquities; Greek, Etruscan, and Roman Antiquities; Islamic Art; Sculpture; Decorative Arts; Paintings; and Prints and Drawings. The museum is visited by millions of people each year and is a must-see for anyone interested in art and history.\n\n### 3. Arc de Triomphe\n\nThe Arc de Triomphe is a famous monument located in Paris, France. It stands at the western end of the Champs-Élysées and honors those who fought and died for France in the French Revolutionary and Napoleonic Wars. The arch is 50 meters tall and 45 meters wide, and it is adorned with sculptures and reliefs depicting military victories and important events in French history. Visitors can climb to the top of"
DP rank 0, Prompt: 'The future of AI is', Generated text: " here, and it's powered by you.\n\nScale AI is the foundation of all AI applications. Our proprietary platform harnesses the power of human intelligence to generate high-quality data at scale, enabling machines to understand and interact with the world. We provide the essential human-annotated data that trains and validates AI models, ensuring they are accurate, reliable, and safe. From autonomous vehicles to advanced language models, our work is at the forefront of AI innovation, driving the development of technologies that are transforming industries and improving lives.\n\nJoin us in shaping the future of AI.\n\nScale AI is seeking a highly motivated and experienced Senior Manager, Strategic Finance to join our team. In this role, you will be responsible for leading the strategic financial planning and analysis for our business units. You will work closely with senior leadership to drive financial strategy, optimize resource allocation, and support key business decisions. The ideal candidate will have a strong background in financial analysis, strategic planning, and cross-functional collaboration.\n\nYou will:\n\n- Lead the annual and quarterly financial planning processes for the business units, including budgeting, forecasting, and long-range planning\n- Develop and maintain financial models to support strategic initiatives, business cases, and investment decisions\n- Partner with business unit leaders to provide financial insights, analyze performance, and identify opportunities for growth and efficiency\n- Prepare and present financial reports, dashboards, and presentations to senior leadership and stakeholders\n- Conduct variance analysis and provide actionable recommendations to improve financial performance\n- Support M&A activities, including due diligence, financial modeling, and integration planning\n- Drive process improvements and implement best practices in financial planning and analysis\n- Collaborate with cross-functional teams, including Accounting, Operations, and Product, to ensure alignment and accuracy in financial reporting and analysis\n\nIdeally you'd have:\n\n- Bachelor's degree in Finance, Accounting, Economics, or a related field; MBA or advanced degree preferred\n- 7+ years of experience in financial planning and analysis, investment banking, consulting, or a related field\n- Strong financial modeling and analytical skills, with the ability to translate complex data into actionable insights\n- Excellent communication and presentation skills, with the ability to effectively communicate financial information to non-financial stakeholders\n- Proven ability to lead and manage cross-functional projects and initiatives\n- High level of proficiency in Excel and financial planning software; experience with SQL or other data analysis tools is a plus\n- Strategic thinker with a proactive and results-oriented mindset\n- Ability to thrive in a fast-paced, dynamic environment and manage multiple priorities\n\nThe base salary range for this full"
DP rank 0, Prompt: '感时花溅泪，', Generated text: '恨别鸟惊心。\n\n烽火连三月，家书抵万金。\n\n白头搔更短，浑欲不胜簪。\n\n【注释】\n\n①春望：春天登高远望。②国破：指长安被安史叛军占领。③城：指长安城。④感时：感伤时事。⑤恨别：恨恨离别。⑥烽火：指战争。⑦家书：家信。⑧抵：值。⑨白头：指白发。⑩搔：抓。⑪浑：简直。⑫不胜簪：插不上簪。簪，古代男子束发，用簪来插住头发，也用它把帽子别在头发上。\n\n【译文】\n\n国家已经破碎不堪，只有山河还在。长安城里又是春天了，但是经过叛军的烧杀抢掠，早已满目荒凉，到处长着又深又密的草木。虽然春花盛开，但看了不是使人愉快，而是让人流泪，觉得花好像也在流泪；虽然到处是春鸟和鸣，但心里由于和家人离别而忧伤，听了鸟鸣，不仅不高兴，还让人惊心。战乱持续了很长时间了，家里已久无音信，一封家信可以抵得上一万两黄金那么宝贵。由于忧伤烦恼，头上的白发越来越稀少，简直连簪子也戴不住了。\n\n【赏析】\n\n这首诗是肃宗至德二年（757）三月杜甫在长安时所作。当时长安被安史叛军焚掠一空，满目荒凉。诗中即景生情，抒写了忧时伤乱的感慨。全诗沉着蕴藉，真挚自然。“国破山河在，城春草木深”，开篇即写春望所见：国都沦陷，城池残破，虽然山河依旧，可是乱草遍地，林木苍苍。一个“破”字，使人休日惊心，继而一个“深”字，令人满目凄然。诗人在此明为写景，实为抒感，寄情于物，托感于景，为全诗创造了气氛。“感时花溅泪，恨别鸟惊心。”花鸟本为娱人之物，但因感时恨别，却使诗人见了反而堕泪惊心。诗的前四句，都统在“望”字中。诗人俯仰瞻视，视线由近而远，又由远而近，视野从城到山河，再由满'

- vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/83f478bb19489b41e9d208b47b4bb5a95ac171ac

gemini-code-assist

Code Review

This pull request continues the refactoring of SFA into MLA, culminating in the removal of deepseek_v3_2.py. The changes primarily involve adapting attention mechanisms and model layers to the new structure. My review has identified several critical issues that could lead to runtime errors. Specifically, there's a potential KeyError or AttributeError in mla_v1.py from unsafe handling of the indexer argument. Additionally, sfa_v1.py contains multiple TypeError risks due to incorrect unpacking of return values from ReplicatedLinear layers. These issues need to be addressed to ensure the refactoring is successful and the code remains stable.

github-actions · 2025-10-25T10:34:02Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

github-actions · 2025-10-28T04:54:14Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: whx-sjtu <2952154980@qq.com>

…project#3769) This is the follow-up PR to PR vllm-project#3189, which continues to refactor sfa into mla and finally remove deepseek_v3_2.py. This is the last PR of deepseek modeling refactoring. After this, all deepseek-related model codes are removed from vllm_ascend. FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with correct accuracy. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: luolun <luolun1995@cmbchina.com>

…project#3769) This is the follow-up PR to PR vllm-project#3189, which continues to refactor sfa into mla and finally remove deepseek_v3_2.py. This is the last PR of deepseek modeling refactoring. After this, all deepseek-related model codes are removed from vllm_ascend. FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with correct accuracy. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: hwhaokun <haokun0405@163.com>

…project#3769) This is the follow-up PR to PR vllm-project#3189, which continues to refactor sfa into mla and finally remove deepseek_v3_2.py. This is the last PR of deepseek modeling refactoring. After this, all deepseek-related model codes are removed from vllm_ascend. FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with correct accuracy. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: nsdie <yeyifan@huawei.com>

…project#3769) This is the follow-up PR to PR vllm-project#3189, which continues to refactor sfa into mla and finally remove deepseek_v3_2.py. This is the last PR of deepseek modeling refactoring. After this, all deepseek-related model codes are removed from vllm_ascend. FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with correct accuracy. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: whx-sjtu <2952154980@qq.com>

whx-sjtu changed the title ~~[Model][3/N] Refactor sfa into mla and remove deepseek_v3_2.py~~ [WIP][Model][3/N] Refactor sfa into mla and remove deepseek_v3_2.py Oct 25, 2025

gemini-code-assist Bot reviewed Oct 25, 2025

View reviewed changes

Comment thread vllm_ascend/attention/mla_v1.py Outdated

Comment thread vllm_ascend/attention/mla_v1.py Outdated

Comment thread vllm_ascend/attention/sfa_v1.py Outdated

Comment thread vllm_ascend/attention/sfa_v1.py Outdated

Comment thread vllm_ascend/attention/sfa_v1.py Outdated

whx-sjtu force-pushed the rm_ds_model branch from 69a37de to 79817fd Compare October 28, 2025 04:54

github-actions Bot added the merge-conflicts label Oct 28, 2025

whx-sjtu force-pushed the rm_ds_model branch from 79817fd to 7f57a95 Compare October 28, 2025 07:17

github-actions Bot removed the merge-conflicts label Oct 28, 2025

whx-sjtu force-pushed the rm_ds_model branch 2 times, most recently from b1447ad to c603611 Compare October 29, 2025 02:31

github-actions Bot added the module:core label Oct 29, 2025

whx-sjtu force-pushed the rm_ds_model branch 4 times, most recently from 7410844 to c13d089 Compare October 29, 2025 13:23

whx-sjtu changed the title ~~[WIP][Model][3/N] Refactor sfa into mla and remove deepseek_v3_2.py~~ [Model][3/N] Refactor sfa into mla and remove deepseek_v3_2.py Oct 29, 2025

refactor sfa into mla and remove sfa.py, sfa_v1.py

a95eb72

Signed-off-by: whx-sjtu <2952154980@qq.com>

whx-sjtu force-pushed the rm_ds_model branch from c13d089 to 08c72ae Compare October 29, 2025 13:24

add back sfa_v1 and refactor it

e6be679

Signed-off-by: whx-sjtu <2952154980@qq.com>

whx-sjtu force-pushed the rm_ds_model branch from 08c72ae to e6be679 Compare October 29, 2025 13:36

whx-sjtu added ready read for review ready-for-test start test by label for PR labels Oct 29, 2025

whx-sjtu mentioned this pull request Oct 30, 2025

[Feat][UT] Support Deepseekv32 FULL_DECODE_ONLY mode and add unit test of sfa_v1 #3763

Merged

wangxiyuan approved these changes Oct 30, 2025

View reviewed changes

wangxiyuan merged commit f6149f3 into vllm-project:main Oct 30, 2025
46 of 51 checks passed

wangxiyuan mentioned this pull request Jan 26, 2026

[Community] Nominate whx-sjtu as maintainer #6268

Merged

Yikun mentioned this pull request Feb 5, 2026

[v0.13.0rc2] FAQ / Feedback | 问题/反馈 #6186

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model][3/N] Refactor sfa into mla and remove deepseek_v3_2.py#3769

[Model][3/N] Refactor sfa into mla and remove deepseek_v3_2.py#3769
wangxiyuan merged 2 commits intovllm-project:mainfrom
whx-sjtu:rm_ds_model

whx-sjtu commented Oct 25, 2025 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Oct 25, 2025

Uh oh!

github-actions Bot commented Oct 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

whx-sjtu commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Oct 25, 2025

Uh oh!

github-actions Bot commented Oct 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

whx-sjtu commented Oct 25, 2025 •

edited

Loading