Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A vulnerability causes VMess over TCP connections to be identifiable (with PoC) #2054

Closed
nlzy opened this issue Oct 10, 2022 · 10 comments
Closed
Labels
bug Something isn't working Stale

Comments

@nlzy
Copy link
Contributor

nlzy commented Oct 10, 2022

Abstract

A security issue was discovered in implementation of VMess protocol, which could result in VMess connections to be identifiable. The only requirement of the attacker is to capture all network traffic of the victim.

Exploit

In VMess protocol, each data chunk is followed by some padding:

2 bytes L-P bytes P bytes
length L data padding

V2Ray uses the PRNG from "math/rand" to fill the padding fields. This PRNG seed value range is very limited, as a result, it can only generate 2**31-2 different random streams.

Attackers can calculate all 2**31-2 different random streams, select some bytes as patterns in each random stream, construct a patterns-to-seed lookup table.

Then, attacker search the network traffic for these patterns, if a connection contains certain pattern, attacker looks up the seed corresponding to the pattern and calculates the random stream. A connection is identified as a VMess connection if its subsequent traffic contains many consecutive bytes in the random stream.

PoC

https://github.com/nlzy/vmess-identify-poc

This PoC analyzes pcap format packet captures. If a VMess packet is found, the source and destination addresses of the packet are output.

Fix

Replace "math/rand" with a cryptographically secure PRNG. It's fixed in PR #2032.

@nlzy
Copy link
Contributor Author

nlzy commented Oct 10, 2022

@klzgrad
Copy link

klzgrad commented Oct 12, 2022

These paddings will send in clear text.

Why? Use of math/rand is not the bug. This is the bug. H/2 and TLS both have paddings but they are encrypted. Reading lots of bytes from /dev/urandom hurts performance.

@nlzy
Copy link
Contributor Author

nlzy commented Oct 12, 2022

These paddings will send in clear text.

Why? Use of math/rand is not the bug. This is the bug. H/2 and TLS both have paddings but they are encrypted. Reading lots of bytes from /dev/urandom hurts performance.

Either encrypting or using CSPRNG will fix this, I just picked a fix with minimal code changes.

Regarding performance, I think it's another issue and less important. Can you provide a real-world example of reduced availability due to performance issues, or estimate how much the system's latency can be reduced and throughput improved by using encryption instead of reading /dev/urandom?

@klzgrad
Copy link

klzgrad commented Oct 13, 2022

@nlzy
Copy link
Contributor Author

nlzy commented Oct 13, 2022

https://stackoverflow.com/questions/29935034/using-dev-urandom-for-dddisk-performance-test-is-a-good-idea-or-not This user here showed urandom has a throughput of 8MB/s.

8 MByte/s is good, 8 MByte/s is enough, even for 10 Gigabit networks, seriously.

@Kylejustknows
Copy link

Consider using urandom to generate the "seed" and "length of using" of the math/rand.

So we have a constantly changing random stream without a performance hit.

@nlzy
Copy link
Contributor Author

nlzy commented Oct 17, 2022

Consider using urandom to generate the "seed" and "length of using" of the math/rand.

So we have a constantly changing random stream without a performance hit.

In terms of throughput, the data length of a chunk could be tens of thousands, which is hundreds of times the padding length, that's why I say that 8MByte/s padding generation speed is enough to handle 10 Gigabit networks.

In terms of latency, the performance penalty of reading /dev/urandom once is just one more syscall than using encryption, which takes only 0.0000001 second on modern CPUs.

Where is the performance issue?

Replacing "crypto/rand" with something else requires careful consideration of security, writing code to implement it, reviewing new implementations, and increasing the complexity of the project.

What is this for, just to be 0.0000001 seconds faster? Can we stop wasting time on this boring performance issue?

@bigdadada
Copy link

https://github.com/nlzy/vmess-identify-poc

I just tested with version 5.0.8, and this code does not complete the identification

@nlzy
Copy link
Contributor Author

nlzy commented Mar 27, 2023

https://github.com/nlzy/vmess-identify-poc

I just tested with version 5.0.8, and this code does not complete the identification

Please make sure you have followed strictly the steps and notes in README, then make some network traffic (open about 10~20 web pages should enough). If there is still no message output, restart v2ray client to try again. The identification rate is not 100%, so you may have to try a few times.

It's prefer to use Chrome and open some web pages as proxy payload. The default parameters of pattern selection is suit for that.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 5 days

@github-actions github-actions bot added the Stale label Jul 26, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale
Projects
None yet
Development

No branches or pull requests

5 participants