Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] Updated document on max map count #1037

Merged
merged 6 commits into from
Jun 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).

### 🚀 Start up the server

1. Ensure `vm.max_map_count` >= 262144 ([more](./docs/guides/max_map_count.md)):
1. Ensure `vm.max_map_count` >= 262144:

> To check the value of `vm.max_map_count`:
>
Expand Down
2 changes: 1 addition & 1 deletion README_ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@

### 🚀 サーバーを起動

1. `vm.max_map_count` >= 262144 であることを確認する【[もっと](./docs/guides/max_map_count.md)】:
1. `vm.max_map_count` >= 262144 であることを確認する:

> `vm.max_map_count` の値をチェックするには:
>
Expand Down
2 changes: 1 addition & 1 deletion README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@

### 🚀 启动服务器

1. 确保 `vm.max_map_count` 不小于 262144 【[更多](./docs/guides/max_map_count.md)】
1. 确保 `vm.max_map_count` 不小于 262144:

> 如需确认 `vm.max_map_count` 的大小:
>
Expand Down
71 changes: 0 additions & 71 deletions docs/guides/max_map_count.md

This file was deleted.

121 changes: 98 additions & 23 deletions docs/quickstart.md → docs/quickstart.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ slug: /
---

# Quick start
import Tabs from '@theme/Tabs';
writinwaters marked this conversation as resolved.
Show resolved Hide resolved
writinwaters marked this conversation as resolved.
Show resolved Hide resolved
import TabItem from '@theme/TabItem';

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. When integrated with LLMs, it is capable of providing truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.

Expand All @@ -25,29 +27,102 @@ This quick start guide describes a general process from:

## Start up the server

This section provides instructions on setting up the RAGFlow server on Linux. If you are on a different operating system, no worries. Most steps are alike.

1. Ensure `vm.max_map_count` >= 262144:

> To check the value of `vm.max_map_count`:
>
> ```bash
> $ sysctl vm.max_map_count
> ```
>
> Reset `vm.max_map_count` to a value at least 262144 if it is not.
>
> ```bash
> # In this case, we set it to 262144:
> $ sudo sysctl -w vm.max_map_count=262144
> ```
>
> This change will be reset after a system reboot. To ensure your change remains permanent, add or update the `vm.max_map_count` value in **/etc/sysctl.conf** accordingly:
>
> ```bash
> vm.max_map_count=262144
> ```
> See [this guide](./guides/max_map_count.md) for instructions on permanently setting `vm.max_map_count` on an operating system other than Linux.
This section provides instructions on setting up the RAGFlow server on Linux. If you are on a different operating system, no worries. Most steps are alike.

<details>
<summary>1. Ensure <code>vm.max_map_count</code> >= 262144:</summary>

`vm.max_map_count`. This value sets the the maximum number of memory map areas a process may have. Its default value is 65530. While most applications require fewer than a thousand maps, reducing this value can result in abmornal behaviors, and the system will throw out-of-memory errors when a process reaches the limitation.

RAGFlow v0.7.0 uses Elasticsearch for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning the Elasticsearch component.

<Tabs
defaultValue="linux"
values={[
{label: 'Linux', value: 'linux'},
{label: 'macOS', value: 'macos'},
{label: 'Windows', value: 'windows'},
]}>
<TabItem value="linux">
1.1. Check the value of `vm.max_map_count`:

```bash
$ sysctl vm.max_map_count
```

1.2. Reset `vm.max_map_count` to a value at least 262144 if it is not.

```bash
$ sudo sysctl -w vm.max_map_count=262144
```

:::caution WARNING
This change will be reset after a system reboot. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception.
:::

1.3. To ensure your change remains permanent, add or update the `vm.max_map_count` value in **/etc/sysctl.conf** accordingly:

```bash
vm.max_map_count=262144
```
</TabItem>
<TabItem value="macos">
If you are on macOS with Docker Desktop, then you *must* use docker-machine to update `vm.max_map_count`:

```bash
$ docker-machine ssh
$ sudo sysctl -w vm.max_map_count=262144
```

:::caution WARNING
This change will be reset after a system reboot. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception.
:::
</TabItem>
<TabItem value="windows">

#### If you are on Windows with Docker Desktop, then you *must* use docker-machine to set `vm.max_map_count`:

```bash
$ docker-machine ssh
$ sudo sysctl -w vm.max_map_count=262144
```
#### If you are on Windows with Docker Desktop WSL 2 backend, then use docker-desktop to set `vm.max_map_count`:

1.1. Run the following in WSL:
```bash
$ wsl -d docker-desktop -u root
$ sysctl -w vm.max_map_count=262144
```

:::caution WARNING
This change will be reset after you restart Docker. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception.
:::

1.2. If you do not wish to have to run those commands each time you restart Docker, you can update your `%USERPROFILE%.wslconfig` as follows to keep your change permanent and globally for all WSL distributions:

```bash
[wsl2]
kernelCommandLine = "sysctl.vm.max_map_count=262144"
```
*This causes all WSL2 virtual machines to have that setting assigned when they start.*

:::note
If you are on Windows 11 or Windows 10 version 22H2, and have installed the Microsoft Store version of WSL, you can also update the **/etc/sysctl.conf** within the docker-desktop WSL distribution to keep your change permanent:

```bash
$ wsl -d docker-desktop -u root
$ vi /etc/sysctl.conf
```

```bash
# Append a line, which reads:
vm.max_map_count = 262144
```
:::
</TabItem>
</Tabs>

</details>

2. Clone the repo:

Expand Down
32 changes: 14 additions & 18 deletions docs/references/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,11 +194,7 @@ Ignore this warning and continue. All system warnings can be ignored.

![](https://github.com/infiniflow/ragflow/assets/93570324/ef5a6194-084a-4fe3-bdd5-1c025b40865c)

#### 4.3 Why does it take so long to parse a 2MB document?

Parsing requests have to wait in queue due to limited server resources. We are currently enhancing our algorithms and increasing computing power.

#### 4.4 Why does my document parsing stall at under one percent?
#### 4.3 Why does my document parsing stall at under one percent?

![stall](https://github.com/infiniflow/ragflow/assets/93570324/3589cc25-c733-47d5-bbfc-fedb74a3da50)

Expand All @@ -211,7 +207,7 @@ docker logs -f ragflow-server
2. Check if the **task_executor.py** process exists.
3. Check if your RAGFlow server can access hf-mirror.com or huggingface.com.

#### 4.5 Why does my pdf parsing stall near completion, while the log does not show any error?
#### 4.4 Why does my pdf parsing stall near completion, while the log does not show any error?

If your RAGFlow is deployed *locally*, the parsing process is likely killed due to insufficient RAM. Try increasing your memory allocation by increasing the `MEM_LIMIT` value in **docker/.env**.

Expand All @@ -225,17 +221,17 @@ If your RAGFlow is deployed *locally*, the parsing process is likely killed due

![nearcompletion](https://github.com/infiniflow/ragflow/assets/93570324/563974c3-f8bb-4ec8-b241-adcda8929cbb)

#### 4.6 `Index failure`
#### 4.5 `Index failure`

An index failure usually indicates an unavailable Elasticsearch service.

#### 4.7 How to check the log of RAGFlow?
#### 4.6 How to check the log of RAGFlow?

```bash
tail -f path_to_ragflow/docker/ragflow-logs/rag/*.log
```

#### 4.8 How to check the status of each component in RAGFlow?
#### 4.7 How to check the status of each component in RAGFlow?

```bash
$ docker ps
Expand All @@ -249,7 +245,7 @@ d8c86f06c56b mysql:5.7.18 "docker-entrypoint.s…" 7 days ago Up
cd29bcb254bc quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z "/usr/bin/docker-ent…" 2 weeks ago Up 11 hours 0.0.0.0:9001->9001/tcp, :::9001->9001/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp ragflow-minio
```

#### 4.9 `Exception: Can't connect to ES cluster`
#### 4.8 `Exception: Can't connect to ES cluster`

1. Check the status of your Elasticsearch component:

Expand All @@ -276,26 +272,26 @@ $ docker ps
curl http://<IP_OF_ES>:<PORT_OF_ES>
```

#### 4.10 Can't start ES container and get `Elasticsearch did not exit normally`
#### 4.9 Can't start ES container and get `Elasticsearch did not exit normally`

This is because you forgot to update the `vm.max_map_count` value in **/etc/sysctl.conf** and your change to this value was reset after a system reboot.

#### 4.11 `{"data":null,"retcode":100,"retmsg":"<NotFound '404: Not Found'>"}`
#### 4.10 `{"data":null,"retcode":100,"retmsg":"<NotFound '404: Not Found'>"}`

Your IP address or port number may be incorrect. If you are using the default configurations, enter `http://<IP_OF_YOUR_MACHINE>` (**NOT 9380, AND NO PORT NUMBER REQUIRED!**) in your browser. This should work.

#### 4.12 `Ollama - Mistral instance running at 127.0.0.1:11434 but cannot add Ollama as model in RagFlow`
#### 4.11 `Ollama - Mistral instance running at 127.0.0.1:11434 but cannot add Ollama as model in RagFlow`

A correct Ollama IP address and port is crucial to adding models to Ollama:

- If you are on demo.ragflow.io, ensure that the server hosting Ollama has a publicly accessible IP address.Note that 127.0.0.1 is not a publicly accessible IP address.
- If you deploy RAGFlow locally, ensure that Ollama and RAGFlow are in the same LAN and can comunicate with each other.

#### 4.13 Do you offer examples of using deepdoc to parse PDF or other files?
#### 4.12 Do you offer examples of using deepdoc to parse PDF or other files?

Yes, we do. See the Python files under the **rag/app** folder.

#### 4.14 Why did I fail to upload a 10MB+ file to my locally deployed RAGFlow?
#### 4.13 Why did I fail to upload a 10MB+ file to my locally deployed RAGFlow?

You probably forgot to update the **MAX_CONTENT_LENGTH** environment variable:

Expand All @@ -314,7 +310,7 @@ docker compose up ragflow -d
```
*Now you should be able to upload files of sizes less than 100MB.*

#### 4.15 `Table 'rag_flow.document' doesn't exist`
#### 4.14 `Table 'rag_flow.document' doesn't exist`

This exception occurs when starting up the RAGFlow server. Try the following:

Expand All @@ -337,15 +333,15 @@ This exception occurs when starting up the RAGFlow server. Try the following:
docker compose up
```

#### 4.16 `hint : 102 Fail to access model Connection error`
#### 4.15 `hint : 102 Fail to access model Connection error`

![hint102](https://github.com/infiniflow/ragflow/assets/93570324/6633d892-b4f8-49b5-9a0a-37a0a8fba3d2)

1. Ensure that the RAGFlow server can access the base URL.
2. Do not forget to append **/v1/** to **http://IP:port**:
**http://IP:port/v1/**

#### 4.17 `FileNotFoundError: [Errno 2] No such file or directory`
#### 4.16 `FileNotFoundError: [Errno 2] No such file or directory`

1. Check if the status of your minio container is healthy:
```bash
Expand Down