Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

translation: ram and cache #1591

Merged
merged 9 commits into from
Jan 13, 2025
Merged

Conversation

magentaqin
Copy link
Contributor

If this pull request (PR) pertains to Chinese-to-English translation, please confirm that you have read the contribution guidelines and complete the checklist below:

  • This PR represents the translation of a single, complete document, or contains only bug fixes.
  • The translation accurately conveys the original meaning and intent of the Chinese version. If deviations exist, I have provided explanatory comments to clarify the reasons.

If this pull request (PR) is associated with coding or code transpilation, please attach the relevant console outputs to the PR and complete the following checklist:

  • I have thoroughly reviewed the code, focusing on its formatting, comments, indentation, and file headers.
  • I have confirmed that the code execution outputs are consistent with those produced by the reference code (Python or Java).
  • The code is designed to be compatible on standard operating systems, including Windows, macOS, and Ubuntu.

@@ -18,54 +18,54 @@ There are three types of storage devices in computers: <u>hard disk</u>, <u>rand
| Speed | Slower, several hundred to thousands MB/s | Faster, several tens of GB/s | Very fast, several tens to hundreds of GB/s |
| Price | Cheaper, several cents to yuan / GB | More expensive, tens to hundreds of yuan / GB | Very expensive, priced with CPU |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm considering translating this currency unit "yuan" to "USD".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add another commit to address this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add another commit to address this

Updated. Have a check.

Copy link
Contributor

@yuelinxin yuelinxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Please consider my suggestions in your next commit. Also, could you change the title of the PR into "translation: ...", so it is easier to track.

@@ -1,12 +1,12 @@
# Memory and cache *

In the first two sections of this chapter, we explored arrays and linked lists, two fundamental and important data structures, representing "continuous storage" and "dispersed storage" respectively.
In the first two sections of this chapter, we explored arrays and linked liststwo fundamental data structures that represent 'continuous storage' and 'dispersed storage,' respectively.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for changing " into ' ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

' is an emphasis, but " is a quote. I think the context here is an emphasis, instead of a quote.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From GPT-4:

在这段话中,应该使用 双引号 (” “)。以下是原因和解释:

选择双引号 (” “)
1. 学术和技术写作的惯例:
• 双引号通常用于引用术语、特定概念或短语的正式定义,以便突出其专业性或特定含义。
2. 美式英语习惯:
• 在美式英语中,双引号更常用于引用内容,即使它只是一个术语或词组的标示。

为什么不用单引号 (’ ’)?
• 单引号通常用于嵌套在双引号内的引用(例如,当引用的内容里又包含另一个引用时),或者用于特定强调,但学术或技术写作中少用。

因此,句子中的 “continuous storage” 和 “dispersed storage” 是正确的,因为它们是在定义或描述特定术语。

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, the book extensively uses " to reference terms. If update is needed, it is recommended to apply the changes uniformly in a new PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for your professional suggestions.

@@ -18,54 +18,54 @@ There are three types of storage devices in computers: <u>hard disk</u>, <u>rand
| Speed | Slower, several hundred to thousands MB/s | Faster, several tens of GB/s | Very fast, several tens to hundreds of GB/s |
| Price | Cheaper, several cents to yuan / GB | More expensive, tens to hundreds of yuan / GB | Very expensive, priced with CPU |

We can imagine the computer storage system as a pyramid structure shown in the figure below. The storage devices closer to the top of the pyramid are faster, have smaller capacity, and are more costly. This multi-level design is not accidental, but the result of careful consideration by computer scientists and engineers.
The computer storage system can be visualized as a pyramid, as shown in the figure belowThe storage devices at the top of the pyramid are faster, have smaller capacities, and are more expensive. This multi-level design is not accidental, but a deliberate outcome of careful consideration by computer scientists and engineers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change the Chinese EOS character into the English EOS character (following "below")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change the Chinese EOS character into the English EOS character (following "below")

Fixed.

@@ -18,54 +18,54 @@ There are three types of storage devices in computers: <u>hard disk</u>, <u>rand
| Speed | Slower, several hundred to thousands MB/s | Faster, several tens of GB/s | Very fast, several tens to hundreds of GB/s |
| Price | Cheaper, several cents to yuan / GB | More expensive, tens to hundreds of yuan / GB | Very expensive, priced with CPU |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add another commit to address this


Overall, **hard disks are used for long-term storage of large amounts of data, memory is used for temporary storage of data being processed during program execution, and cache is used to store frequently accessed data and instructions** to improve program execution efficiency. Together, they ensure the efficient operation of computer systems.
Overall, **hard disks provide long-term storage for large volumes of data, memory serves as temporary storage for data being processed during program execution, and cache stores frequently accessed data and instructions to enhance execution efficiency.**. Together, they ensure the efficient operation of computer systems.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove the redundant EOS character between "execution efficiency" and "**"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


As shown in the figure below, during program execution, data is read from the hard disk into memory for CPU computation. The cache can be considered a part of the CPU, **smartly loading data from memory** to provide fast data access to the CPU, significantly enhancing program execution efficiency and reducing reliance on slower memory.
As shown in the figure below, during program execution, data is read from the hard disk into memory for CPU computation. The cache, acting as an extension of the CPU, **intelligently preloads data from memory**, enabling faster data access for the CPU.This greatly improves program execution efficiency while reducing reliance on slower memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a space character between "CPU." and "This greatly improves"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

- **Cache lines**: Caches don't store and load data byte by byte but in units of cache lines. Compared to byte-by-byte transfer, the transmission of cache lines is more efficient.
- **Prefetch mechanism**: Processors try to predict data access patterns (such as sequential access, fixed stride jumping access, etc.) and load data into the cache according to specific patterns to improve the hit rate.
- **Spatial locality**: If data is accessed, data nearby is likely to be accessed in the near future. Therefore, when loading certain data, the cache also loads nearby data to improve the hit rate.
- **Cache lines**: Caches operate by storing and loading data in units called cache lines, rather than individual bytes. This approach improves efficiency by transferring larger blocks of data at once.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"by storing and loading" -> "by loading and storing"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion, but I remain my opinion, as the word order matches the Chinese version.

- **Spatial locality**: If data is accessed, data nearby is likely to be accessed in the near future. Therefore, when loading certain data, the cache also loads nearby data to improve the hit rate.
- **Cache lines**: Caches operate by storing and loading data in units called cache lines, rather than individual bytes. This approach improves efficiency by transferring larger blocks of data at once.
- **Prefetch mechanism**: Processors predict data access patterns (e.g., sequential or fixed-stride access) and preload data into the cache based on these patterns to increase the cache hit rate.
- **Spatial locality**: When a specific piece of data is accessed, nearby data is likely to be accessed soon. To leverage this, caches load adjacent data along with the requested data, enhancing hit rates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"enhancing" -> "improving" or "increasing"
enhancing is for a property, hit rate is a value, so we prefer improve or increase

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example, "enhance stability", "increase probability", stability here is a property while probability is a value. "improve" should work with both property and value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

- **Temporal locality**: If data is accessed, it's likely to be accessed again in the near future. Caches use this principle to retain recently accessed data to improve the hit rate.

In fact, **arrays and linked lists have different cache utilization efficiencies**, mainly reflected in the following aspects.
In fact, **arrays and linked lists have different cache utilization efficiencies**, which can be analyzed as follows.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i prefer the original wording

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"which" clause is necessary to ensure grammar correctness. And thanks for your suggestion, I updated "can be analyzed" to "mainly reflected". Have a check!

- **Cache lines**: Linked list data is scattered throughout memory, and since caches load "by line," the proportion of loading invalid data is higher.
- **Prefetch mechanism**: The data access pattern of arrays is more "predictable" than that of linked lists, meaning the system is more likely to guess which data will be loaded next.
- **Spatial locality**: Arrays are stored in concentrated memory spaces, so the data near the loaded data is more likely to be accessed next.
- **Occupied space**: Linked list elements require additional memory for pointers, resulting in greater space consumption compared to arrays. This reduces the effective amount of useful data stored in the cache.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove redundant space character between ":" and "Linked list"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

- **Occupied space**: Linked list elements require additional memory for pointers, resulting in greater space consumption compared to arrays. This reduces the effective amount of useful data stored in the cache.
- **Cache lines**: Linked list elements are scattered across memory, and since caches load data "by line," they are more likely to include unrelated or invalid data. Arrays, with their contiguous storage, make better use of cache lines.
- **Prefetch mechanism**: Arrays follow a predictable access pattern due to their contiguous memory allocation, enabling the system's prefetch mechanism to accurately anticipate upcoming data loads. In contrast, linked lists, with their scattered storage, have less predictable access patterns, reducing prefetch efficiency.
- **Spatial locality**: Arrays benefit from high spatial locality, as data stored near a currently accessed element is more likely to be accessed next. Linked lists lack this advantage because their elements are not stored adjacently in memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove "high" from "high spatial locality", again, spatial locality is a property, high and low are for numerical values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the whole paragraph.

@magentaqin magentaqin changed the title doc: en translation for ram and cache translation: ram and cache Dec 26, 2024
@magentaqin
Copy link
Contributor Author

Great work! Please consider my suggestions in your next commit. Also, could you change the title of the PR into "translation: ...", so it is easier to track.

Thank you for your suggestions. I've already updated the PR title.

Copy link
Contributor

@yuelinxin yuelinxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

appreciate your effort, please consider my final suggestions

- **Prefetch mechanism**: The data access pattern of arrays is more "predictable" than that of linked lists, meaning the system is more likely to guess which data will be loaded next.
- **Spatial locality**: Arrays are stored in concentrated memory spaces, so the data near the loaded data is more likely to be accessed next.
- **Occupied space**: Linked list elements take up more space than array elements, resulting in less effective data being held in the cache.
- **Cache lines**: The linked list data is scattered throughout the memory, and the cache is "loaded by row", so the proportion of invalid data loaded is higher.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think the "the"s added in this line is needed, because we are not referring to any specific item or scenario.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. I removed two "the"s.

- **Spatial locality**: Arrays are stored in concentrated memory spaces, so the data near the loaded data is more likely to be accessed next.
- **Occupied space**: Linked list elements take up more space than array elements, resulting in less effective data being held in the cache.
- **Cache lines**: The linked list data is scattered throughout the memory, and the cache is "loaded by row", so the proportion of invalid data loaded is higher.
- **Prefetch mechanism**: The data access pattern of arrays is more "predictable" than that of linked lists, that is, it is easier for the system to guess the data that is about to be loaded.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can change the first "that is" into "in other words" so we don't have two "that is" so close to each other, or it reads a bit weird.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no two "that is" here. The previous one is "that of", and it's for reference.

- **Occupied space**: Linked list elements take up more space than array elements, resulting in less effective data being held in the cache.
- **Cache lines**: The linked list data is scattered throughout the memory, and the cache is "loaded by row", so the proportion of invalid data loaded is higher.
- **Prefetch mechanism**: The data access pattern of arrays is more "predictable" than that of linked lists, that is, it is easier for the system to guess the data that is about to be loaded.
- **Spatial locality**: Arrays are stored in a centralized memory space, so data near the data being loaded is more likely to be accessed soon.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i prefer "continuous" over "centralized"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'll take you advice!

Copy link
Contributor

@yuelinxin yuelinxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@magentaqin
Copy link
Contributor Author

Great work!

Thank you as well!

@krahets krahets added translation English translation documents documents-related labels Jan 13, 2025
@krahets krahets merged commit 9c78c51 into krahets:main Jan 13, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documents documents-related translation English translation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants