Skip to content

Conversation

@guoyuhong
Copy link

We used a static code scan tool to scan arrow code. There are several possible bugs:

  1. The return value of PyDict_SetItem is not used.
  2. Currently, EventLoop:: Shutdown should be called explicitly, which is error-prone and causing leak when the user forgets to call it.
  3. There is an unclosed file descriptor in io.cc when path name is too long.

Besides, we also made the following small changes:

  1. When we use Plasma in Yarn and when a node uses too much memory, a SIGTERM signal will be sent to Plasma. Current plasma will exit silently. We also some log to plasma store to help us to debug.
  2. ARROW_LOG will always evaluate the output expression even when it is not enabled, which is not efficient.
  3. The constructor of Java class ObjectStoreData is private, which is not convenient when we want to create a mock plasma store.
  4. Fix a call to ObjectStoreData which misplaces meta and data according to https://github.com/apache/arrow/blob/master/java/plasma/src/main/java/org/apache/arrow/plasma/ObjectStoreLink.java#L32 .

@xhochy
Copy link
Member

xhochy commented Feb 15, 2019

@guoyuhong Out of curiosity which "code scan tool" did you use?

@guoyuhong
Copy link
Author

@xhochy It is called Coverity.

@wesm
Copy link
Member

wesm commented Feb 15, 2019

Coverity just started offering free scans to open source projects again. @pitrou can you review?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are lint tool conflicts here, so I disable clang-format.

@guoyuhong
Copy link
Author

@pcmoritz Please take a look at this PR.

@guoyuhong
Copy link
Author

Ping... @pcmoritz

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this @guoyuhong . Here are some comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry for not being specific here. It's better to not override the second parameter, so that the proper status code is returned (for example a Python MemoryError would be turned into a OutOfMemory Status, etc.).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(attempting to fix it directly from Github...)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I checked out common.h. Yes, ConvertPyError() is better.

@pitrou
Copy link
Member

pitrou commented Feb 18, 2019

Rebased, will wait for CI.

@pitrou
Copy link
Member

pitrou commented Feb 18, 2019

Hmm, I really don't understand what the merge script is saying me here:

$ ./dev/merge_arrow_pr.py 
ARROW_HOME = /home/antoine/arrow
PROJECT_NAME = arrow
Which pull request would you like to merge? (e.g. 34): 3656

Pull request 3656 is not mergeable in its current form.
Continue? (experts only!) (y/n): n

@wesm did something weird happen on git master?

@wesm
Copy link
Member

wesm commented Feb 19, 2019

@pitrou master had a hiccup about 5 hours ago at 19:30 UTC; I accidentally created a merge commit in master (using the GitHub UI -- I have since asked INFRA to disallow merge commits through the UI) and quickly rebased and force-pushed master. It's possible that this PR was impacted. I have also seen the merge script misbehave at times like this when the GitHub API returns bad results now and then

@kszucs
Copy link
Member

kszucs commented Feb 19, 2019

@wesm I've rebased it since that.

@pitrou pitrou closed this in 135d481 Feb 19, 2019
@pitrou
Copy link
Member

pitrou commented Feb 19, 2019

Finally merged. Thanks @guoyuhong !

@guoyuhong
Copy link
Author

@pitrou Thanks!

@guoyuhong guoyuhong deleted the fixPlasma branch February 19, 2019 11:30
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
We used a static code scan tool to scan arrow code. There are several possible bugs:
1. The return value of `PyDict_SetItem` is not used.
2. Currently, `EventLoop:: Shutdown` should be called explicitly, which is error-prone and causing leak when the user forgets to call it.
3. There is an unclosed file descriptor in `io.cc` when path name is too long.

Besides, we also made the following small changes:
1. When we use Plasma in Yarn and when a node uses too much memory, a SIGTERM signal will be sent to Plasma. Current plasma will exit silently. We also some log to plasma store to help us to debug.
2. `ARROW_LOG` will always evaluate the output expression even when it is not enabled, which is not efficient.
3. The constructor of Java class `ObjectStoreData` is private, which is not convenient when we want to create a mock plasma store.
4. Fix a call to `ObjectStoreData` which misplaces `meta` and `data` according to https://github.com/apache/arrow/blob/master/java/plasma/src/main/java/org/apache/arrow/plasma/ObjectStoreLink.java#L32 .

Author: Yuhong Guo <[email protected]>
Author: Antoine Pitrou <[email protected]>

Closes apache#3656 from guoyuhong/fixPlasma and squashes the following commits:

634e36a <Antoine Pitrou> Use default argument value to `ConvertPyError`
b547f2f <Yuhong Guo> remove if from ARROW_LOG
440f097 <Yuhong Guo> Address comment
d3eb22f <Yuhong Guo> Lint
79b4af3 <Yuhong Guo> Fix and Lint
434a039 <Yuhong Guo> Make constructor of ObjectStoreData public
b2ddba6 <Yuhong Guo> Fix misplace of meta and data in PlasmaClient.java
a667402 <Yuhong Guo> Do not evaluate logging strings when logging is not enabled.
5be7b89 <Yuhong Guo> Fix unclosed fd reported by code scan tool
3a917d4 <Yuhong Guo> Fix not used return value in deserialize.cc reported by code scan tool
ed56a48 <Yuhong Guo> Fix possible unclosed EventLoop reported by code scanning tool
3fed926 <Yuhong Guo> Add plasma log
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants