Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add environment parsing #1091

Merged
merged 2 commits into from
May 21, 2017

Conversation

alxchk
Copy link
Contributor

@alxchk alxchk commented May 20, 2017

No description provided.

Copy link
Owner

@giampaolo giampaolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I made an initial review but there's a lot to adjust here. In general:

  • always indent with 4 spaces
  • unless the function body is very small or you don't allocate any resource (e.g. a malloc, an open fd etc.) ALWAYS use goto;. As a general rule, every time you want to do return NULL think again and do goto error; instead. Forgetting to free resources every time you mean to return an error is very simple, it doesn't scale and that's the door which introduces memory leaks; also, if you want to test you made no mistakes, you can use make test-memleaks, which is not bullet proof as it doesn't cover all your code paths but it's something
  • write docstrings describing what the C function does and be explicit in stating what the possible return values are and what they mean so that the users of the function know how to deal with them

@@ -383,6 +385,10 @@ def _proc_name_and_args(self):
return cext.proc_name_and_args(self.pid, self._procfs_path)

@memoize_when_activated
def _proc_environ(self):
return cext.proc_environ(self.pid, self._procfs_path)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to use memoize_when_activated as this function only return one thing.

const char *procfs_path;
PyObject *py_retlist = NULL;
PyObject *envname;
PyObject *envval;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a py_ prefix to all python-related variables. Also fix indentation.

char **env;
ssize_t env_count = -1;
char *dm;
int i;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation


py_retlist = PyDict_New();
if (! py_retlist)
return PyErr_NoMemory();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation; also you should return NULL instead.


for (i=0; i<env_count; i++) {
if (! env[i])
break;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

}
}

result[len] = '\00';
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result is not freed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to free it. If there is no data to read, then just return empty C string (len = 0, alloced size = 1, result[0] = 0 == valid zero length C string)

size_t pblock_size;
int i;

assert(ptr_size == 4 || ptr_size == 8);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly checks that I didn't pass some invalid pointer size

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK but what happens? A warning is print on stderr?

if (lseek(fd, offt, SEEK_SET) == (off_t)-1) {
PyErr_SetFromErrno(PyExc_OSError);
return -1;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fd is not closed (use goto error;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caller's responsibility


if (r != ptr_size) {
PyErr_SetString(
PyExc_ValueError, "Pointer block is truncated");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather throw a RuntimeError here

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about this?

as, info.pr_envp, ptr_size);

if (env_count >= 0 && count)
*count = env_count;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

@alxchk alxchk force-pushed the solaris_feature_envp_as_parsing branch 2 times, most recently from 6d89c05 to cd7a5a2 Compare May 21, 2017 11:51
@alxchk
Copy link
Contributor Author

alxchk commented May 21, 2017

Updated

@alxchk alxchk force-pushed the solaris_feature_envp_as_parsing branch from cd7a5a2 to 1e8dd17 Compare May 21, 2017 11:59
return py_retdict;

enomem:
PyErr_NoMemory();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not free resources. Get rid of this and just do:

PyErr_NoMemory();
goto error;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which resources are not freed there?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, which you do in the error body later:

+    if (env && env_count >= 0)
 +        psutil_free_cstrings_array(env, env_count);
 +
 +    Py_XDECREF(py_envname);
 +    Py_XDECREF(py_envval);
 +    Py_XDECREF(py_retdict);
 +    return NULL;

As a general rule you should only define one "error" goto block, not different gotos, and implement the whole free resource logic in there (in 1 place only). Apply this construct everywhere. It's safer and also consistent with the existent code base.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do this, but I can't get this...
After landing enomem codeflow get to error. So It do the same + set NoMemory exception.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because before malloc you may already have allocated other resources. Or maybe you didn't, but you change your code later and open a fd or use another malloc before this one, or create a Python list or whatever.
The "free" logic should just live in one place (error:) so that every time you want to raise an exception you set the error type (PyErr_NoMemory(); or whatever) and then just do goto error.
This way you don't have to reason about what resources you allocated thus far and you don't have to infer whether to use goto errmem, goto oserror etc.

@@ -0,0 +1,302 @@
/*
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you missed this comment

sprintf(proc_path, "%s/%i/as", procfs_path, pid);
fd = open(proc_path, O_RDONLY);
if (fd < 0)
PyErr_SetFromErrno(PyExc_OSError);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh you're right

@giampaolo
Copy link
Owner

Please split this in 2 files: utils.c and environ.c.

@alxchk alxchk force-pushed the solaris_feature_envp_as_parsing branch from 1e8dd17 to 62c6407 Compare May 21, 2017 18:46
@alxchk alxchk force-pushed the solaris_feature_envp_as_parsing branch from 62c6407 to 385bfeb Compare May 21, 2017 18:48
@alxchk
Copy link
Contributor Author

alxchk commented May 21, 2017

This is my last bump. I'm tired of rebasing and not going to do anything else. Feel free to reject or to change as you like if you don't like the code

@giampaolo
Copy link
Owner

giampaolo commented May 21, 2017

Dude, I appreciate what you are doing here, really, and I understand your frustration, but this is how code review works. Most of the nitpicks I provided in this review session are objectively correct and well motivated. psutil has very high code quality standards, pretty much the same as when you submit a PR for the cPython project (basically that is where I trained myself and I'm coming from being a core-dev myself).
...And I'm picky because once your code gets in I am likely the one who will have to maintain it in the future, not you, because you are contributing something now, but you may disappear in the future, so I care to understand what you're doing and why (hence why I ask so many questions, like the "what does assert do?").
It is objectively better to stick with 4 spaces indentation, always check return codes, always free resources in a single point, provide docstrings and choose C file names which make sense. These are basically the only complaints I gave you thus far and there were a lot of these. Don't get me wrong, the functionality you're providing is great per-se, and I will go as far as saying I wouldn't have been able to implement it myself otherwise it would have already been there, but the fact that I had to comment 57 times on this PR shows the code you provided may have worked but it wasn't top notch. In fact it's much better now.

@giampaolo
Copy link
Owner

I will merge this now and fix the other few remaining nitpicks myself. I appreciate your contribution.

@giampaolo giampaolo merged commit 68e04de into giampaolo:master May 21, 2017
@giampaolo
Copy link
Owner

I get this:

======================================================================
ERROR: psutil.tests.test_process.TestProcess.test_as_dict
----------------------------------------------------------------------
Traceback (most recent call last):
  File "psutil/tests/test_process.py", line 1127, in test_as_dict
    d = p.as_dict()
  File "psutil/__init__.py", line 574, in as_dict
    ret = meth()
  File "psutil/__init__.py", line 883, in environ
    return self._proc.environ()
  File "psutil/_pssunos.py", line 339, in wrapper
    return fun(self, *args, **kwargs)
  File "psutil/_pssunos.py", line 419, in environ
    return cext.proc_environ(self.pid, self._procfs_path)
RuntimeError: Dereferencing procfs pointers of 64bit process is not possible

======================================================================
ERROR: psutil.tests.test_process.TestProcess.test_pid_0
----------------------------------------------------------------------
Traceback (most recent call last):
  File "psutil/tests/test_process.py", line 1366, in test_pid_0
    ret = meth()
  File "psutil/__init__.py", line 883, in environ
    return self._proc.environ()
  File "psutil/_pssunos.py", line 339, in wrapper
    return fun(self, *args, **kwargs)
  File "psutil/_pssunos.py", line 419, in environ
    return cext.proc_environ(self.pid, self._procfs_path)
RuntimeError: Dereferencing procfs pointers of 64bit process is not possible

======================================================================
FAIL: psutil.tests.test_process.TestProcess.test_weird_environ
----------------------------------------------------------------------
Traceback (most recent call last):
  File "psutil/tests/test_process.py", line 1445, in test_weird_environ
    self.assertEqual(p.environ(), {"A": "1", "C": "3"})
AssertionError: {'A': '1'} != {'A': '1', 'C': '3'}
- {'A': '1'}
+ {'A': '1', 'C': '3'}

@alxchk
Copy link
Contributor Author

alxchk commented May 22, 2017

First two happens when you try to get environment of 64 bit process from 32 bit. It happens because psinfo_t has different fields size for 32 and 64 bit processes. So when you read psinfo_t structure of 64 bit process from 32 you will get truncated invalid pointers. That's the check is_ptr_dereference_possible about.

@alxchk
Copy link
Contributor Author

alxchk commented May 22, 2017

What 3rd test is about I can't get, because as I understand http://pubs.opengroup.org/onlinepubs/9699919799/ it's not possible to not to have '=' sign.

@giampaolo
Copy link
Owner

giampaolo commented May 28, 2017

OK, got back from my vacation so looking back at this. There is another problem: when running as a limited user I'm able to query only the current process, whereas all other processes either raise AccessDenied (which is fine) or return an empty dict (which is not).

@giampaolo
Copy link
Owner

OK I think I know why this happens.

giampaolo added a commit that referenced this pull request May 28, 2017
* rename C module

* rename C file

* fix compilation error

* small refactoring

* small refactoring

* raise AccessDenied if info.pr_envp is empty, see #1091 (comment)

* fix envs with no equal sign

* style

* update doc

* style
@giampaolo
Copy link
Owner

OK, I should have fixed it: #1097

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants