-
-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
netrc: Implement .netrc parsing #244
base: master
Are you sure you want to change the base?
netrc: Implement .netrc parsing #244
Conversation
This implementation parses the $HOME/.netrc (or a file specified by the user) through the `--netrc[=file]` or `-R[file]` options. It uses a simple FSM in order to parse the file and find the credentials corresponding to the FTP host, or uses the "default" entry otherwise, if any, according to the specification found on [1]. Tokens not applicable to axel usage were not considered. [1] https://docs.oracle.com/cd/E19455-01/806-0633/6j9vn6q5f/index.html Closes: axel-download-accelerator#116 Signed-off-by: David Polverari <[email protected]>
PS: On this implementation, I had to use the As GCC considers this specific comment as an "explicit fallthrough", the error doesn't show, but I don't know whether other compilers (like clang) will present the same behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- whatever the parser needs should be into it's own struct, and it should get a pointer to it.
- allocate buffers with malloc/calloc to fail gracefully when OOM
- the state machine could be encoded as functions using tail merging
- don't use variables to stop a loop
- the parser shouldn't know anything about
axel_t
, outputs should be explicit too, pass pointers and lengths, so usestrlcpy
- have you thought about using
mmap
? that could simplify the design.
@@ -161,6 +162,15 @@ main(int argc, char *argv[]) | |||
} | |||
} | |||
break; | |||
case 'R': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's define a negative version, it's necessary for if there's a default in the axel config file. I'm not sure a short version is needed (I've been thinking about removing support for systems not supporting getopt_long anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About negative version of the option, I don't know whether it is a good idea to make .netrc parsing default, as it may break existing users' assumptions/scripts/workflows, etc. What do you think about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mean to make it the default, but if you have the option on the configuration file, then you need a way to disable it from the command line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll add it.
Thanks for the review! I will try to address the points you made as soon as I can. |
About this point, I was thinking about using I could either Or maybe I can just write a tokenizer function by iterating through the mmaped bytes and filling a For me, the current Edit: I'm implemented a custom tokenizer to use along with mmap. I will rewrite the PR and will send as soon as I can. |
On 18/Oct/2019 20:51, davidpolverari wrote:
<...>
Or maybe I can just write a tokenizer function by iterating through
the mmaped bytes and filling a `char []` with each found token, while
keeping track of the beginning of the next token by means of local
static pointer.
For me, the current `fgets` approach and the `malloc` one seem the
most KISS. What are your views about it?
It would be worth knowing how it compares in practice, I suspect it's
less overhead in the end, but for now the stdio API is fine.
It doesn't need to copy anything, and since the format is so simple, you
can get away with something as simple as strspn + strncmp, altough we
would need a limiting version of strspn, but that's about it.
|
I sent a new commit for the current PR. As soon as the code is ready for merging, I'll rebase everything. |
I already addressed most of the points you made on my local repo. As soon as I implement the remaining ones ( |
On 22/Oct/2019 18:15, davidpolverari wrote:
I was afraid of using strlcpy because it is not standard (neither ISO
C nor POSIX), and thus could cause portability problems. But if it is
ok for you, I'll use it.
We provide strlcpy and strlcat for systems that don't have it on libc
and don't have libbsd.
It's OK to use non-standard functions if they're generic.
|
I didn't forget about this issue. I was busy because of work, but will resume as soon as possible. |
Missing: - add negative version (--no-netrc) - add struct for netrc use, and use a pointer to it instead of using netrc_filename
I just sent another WIP commit with most of the proposed changes, including using Later on I will try to refactor |
I believe that with those latest PRs most of the requested things were changed. I have doubts now as to where we should keep the After that, I will be able to implement the negative ( |
Well, with this last PR I think the changes are almost all done, the only missing piece being the negative option (and removing the debugging Any thoughts about it? When everything is on right shape I can rebase it on master and push it again in a more organized way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much better now. I made a couple of comments, the only important task now is to complete the get_creds implementation.
src/conn.c
Outdated
netrc_t * netrc; | ||
netrc = netrc_init(conn->conf->netrc_filename, conn->host, conn->user, sizeof(conn->user), conn->pass, sizeof(conn->pass)); | ||
netrc_parse(netrc); | ||
if (conn->conf->netrc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code here shouldn't be conditional, and it should be for all protocols.
We want to support other formats, like authinfo, in the future.
It should be into a separate function, so that refactoring doesn't touch code around it:
conf_auth_setup(conf);
Also auto-login should only happen if we didn't got the credentials from somewhere else, so a check for that is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I will move this code to the conn.c:conn_set()
function, as that is the one that deals with parsing URLs and extracting fields like username, password and hostname from them and copying to conn_t
. But once parsing is entangled with auth code, I'd rather leave this kind of refactoring to another PR.
What else do you think it is missing on |
@davidpolverari please take a look at ismaell@888ec1e. That fixes both the null deref at the mmap function, the parsing (which now aborts on unkonwn tokens), and removes two globals. What I'm missing there is a cache, so that each machine is only parsed once. |
- removed netrc_free - simplified netrc_mmap - reworked parser
- simplified parser flow - replaced memcpy with strlcpy
Your parser implementation is way more elegant than mine 😄. Anyways, I had to make minor changes to make it work and to simplify (IMHO) the program flow. Besides, we can't use As the parser now aborts on unknown tokens, we may need to include the other tokens used by the format which are unused on axel, and turn them in no-ops, otherwise a compliant file might get ignored. |
Regarding the caching (memoization), I think we could create a function to wrap But don't you think it is better to implement it as a separate PR later? |
I failed to mention I made changes at https://github.com/ismaell/axel netrc branch |
I cherry-picked your commits here, as they would be the same changes I'd have to make. |
@davidpolverari hey, 'sup? are you still around? |
Hi! Yes, was just a little bit busy with other stuff. |
fbfc36f
to
3c21dc4
Compare
This implementation parses the $HOME/.netrc (or a file specified by the
user) through the
--netrc[=file]
or-R[file]
options.It uses a simple FSM in order to parse the file and find the credentials
corresponding to the FTP host, or uses the "default" entry otherwise,
if any, according to the specification found on [1]. Tokens not
applicable to axel usage were not considered.
[1] https://docs.oracle.com/cd/E19455-01/806-0633/6j9vn6q5f/index.html
Closes: #116
Signed-off-by: David Polverari [email protected]