Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netrc: Implement .netrc parsing #244

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

davidpolverari
Copy link
Contributor

This implementation parses the $HOME/.netrc (or a file specified by the
user) through the --netrc[=file] or -R[file] options.

It uses a simple FSM in order to parse the file and find the credentials
corresponding to the FTP host, or uses the "default" entry otherwise,
if any, according to the specification found on [1]. Tokens not
applicable to axel usage were not considered.

[1] https://docs.oracle.com/cd/E19455-01/806-0633/6j9vn6q5f/index.html

Closes: #116

Signed-off-by: David Polverari [email protected]

This implementation parses the $HOME/.netrc (or a file specified by the
user) through the `--netrc[=file]` or `-R[file]` options.

It uses a simple FSM in order to parse the file and find the credentials
corresponding to the FTP host, or uses the "default" entry otherwise,
if any, according to the specification found on [1]. Tokens not
applicable to axel usage were not considered.

[1] https://docs.oracle.com/cd/E19455-01/806-0633/6j9vn6q5f/index.html

Closes: axel-download-accelerator#116

Signed-off-by: David Polverari <[email protected]>
@davidpolverari
Copy link
Contributor Author

PS: On this implementation, I had to use the // fall through comment in order to placate GCC's -Wimplicit-fallthrough, as the project Makefile WARN_CFLAGS are considering a lot of warnings as errors and in this case it was the desired outcome.

As GCC considers this specific comment as an "explicit fallthrough", the error doesn't show, but I don't know whether other compilers (like clang) will present the same behavior.

Copy link
Member

@ismaell ismaell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • whatever the parser needs should be into it's own struct, and it should get a pointer to it.
  • allocate buffers with malloc/calloc to fail gracefully when OOM
  • the state machine could be encoded as functions using tail merging
  • don't use variables to stop a loop
  • the parser shouldn't know anything about axel_t, outputs should be explicit too, pass pointers and lengths, so use strlcpy
  • have you thought about using mmap? that could simplify the design.

src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
@@ -161,6 +162,15 @@ main(int argc, char *argv[])
}
}
break;
case 'R':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's define a negative version, it's necessary for if there's a default in the axel config file. I'm not sure a short version is needed (I've been thinking about removing support for systems not supporting getopt_long anyway).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About negative version of the option, I don't know whether it is a good idea to make .netrc parsing default, as it may break existing users' assumptions/scripts/workflows, etc. What do you think about it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mean to make it the default, but if you have the option on the configuration file, then you need a way to disable it from the command line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll add it.

src/conf.h Outdated Show resolved Hide resolved
src/text.c Outdated Show resolved Hide resolved
@davidpolverari
Copy link
Contributor Author

Thanks for the review! I will try to address the points you made as soon as I can.

@davidpolverari
Copy link
Contributor Author

davidpolverari commented Oct 19, 2019

* have you thought about using `mmap`? that could simplify the design.

About this point, I was thinking about using mmap, but using it precludes the use of strtok*, because mmap won't null-terminate the mmaped file and strtok{,_r} will try to change the string, something unacceptable for a mmaped file, and something that will not work on read-only mappings, which would be the right way to mmap .netrc anyway.

I could either malloc(file_size + spare bytes), load file contents on this mem and null-terminate it, in order to work on that copy with strtok*, but in the unlikely event that someone has an absurdly big .netrc file, this approach might fail. Performance-wise, I don't think it would be a burden in most cases.

Or maybe I can just write a tokenizer function by iterating through the mmaped bytes and filling a char [] with each found token, while keeping track of the beginning of the next token by means of local static pointer.

For me, the current fgets approach and the malloc one seem the most KISS. What are your views about it?

Edit: I'm implemented a custom tokenizer to use along with mmap. I will rewrite the PR and will send as soon as I can.

@ismaell
Copy link
Member

ismaell commented Oct 20, 2019 via email

@davidpolverari
Copy link
Contributor Author

I sent a new commit for the current PR. As soon as the code is ready for merging, I'll rebase everything.

src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
@davidpolverari
Copy link
Contributor Author

I already addressed most of the points you made on my local repo. As soon as I implement the remaining ones (netrc_t, strspn-like interace, and --no-netrc) I will send the PR.

@ismaell
Copy link
Member

ismaell commented Oct 24, 2019 via email

@davidpolverari
Copy link
Contributor Author

I didn't forget about this issue. I was busy because of work, but will resume as soon as possible.

Missing:

- add negative version (--no-netrc)

- add struct for netrc use, and use a pointer to it instead of using netrc_filename
@davidpolverari
Copy link
Contributor Author

I just sent another WIP commit with most of the proposed changes, including using strspn and strcspn-like functions (which I named memspn and memcspn as they don't deal with null-terminated strings).

Later on I will try to refactor next_token() into something like a memtok_r(), and will try to get rid of the global variables by explicitly passing them as arguments to the auxiliary static functions, besides implementing the remaining proposed changes.

src/netrc.c Outdated Show resolved Hide resolved
@davidpolverari
Copy link
Contributor Author

I believe that with those latest PRs most of the requested things were changed. I have doubts now as to where we should keep the netrc_t pointer, at what point we should init the structure, etc. I think it would be easier if we could decouple the username and password from conn_t, and had a struct used globally for it, but it would imply touching on a lot of places.

After that, I will be able to implement the negative (--no-netrc) option.

@davidpolverari
Copy link
Contributor Author

Well, with this last PR I think the changes are almost all done, the only missing piece being the negative option (and removing the debugging printf I put). There are other things that I thought of doing at a later point in time, as moving the memspn, memcspn and memtok functions elsewhere on axel code, but I think that can be postponed.

Any thoughts about it? When everything is on right shape I can rebase it on master and push it again in a more organized way.

Copy link
Member

@ismaell ismaell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better now. I made a couple of comments, the only important task now is to complete the get_creds implementation.

src/text.c Outdated Show resolved Hide resolved
src/conn.c Outdated
netrc_t * netrc;
netrc = netrc_init(conn->conf->netrc_filename, conn->host, conn->user, sizeof(conn->user), conn->pass, sizeof(conn->pass));
netrc_parse(netrc);
if (conn->conf->netrc) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code here shouldn't be conditional, and it should be for all protocols.

We want to support other formats, like authinfo, in the future.

It should be into a separate function, so that refactoring doesn't touch code around it:

conf_auth_setup(conf);

Also auto-login should only happen if we didn't got the credentials from somewhere else, so a check for that is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I will move this code to the conn.c:conn_set() function, as that is the one that deals with parsing URLs and extracting fields like username, password and hostname from them and copying to conn_t. But once parsing is entangled with auth code, I'd rather leave this kind of refactoring to another PR.

src/netrc.c Outdated Show resolved Hide resolved
@davidpolverari
Copy link
Contributor Author

Looks much better now. I made a couple of comments, the only important task now is to complete the get_creds implementation.

What else do you think it is missing on get_creds() implementation?

@ismaell
Copy link
Member

ismaell commented Jan 20, 2020

@davidpolverari please take a look at ismaell@888ec1e.

That fixes both the null deref at the mmap function, the parsing (which now aborts on unkonwn tokens), and removes two globals.

What I'm missing there is a cache, so that each machine is only parsed once.

ismaell and others added 2 commits January 20, 2020 20:46
- removed netrc_free
- simplified netrc_mmap
- reworked parser
- simplified parser flow
- replaced memcpy with strlcpy
@davidpolverari
Copy link
Contributor Author

Your parser implementation is way more elegant than mine 😄. Anyways, I had to make minor changes to make it work and to simplify (IMHO) the program flow. Besides, we can't use memcpy because the pointers returned by memtok aren't NULL-terminated, thus we need strlcpy to NULL-terminate it the copy for us.

As the parser now aborts on unknown tokens, we may need to include the other tokens used by the format which are unused on axel, and turn them in no-ops, otherwise a compliant file might get ignored.

@davidpolverari
Copy link
Contributor Author

Regarding the caching (memoization), I think we could create a function to wrap netrc_parse and populate/verify/return the previous results contained in an array of few elements of a struct (host, username, password) inside netrc_t, acting like a circular buffer. With few elements, we wouldn't need any fancy algorithm to do it.

But don't you think it is better to implement it as a separate PR later?

src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
src/netrc.c Show resolved Hide resolved
src/netrc.c Outdated Show resolved Hide resolved
@ismaell
Copy link
Member

ismaell commented Jan 23, 2020

I failed to mention I made changes at https://github.com/ismaell/axel netrc branch

@davidpolverari
Copy link
Contributor Author

I cherry-picked your commits here, as they would be the same changes I'd have to make.

@ismaell ismaell self-assigned this Mar 25, 2020
@ismaell ismaell added this to the v2.18 milestone Mar 25, 2020
@ismaell
Copy link
Member

ismaell commented Apr 6, 2020

@davidpolverari hey, 'sup? are you still around?

@davidpolverari
Copy link
Contributor Author

Hi! Yes, was just a little bit busy with other stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.netrc support
2 participants