Skip to content
This repository has been archived by the owner on Jun 9, 2020. It is now read-only.

allow use of separate files for TSV headers? #97

Open
RichMorin opened this issue May 22, 2014 · 1 comment
Open

allow use of separate files for TSV headers? #97

RichMorin opened this issue May 22, 2014 · 1 comment

Comments

@RichMorin
Copy link

Let's say that I have created a pair of huge (eg, multi-gigabyte) TSV files. After importing them, I find that I need to edit the header lines to add indexing, etc.

I can't edit the files directly; they're far too large for any conventional text editor. So, I need to use Unix tools such as head(1), tail(1), and cat(1) to manipulate the files in and around the editing process. This is both annoying and time-consuming.

So, I'd like to have a way to use separate files for the TSV headers. That would allow me to edit the (tiny) header files, leaving the (huge) data files alone. Please consider adding a feature such as this.

@rswarup82
Copy link

Hi Mike,

I did spend some time with neo4j import tool comes with 2.2.x version, which allow me to provide nodes/relationship csv files headers in separate file. I found this features is very useful because when we are trying to import billions of nodes/relationship into graph database it's quite obvious that file size will be big hence it is impossible to open file in any text editor. Hence having CSV file header in separate file is very useful. Moreover, sometimes nodes/relationshpi csv files are splitted into multiple files in that case copying header in each file is quite impossible.

Is there anyway we have can have this features available in batch importer tool in coming release? In order use 2.1.8 version in production we might have to use batch importer tool for bulk data import to neo4j.

Looking forward to hear back from you,

Thanks for your support.
Swarup Rakshit

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants